Add a comment to the ARFF file

This is my first question on this forum .... I am making an adata-mining application in java with WEKA API. I do the preprocessing step first, and when I save the ARFF file, I would like to add a couple of lines (as comments) that define the preprocessing task I did with the file ... the problem is that I don’t know how to add comments on the ARFF file from the java WEKA API. To save the file, I use the ArffSaver class, like this ...

try { ArffSaver saver = new ArffSaver(); saver.setInstances(dataPost); saver.setFile(arffFile); saver.writeBatch(); return true; } catch (IOException ex) { Logger.getLogger(Preprocesamiento.class.getName()).log(Level.SEVERE, null, ex); return false; } 

I would be very happy if someone could give some idea ... thanks!

+4
source share
2 answers

You should AVOID writing comments on the .arff file, even more when you write it out with Java. These files are very "parser sensitive". The Weka API for creating these files limits this particular reason.

Despite this, you can always add your comments manually using the % symbol. This suggests that I would not recommend that you enter anything more than instances, attributes, and values ​​into the .arff file .; -)

+1
source

I see no reason for not writing comments in the header of the ARFF file. The specification clearly says:

Lines starting with the% character are comments.

Thus, although this is technically feasible, it can be tricky if you want to use the ArffSaver#setFile . This method makes a lot (convenient, but somewhat arbitrary and unspecified) work inside until it finally calls

 setDestination(new FileOutputStream(m_outputFile)); 

If this is not required, the easiest option is to write directly to the OutputStream , which you can then simply set as the destination for ArffSaver . This can be wrapped with a small helper method, for example, as follows:

 static void writeArff( Instances instances, List<String> commentLines, OutputStream outputStream) throws IOException { ArffSaver saver = new ArffSaver(); saver.setInstances(instances); if (commentLines != null && !commentLines.isEmpty()) { BufferedWriter bw = new BufferedWriter( new OutputStreamWriter(outputStream)); for (String commentLine : commentLines) { bw.write("% " + commentLine + "\n"); } bw.write("\n"); bw.flush(); } saver.setDestination(outputStream); saver.writeBatch(); } 

When calling this type

  List<String> comments = Arrays.asList("A comment", "Another one"); writeArff(instances, comments, outputStream); 

then these comments will be inserted at the top of the ARFF file.

0
source

All Articles