In IndexWriterConfig you can go to Codec , which defines the storage method that the index will use. This will only work when the IndexWriter built (i.e. changing the configuration after building will have no effect). You will want to use Lucene40Codec .
Sort of:
//You could also simply pass in Version.LUCENE_40 here, and not worry about the Codec //(though that will likely affect other things as well) IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_41, analyzer); config.setCodec(new Lucene40Codec()); IndexWriter writer = new IndexWriter(directory, config);
You can also use Lucene40StoredFieldsFormat to get the old, uncompressed format of the saved field and pass it back from the custom implementation of Codec. You could probably take most of the code from Lucene41Codec and simply replace the storedFieldFormat() method. This may be a more focused approach, but a more complex approach, and I donβt know for sure if you might run into other problems.
Another note on creating a custom codec, as the API indicates that you should do this, is to extend FilterCodec and slightly change their example:
CustomCodec public end class extends FilterCodec {
public CustomCodec() { super("CustomCodec", new Lucene41Codec()); } public StoredFieldsFormat storedFieldsFormat() { return new Lucene40StoredFieldsFormat(); }
}
Of course, another implementation that comes to mind:
I think itβs also clear to you that the problem is that βI end up downloading all the documents of the candidates.β I will not rework too much in the implementation of the assessment. I don't have full information or understanding, but it looks like you are fighting the Lucene architecture to get it to do what you want. Stored fields should not be used for scoring, as a rule, and you can expect that the performance will suffer greatly, using the format of the saved field 4.0, as well as, although to a slightly lesser extent. Could there be a better implementation, both from the point of view of the counting algorithm and from the point of view of the structure of the document, which will eliminate the requirement for registering documents based on stored fields?