As I see in the code for BM25Similarity:
public final long computeNorm(FieldInvertState state) { final int numTerms = discountOverlaps ? state.getLength() - state.getNumOverlap() : state.getLength(); return encodeNormValue(state.getBoost(), numTerms); }
where state # getLength ():
public int getLength() { return length; }
Actually, this is an integer. Could you tell me where you see non-integer values? SolrAdmin user interface? Where?
Now that you have posted the output, I found the place where it came from: source
Take a look at this:
private Explanation explainTFNorm(int doc, Explanation freq, BM25Stats stats, NumericDocValues norms) { List<Explanation> subs = new ArrayList<>(); subs.add(freq); subs.add(Explanation.match(k1, "parameter k1")); if (norms == null) { subs.add(Explanation.match(0, "parameter b (norms omitted for field)")); return Explanation.match( (freq.getValue() * (k1 + 1)) / (freq.getValue() + k1), "tfNorm, computed from:", subs); } else { float doclen = decodeNormValue((byte)norms.get(doc)); subs.add(Explanation.match(b, "parameter b")); subs.add(Explanation.match(stats.avgdl, "avgFieldLength")); subs.add(Explanation.match(doclen, "fieldLength")); return Explanation.match( (freq.getValue() * (k1 + 1)) / (freq.getValue() + k1 * (1 - b + b * doclen/stats.avgdl)), "tfNorm, computed from:", subs); } }
So, along the length of the field, they output: float doclen = decodeNormValue((byte)norms.get(doc));
protected float decodeNormValue(byte b) { return NORM_TABLE[b & 0xFF]; } private static final float[] NORM_TABLE = new float[256]; static { for (int i = 1; i < 256; i++) { float f = SmallFloat.byte315ToFloat((byte)i); NORM_TABLE[i] = 1.0f / (f*f); } NORM_TABLE[0] = 1.0f / NORM_TABLE[255];
In fact, looking at wikipedia , this docLen should be
a | D | - the length of the document D in words