Various results from Murmur3 from Scala and Guava

I am trying to generate hashes using the Murmur3 algorithm. The hashes are consistent, but they are different values ​​returned by Scala and Guava.

class package$Test extends FunSuite {
  test("Generate hashes") {
    println(s"Seed = ${MurmurHash3.stringSeed}")
    val vs = Set("abc", "test", "bucket", 111.toString)
    vs.foreach { x =>
      println(s"[SCALA] Hash for $x = ${MurmurHash3.stringHash(x).abs % 1000}")
      println(s"[GUAVA] Hash for $x = ${Hashing.murmur3_32().hashString(x).asInt().abs % 1000}")
      println(s"[GUAVA with seed] Hash for $x = ${Hashing.murmur3_32(MurmurHash3.stringSeed).hashString(x).asInt().abs % 1000}")
      println()
    }
  }
}


Seed = -137723950
[SCALA] Hash for abc = 174
[GUAVA] Hash for abc = 419
[GUAVA with seed] Hash for abc = 195

[SCALA] Hash for test = 588
[GUAVA] Hash for test = 292
[GUAVA with seed] Hash for test = 714

[SCALA] Hash for bucket = 413
[GUAVA] Hash for bucket = 22
[GUAVA with seed] Hash for bucket = 414

[SCALA] Hash for 111 = 250
[GUAVA] Hash for 111 = 317
[GUAVA with seed] Hash for 111 = 958

Why do I get different hashes?

+4
source share
2 answers

It seems to me that Scala hashStringconverts UTF-16 pairs charin a intdifferent way than Guava hashUnencodedChars( hashStringwithout Charset).

Scala:

val data = (str.charAt(i) << 16) + str.charAt(i + 1)

guavas:

int k1 = input.charAt(i - 1) | (input.charAt(i) << 16);

char i 16 int, char at i + 1 16 . Scala, : char at i , char at i + 1 . (, Scala +, |).

, Guava ByteBuffer.putChar(c) , ByteBuffer, ByteBuffer.getInt(), int . Guava UTF-16LE . Scala , JVM. , , ( ) Scala , .

Edit:

Scala , Guava: , finalizeHash, Guava fmix.

+4

, hashString(x, StandardCharsets.UTF_16BE) Scala. .

(, , Guava - !)

-1

All Articles