I use mahout to start k-mean clustering, and I had a problem identifying data input during clustering, for example, I have 100 data records
id data
0 0.1 0.2 0.3 0.4
1 0.2 0.3 0.4 0.5
... ...
100 0.2 0.4 0.4 0.5
after clustering, I need to return the identifier from the cluster result to see which point belongs to the cluster, but there is no way to maintain the identifier.
In the official mahout example of clustering synthetic control data, only data was entered into mahout without id, for example
28.7812 34.4632 31.3381 31.2834 28.9207 ...
...
24.8923 25.741 27.5532 32.8217 27.8789 ...
and the cluster result has only the cluster identifier and the point value:
VL-539{n=38 c=[29.950, 30.459, ...
Weight: Point:
1.0: [28.974, 29.026, 31.404, 27.894, 35.985...
2.0: [24.214, 33.150, 31.521, 31.986, 29.064
but there is no point identifier, so can anyone have an idea on how to add saving point identifier when performing mahout clustering? many thanks!