I am trying to count the number of unique users per day in a java appengine application. I decided to use the mapreduce framework (mapreduce.appspot.com) for java appengine to do this calculation offline. I managed to create a map reduction job that goes through all of my objects that represent a single user session event. I can use a simple counter. I have a few questions:
1) How to increment the counter once for each user ID? I am currently matching objects that contain the property of the user ID, but many of these objects can contain the same user ID, so how can I read only once?
2) As soon as I get these job results stored in these counters, how can I transfer them to the data warehouse? I see the results of counters on the mapreduce status page, but I want these results to be automatically stored in the data store.
Ideas?
java google-app-engine parallel-processing mapreduce
aloo
source share