Couchbase Data Modeling - Document Oriented

This question does not have to be developed by Couchbase 2.0, but I think it can help people with the investigation of the new Couchbase product.

I am looking for advice on data modeling. We are exploring Couchbase with a view to possibly using it for Realtime Analytics.

However, I cannot find documentation on how to best model data in the real world.

I propose a scenario, and if the community can help me or discuss some ideas on how this could be modeled, would it be very helpful?

Please note that this is not a representative of our product, and I do not ask people to solve our modeling for us, the question is more for discussion

Suppose that customers make purchases of products at a specific date / time, the products have information with them, such as identifier, name, description and price, the purchase is made on a date.

The initial requirement is to be able to count all purchases between two dates. There can be more than 100,000 purchases in one day - this is a pretty big business;)

If any of the syntax is incorrect, please tell me all the tips / help.

If we modeled the data like this (which may be completely wrong):

Grocery shopping

{ "_id" : "purchase_1", "_rev" : "1-1212afdd126126128ae", "products" : [ "prod_1" : { "name" : "Milk", "desc" : "Semi-skimmed 1ltr", "price" : "0.89" }, "prod_7568" : { "name" : "Crisps", "desc" : "Salt and Vinegar", "price: "0.85" } ] "date" : "2012-01-14 14:24:33" } { "_id" : "purchase_2", "_rev" : "1-1212afdd126126128ae", "products" : [ "prod_89001" : { "name" : "Bread", "desc" : "White thick sliced", "price: "1.20" } ] "date" : "2012-01-14 15:35:59" } 

So, given this layout of the document, we can see every purchase, and we can see the products that were in this purchase - how could we count on all purchases between two dates? Also, how could you see a log of all purchases between two dates in descending order of date?

Is Couchbase suitable for this?

Between two dates there can be hundreds of thousands of purchases, and the client does not like to wait for reports .... as I'm sure everyone experienced;)

It would be better to use incr functions, and if so, how are you going to model the data?

Many thanks to everyone who reads this - I hope this explains, adding even more examples of modeling problems in the real world, if possible.

James

+7
source share
1 answer

In the simplest case, you can write a Map function that creates a view using the date field as a key.

So, with a slightly redesigned document:

 { "_id": "purchase_1", "_rev": "2-c09e24efaffd446c6ee8ed6a6e2b4a22", "products": [ { "id": "prod_3", "name": "Bread", "desc": "Whole wheat high fiber", "price": 2.99 } ], "date": "2012-01-15 12:34:56" } { "_id": "purchase_2", "_rev": "2-3a7f4e4e5907d2163d6684f97c45a715", "products": [ { "id": "prod_1", "name": "Milk", "desc": "Semi-skimmed 1ltr", "price": 0.89 }, { "id": "prod_7568", "name": "Crisps", "desc": "Salt and Vinegar", "price": 0.85 } ], "date": "2012-01-14 14:24:33" } 

Your display function will look like this:

 function(doc) { for (var product in doc.products) { emit(doc.date, doc.products[product].price); } } 

You can add a reduction function that summarizes purchases by date.

 function(keys, values) { return sum(values); } 

You can then request a view using the start key and end key options.

 http://localhost:5984/couchbase/_design/Products/_view/total_price_by_date?startkey="2012-01-01"&endkey="2012-01-31"&group=true 

The result of a view request will look like this:

 {"rows":[ {"key":"2012-01-14 14:24:33","value":4.94}, {"key":"2012-01-15 12:34:56","value":2.99} ]} 

Or remove the group parameter to get the sum for the entire date range:

 {"rows":[ {"key":null,"value":7.930000000000001} ]} 

Hope this helps.

- John

+6
source

All Articles