Designing Social Feeds at DynamoDB

This question may be relevant for any document-based NoSQL database.

I am involved in a certain social network and decided to go with DynamoDB due to factors of scalability and lack of pain. There are only two main objects in the database: users and messages .

The requirement for regular queries is very simple:

  • Homemade feed (feed of people I follow)
  • My / User feed (my feed or a specific user feed)
  • List of users I / used after
  • Subscribers List

Here is the database diagram I have come to so far (legend: __thisIsHashKey and _thisIsRangeKey ):

 timeline = { // post __usarname:"totocaster", _date:"1245678901345", record_type:"collection", items: ["2d931510-d99f-494a-8c67-87feb05e1594","2d931510-d99f-494a-8c67-87feb05e1594","2d931510-d99f-494a-8c67-87feb05e1594","2d931510-d99f-494a-8c67-87feb05e1594","2d931510-d99f-494a-8c67-87feb05e1594"], number_of_likes:123, description:"Hello, this is cool" } timeline = { // new follower __usarname:"totocaster", _date:"1245678901345", type:"follow", follower:"tamuna123" } timeline = { // new like __usarname:"totocaster", _date:"1245678901345", record_type:"like", liker:"tamuna123", like_date:"123255634567456" } users = { __username:"totocaster", avatar_url:"2d931510-d99f-494a-8c67-87feb05e1594", followers:["don_gio","tamuna123","barbie","mikecsharp","bassman"], following:["tamuna123","barbie","mikecsharp"], likes:[ { username:'barbie', date:"123255634567456" }, { username:"mikecsharp", date:"123255634567456" }], full_name:"Toto Tvalavadze", password:"Hashed Key", email:"totocaster@myemailprovider.com" } 

As you can see, I collected all my posts directly in the timeline collection. This way I can request messages using date and username (hash and range keys). Everything seems fine, but here is the problem:

I cannot request the user timeline at a time. This will be one of the most requested queries by the system, and I cannot provide an effective way to do this. Please help. Thanks.

+8
database amazon-web-services amazon-dynamodb database-design database-schema
source share
2 answers

I would look at the database of Count Titan ( http://thinkaurelius.github.com/titan/ ) and Neo4j ( http://www.neo4j.org/ ).

I know that Titan claims to scale quite well with large datasets.

Ultimately, I think your model fits the schedule well. Users and messages will be nodes, and then you can connect them arbitrarily through the edges. A user (node) is a friend (edge) of another user (node).

A user (node) has many messages (nodes) on its timeline. Then you can run interesting traverses through the schedule.

+1
source share

I work with the news daily. (Written by Stream-Framework and founded by getstream.io)

The most common solutions that I see are:

  • Kassandra (Instagram)
  • Redis (expensive but easy)
  • Mongodb
  • Dynamodb
  • RocksDB (Linkedin)

Most people use a splitter to write or fork when reading. This makes it easier to create a working solution, but it can quickly become expensive. It is best to use a combination of these two approaches. So in most cases, do branching when recording, but for very popular channels they are stored in memory.

Stream-Framework is open source and supports Cassandra / Redis and Python

getstream.io is an integrated solution built on top of Go and Rocksdb.

If you end up using DynamoDB, be sure to configure the right key of the section: https://shinesolutions.com/2016/06/27/a-deep-dive-into-dynamodb-partitions/

Also note that a Redis or DynamoDB solution will become quite expensive. You will get the lowest cost for each user using Cassandra or RocksDB.

0
source share

All Articles