How many bytes of memory is a tweet?

140 characters. How much memory will it take?

I am trying to calculate how many tweets my EC2 Large Mongo DB instance can contain.

+7
source share
6 answers

Twitter uses UTF-8 encoded messages .

UTF-8 code points can be six in length, four octets in length , which makes the maximum message size 140 x 4 = 560 8-bit bytes .

This, of course, is only for raw messages, excluding the costs of storage, indexing and other storage additions.

e: Twitter successfully allowed me to post:

β„’ need to be, depending on what you need, depending on what you need, depending on what you need, depending on what you need, depending on your choice β„’ need to be, depending on what you need, depending on what you need, depending on what you need, depending on what you need, depending on your choice β„’ needs, depending on what you are, depending on whether you are

Yes, these are 140 trademark characters, which are three octets in UTF-8

+9
source

Back in September, an engineer on Twitter gave a presentation that suggested about 200 bytes per tweet.

Of course, you still have to consider the overhead for your own metadata and the database itself, but 200 bytes / record is probably a good place to start.

+1
source

This is usually two bytes per character, if you save Unicode as UTF-8, so that would mean a maximum of 280 bytes per tweet.

0
source

Probably 284 bytes in memory (prefix of length 4 bytes + length * 2). I can’t say inside the database, but probably 280, if the database is UTF-8, you can add a few bytes of service data, metadata, etc.

0
source

Potentially interesting:
http://mehack.com/map-of-a-twitter-status-object
Twitter Status Object Anatomy

Also more about Twitter character encoding:
http://dev.twitter.com/pages/counting_characters

0
source

It is technically stored as UTF-8, and actually the slide deck from the guy on Twitter here http://www.slideshare.net/raffikrikorian/twitter-by-the-numbers gives a real stat about this:

140 characters, ~ 200 bytes

0
source

All Articles