I don't think there is any use to storing literal JSON data as a BLOB in Cassandra. In the best case, your storage costs are identical, and in general the APIs are less convenient in terms of working with BLOB types, since they are designed to work with strings / text.
For example, if you use their Java API , then to store data as a BLOB using the parameterized PreparedStatement you first need to load all this into a ByteBuffer , for example, by packing JSON data in an InputStream .
If you are not dealing with very large JSON fragments that force you to transfer your data anyway, this is a fair bit of extra work to gain access to the BLOB type. And what would you get from this? In fact, nothing.
However, I think there are some advantages to the question . " Should I store JSON as text or gzip and store the compressed data as a BLOB ?.
And the answer to this question comes down to how you configured Cassandra and your table. In particular, as long as you use Cassandra version 1.1 or later, your tables have default compression. This may be sufficient, especially if your JSON data is pretty uniform on each line.
However, Cassandra built-in compression applies only to tables, not single rows. This way you can get a better compression ratio by manually compressing the JSON data before storage, writing the compressed bytes to ByteBuffer , and then sending the data to Cassandra as a BLOB .
Thus, it essentially boils down to a compromise in terms of storage space and ease of programming against CPU usage. I would decide the following:
- Minimize the storage that you consider the biggest concern?
- If so, compress the JSON data and store the compressed bytes as a
BLOB ; - Otherwise, go to # 2.
- Is Cassandra built-in compression available and enabled for your table?
- If not (and if you cannot enable compression), compress the JSON data and store the compressed bytes as a
BLOB ; - Otherwise, go to # 3.
- Are the data that you will be storing relatively evenly on each row?
- The answer is probably yes for the JSON data, in which case you should store the data as text and let Cassandra handle the compression;
- Otherwise, go to # 4.
- Do you need efficiency or convenience?
- Efficiency; Compress JSON data and save compressed bytes as
BLOB . - Convenience; compress JSON data, base64 - compressed data, and then save the data in base64 encoding as text.
aroth
source share