CQL3 Each row has its own schema

I want to use Cassandra in a .Net application. My goal is to store some data in a column family, but each row of data will have a different layout.

Example (very simple) I want to have a "Toys" column family to store the following objects (note that they have very different properties than the ID property)

Toy object 1 {"id": "1", "Name": "Car", "number_of_doors": 4, "Likes": 3}

Toy object 2 {"id": "2", "Type": "Airplane", "Flying_range": "100m"}

Toy Item 3 {"id": "3", "Category": "Train", "number_of_carriages": 10}

From my initial understanding and use of the Datastax CSharp driver, I should always modify a table (column family) that is not suitable for me. I would like each line to have its own pattern. Thrift API may solve this problem, but it seems that HectorSharp is almost dead.

A question similar to my requirement, but it has no answer, I want

Cassandra for db schema, 10 million order tables and millions of queries per day

Am I barking the wrong tree, expecting each line to have its own schema or is there a way to do this using Cassandra + Csharp?

Thanks in advance for your answers.

+7
c # cassandra cql3
source share
2 answers

Older versions of Cassandra were Schema-less, which means you didn't have a definition of what the string might contain. Now you may need partially with Map on Cassandra 2.1

 CREATE TABLE toys ( id text PRIMARY KEY, toy map<text, text> ) 

Put some data ...

 INSERT INTO toys (id, toy) VALUES ( '1', {'name':'Car', 'number_of_doors':'4', 'likes':'3'}); INSERT INTO toys (id, toy) VALUES ( '2', {'type':'Plane', 'flying_range':'100m'}); INSERT INTO toys (id, toy) VALUES ( '3', {'category':'Train', 'number_of_carriages':'10'}); 

Table Contents ...

  id | toy ----+------------------------------------------------------- 3 | {'category': 'Train', 'number_of_carriages': '10'} 2 | {'flying_range': '100m', 'type': 'Plane'} 1 | {'likes': '3', 'name': 'Car', 'number_of_doors': '4'} 

Now we can create an index on the keys ...

 CREATE INDEX toy_idx ON toys (KEYS(toy)); 

... and execute queries on the map keys ...

 SELECT * FROM toys WHERE toy CONTAINS KEY 'name'; id | toy ----+------------------------------------------------------- 1 | {'likes': '3', 'name': 'Car', 'number_of_doors': '4'} 

Now you can update or delete records in the map, as if you were doing with ordinary columns without reading before writing

 DELETE toy['name'] FROM toys WHERE id='1'; UPDATE toys set toy = toy + {'name': 'anewcar'} WHERE id = '1'; SELECT * FROM toys; id | toy ----+----------------------------------------------------------- 3 | {'category': 'Train', 'number_of_carriages': '10'} 2 | {'flying_range': '100m', 'type': 'Plane'} 1 | {'likes': '3', 'name': 'anewcar', 'number_of_doors': '4'} 

Few limitations

  • you cannot get part of the collection: even if internally each card record is stored as a column, you can only get the whole collection.
  • you need to choose whether to create an index on keys or values โ€‹โ€‹are not supported at the same time.
  • Since cards are being drawn, you cannot put mixed values โ€‹โ€‹- in my examples, all integers are now strings

I personally consider the widespread use of this approach an anti-pattern.

NTN, Carlo

+12
source share

To add to Carlo's answer:

  • Collection indexes are not available in older versions of cassandra (pre 2.1). Secondary indices also have limitations and are ultimately sequential. Go deep into it.
  • Don't look for โ€œask me all the toys that are carsโ€ with this type. As in most cases of cassandra, think about how you are going to access the data (queries) and model, respectively. Depending on the queries, it is perfectly acceptable to have several tables storing toys with different structures in order to facilitate different queries.
+3
source share

All Articles