"Is there a better way?"
All Cassandra data is stored in the data / folder (check the configuration value data_file_directories in cassandra.yaml ). You can also check the configuration of saved_caches_directory and commitlog_directory .
In the data folder you will have
- One folder for each keyspace
- One folder for the system key space
Some folder for authentication, etc.
Inside each folder with keys you will have
* - Data.db files containing your real data
- * - Filter.db files
- * - Index.db files for the index
- ...
To replicate data, you make a regular copy of these folders.
Our ops team uses crontab to schedule regular backups of Cassandra data this way.
Note: sometimes you can skip real-time data that is still in memory or in memory and also not cleared to disk. You can cause full compression before backing up data files. But a full seal can harm you, so be careful.
Better answer: use the provided tool to take a snapshot of your database:
http://www.datastax.com/docs/1.0/operations/backup_restore
doanduyhai
source share