Does it make sense to use neo4j to index the file system

I am working on a Java-based backup client that scans files in a file system and populates the Sqlite database with directories and file names that it finds for backup. Would it be wise to use neo4j instead of sqlite? Will it be more nimble and easier to use for this application. I thought, because the file system is a tree (or graph, if you count symbolic links), can the gaph database be suitable? The sqlite database schema defines only 2 tables, one for directories (full path and other information) and one for files (a name with only a foreign key containing a directory in the directory table), so it is relatively simple.

An application must index many millions of files, so the solution should be fast.

+5
source share
3 answers

As long as you can perform operations with the database, mainly using string matching on saved file system paths, using relational databases makes sense. At a time when the data model becomes more complex, and in fact you cannot fulfill your queries using string matching, but you need to cross the graph, using the graph database will make it a lot easier.

+3
source

As I understand it, one of the earliest applications of Neo4j was to do just that as part of the Neo4j CMS system.

Lucene, backing Neo4j, , .

.

+3

. .

:

sqlite:

  • - fs, sqlite : (, )
  • , , , O (1), neo4j
  • sqlite ?

neo4j:

  • , , cypher .
  • the data model is likely to be more complex than 2 tables: all different objects, then dir-in-dir relationships, file-in-dir relationships, symlink relationships

Cheers, hj

0
source

All Articles