This is actually more of a Lucene question, but it is in the context of the neo4j database.
I have a database divided into 50 or near node types (so there are “collections” or “tables” in other dbs types). Each of them has a subset of properties that need to be indexed, some have the same name, and some do not.
When searching, I always want to find nodes of a certain type, but not through all nodes.
I can see three ways of organizing this:
One index for each type of property, naturally, represent fields indexes: index 'foo', 'id'='1234'.
One global index, each field matches a property name to distinguish between the type either to include it as part of the value ( 'id'='foo:1234'), or to check the nodes after they return (I expect duplicates to be very rare). A.
One index type is part of the field name: 'foo.id'='1234'.
Once created, the database is read-only.
Are there any advantages for one of them, in terms of convenience, size efficiency, and caching or performance?
As I understand it, for the first option, neo4j will create a separate physical index for each type, which seems suboptimal. For the third, I end up with most lucene docs having only a small subset of fields, not sure if this has any effect on.
source
share