I do not think that the space needed to store billions of triples is realistically worse than the space needed to store billions of rows in an SQL database.
A common approach that most systems use if native repositories / SQL are based on assigning identifiers to nodes and store each triple as only 3 node IDs. Given the good choice of generating node identifiers and the effective index between the value of node ID and node, you can easily create stores that scale massively.
As an additional optimization, some stores generate node identifiers in such a way that simple value types (for example, integers, booleans, dates, etc.) have their value encoded directly in the node identifier, so there is no need to search from the identifier to the value (or vice versa when inserting such data)
Robv
source share