I think there is no right answer, just different models of the mind. You may find mines, troublesome, or both, depending on your programming background. I present the data warehouse as a single huge distributed collection of value values, consisting of all entity data of any type in any namespace and all GAE applications for all users. One bucket is called a group of entities. It has a root key that (under the hood) consists of your appID, namespace, view, entity identifier or name. A group of persons contains one or more objects that have keys that extend the root key. An entity that belongs to the root key itself may or may not exist. Operations within one group of objects are atomic (transactional). An entity is a simple cartographic data structure. The 2 built-in indexes (ascending and descending) again represent 2 giant sorted sets of index entries. Each index record represents an appID, namespace, kind data structure, property name, property type, property value, entity key - in that order. Each (automatic) indexed value of each property of each object creates 2 such index elements. There is another index in which there are only entities. However, custom indexes move on to another sorted collection with elements containing appID, namespace, index type, combined index value, entity key. This is the only part of the entire data warehouse that uses metadata. It stores an index definition that tells the store how the combined index value is generated from the object. This is a picture that was burned in my mind and from which I know how to make a happy data warehouse.
source share