Say we have schools with some data, including a name and a list of students, as well as students with some data, including the courses in which they are enrolled, and links to their school. On the client:
- I would like to show a screen that displays information about the school, which includes a list of all my students by name.
- I would like to show a screen that displays information about the student, including the name of their school and the names of the courses they take.
- I would like to cache this information so that I can show the same screen without waiting for a new selection. I should be able to go from school to school and back to school without picking up the school again.
- I would like to show each screen with only one of them. The transition from the school page to the student page can be carried out separately, but I must show the school with a complete list of student names in one sample.
- I would like to avoid duplication of data, so if the name of the school changes, one of them, to update the school, will lead to the fact that the correct name will be displayed both on the school page and on the student pages.
Is there a good way to do all this, or should some of the restrictions be lifted?
The first approach would be to have an API that does something like this:
GET /school/1 { id: 1, name: "Jefferson High", students: [ { id: 1 name: "Joel Kim" }, { id: 2, name: "Chris Green" } ... ] } GET /student/1 { id: 1, name: "Joel Kim", school: { id: 1, name: "Jefferson High" } courses: [ { id: 3 name: "Algebra 1" }, { id: 5, name: "World History" } ... ] }
The advantage of this approach is that for each screen we can simply make one selection. On the client side, we could normalize schools and students so that they refer to each other with identifiers, and then store objects in different data stores. However, the student object nested inside school is not a complete object — it does not include nested courses or a link to the school. Similarly, the school object inside student does not have a list of all participating students. Saving partial representations of objects in data warehouses will lead to the creation of complex logic on the client side.
Instead of normalizing these objects, we could store schools and students with their nested partial objects. However, this means data duplication - every student in Jefferson High will have the name of the school attached. If the name of the school has changed just before making a selection for a particular student, then we will display the correct school name for this student, but this is the wrong name everywhere, including on the School Information page.
Another approach would be to design an API to simply return the identifiers of nested objects:
GET /school/1 { id: 1, name: "Jefferson High", students: [1, 2] } GET /student/1 { id: 1, name: "Joel Kim", school: 1, courses: [3, 5] }
We will always have “complete” representations of objects with all their references, therefore it is quite easy to store this information on the client side of data warehouses. However, it will take several samples to display each screen. To show information about the student, we will need to pick up the student, and then bring their school, as well as their courses.
Is there a more reasonable approach that will allow us to cache only one copy of each object and prevent the display of multiple sets of main screens?