Database schema issue

I am developing a data model for a local city page, more similar to the requirements for it.

So, 4 tables: Country, State, City, neighborhood.

Real life relationships: a country owns several states that own several cities that have several areas.

In the data model: do they relate them to FK in the same way or do they relate to each other? As in every table, even there will be a country identifier, state identifier, city identifier and a neighboring identifier, each of which is associated with each of them? In other words, in order to reach a neighborhood from a country, do we need to join two other tables between them?

There are more tables that I need to maintain for IP add-ons of cities, latitude / longitude, etc.

+4
source share
1 answer

The closest to the industry standard is the following: each dependent table is connected by a foreign key with its immediate parent:

create table country (country_id number not null , country_name varchar2(30) , constraint country_pk primary key (country_id) ) / create table state (state_id number not null , state_name varchar2(30) , country_id number not null , constraint state_pk primary key (state_id) , constraint state_country_fk foreign key (country_id) references country(country_id) ) / create table city (city_id number not null , city_name varchar2(30) , state_id number not null , constraint city_pk primary key (city_id) , constraint city_state_fk foreign key (state_id) references state(state_id) ) / create table neighbourhood (neighbourhood_id number not null , neighbourhood_name varchar2(30) , city_id number not null , constraint neighbourhood_pk primary key (neighbourhood_id) , constraint neighbourhood_city_fk foreign key (city_id) references city(city_id) ) / 

An alternative approach that has largely gone out of use is to define the primary keys of the child tables as composite keys, including the keys of the immediate parent table:

 create table state (country_id number not null , state_id number not null , state_name varchar2(30) , constraint state_pk primary key (country_id, state_id) , constraint state_country_fk foreign key (country_id) references country(country_id) ) / create table city (country_id number not null , state_id number not null , city_id number not null , city_name varchar2(30) , constraint city_pk primary key (country_id, state_id, city_id) , constraint city_state_fk foreign key (country_id, state_id) references state(country_id, state_id) ) / create table neighbourhood (country_id number not null , state_id number not null , city_id number not null , neighbourhood_id number not null , neighbourhood_name varchar2(30) , constraint neighbourhood_pk primary key (country_id, state_id, city_id, neighbourhood_id) , constraint neighbourhood_city_fk foreign key (country_id, state_id, city_id) references city(country_id, state_id, city_id) ) / 

This approach is outdated because in the short term it creates extremely bulky associations and in the long run it creates terrible riots when the keys change. Primary keys should not be changed, but compiling them makes sense. Therefore, when the system data changes - say, the state border is reorganized there - changes to a whole bunch of cities should be cascaded into the table of Neighborhood and any other children. Ugh.

Your offer is an alternative version of this:

 create table state (state_id number not null , state_name varchar2(30) , country_id number not null , constraint state_pk primary key (state_id) , constraint state_country_fk foreign key (country_id) references country(country_id) ) / create table city (city_id number not null , city_name varchar2(30) , country_id number not null , state_id number not null , constraint city_pk primary key (city_id) , constraint city_country_fk foreign key (country_id) references country(country_id) , constraint city_state_fk foreign key (state_id) references state(state_id) ) / create table neighbourhood (neighbourhood_id number not null , neighbourhood_name varchar2(30) , country_id number not null , state_id number not null , city_id number not null , constraint neighbourhood_pk primary key (neighbourhood_id) , constraint neighbourhood_country_fk foreign key (country_id) references country(country_id) , constraint neighbourhood_state_fk foreign key (state_id) references state(state_id) , constraint neighbourhood_city_fk foreign key (city_id) references city(city_id) ) / 

It avoids complex keys, but you still have a problem with cascading data. It also violates relational practice by introducing foreign keys for relationships that are not (there is no direct connection between the neighborhood and the country, this is understood through intermediate connections).

On the plus side, as you point out, this can be very useful for running queries that want to return neighborhoods for a given country. I worked on one system where it was useful (in fact, he used inherited composite keys, but the principle is the same). However, it was a very specialized data warehouse, and even then the queries that I ran were administrator / developer queries, not applications. If you are not dealing with huge amounts of data (millions of quarters), I think that the performance gain from skipping multiple connections will not be worth the overhead of managing these extra columns.

In short, use the first approach: it is neat and standard.

change

"The state should be optional, because not all countries have a state. Then the country will contact the city directly."

If true, this changes everything. Obviously, STATE cannot be used as an identity foreign key for CITY. Therefore, CITY must refer to COUNTRY. STATE may be an optional CITY search.

Although I think most countries have an equivalent unit, such as counties or departments. Even microstates, such as Liechtenstein and San Marino, have municipalities (there is only one in Monaco). Perhaps the only country that is not the Vatican. Therefore, carefully consider whether your data model should be structured to support one or two edge cases or to extract data by entering an artificial “state” for exceptions such as the Holy See. None of the approaches are completely satisfactory.

"all of these fields will automatically fill in the fields, so are not sure if that changes the structure of the table in any case?"

Irrelevant.

"But who knows, a few months later we may discover some interesting features that might be required for a country to fit the neighborhood too."

Yes, but then you can’t. XP has a powerful principle called YAGNI - you won’t need it . Basically, do not do a lot of work and do not complicate your design for the sake of some supposed future requirement that will never come.

And if it does, the first solution would be to join NEIGHBORHOOD and COUNTRY through the staging tables (or a table if you are not using STATE as a reference for CITY). Only if the performance of this request is sucking! and he stubbornly resists tuning if you think about tuning the data model.

+5
source

Source: https://habr.com/ru/post/1312933/


All Articles