Is there any systematic step-by-step or mathematical way to build an SQL query from a given description that is understandable to humans?
Yes there is.
It turns out that natural language expressions and logical expressions, as well as expressions of relational algebra and SQL expressions (a hybrid of the latter two) correspond in a rather direct way. (What follows is not for duplicate strings and zeros.)
A related predicate is associated with each table (database or query result) - an empty-fill operator (named-) template parameterized by column names.
[liker] likes [liked]
The table contains each row, which, using column row values to fill in (named) spaces, makes a true statement, known as a sentence.
liker | liked -------------- Bob | Dex /* Bob likes Dex */ Bob | Alice /* Bob likes Alice */ Alice | Carol /* Alice likes Carol */
Each sentence from filling a predicate with values from a row in a table is true. And every sentence from filling the predicate with values from a row that is not in the table is false.
/* Alice likes Carol AND NOT Alice likes Alice AND NOT Alice likes Bob AND NOT Alice likes Dex AND NOT Alice likes Ed ... AND Bob likes Alice AND Bob likes Dex AND NOT Bob likes Bob AND NOT Bob likes Carol AND NOT Bob likes Ed ... AND NOT Carol likes Alice ... AND NOT Dex likes Alice ... AND NOT Ed likes Alice ... */
DBA gives a predicate for each base table. The SQL syntax for declaring a table is much like the traditional logical shorthand for a natural language version of this predicate.
/* (person, liked) rows where [liker] likes [liked] */ /* (person, liked) rows where Likes(liker, liked) */ SELECT * FROM Likes
The expression (sub) of the SQL query converts the values of the argument table into the new value of the table containing the rows that make up the true statement from the new predicate. The new table predicate can be expressed in terms of the predicate (s) of the argument table according to the relational / table expression operators (sub) expressions. A query is an SQL expression whose predicate is the predicate of the row table we want.
Inside SELECT :
• A base table named T with an alias A has a predicate / row, where T(AC,...) .
• R CROSS JOIN S & R INNER JOIN S have a predicate / are strings, where the predicate of R AND the predicate of S (Strings that are a combination of a string from each argument with an alias A after renaming its columns C,... to AC,... )
• R ON condition R WHERE condition have a predicate / are strings in which the predicate of R AND condition .
• SELECT DISTINCT AC AS D,... FROM R (possibly with implicit A. and / or implicit AS D ) has predicates / rows in which FOR SOME [value for] then discards the columns and then the predicate of R with AC,... replaced by D,... (Deleted columns are not parameters of the new predicate.)
• Equivalent to SELECT DISTINCT AC AS D,... FROM R has a predicate / are strings in which FOR SOME A.*,..., AC=D AND... AND the predicate of R (This may be less compact, but more like SQL.)
• (X,...) IN (R) means predicate of R with columns C,... replaced by X,...
• Therefore, (...) IN (SELECT * FROM T) means T(...) .
The natural language and abbreviation for the lines (the person you like), where [the person] is Bob, and Bob likes the one who likes [liked] but who doesn't like Ed.
/* (person, liked) rows where for some value for x, [person] likes [x] and [x] likes [liked] and [person] = 'Bob' and not [x] likes 'Ed' /* (person, liked) rows where FOR SOME [value for] x, Likes(person, x) AND Likes(x, liked) AND person = 'Bob' AND NOT Likes(x, 'Ed') */
Rewrite using the predicates of our base tables and then SQL.
/* (person, liked) rows where FOR SOME [values for] l1.*, l2.*, person = l1.liker AND liked = l2.liked AND Likes(l1.liker, l1.liked) AND Likes(l2.liker, l2.liked) AND l1.liked = l2.liker AND person = 'Bob' AND NOT Likes(l1.liked, 'Ed') */ SELECT l1.liker AS person, l2.liked AS liked FROM /* (l1.liker, l1.liked, l2.liker, l2.liked) rows where Likes(l1.liker, l1.liked) AND Likes(l2.liker, l2.liked) AND l1.liked = l2.liker AND l1.liker = 'Bob' AND NOT Likes(l1.liked, 'Ed') */ Likes l1 INNER JOIN Likes l2 ON l1.liked = l2.liker WHERE l1.liker = 'Bob' AND NOT (l1.liked, 'Ed') IN (SELECT * FROM Likes)
• R UNION CORRESPONDING S has a predicate / are strings in which the predicate of R OR the predicate of S
• R EXCEPT S has a predicate / are strings in which the predicate of R AND NOT the predicate of S
• VALUES(C,...)((X,...),...) has a predicate of / - lines, where (C = X AND...) OR...
/* (person) rows where (FOR SOME liked, Likes(person, liked)) OR person = 'Bob' */ SELECT liker AS person FROM Likes UNION VALUES (person) (('Bob'))
Thus, if we express our desired rows in terms of the given operator patterns in the natural language of the base table, for which the rows are true or false (to return or not), then we can convert to SQL queries, which are embeddings of logical abbreviations and operators and / or table names. & operators. And then the DBMS can completely convert to tables to calculate the rows that make our predicate true.
See How to get matching data from another SQL table for two different columns: Internal join and / or Join? reapply this to SQL. (Another self-join.)
See Relational Algebra for Banking Scenarios for more information on natural language formulations. (In the context of relational algebra.)