Can I generate verbose error messages from complex database queries?

Let me illustrate this question with a simplified example. Suppose I am building a project using python with a PostgreSQL relational database. In my database, I have two tables: "parent" and "child", which are N through M linked through the table "parent_child". I want to be able to get some data about a particular child item belonging to a specific parent in a safe way that allows me to make the following request (X, Y and Z are literals provided by the user):

SELECT child.age FROM parent, parent_child, child WHERE child.id = parent_child.child_id AND parent_child.id = X AND parent_child.parent_id = parent.id AND parent.id = Y AND parent.password = Z; 

Say that the user comes in and enters the wrong value for X, Y or Z, the query returns an empty set that can be detected, and a message that tells the user that an error has occurred. The problem, of course, is that I cannot determine what value causes the problems, and therefore cannot provide the user with specific information about what they entered incorrectly?

The simplest solution is to split the request into several parts. First check if parent.id exists.

 SELECT parent.id FROM parent WHERE parent.id = Y; 

Secondly, check the password is correct.

 SELECT parent.id FROM parent WHERE parent.id = Y and parent.password = Z; 

Third, check if the child exists.

 SELECT child.id FROM child WHERE child.id = X; 

Fourthly, checking that the child element belongs to the parent object and returns the information we need.

 SELECT child.age FROM child, parent_child WHERE parent_child.child_id = child.id AND parent_child.parent_id = Y AND parent_child.child_id = X; 

These four queries will allow us to check specific things about the information provided by the user and report specific problems as they arise. Obviously, the four queries have some additional overhead per request, and I find the four queries less readable than one. Anyway, to have the best of both worlds? One request and detailed error messages?

+4
source share
3 answers
 SELECT p.id, p2.z AS pw, pc.parent_id, CASE p2.z WHEN p.pw THEN c.age END AS age FROM (VALUES (1)) AS p1(y) LEFT JOIN parent p ON p.id = p1.y LEFT JOIN (VALUES ('pw1')) AS p2(z) ON p2.z = p.pw CROSS JOIN (VALUES(1)) AS p3(x) LEFT JOIN child c ON c.id = p3.x LEFT JOIN parent_child pc ON pc.parent_id = p.id AND pc.child_id = c.id 

NULLs in the corresponding columns means that the corresponding conditions failed.

+3
source

Well, the problem here is that the request is not really an error - it gives you the right information for your criteria every time. Thus, there really is no way to find out without studying each request individually.

Perhaps you can check if you have any rows, and THEN will run your other queries to find out why, and this will reduce your overhead.

+1
source

These four queries will allow us to check specific things about the information provided by the user and report specific problems as they arise.

Yes, this is a standard procedure (and it exists for some reason. Let's say you update the lines: you would use all kinds of server resources, such as a transaction log, only to find that it failed, and always check each level before try on the next level. Never lock or update anything until you complete a check. Never try to do anything unless you are sure it will work. It does not update, but the standard allows you to isolate the error about in a bullish way and not waste resources (at later levels due to earlier failure).

Obviously, there are many additional overheads in four queries: one query

I do not understand your arithmetic. Say, each query to a table using PK costs 50 units of resources, if it is not in the data cache, 2 units, if so. Assuming PostgreSQL has a data cache and multi-threaded engine, and your code segment is a continuous sequence (stored by proc or not):

  • first operator = 50
  • second statement (since the page is in the cache) = 2
  • third operator = 50
  • fourth operator (since the parent and child are in the cache) = 2 + 2 + 50
  • equal to 156 units

  • More importantly, in the event of an error, the cost (depending on where the error is located) is 50 or 52 or 102 units

  • while the standalone fourth statement costs 150 units

I find four queries less readable than one.

Put some free space and comments between them if you need to improve readability. (Your code is hard to read by others, I would format it.)

One request and detailed error messages?

Well, you get detailed errors, nothing more; what you are asking for is isolating the error to a specific point in your code (or user request). If you write a stored procedure for general use and return an error code, you will need the sequence that I identified.

Any other method (and I'm sure there are complex and tricky methods) will (a) add even more overhead and (b) introduce unnecessary complexity into a simple pedestrian requirement and therefore will be difficult to maintain.

0
source

All Articles