Intelligent logic requests performance inside functions for PostgreSQL

Consider the following sql query:

SELECT a,b,c FROM t WHERE (id1 = :p_id1 OR :p_id1 IS NULL) AND (id2 = :p_id2 OR :p_id2 IS NULL) 

Markus Winand, in his book “ SQL Performance Explained ”, calls this approach one of the worst characteristics of anti-patterns for everyone, and explains why (the database should prepare a plan for the worst case when all filters are disabled).

But later, he also writes that for PostgreSQL, this problem only occurs when reusing the instruction handle ( PreparedStatement ).

Suppose also that the request indicated above is wrapped in a function, for example:

 CREATE FUNCTION func(IN p_id1 BIGINT,IN p_id2 BIGINT) ... $BODY$ BEGIN ... END; $BODY$ 

So far, I misunderstand a few points:

  • Will this problem occur when wrapping functions? (I tried to see the execution plan of the function call, but Postgres does not show me the details for internal function calls even with SET auto_explain.log_nested_statements = ON ).

  • Let's say I'm working with an outdated project and cannot change the function itself, only java runtime code. Would it be better to avoid the prepared statement here and use a dynamic query every time? (Assuming the runtime is quite long, up to a few seconds). Let's say an ugly approach:


 getSession().doWork(connection -> { ResultSet rs = connection.createStatement().executeQuery("select * from func("+id1+","+id2+")"); ... }) 
+5
source share
1 answer

1. It depends.

If you do not use prepared statements, PostgreSQL schedules the query each time again using the parameter values. It is known as a user plan.

With prepared statements (and you're right, PL / pgSQL functions use prepared statements), this is more complicated. PostgreSQL prepares the statement (parses its text and saves the parse tree), but reschedules it every time it is executed. Custom plans are generated at least 5 times. After that, the planner considers the use of a general plan (for example, independent of the parameter) if it costs less than the average cost of custom plans generated so far.

Note that the cost of the plan is an estimate of the scheduler, not the actual I / O operations or processor cycles.

So, a problem may arise, but for this you need some bad luck.

2. The approach you propose will not work, because it does not change the behavior of the function.

In general, it is not so ugly for PostgreSQL to use non-parameters (as, for example, for Oracle), because PostgreSQL does not have a common cache for plans. Prepared plans are stored in each internal memory, so rescheduling will not affect other sessions.

But, as far as I know, there is currently no way to force the scheduler to use custom plans (except for reconnecting after 5 executions ...).

+2
source

All Articles