Postgres functions: when does IMMUTABLE affect performance?

Question

Postgres functions: when does IMMUTABLE affect performance?

To get the best optimization results, you should mark your functions with the most stringent volatility category that is valid for them.

However, it seems that I have an example where this is not the case, and I would like to understand what is happening. (Background: I am running postgres 9.2)

I often need to convert times, expressed as integers from seconds to dates. I wrote a function for this:

CREATE OR REPLACE FUNCTION to_datestamp(time_int double precision) RETURNS date AS $$ SELECT date_trunc('day', to_timestamp($1))::date; $$ LANGUAGE SQL;

Let us compare the performance with other identical functions, with the change in volatility with IMMUTABLE and STABLE:

 CREATE OR REPLACE FUNCTION to_datestamp_immutable(time_int double precision) RETURNS date AS $$ SELECT date_trunc('day', to_timestamp($1))::date; $$ LANGUAGE SQL IMMUTABLE;

 CREATE OR REPLACE FUNCTION to_datestamp_stable(time_int double precision) RETURNS date AS $$ SELECT date_trunc('day', to_timestamp($1))::date; $$ LANGUAGE SQL STABLE;

To test this, I will create a table of 10 ^ 6 random numbers corresponding to the times between 2010-01-01 and 2015-01-01

 CREATE TEMPORARY TABLE random_times AS SELECT 1262304000 + round(random() * 157766400) AS time_int FROM generate_series(1, 1000000) x;

Finally, I will name two functions in this table; on my specific field, the original takes ~ 6 seconds, the immutable version takes ~ 33 seconds, and the stable version takes ~ 6 seconds.

 EXPLAIN ANALYZE SELECT to_datestamp(time_int) FROM random_times; Seq Scan on random_times (cost=0.00..20996.62 rows=946950 width=8) (actual time=0.150..5493.722 rows=1000000 loops=1) Total runtime: 6258.827 ms EXPLAIN ANALYZE SELECT to_datestamp_immutable(time_int) FROM random_times; Seq Scan on random_times (cost=0.00..250632.00 rows=946950 width=8) (actual time=0.211..32209.964 rows=1000000 loops=1) Total runtime: 33060.918 ms EXPLAIN ANALYZE SELECT to_datestamp_stable(time_int) FROM random_times; Seq Scan on random_times (cost=0.00..20996.62 rows=946950 width=8) (actual time=0.086..5295.608 rows=1000000 loops=1) Total runtime: 6063.498 ms

What's going on here? For example, the results of postgres time caching are wasting time when this is actually not useful, since the arguments passed to the function are unlikely to be repeated?

(I am running postgres 9.2.)

Thanks!

UPDATE

Thanks to Craig Ringer, this has been discussed on the pgsql-performance mailing list . Main characteristics:

Tom Lane says

[shrug ...] Using IMMUTABLE to lie about function variability (in this case date_trunc) is a bad idea. This can lead to incorrect answers, not to mention performance issues. In this particular case, I imagine that a performance problem arises from the fact that you suppressed the option to embed the function body ... but you should be more worried about whether you get fictitious answers in other cases.

Pavel Steule says

If I understand, the IMMUTABLE flag used disables embedding. What you see is SQL eval overflow. My rule is not to use flags in SQL functions when possible.

+7

performance postgresql user-defined-functions

brahn Aug 13 '13 at 23:10

source share

1 answer

Clodoaldo neto · Answer 1 · 2013-08-14T00:14:59+0000

The problem is that to_timestamp returns a timestamp with a time zone. If the to_timestamp function to_timestamp replaced with a "manual" calculation without a time zone, the difference in performance

 create or replace function to_datestamp_stable( time_int double precision ) returns date as $$ select date_trunc('day', timestamp 'epoch' + $1 * interval '1 second')::date; $$ language sql stable; explain analyze select to_datestamp_stable(a) from generate_series(1, 1000000) s (a); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Function Scan on generate_series s (cost=0.00..22.50 rows=1000 width=4) (actual time=96.962..433.562 rows=1000000 loops=1) Total runtime: 459.531 ms create or replace function to_datestamp_immutable( time_int double precision ) returns date as $$ select date_trunc('day', timestamp 'epoch' + $1 * interval '1 second')::date; $$ language sql immutable; explain analyze select to_datestamp_immutable(a) from generate_series(1, 1000000) s (a); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Function Scan on generate_series s (cost=0.00..22.50 rows=1000 width=4) (actual time=94.188..433.492 rows=1000000 loops=1) Total runtime: 459.434 ms

Same functions using to_timestamp

 create or replace function to_datestamp_stable( time_int double precision ) returns date as $$ select date_trunc('day', to_timestamp($1))::date; $$ language sql stable; explain analyze select to_datestamp_stable(a) from generate_series(1, 1000000) s (a); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ Function Scan on generate_series s (cost=0.00..20.00 rows=1000 width=4) (actual time=91.924..3059.570 rows=1000000 loops=1) Total runtime: 3103.655 ms create or replace function to_datestamp_immutable( time_int double precision ) returns date as $$ select date_trunc('day', to_timestamp($1))::date; $$ language sql immutable; explain analyze select to_datestamp_immutable(a) from generate_series(1, 1000000) s (a); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Function Scan on generate_series s (cost=0.00..262.50 rows=1000 width=4) (actual time=92.639..20083.920 rows=1000000 loops=1) Total runtime: 20149.311 ms

Postgres functions: when does IMMUTABLE affect performance?

More articles: