BUG #18831: Particular queries using gin-indexes are not interruptible, resulting is resource usage concerns. - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #18831: Particular queries using gin-indexes are not interruptible, resulting is resource usage concerns.
Date
Msg-id 18831-e845ac44ebc5dd36@postgresql.org
Whole thread Raw
Responses Re: BUG #18831: Particular queries using gin-indexes are not interruptible, resulting is resource usage concerns.
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      18831
Logged by:          Niek
Email address:      niek.brasa@hitachienergy.com
PostgreSQL version: 17.4
Operating system:   Windows/Linux
Description:

As part of checking why some of the queries were not interruptible (even
though statement_timeout was set), I was able to reduce it to something
generic, as the usage of a gin index and the ?| operator. 
I have tested this in PostgreSQL 13.20, 16.4 and 17.4

Although performance is certainly a real issue as well, that is not the main
cause of filing a bug report. As you can imagine, it is relatively easy to
execute multiple of those statements, and put the server to a halt, which is
what happened in our case. We have 120 seconds for statement timeout, and it
was not being honored causing our services to reconnect after another
timeout was triggered, created another connection and executing the same
query. 

Output for 1000 iteration:

NOTICE:  Query for  1000/     10 took      0.001004 s
NOTICE:  Query for  1000/    100 took      0.000983 s
NOTICE:  Query for  1000/   1000 took      0.029361 s
NOTICE:  Query for  1000/  10000 took      3.364018 s
NOTICE:  Query for  1000/  50000 took     84.101790 s
NOTICE:  Query for  1000/ 100000 took    373.761158 s
NOTICE:  Query for  1000/ 150000 took    846.428113 s
NOTICE:  Query for  1000/ 200000 took   1486.085538 s
NOTICE:  Query for  1000/ 300000 took   3382.042434 s
NOTICE:  Query for  1000/ 400000 took   6249.934149 s
NOTICE:  Query for  1000/ 500000 took   9731.607982 s
NOTICE:  Query for  1000/ 600000 took  14113.382917 s
NOTICE:  Query for  1000/ 700000 took  19276.934712 s
NOTICE:  Query for  1000/ 800000 took  24991.138558 s

And one for 1000 and interrupted  at the 50000 point (you will notice, it
will return after the query is executed with exception)

NOTICE:  Query for  1000/  50000 took     93.362066 s, before being
canceled.

Below you find the reproducible block, let it run, when you see that you
have to wait a long time for a result, use the pg_terminate_backend or
pg_cancel_backend to stop the statement. You will notice that nothing is
happening, but the statement is actually executed, you will get a
query_canceled exception.

DO
$$
DECLARE
  _dummy int;
  _start timestamp;
  _a_t   int[] := ARRAY[1,10,100,1000,10000,100000];
  _a_l   int[] :=
ARRAY[10,100,1000,10000,50000,100000,150000,200000,300000,400000,500000,600000,700000,800000,900000,1000000];
  _i_l   int;
  _i_t   int;
BEGIN
    FOREACH _i_t IN ARRAY _a_t
    LOOP
        -- create a table with _i_t rows that contains a jsonb column 'doc',
with
        -- one attribute 'a' containing an array of two random uuids as text.
        -- using the gin index here, will cause the statement to be non
interruptible
        DROP TABLE IF EXISTS public.t1;
        CREATE TABLE public.t1 as (SELECT jsonb_build_object('a',
jsonb_build_array(gen_random_uuid()::text,gen_random_uuid()::text)) doc FROM
generate_series(1,_i_t));
        CREATE INDEX t1_idx ON public.t1 USING gin ((doc->'a') jsonb_ops);
        ANALYZE public.t1;
    
        FOREACH _i_l IN ARRAY _a_l
        LOOP
          BEGIN
            _start := clock_timestamp();
            
            -- the following query cannot be cancelled or terminated and the postgres
process
            -- will keep running, can only hard kill the process, which causes the
database to go
            -- into recovery mode. If statement takes longer than the statement
timeout, it will
            -- return with canceled statement exception.
            -- Although performance is horrendous with larger table contents, the
fact that it is
            -- not interruptible is the main concern raised here.
            --
            --  - Using an in memory table, will cause the statement to be
interruptible.
            --    ,t1 AS (select jsonb_build_object('a',
jsonb_build_array(gen_random_uuid()::text,gen_random_uuid()::text)) doc FROM
generate_series(1,_i_t))
            --  - Using an actual table without the gin index, will cause the
statement to be
            --    interruptible
            --  - Using a gin index, will cause the statemetnt to be non
interruptible.
            --
            WITH l1 AS (select gen_random_uuid()::text uuid FROM
generate_series(1,_i_l))
            SELECT 1 INTO _dummy
              FROM t1
             WHERE t1.doc->'a' ?| (SELECT array_agg(uuid) FROM l1);
          EXCEPTION WHEN query_canceled THEN          
            RAISE NOTICE 'Query for %/% took % s, before being canceled.'
                        ,LPAD(_i_t::text,5,' ')
                        ,LPAD(_i_l::text,7,' ')
                        ,LPAD(EXTRACT(epoch FROM (clock_timestamp() - _start))::text,13,'
');
            RETURN;
          END;
          RAISE NOTICE 'Query for %/% took % s'
                      ,LPAD(_i_t::text,5,' ')
                      ,LPAD(_i_l::text,7,' ')
                      ,LPAD(EXTRACT(epoch FROM (clock_timestamp() - _start))::text,13,'
');
        END LOOP;
    END LOOP;
END;
$$;


pgsql-bugs by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: BUG #18828: Crash when pg_get_logical_snapshot_meta() passed empty string
Next
From: Tom Lane
Date:
Subject: Re: BUG #18831: Particular queries using gin-indexes are not interruptible, resulting is resource usage concerns.