Re: query locks up when run concurrently - Mailing list pgsql-general

From azhwkd
Subject Re: query locks up when run concurrently
Date
Msg-id CAJ_VqixL6Ab9-VL6qpttnT1YHw9ZiH=ZmFZgEiCL6+7GvdDkWQ@mail.gmail.com
Whole thread Raw
In response to Re: query locks up when run concurrently  (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses Re: query locks up when run concurrently  (Adrian Klaver <adrian.klaver@aklaver.com>)
List pgsql-general
Greetings!

Yes I had a look at the logfiles but there is not a single logfile generated when I'm reproducing this issue. 
Concerning locks I used there queries at https://wiki.postgresql.org/wiki/Lock_Monitoring and they came up empty.
The group_history table and its sub-tables do not have any foreign keys as they only contain calculated data and the source data is removed after some time.
The trigger which calls the partition function below is defined as follows:

create
    trigger group_history_trigger before insert
        on
        item_history for each row execute procedure group_history_partition_function()

Kind regards,
Sebastian

Adrian Klaver <adrian.klaver@aklaver.com> schrieb am Mi., 23. Nov. 2016 um 16:25 Uhr:
On 11/22/2016 11:41 PM, azhwkd@gmail.com wrote:
> Greetings!
>
> I'm using postgres version 9.5.5 on a ubuntu 16.04.1 server
> installation which was installed through apt-get.
>
> I have a query which if run alone usually completes in about 300ms.
> When run in my application this query constantly locks up and bogs
> down all connections of the connection pool (In the application this
> query is run up to 10 times in parallel with different parameters).
> What's really weird is that I can re-run one of the hung queries from
> the command line while it's hung and it will complete as expected
> while the hung queries continue to use 100% CPU time.

Have you looked in the Postgres logs to see if there are any relevant
messages?

Best guess is that running in parallel is creating a lock problem where
one of the queries is holding a lock that is keeping the others from
completing.

What relationships do the group_history and group_history_* tables have
to other tables?

What is the definition of the trigger that fires the trigger function below?

>
> The query in question is this:
>
> insert into group_history ("group", id, sub_category, "date", aa, ab,
> bb, ba, quantity, "hour")
> (select
> a."group",
> a.id,
> b.sub_category,
> to_timestamp($2)::date as "date",
> max(a.aa / a.quantity) as aa,
> min(a.aa / a.quantity) as ab,
> max(a.bb / a.quantity) as bb,
> min(a.bb/ a.quantity) as ba,
> sum(a.quantity) as quantity,
> extract('hour' from to_timestamp($2)) as "hour"
> from tbla a
> join tblb b on a.id = b.id
> where a."group" = $1 and b."group" = $1
> group by a."group", a.id, b.sub_category
> );
>
> When I'm running a perf on the system it looks like this while running
> the query 10 times:
>
> Samples: 4M of event 'cpu-clock', Event count (approx.): 18972107951
> Overhead Shared Object Symbol
> 17.95% postgres [.] heap_hot_search_buffer
> 5.64% postgres [.] heap_page_prune_opt
> 4.62% postgres [.] hash_search_with_hash_value
> 3.80% postgres [.] LWLockRelease
> 3.73% postgres [.] 0x00000000002f420d
> 2.50% postgres [.] _bt_checkkeys
> 2.48% postgres [.] hash_any
> 2.45% postgres [.] 0x00000000002f41e7
> 2.10% postgres [.] slot_getattr
> 1.80% postgres [.] ResourceOwnerForgetBuffer
> 1.58% postgres [.] LWLockAcquire
> 1.58% postgres [.] ReadBufferExtended
> 1.54% postgres [.] index_fetch_heap
> 1.47% postgres [.] MemoryContextReset
> 1.43% postgres [.] btgettuple
> 1.38% postgres [.] 0x00000000002d710c
> 1.36% postgres [.] 0x00000000002d70a5
> 1.35% postgres [.] ExecQual
>
> Explain (Analyze, Verbose) Output
>
> QUERY PLAN
> ----------------------------------------------------------------------------
> -------------------------------------------------------------------------------------------------------------------------------------------------
> Insert on public.group_history (cost=10254.36..10315.16 rows=2432
> width=62) (actual time=1833.967..1833.967 rows=0 loops=1)
> -> Subquery Scan on "*SELECT*" (cost=10254.36..10315.16 rows=2432
> width=62) (actual time=353.880..376.490 rows=6139 loops=1)
> Output: "*SELECT*"."group", "*SELECT*".id,
> "*SELECT*".sub_category, "*SELECT*"."when", "*SELECT*".aa,
> "*SELECT*".ab, "*SELECT*".bb, "*SELECT*".ba, "*SELECT*".quantity,
> "*SELECT*"."hour"
> -> HashAggregate (cost=10254.36..10278.68 rows=2432 width=28)
> (actual time=353.871..367.144 rows=6139 loops=1)
> Output: a."group", a.id, b.sub_category, '2016-11-20'::date,
> max((a.aa / a.quantity)), min((a.aa / a.quantity)), max((a.bb /
> a.quantity)), min((a.bb / a.quantity)), sum(a.quantity), '21'::double
> precision
> Group Key: a."group", a.id, b.sub_category
> -> Hash Join (cost=5558.64..10181.40 rows=2432 width=28)
> (actual time=193.949..294.106 rows=30343 loops=1)
> Output: a."group", a.id, a.aa, a.quantity, a.bb, b.sub_category
> Hash Cond: (b.id = a.id)
> -> Bitmap Heap Scan on public.auctions_extra b
> (cost=685.19..4719.06 rows=30550 width=8) (actual time=56.678..111.038
> rows=30343 loops=1)
> Output: b.sub_category, b.id
> Recheck Cond: (b."group" = 7)
> Heap Blocks: exact=289
> -> Bitmap Index Scan on auction_extra_pk
> (cost=0.00..677.55 rows=30550 width=0) (actual time=55.966..55.966
> rows=30343 loops=1)
> Index Cond: (b."group" = 7)
> -> Hash (cost=4280.62..4280.62 rows=30627 width=28)
> (actual time=137.160..137.160 rows=30343 loops=1)
> Output: a."group", a.id, a.aa, a.quantity, a.bb, a.id
> Buckets: 16384 Batches: 4 Memory Usage: 638kB
> -> Bitmap Heap Scan on public.tbla a
> (cost=689.78..4280.62 rows=30627 width=28) (actual
> time=58.530..117.064 rows=30343 loops=1)
> Output: a."group", a.id, a.aa, a.quantity,
> a.bb, a.id
> Recheck Cond: (a."group" = 7)
> Heap Blocks: exact=254
> -> Bitmap Index Scan on tbla_pk
> (cost=0.00..682.12 rows=30627 width=0) (actual time=57.801..57.801
> rows=30343 loops=1)
> Index Cond: (a."group" = 7)
> Planning time: 0.475 ms
> Trigger group_history_trigger: time=1442.561 calls=6139
> Execution time: 1834.119 ms
>
>
> group_history_trigger:
>
> CREATE OR REPLACE FUNCTION public.group_history_partition_function()
> RETURNS trigger
> LANGUAGE plpgsql
> AS $function$
> declare
> _new_date timestamptz;
> _tablename text;
> _startdate text;
> begin
> -- Takes the current inbound "when" value and determines when
> midnight is for the given date
> _new_date := date_trunc('day', new."when");
> _startdate := to_char(_new_date, 'YYYY_MM_DD');
> _tablename := 'group_history_'||_startdate;
>
> -- Insert the current record into the correct partition
> execute 'INSERT INTO public.' || quote_ident(_tablename) || ' VALUES ($1.*)
> on conflict ("group", id, sub_category, "when", "hour") do
> update set aa = excluded.aa,
> ab = excluded.ab,
> bb = excluded.bb,
> ba = excluded.ba,
> quantity = excluded.quantity' using new;
> return null;
> end
> $function$
>
> Has anyone experienced this before or even has a solution for this?
> I'd be very grateful.
>
> Kind regards
> Sebastian
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

pgsql-general by date:

Previous
From: John R Pierce
Date:
Subject: Re: Methods to quiesce PostgreSQL DB without configuring log archival
Next
From: azhwkd
Date:
Subject: Re: query locks up when run concurrently