Re: Increasing work_mem slows down query, why? - Mailing list pgsql-performance

From Pavel Stehule
Subject Re: Increasing work_mem slows down query, why?
Date
Msg-id CAFj8pRA+teK=u9ei82WXcxAXF2o+-6NShsKt8XBo20ycUcPVCQ@mail.gmail.com
Whole thread Raw
In response to Re: Increasing work_mem slows down query, why?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Increasing work_mem slows down query, why?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance


po 30. 3. 2020 v 18:02 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> po 30. 3. 2020 v 10:12 odesílatel Silvio Moioli <moio@suse.de> napsal:
>> ->  Sort  (cost=299108.00..300335.41 rows=490964 width=79)
>>         (actual time=6475.147..6494.111 rows=462600 loops=1)
>>         Output: rhnpackagecapability_1.name,
>>         rhnpackagecapability_1.version, rhnpackagecapability_1.id
>>         Sort Key: rhnpackagecapability_1.name
>>         Sort Method: quicksort  Memory: 79862kB
>>         Buffers: shared hit=7217
>>         ->  Seq Scan on public.rhnpackagecapability rhnpackagecapability_1  (cost=0.00..252699.00 rows=490964 width=79) (actual time=0.016..59.976 rows=490964 loops=1)

>> ->  Sort  (cost=299108.00..300335.41 rows=490964
>>         width=79) (actual time=6458.988..6477.151 rows=462600 loops=1)
>>         Output: rhnpackagecapability.id,
>>         rhnpackagecapability.name, rhnpackagecapability.version
>>         Sort Key: rhnpackagecapability.name
>>         Sort Method: quicksort  Memory: 79862kB
>>         Buffers: shared hit=7217
>>         ->  Seq Scan on public.rhnpackagecapability (cost=0.00..252699.00 rows=490964 width=79) (actual time=0.012..50.467 rows=490964 loops=1)

> I did some tests and it looks so a penalization for sort long keys is not
> too high. In your case it is reason why sort is very slow (probably due
> slow locales). Then the cost of hash join and sort is similar, although in
> reality it is not true.

Yeah, the run time of the slow query seems to be almost entirely expended
in these two sort steps, while the planner doesn't think that they'll be
very expensive.  Tweaking unrelated cost settings to work around that is
not going to be helpful.  What you'd be better off trying to do is fix
the slow sorting.  Is rhnpackagecapability.name some peculiar datatype?
If it's just relatively short text strings, as one would guess from the
column name, then what you must be looking at is really slow locale-based
sorting.  What's the database's LC_COLLATE setting?  Can you get away
with switching it to C?

There is another interesting thing

               ->  Hash Join  (cost=18263.69..18347.78 rows=1 width=10) (actual time=173.223..173.750 rows=1100 loops=1)
                     Output: wanted_capability_1.ordering, rhnpackagecapability.id
                     Hash Cond: (wanted_capability_1.name = (rhnpackagecapability.name)::text)
                     Join Filter: (NOT (wanted_capability_1.version IS DISTINCT FROM (rhnpackagecapability.version)::text))
                     Buffers: shared hit=7217
                     ->  CTE Scan on wanted_capability wanted_capability_1  (cost=0.00..22.00 rows=1100 width=68) (actual time=0.000..0.070 rows=1100 loops=1)
                           Output: wanted_capability_1.ordering, wanted_capability_1.name, wanted_capability_1.version
                     ->  Hash  (cost=12126.64..12126.64 rows=490964 width=79) (actual time=172.220..172.220 rows=490964 loops=1)
                           Output: rhnpackagecapability.id, rhnpackagecapability.name, rhnpackagecapability.version
                           Buckets: 524288  Batches: 1  Memory Usage: 53922kB
                           Buffers: shared hit=7217
                           ->  Seq Scan on public.rhnpackagecapability  (cost=0.00..12126.64 rows=490964 width=79) (actual time=0.008..52.573 rows=490964 loops=1)
                                 Output: rhnpackagecapability.id, rhnpackagecapability.name, rhnpackagecapability.version
                                 Buffers: shared hit=7217

CTE scan has only 1100 rows, public.rhnpackagecapability  has 490964 rows. But planner does hash from public.rhnpackagecapability table. It cannot be very effective.

Pavel



                        regards, tom lane

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Increasing work_mem slows down query, why?
Next
From: Tom Lane
Date:
Subject: Re: Increasing work_mem slows down query, why?