Re: [SQL] Re: How to optimize a query... -- Extreme performance difference, same EXPLAIN - Mailing list pgsql-sql

From secret
Subject Re: [SQL] Re: How to optimize a query... -- Extreme performance difference, same EXPLAIN
Date
Msg-id 36E6EB54.D6624BD@kearneydev.com
Whole thread Raw
In response to Re: How to optimize a query...  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-sql
Tom Lane wrote:

> >     I originally had this query with an OR clause instead of the UNION,
> > which made it amazingly impossibly slow, with the UNION it's far faster
> > however it's still very slow,(10,000+ rows in both stables), is there a
> > way to speed this up?
>
> Hmm.  What does EXPLAIN give as the query plan if you use the OR
> approach?  The UNION method requires much more work than a decent OR
> plan would need; as you can see from the query plan, it has to sort and
> unique-ify the results of the two sub-selects in order to discard
> duplicate tuples.  If you don't mind seeing the same tuple twice when
> it matches on both PO fields, you could use UNION ALL rather than UNION
> to avoid the sort step.  But that's just a hack...
>

    Explain seems to be telling a different story now...  It used to give a
bunch of
sequential scans, when I used UNION it gave less hash joins, I thought the
less
hash joins might be better(And they were, one would take hours & run out of
memory
the other would take an hour & not run out of memory)... I have a stripped
down version
of the query I need, the UNION takes:
! system usage stats:
!       938.603740 elapsed 914.250000 user 1.140000 system sec
!       [914.260000 user 1.140000 sys total]
!       0/0 [0/0] filesystem blocks in/out
!       3799/13766 [3915/13873] page faults/reclaims, 654 [654] swaps
!       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!       0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!       Shared blocks:          0 read,          0 written, buffer hit rate =
100.00%
!       Local  blocks:          0 read,          0 written, buffer hit rate =
0.00%
!       Direct blocks:        688 read,        716 written

When I use OR:
! system usage stats:
!       706.158704 elapsed 698.970000 user 0.370000 system sec
!       [698.980000 user 0.370000 sys total]
!       0/0 [0/0] filesystem blocks in/out
!       1025/6810 [1062/6906] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!       0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!       Shared blocks:        232 read,          0 written, buffer hit rate =
84.61%
!       Local  blocks:          0 read,          0 written, buffer hit rate =
0.00%
!       Direct blocks:          0 read,          0 written


>
> I think the problem might be that Postgres doesn't know how to handle
> ORed join clauses very efficiently.  Something to work on in the future.
>

    I think that may of been because I didn' tknow to VACUUM ANALYZE,
however today I found a much more disturbing thing, if I simply change the
order of the SELECT clause I can change a 9.69 second query into a 605 second
query,
below is the example:

SELECT po_id,                                 <- PK of po
material.name,                                    <- Notice material name
here.
ticket_pk                                            <- PK of tickets

FROM po,tickets,material
WHERE   po_id=material_po
                AND po.units_id=tickets.units_id
                AND po.material_id=material.material_id
;
! system usage stats:
!       315.427223 elapsed 315.010000 user 0.250000 system sec
!       [315.020000 user 0.250000 sys total]
!       0/0 [0/0] filesystem blocks in/out
!       645/6717 [682/6812] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!       0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!       Shared blocks:        144 read,          0 written, buffer hit rate =
89.66%
!       Local  blocks:          0 read,          0 written, buffer hit rate =
0.00%
!       Direct blocks:          0 read,          0 written


select po_id,ticket_pk,material.name

FROM po,tickets,material

WHERE   po_id=material_po AND
        po.units_id=tickets.units_id AND
        po.material_id=material.material_id;

! system usage stats:
!       2.290408 elapsed 1.890000 user 0.250000 system sec
!       [1.900000 user 0.250000 sys total]
!       0/0 [0/0] filesystem blocks in/out
!       73/6691 [110/6786] page faults/reclaims, 0 [0] swaps
!       0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
!       0/0 [0/0] voluntary/involuntary context switches
! postgres usage stats:
!       Shared blocks:          0 read,          0 written, buffer hit rate =
10
0.00%
!       Local  blocks:          0 read,          0 written, buffer hit rate =
0.
00%
!       Direct blocks:          0 read,          0 written

    315 seconds vs 2.3 seconds?  Is there some light you can shed on
how queries are processed under PostgreSQL?  They both have the
exact same EXPLAIN:

Hash Join  (cost=5773.47 size=24494486 width=36)
  ->  Hash Join  (cost=899.69 size=101128 width=26)
        ->  Seq Scan on po  (cost=530.38 size=10860 width=10)
        ->  Hash  (cost=0.00 size=0 width=0)
              ->  Seq Scan on material  (cost=5.47 size=105 width=16)
  ->  Hash  (cost=0.00 size=0 width=0)
        ->  Seq Scan on tickets  (cost=878.78 size=19872 width=10)

    The only GUESS I can come up with is that It's joining po & material then
tickets,
vs po & tickets then material, the latter one would be more efficient, since
po & material
are going to knock out lots from po, so there would be less joining against
material which
will have a component in material for all po.  I have indexes on all involved
rows.

>
> >     I have another query that joins the result of this with 5 other
> > tables, unfortunately that one takes like 10 minutes...
>
> How long does it take to EXPLAIN that query?  Postgres' optimizer has
> some bugs that cause it to take an unreasonable amount of time to plan
> a query that joins more than a few tables.  You can tell whether it's
> the planning or execution time that's the problem by comparing EXPLAIN
> runtime to the actual query.  (The bugs are fixed in 6.5, btw.  In the
> meantime a workaround is to reduce the GEQO threshold to less than the
> number of tables in the query that's giving you trouble.)
>
>                         regards, tom lane

    The explain takes a few seconds, maybe 10... It's definately a runtime
thing,
one time I ran the silly thing and it sucked up 192M of RAM(64M Physical +
128M
swap) then died... Sigh.  I'm slowly working my way back up to it, I just
bought
128M extra RAM... If you could shed some light on the extreme performance
difference for a simple change of order in the SELECT part of it I'd
appreciate it....
I don't think PostgreSQL is using my indexes very well, or at all... :(

    Sorry about the length of the message, I wanted to include as much detail

as possible.

David Secret
MIS Director
Kearney Development Co., Inc.




pgsql-sql by date:

Previous
From: Sue Hanen
Date:
Subject: Recording Date,Time and UserId when a record is modified
Next
From: Dan Lauterbach
Date:
Subject: How match percent sign in SELECT using LIKE?