Re: Performance problem... - Mailing list pgsql-admin

From Marcin Giedz
Subject Re: Performance problem...
Date
Msg-id 200503161446.26640.marcin.giedz@eulerhermes.pl
Whole thread Raw
In response to Re: Performance problem...  (Scott Marlowe <smarlowe@g2switchworks.com>)
Responses Re: Performance problem...  (Scott Marlowe <smarlowe@g2switchworks.com>)
List pgsql-admin
Dnia wtorek, 15 marca 2005 18:00, Scott Marlowe napisał:

OK now I know I mys query lasts so long:

SELECT DISTINCT t1.Id, t1.IdTypNazwa, t1.IdFirma, t1.Nazwa, t1.NazwaAscii,
t1.MskNazwa, t3.Id, t3.numer, t3.MskNumer, t4.Id, t4.numer, t4.MskNumer,
t5.Id, t5.numer, t5.MskNumer, t6.Id, t6.numer, t6.MskNumer, t7.Id, t7.numer,
t7.MskNumer, t8.Id, t8.numer, t8.MskNumer, t9.Id, t9.numer, t9.MskNumer,
t10.Id, t10.IdTypFormaPrawna, t10.MskFormaPrawna, t11.slowo FROM firmy.adres
AS n INNER JOIN firmy.nazwa AS t1 ON t1.IdFirma = n.IdPodmiot AND
t1.IdTypNazwa = 153 AND t1.Historia = 0 LEFT JOIN firmy.numer AS t3 ON
t3.IdPodmiot = t1.IdFirma AND t3.IdTypNumer = 156 AND t3.Historia = 0 LEFT
JOIN firmy.numer AS t4 ON t4.IdPodmiot = t1.IdFirma AND t4.IdTypNumer = 160
AND t4.Historia = 0 LEFT JOIN firmy.numer AS t5 ON t5.IdPodmiot = t1.IdFirma
AND t5.IdTypNumer = 155 AND t5.Historia = 0 LEFT JOIN firmy.numer AS t6 ON
t6.IdPodmiot = t1.IdFirma AND t6.IdTypNumer = 627 AND t6.Historia = 0 LEFT
JOIN firmy.numer AS t7 ON t7.IdPodmiot = t1.IdFirma AND t7.IdTypNumer = 848
AND t7.Historia = 0 LEFT JOIN firmy.numer AS t8 ON t8.IdPodmiot = t1.IdFirma
AND t8.IdTypNumer = 763 AND t8.Historia = 0 LEFT JOIN firmy.numer AS t9 ON
t9.IdPodmiot = t1.IdFirma AND t9.IdTypNumer = 762 AND t9.Historia = 0 LEFT
JOIN firmy.formaprawna AS t10 ON t10.IdFirma = t1.IdFirma LEFT JOIN
slowniki.tslownik AS t11 ON t11.idslownik = t10.IdTypFormaPrawna AND
t11.idjezyktyp = 2 WHERE n.Miasto ILIKE 'warszawa%' LIMIT 25

...because of DISTINCT :D With DISTINCT it tooks about 0,5 minute to search
several records ..... without DISTINCT (of course a few doubled records
occured) but query tooks 10ms ;). On the other hand doubled records are
annoying ..... is it possible to avoid these without changing data in tables?
Maybe some other query?

BR,
Marcin

> On Tue, 2005-03-15 at 10:17, Marcin Giedz wrote:
> > Dnia wtorek, 15 marca 2005 17:08, Scott Marlowe napisał:
> > > On Tue, 2005-03-15 at 02:59, Marcin Giedz wrote:
> > > > Dnia poniedziałek, 14 marca 2005 19:32, Scott Marlowe napisał:
> > > > > On Mon, 2005-03-14 at 12:03, Marcin Giedz wrote:
> > > > > > Hello...
> > > > > >
> > > > > >
> > > > > > Our company is going to change SQL engine from MySQL to PSQL. Of
> > > > > > course some performance problems occured. Our server is Dual Xeon
> > > > > > 3.0GHz + 8GB RAM + RAID1(software - two 146GB SCSI 15k) for sql
> > > > > > data + RAID1(software - two 146GB SCSI 15k) for pg_xlog.
> > > > > > Postgres.conf parameters are as follows:
> > > > > >
> > > > > > max_connections = 150
> > > > > > shared_buffers = 50000          # min 16, at least
> > > > > > max_connections*2, 8KB each work_mem = 2048         # min 64,
> > > > > > size in KB
> > > > >
> > > > > 50,000 shared buffers may or may not be too much.  Try it at
> > > > > different sizes from 5,000 or so up to 50,000 and find the "knee".
> > > > > It's usually closer to 10,000 than 50,000, but ymmv...
> > > >
> > > > Playing with shared_buffers from 10000 to 50000 doesn't change
> > > > anything in total time for this query :( But when I change work_mem a
> > > > little higher to 10000 total runtime decreases a little about 10% but
> > > > when I change random_page_cost to 0.2 (I know that almost all papers
> > > > say it should be higher then 1.0) total runtime decreases almost 3
> > > > times and lasts about 900ms - earlier with random_page_cost=1.2 it
> > > > took 2.7s. Is it possible to have random_page_cost on this value?
> > >
> > > IF random_page_cost needs to be that low, then it's likely that the
> > > query planner is either getting bad statistics and making a poor
> > > decision, or that you've got a corner case that it just can't figure
> > > out.  What does explain analyze <yourqueryhere> say with
> > > random_page_cost set to 1.2 and 0.2?  HAve you run analyze and vacuumed
> > > full lately?
> >
> > It cann't be possible - I've run vacuum full analyze - it didn't change
> > anything ;)
>
> You might want to try adjusting these values to see if you can get the
> query planner to choose the faster plan without dropping
> random_page_cost to 0.2.  I.e. give the query planner candy and flowers,
> don't just bonk it on the head with a big stick and drag it back home...
>
> #cpu_tuple_cost = 0.01          # (same)
> #cpu_index_tuple_cost = 0.001   # (same)
> #cpu_operator_cost = 0.0025     # (same)
>
> Does explain analyze show a big difference in expected an actual rows
> returned for any of the parts of the query plan?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match

pgsql-admin by date:

Previous
From: "Wes Williams"
Date:
Subject: Re: FYI: Interview with Josh Berkus at Mad Penguin
Next
From: Scott Marlowe
Date:
Subject: Re: Performance problem...