Thread: Performance problem...

Performance problem...

From
Marcin Giedz
Date:
Hello...


Our company is going to change SQL engine from MySQL to PSQL. Of course some
performance problems occured. Our server is Dual Xeon 3.0GHz + 8GB RAM +
RAID1(software - two 146GB SCSI 15k) for sql data + RAID1(software - two
146GB SCSI 15k) for pg_xlog. Postgres.conf parameters are as follows:

max_connections = 150
shared_buffers = 50000          # min 16, at least max_connections*2, 8KB each
work_mem = 2048         # min 64, size in KB
maintenance_work_mem = 524288   # min 1024, size in KB
checkpoint_segments = 32        # in logfile segments, min 1, 16MB each
archive_command = 'cp "%p" /mnt/logs/"%f"'              # command to use to
archive a logfile segment

effective_cache_size = 655360   # typically 8KB each
random_page_cost = 1.2          # units are one sequential page fetch cost
stats_start_collector = true
stats_row_level = true

Of course our system is Debian Sarge with Shared memory size = 1GB

Here is an example:

.... I know you don't have our schemas/tables etc. but I also attached QUERY
PLAN for such query. Maybe there is something wrong with this query maybe it
should be changed? or so?

SELECT DISTINCT t1.Id, t1.IdTypNazwa, t1.IdFirma, t1.Nazwa, t1.NazwaAscii,
t1.MskNazwa, t3.Id, t3.numer, t3.MskNumer, t4.Id, t4.numer, t4.MskNumer,
t5.Id, t5.numer, t5.MskNumer, t6.Id, t6.numer, t6.MskNumer, t7.Id, t7.numer,
t7.MskNumer, t8.Id, t8.numer, t8.MskNumer, t9.Id, t9.numer, t9.MskNumer,
t10.Id, t10.IdTypFormaPrawna, t10.MskFormaPrawna, t11.slowo FROM firmy.adres
AS n INNER JOIN firmy.nazwa AS t1 ON t1.IdFirma = n.IdPodmiot AND
t1.IdTypNazwa = 153 AND t1.Historia = 0 LEFT JOIN firmy.numer AS t3 ON
t3.IdPodmiot = t1.IdFirma AND t3.IdTypNumer = 156 AND t3.Historia = 0 LEFT
JOIN firmy.numer AS t4 ON t4.IdPodmiot = t1.IdFirma AND t4.IdTypNumer = 160
AND t4.Historia = 0 LEFT JOIN firmy.numer AS t5 ON t5.IdPodmiot = t1.IdFirma
AND t5.IdTypNumer = 155 AND t5.Historia = 0 LEFT JOIN firmy.numer AS t6 ON
t6.IdPodmiot = t1.IdFirma AND t6.IdTypNumer = 627 AND t6.Historia = 0 LEFT
JOIN firmy.numer AS t7 ON t7.IdPodmiot = t1.IdFirma AND t7.IdTypNumer = 848
AND t7.Historia = 0 LEFT JOIN firmy.numer AS t8 ON t8.IdPodmiot = t1.IdFirma
AND t8.IdTypNumer = 763 AND t8.Historia = 0 LEFT JOIN firmy.numer AS t9 ON
t9.IdPodmiot = t1.IdFirma AND t9.IdTypNumer = 762 AND t9.Historia = 0 LEFT
JOIN firmy.formaprawna AS t10 ON t10.IdFirma = t1.IdFirma LEFT JOIN
slowniki.tslownik AS t11 ON t11.idslownik = t10.IdTypFormaPrawna AND
t11.idjezyktyp = 2 WHERE n.Ulica ILIKE 'pu%' AND n.IdKraj = 190 LIMIT 25

pl=# explain analyze SELECT DISTINCT t1.Id, t1.IdTypNazwa, t1.IdFirma,
t1.Nazwa, t1.NazwaAscii,
pl-# t1.MskNazwa, t3.Id, t3.numer, t3.MskNumer, t4.Id, t4.numer, t4.MskNumer,
pl-# t5.Id, t5.numer, t5.MskNumer, t6.Id, t6.numer, t6.MskNumer, t7.Id,
t7.numer,
pl-# t7.MskNumer, t8.Id, t8.numer, t8.MskNumer, t9.Id, t9.numer, t9.MskNumer,
pl-# t10.Id, t10.IdTypFormaPrawna, t10.MskFormaPrawna, t11.slowo FROM
firmy.adres
pl-# AS n INNER JOIN firmy.nazwa AS t1 ON t1.IdFirma = n.IdPodmiot AND
pl-# t1.IdTypNazwa = 153 AND t1.Historia = 0 LEFT JOIN firmy.numer AS t3 ON
pl-# t3.IdPodmiot = t1.IdFirma AND t3.IdTypNumer = 156 AND t3.Historia = 0
LEFT
pl-# JOIN firmy.numer AS t4 ON t4.IdPodmiot = t1.IdFirma AND t4.IdTypNumer =
160
pl-# AND t4.Historia = 0 LEFT JOIN firmy.numer AS t5 ON t5.IdPodmiot =
t1.IdFirma
pl-# AND t5.IdTypNumer = 155 AND t5.Historia = 0 LEFT JOIN firmy.numer AS t6
ON
pl-# t6.IdPodmiot = t1.IdFirma AND t6.IdTypNumer = 627 AND t6.Historia = 0
LEFT
pl-# JOIN firmy.numer AS t7 ON t7.IdPodmiot = t1.IdFirma AND t7.IdTypNumer =
848
pl-# AND t7.Historia = 0 LEFT JOIN firmy.numer AS t8 ON t8.IdPodmiot =
t1.IdFirma
pl-# AND t8.IdTypNumer = 763 AND t8.Historia = 0 LEFT JOIN firmy.numer AS t9
ON
pl-# t9.IdPodmiot = t1.IdFirma AND t9.IdTypNumer = 762 AND t9.Historia = 0
LEFT
pl-# JOIN firmy.formaprawna AS t10 ON t10.IdFirma = t1.IdFirma LEFT JOIN
pl-# slowniki.tslownik AS t11 ON t11.idslownik = t10.IdTypFormaPrawna AND
pl-# t11.idjezyktyp = 2 WHERE n.Ulica ILIKE 'pu%' AND n.IdKraj = 190 LIMIT 25;

                      QUERY PLAN

 Limit  (cost=84757.05..84759.05 rows=25 width=264) (actual
time=3153.752..3154.418 rows=25 loops=1)
   ->  Unique  (cost=84757.05..88861.61 rows=51307 width=264) (actual
time=3153.748..3154.391 rows=25 loops=1)
         ->  Sort  (cost=84757.05..84885.32 rows=51307 width=264) (actual
time=3153.745..3153.768 rows=44 loops=1)
               Sort Key: t1.id, t1.idtypnazwa, t1.idfirma, t1.nazwa,
t1.nazwaascii, t1.msknazwa, t3.id, t3.numer, t3.msknumer, t4.id, t4.numer,
t4.msknumer,
 t5.id, t5.numer, t5.msknumer, t6.id, t6.numer, t6.msknumer, t7.id, t7.numer,
t7.msknumer, t8.id, t8.numer, t8.msknumer, t9.id, t9.numer, t9.msknumer,
t10.i
d, t10.idtypformaprawna, t10.mskformaprawna, t11.slowo
               ->  Hash Left Join  (cost=18104.92..77085.08 rows=51307
width=264) (actual time=643.240..3131.874 rows=1128 loops=1)
                     Hash Cond: ("outer".idtypformaprawna = "inner".idslownik)
                     ->  Merge Left Join  (cost=17680.10..73498.20 rows=50457
width=227) (actual time=626.711..3100.239 rows=1128 loops=1)
                           Merge Cond: ("outer".idfirma = "inner".idfirma)
                           ->  Merge Left Join  (cost=17680.10..71408.80
rows=50457 width=215) (actual time=626.209..2930.366 rows=1128 loops=1)
                                 Merge Cond: ("outer".idfirma =
"inner".idpodmiot)
                                 ->  Merge Left Join  (cost=17101.31..70698.83
rows=50457 width=195) (actual time=623.431..2915.149 rows=1128 loops=1)
                                       Merge Cond: ("outer".idfirma =
"inner".idpodmiot)
                                       ->  Merge Left Join
(cost=15446.22..68901.47 rows=50457 width=175) (actual time=614.432..2892.178
rows=1128 loops=1)
                                             Merge Cond: ("outer".idfirma =
"inner".idpodmiot)
                                             ->  Merge Left Join
(cost=7301.66..60235.12 rows=50457 width=155) (actual time=260.861..2454.992
rows=1128 loo
ps=1)
                                                   Merge Cond:
("outer".idfirma = "inner".idpodmiot)
                                                   ->  Merge Left Join
(cost=7301.66..49786.89 rows=50457 width=135) (actual time=258.841..2054.790
rows=11
28 loops=1)
                                                         Merge Cond:
("outer".idfirma = "inner".idpodmiot)
                                                         ->  Merge Left Join
(cost=0.00..42050.02 rows=50457 width=115) (actual time=5.759..1735.173
rows=1
128 loops=1)
                                                               Merge Cond:
("outer".idfirma = "inner".idpodmiot)
                                                               ->  Merge Left
Join  (cost=0.00..31611.72 rows=50457 width=95) (actual time=4.530..1337.763
r
ows=1128 loops=1)
                                                                     Merge
Cond: ("outer".idfirma = "inner".idpodmiot)
                                                                     ->  Merge
Join  (cost=0.00..21021.26 rows=50457 width=75) (actual time=2.709..813.394 r
ows=1128 loops=1)
                                                                           Merge
Cond: ("outer".idpodmiot = "inner".idfirma)
                                                                           ->
Index Scan using firmy_adres_idpodmiot on adres n  (cost=0.00..12596.46 rows=
42837 width=4) (actual time=1.261..337.163 rows=1128 loops=1)
                                                                                 Filter:
(((ulica)::text ~~* 'pu%'::text) AND (idkraj = 190))
                                                                           ->
Index Scan using firmy_nazwa_idfirma on nazwa t1  (cost=0.00..7539.00 rows=11
0134 width=75) (actual time=0.023..392.591 rows=109085 loops=1)
                                                                                 Filter:
((idtypnazwa = 153) AND (historia = 0))
                                                                     ->  Index
Scan using firmy_numer_idpodmiot on numer t3  (cost=0.00..9869.42 rows=75337
width=24) (actual time=0.018..463.952 rows=77155 loops=1)
                                                                           Filter:
((idtypnumer = 156) AND (historia = 0))
                                                               ->  Index Scan
using firmy_numer_idpodmiot on numer t4  (cost=0.00..9869.42 rows=56067
width=
24) (actual time=0.012..335.900 rows=57050 loops=1)
                                                                     Filter:
((idtypnumer = 160) AND (historia = 0))
                                                         ->  Sort
(cost=7301.66..7375.98 rows=29728 width=24) (actual time=252.882..288.512
rows=28192 loop
s=1)
                                                               Sort Key:
t5.idpodmiot
                                                               ->  Index Scan
using firmy_numer_idtypnumer on numer t5  (cost=0.00..5092.94 rows=29728
width
=24) (actual time=0.029..66.937 rows=27904 loops=1)
                                                                     Index
Cond: ((idtypnumer = 155) AND (historia = 0))
                                                   ->  Index Scan using
firmy_numer_idpodmiot on numer t6  (cost=0.00..9869.42 rows=57326 width=24)
(actual
time=0.049..340.456 rows=59336 loops=1)
                                                         Filter: ((idtypnumer
= 627) AND (historia = 0))
                                             ->  Sort  (cost=8144.56..8239.70
rows=38056 width=24) (actual time=353.474..395.087 rows=37693 loops=1)
                                                   Sort Key: t7.idpodmiot
                                                   ->  Index Scan using
firmy_numer_idtypnumer on numer t7  (cost=0.00..5249.29 rows=38056 width=24)
(actual
 time=0.032..90.333 rows=37549 loops=1)
                                                         Index Cond:
((idtypnumer = 848) AND (historia = 0))
                                       ->  Sort  (cost=1655.10..1658.97
rows=1550 width=24) (actual time=8.884..9.971 rows=1699 loops=1)
                                             Sort Key: t8.idpodmiot
                                             ->  Index Scan using
firmy_numer_idtypnumer on numer t8  (cost=0.00..1572.96 rows=1550 width=24)
(actual time=0
.053..5.287 rows=1690 loops=1)
                                                   Index Cond: ((idtypnumer =
763) AND (historia = 0))
                                 ->  Sort  (cost=578.79..580.00 rows=485
width=24) (actual time=2.698..3.077 rows=509 loops=1)
                                       Sort Key: t9.idpodmiot
                                       ->  Index Scan using
firmy_numer_idtypnumer on numer t9  (cost=0.00..557.15 rows=485 width=24)
(actual time=0.032..1.
757 rows=508 loops=1)
                                             Index Cond: ((idtypnumer = 762)
AND (historia = 0))
                           ->  Index Scan using formaprawna_idfirma_key on
formaprawna t10  (cost=0.00..1500.26 rows=58650 width=16) (actual
time=0.013..103
.667 rows=59116 loops=1)
                     ->  Hash  (cost=415.86..415.86 rows=3583 width=41)
(actual time=16.463..16.463 rows=0 loops=1)
                           ->  Seq Scan on tslownik t11  (cost=0.00..415.86
rows=3583 width=41) (actual time=0.020..12.802 rows=3595 loops=1)
                                 Filter: (idjezyktyp = 2)
 Total runtime: 3159.199 ms
(55 rows)

pl=#


I now .... query plan is not very readable but .... as you can see only one
seq scan occured for field "ulica" in table "adres" .... on the rest Index
scan were used. I'm rather new and really don't know how to interpret and
what are they mean: Merge Left Join? How to read actual time in each row ow
query plan.... and why this query takes about 3 second to receive 25 records
when on MySQL it takes only 0.14sec.? All indexes are made - the same indexes
on mysql and postgresql. What is the cause of such big difference?

Many thanks for response....

Marcin Giedz

Re: Performance problem...

From
Scott Marlowe
Date:
On Mon, 2005-03-14 at 12:03, Marcin Giedz wrote:
> Hello...
>
>
> Our company is going to change SQL engine from MySQL to PSQL. Of course some
> performance problems occured. Our server is Dual Xeon 3.0GHz + 8GB RAM +
> RAID1(software - two 146GB SCSI 15k) for sql data + RAID1(software - two
> 146GB SCSI 15k) for pg_xlog. Postgres.conf parameters are as follows:
>
> max_connections = 150
> shared_buffers = 50000          # min 16, at least max_connections*2, 8KB each
> work_mem = 2048         # min 64, size in KB

50,000 shared buffers may or may not be too much.  Try it at different
sizes from 5,000 or so up to 50,000 and find the "knee".  It's usually
closer to 10,000 than 50,000, but ymmv...

On the other hand, for a machine with 8 gigs of ram, 2 meg of work_mem
is pretty small.  Try bumping it up to 8 or 16 megs.  You can change
this one "on the fly" for testing, so just do:

set work_mem=16384;
and then run the query again and see if that helps.  The hash aggregate
method uses sort/work mem to do it's work, and if it doesn't think it
can hold the result set in that space the planner will pick another
method, like the merge left join.

In your explain analyze output, look for gross mismatches between
estimated and actual rows.  Most of yours here look pretty good in the
areas where the data is being collected, but during the merges, the
numbers are WAY off, but i'm not sure what to do to change that.

Re: Performance problem...

From
Marcin Giedz
Date:
Dnia poniedziałek, 14 marca 2005 19:32, Scott Marlowe napisał:
> On Mon, 2005-03-14 at 12:03, Marcin Giedz wrote:
> > Hello...
> >
> >
> > Our company is going to change SQL engine from MySQL to PSQL. Of course
> > some performance problems occured. Our server is Dual Xeon 3.0GHz + 8GB
> > RAM + RAID1(software - two 146GB SCSI 15k) for sql data + RAID1(software
> > - two 146GB SCSI 15k) for pg_xlog. Postgres.conf parameters are as
> > follows:
> >
> > max_connections = 150
> > shared_buffers = 50000          # min 16, at least max_connections*2, 8KB
> > each work_mem = 2048         # min 64, size in KB
>
> 50,000 shared buffers may or may not be too much.  Try it at different
> sizes from 5,000 or so up to 50,000 and find the "knee".  It's usually
> closer to 10,000 than 50,000, but ymmv...

Playing with shared_buffers from 10000 to 50000 doesn't change anything in
total time for this query :( But when I change work_mem a little higher to
10000 total runtime decreases a little about 10% but when I change
random_page_cost to 0.2 (I know that almost all papers say it should be
higher then 1.0) total runtime decreases almost 3 times and lasts about 900ms
- earlier with random_page_cost=1.2 it took 2.7s. Is it possible to have
random_page_cost on this value?

>
> On the other hand, for a machine with 8 gigs of ram, 2 meg of work_mem
> is pretty small.  Try bumping it up to 8 or 16 megs.  You can change
> this one "on the fly" for testing, so just do:
>
> set work_mem=16384;
> and then run the query again and see if that helps.  The hash aggregate
> method uses sort/work mem to do it's work, and if it doesn't think it
> can hold the result set in that space the planner will pick another
> method, like the merge left join.
>
> In your explain analyze output, look for gross mismatches between
> estimated and actual rows.  Most of yours here look pretty good in the
> areas where the data is being collected, but during the merges, the
> numbers are WAY off, but i'm not sure what to do to change that.

Re: Performance problem...

From
Scott Marlowe
Date:
On Tue, 2005-03-15 at 02:59, Marcin Giedz wrote:
> Dnia poniedziałek, 14 marca 2005 19:32, Scott Marlowe napisał:
> > On Mon, 2005-03-14 at 12:03, Marcin Giedz wrote:
> > > Hello...
> > >
> > >
> > > Our company is going to change SQL engine from MySQL to PSQL. Of course
> > > some performance problems occured. Our server is Dual Xeon 3.0GHz + 8GB
> > > RAM + RAID1(software - two 146GB SCSI 15k) for sql data + RAID1(software
> > > - two 146GB SCSI 15k) for pg_xlog. Postgres.conf parameters are as
> > > follows:
> > >
> > > max_connections = 150
> > > shared_buffers = 50000          # min 16, at least max_connections*2, 8KB
> > > each work_mem = 2048         # min 64, size in KB
> >
> > 50,000 shared buffers may or may not be too much.  Try it at different
> > sizes from 5,000 or so up to 50,000 and find the "knee".  It's usually
> > closer to 10,000 than 50,000, but ymmv...
>
> Playing with shared_buffers from 10000 to 50000 doesn't change anything in
> total time for this query :( But when I change work_mem a little higher to
> 10000 total runtime decreases a little about 10% but when I change
> random_page_cost to 0.2 (I know that almost all papers say it should be
> higher then 1.0) total runtime decreases almost 3 times and lasts about 900ms
> - earlier with random_page_cost=1.2 it took 2.7s. Is it possible to have
> random_page_cost on this value?

IF random_page_cost needs to be that low, then it's likely that the
query planner is either getting bad statistics and making a poor
decision, or that you've got a corner case that it just can't figure
out.  What does explain analyze <yourqueryhere> say with
random_page_cost set to 1.2 and 0.2?  HAve you run analyze and vacuumed
full lately?

Re: Performance problem...

From
Marcin Giedz
Date:
Dnia wtorek, 15 marca 2005 17:08, Scott Marlowe napisał:
> On Tue, 2005-03-15 at 02:59, Marcin Giedz wrote:
> > Dnia poniedziałek, 14 marca 2005 19:32, Scott Marlowe napisał:
> > > On Mon, 2005-03-14 at 12:03, Marcin Giedz wrote:
> > > > Hello...
> > > >
> > > >
> > > > Our company is going to change SQL engine from MySQL to PSQL. Of
> > > > course some performance problems occured. Our server is Dual Xeon
> > > > 3.0GHz + 8GB RAM + RAID1(software - two 146GB SCSI 15k) for sql data
> > > > + RAID1(software - two 146GB SCSI 15k) for pg_xlog. Postgres.conf
> > > > parameters are as follows:
> > > >
> > > > max_connections = 150
> > > > shared_buffers = 50000          # min 16, at least max_connections*2,
> > > > 8KB each work_mem = 2048         # min 64, size in KB
> > >
> > > 50,000 shared buffers may or may not be too much.  Try it at different
> > > sizes from 5,000 or so up to 50,000 and find the "knee".  It's usually
> > > closer to 10,000 than 50,000, but ymmv...
> >
> > Playing with shared_buffers from 10000 to 50000 doesn't change anything
> > in total time for this query :( But when I change work_mem a little
> > higher to 10000 total runtime decreases a little about 10% but when I
> > change random_page_cost to 0.2 (I know that almost all papers say it
> > should be higher then 1.0) total runtime decreases almost 3 times and
> > lasts about 900ms - earlier with random_page_cost=1.2 it took 2.7s. Is it
> > possible to have random_page_cost on this value?
>
> IF random_page_cost needs to be that low, then it's likely that the
> query planner is either getting bad statistics and making a poor
> decision, or that you've got a corner case that it just can't figure
> out.  What does explain analyze <yourqueryhere> say with
> random_page_cost set to 1.2 and 0.2?  HAve you run analyze and vacuumed
> full lately?
It cann't be possible - I've run vacuum full analyze - it didn't change
anything ;)

Marcin


Re: Performance problem...

From
Scott Marlowe
Date:
On Tue, 2005-03-15 at 10:17, Marcin Giedz wrote:
> Dnia wtorek, 15 marca 2005 17:08, Scott Marlowe napisał:
> > On Tue, 2005-03-15 at 02:59, Marcin Giedz wrote:
> > > Dnia poniedziałek, 14 marca 2005 19:32, Scott Marlowe napisał:
> > > > On Mon, 2005-03-14 at 12:03, Marcin Giedz wrote:
> > > > > Hello...
> > > > >
> > > > >
> > > > > Our company is going to change SQL engine from MySQL to PSQL. Of
> > > > > course some performance problems occured. Our server is Dual Xeon
> > > > > 3.0GHz + 8GB RAM + RAID1(software - two 146GB SCSI 15k) for sql data
> > > > > + RAID1(software - two 146GB SCSI 15k) for pg_xlog. Postgres.conf
> > > > > parameters are as follows:
> > > > >
> > > > > max_connections = 150
> > > > > shared_buffers = 50000          # min 16, at least max_connections*2,
> > > > > 8KB each work_mem = 2048         # min 64, size in KB
> > > >
> > > > 50,000 shared buffers may or may not be too much.  Try it at different
> > > > sizes from 5,000 or so up to 50,000 and find the "knee".  It's usually
> > > > closer to 10,000 than 50,000, but ymmv...
> > >
> > > Playing with shared_buffers from 10000 to 50000 doesn't change anything
> > > in total time for this query :( But when I change work_mem a little
> > > higher to 10000 total runtime decreases a little about 10% but when I
> > > change random_page_cost to 0.2 (I know that almost all papers say it
> > > should be higher then 1.0) total runtime decreases almost 3 times and
> > > lasts about 900ms - earlier with random_page_cost=1.2 it took 2.7s. Is it
> > > possible to have random_page_cost on this value?
> >
> > IF random_page_cost needs to be that low, then it's likely that the
> > query planner is either getting bad statistics and making a poor
> > decision, or that you've got a corner case that it just can't figure
> > out.  What does explain analyze <yourqueryhere> say with
> > random_page_cost set to 1.2 and 0.2?  HAve you run analyze and vacuumed
> > full lately?
> It cann't be possible - I've run vacuum full analyze - it didn't change
> anything ;)
>

You might want to try adjusting these values to see if you can get the
query planner to choose the faster plan without dropping
random_page_cost to 0.2.  I.e. give the query planner candy and flowers,
don't just bonk it on the head with a big stick and drag it back home...

#cpu_tuple_cost = 0.01          # (same)
#cpu_index_tuple_cost = 0.001   # (same)
#cpu_operator_cost = 0.0025     # (same)

Does explain analyze show a big difference in expected an actual rows
returned for any of the parts of the query plan?

FYI: Interview with Josh Berkus at Mad Penguin

From
"Wes Williams"
Date:
http://madpenguin.org/cms/html/62/3677.html

I for one think that Postgres 8.0 is great and that the largest hurdle for the project is the lack of [easy] developer
toolsthat MySQL has.  This in turn, seems to make learning Postgres and SQL more difficult for less experienced users
thatturn to MySQL for a quick solution. 

Keep up the excellent work all!


Re: FYI: Interview with Josh Berkus at Mad Penguin

From
Robert Perry
Date:
    I am not certain that this is the correct forum for Kudos, but after
taking a long break from using PostgreSQL I am also delighted with how
good it is doing.  I always hated the fact the it was missing table
spaced and I am also excited about Point In Time backups.

    Maybe it is just because I learned PostgreSQLfirst, but I have always
felt much more comfortable with it as opposed to mysql.  But, then I
also do not like to use GUI tools much.
On Mar 15, 2005, at 4:12 PM, Wes Williams wrote:

> http://madpenguin.org/cms/html/62/3677.html
>
> I for one think that Postgres 8.0 is great and that the largest hurdle
> for the project is the lack of [easy] developer tools that MySQL has.
> This in turn, seems to make learning Postgres and SQL more difficult
> for less experienced users that turn to MySQL for a quick solution.
>
> Keep up the excellent work all!
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>


Re: FYI: Interview with Josh Berkus at Mad Penguin

From
"Wes Williams"
Date:
I suppose I should have apologized if that was off topic entirely, but
seeing as how there are many Admins in this group with 7.x or older
questions being sent, I thought some insight into 8.0+ would be useful for
such a short read to those who were interested enough to follow the link.

As for me, I've just been learning SQL over the past few months and started
with MySQL since it was easier and "faster" to setup a solution.  It was the
SQL-newbie friendly GUI tools that supported MySQL that attracted me at
first, but even in this relatively short period of time I started to see the
merits of Postgres and have delayed my projects until I can learn more about
Postgres and the underpinnings of SQL.  My primary point being that if more
of the MySQL tools worked with Postgres (although I do understand the
technical challenges) or that phpPgAdmin was as exhaustive as phpMyAdmin,
more users might be FIRST inclined to start with Postgres.

Again, I apologize to anyone

-----Original Message-----
From: pgsql-admin-owner@postgresql.org
[mailto:pgsql-admin-owner@postgresql.org]On Behalf Of Robert Perry
Sent: Tuesday, March 15, 2005 5:06 PM
To: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] FYI: Interview with Josh Berkus at Mad Penguin


    I am not certain that this is the correct forum for Kudos, but after
taking a long break from using PostgreSQL I am also delighted with how
good it is doing.  I always hated the fact the it was missing table
spaced and I am also excited about Point In Time backups.

    Maybe it is just because I learned PostgreSQLfirst, but I have always
felt much more comfortable with it as opposed to mysql.  But, then I
also do not like to use GUI tools much.
On Mar 15, 2005, at 4:12 PM, Wes Williams wrote:

> http://madpenguin.org/cms/html/62/3677.html
>
> I for one think that Postgres 8.0 is great and that the largest hurdle
> for the project is the lack of [easy] developer tools that MySQL has.
> This in turn, seems to make learning Postgres and SQL more difficult
> for less experienced users that turn to MySQL for a quick solution.
>
> Keep up the excellent work all!
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)


Re: Performance problem...

From
Marcin Giedz
Date:
Dnia wtorek, 15 marca 2005 18:00, Scott Marlowe napisał:

OK now I know I mys query lasts so long:

SELECT DISTINCT t1.Id, t1.IdTypNazwa, t1.IdFirma, t1.Nazwa, t1.NazwaAscii,
t1.MskNazwa, t3.Id, t3.numer, t3.MskNumer, t4.Id, t4.numer, t4.MskNumer,
t5.Id, t5.numer, t5.MskNumer, t6.Id, t6.numer, t6.MskNumer, t7.Id, t7.numer,
t7.MskNumer, t8.Id, t8.numer, t8.MskNumer, t9.Id, t9.numer, t9.MskNumer,
t10.Id, t10.IdTypFormaPrawna, t10.MskFormaPrawna, t11.slowo FROM firmy.adres
AS n INNER JOIN firmy.nazwa AS t1 ON t1.IdFirma = n.IdPodmiot AND
t1.IdTypNazwa = 153 AND t1.Historia = 0 LEFT JOIN firmy.numer AS t3 ON
t3.IdPodmiot = t1.IdFirma AND t3.IdTypNumer = 156 AND t3.Historia = 0 LEFT
JOIN firmy.numer AS t4 ON t4.IdPodmiot = t1.IdFirma AND t4.IdTypNumer = 160
AND t4.Historia = 0 LEFT JOIN firmy.numer AS t5 ON t5.IdPodmiot = t1.IdFirma
AND t5.IdTypNumer = 155 AND t5.Historia = 0 LEFT JOIN firmy.numer AS t6 ON
t6.IdPodmiot = t1.IdFirma AND t6.IdTypNumer = 627 AND t6.Historia = 0 LEFT
JOIN firmy.numer AS t7 ON t7.IdPodmiot = t1.IdFirma AND t7.IdTypNumer = 848
AND t7.Historia = 0 LEFT JOIN firmy.numer AS t8 ON t8.IdPodmiot = t1.IdFirma
AND t8.IdTypNumer = 763 AND t8.Historia = 0 LEFT JOIN firmy.numer AS t9 ON
t9.IdPodmiot = t1.IdFirma AND t9.IdTypNumer = 762 AND t9.Historia = 0 LEFT
JOIN firmy.formaprawna AS t10 ON t10.IdFirma = t1.IdFirma LEFT JOIN
slowniki.tslownik AS t11 ON t11.idslownik = t10.IdTypFormaPrawna AND
t11.idjezyktyp = 2 WHERE n.Miasto ILIKE 'warszawa%' LIMIT 25

...because of DISTINCT :D With DISTINCT it tooks about 0,5 minute to search
several records ..... without DISTINCT (of course a few doubled records
occured) but query tooks 10ms ;). On the other hand doubled records are
annoying ..... is it possible to avoid these without changing data in tables?
Maybe some other query?

BR,
Marcin

> On Tue, 2005-03-15 at 10:17, Marcin Giedz wrote:
> > Dnia wtorek, 15 marca 2005 17:08, Scott Marlowe napisał:
> > > On Tue, 2005-03-15 at 02:59, Marcin Giedz wrote:
> > > > Dnia poniedziałek, 14 marca 2005 19:32, Scott Marlowe napisał:
> > > > > On Mon, 2005-03-14 at 12:03, Marcin Giedz wrote:
> > > > > > Hello...
> > > > > >
> > > > > >
> > > > > > Our company is going to change SQL engine from MySQL to PSQL. Of
> > > > > > course some performance problems occured. Our server is Dual Xeon
> > > > > > 3.0GHz + 8GB RAM + RAID1(software - two 146GB SCSI 15k) for sql
> > > > > > data + RAID1(software - two 146GB SCSI 15k) for pg_xlog.
> > > > > > Postgres.conf parameters are as follows:
> > > > > >
> > > > > > max_connections = 150
> > > > > > shared_buffers = 50000          # min 16, at least
> > > > > > max_connections*2, 8KB each work_mem = 2048         # min 64,
> > > > > > size in KB
> > > > >
> > > > > 50,000 shared buffers may or may not be too much.  Try it at
> > > > > different sizes from 5,000 or so up to 50,000 and find the "knee".
> > > > > It's usually closer to 10,000 than 50,000, but ymmv...
> > > >
> > > > Playing with shared_buffers from 10000 to 50000 doesn't change
> > > > anything in total time for this query :( But when I change work_mem a
> > > > little higher to 10000 total runtime decreases a little about 10% but
> > > > when I change random_page_cost to 0.2 (I know that almost all papers
> > > > say it should be higher then 1.0) total runtime decreases almost 3
> > > > times and lasts about 900ms - earlier with random_page_cost=1.2 it
> > > > took 2.7s. Is it possible to have random_page_cost on this value?
> > >
> > > IF random_page_cost needs to be that low, then it's likely that the
> > > query planner is either getting bad statistics and making a poor
> > > decision, or that you've got a corner case that it just can't figure
> > > out.  What does explain analyze <yourqueryhere> say with
> > > random_page_cost set to 1.2 and 0.2?  HAve you run analyze and vacuumed
> > > full lately?
> >
> > It cann't be possible - I've run vacuum full analyze - it didn't change
> > anything ;)
>
> You might want to try adjusting these values to see if you can get the
> query planner to choose the faster plan without dropping
> random_page_cost to 0.2.  I.e. give the query planner candy and flowers,
> don't just bonk it on the head with a big stick and drag it back home...
>
> #cpu_tuple_cost = 0.01          # (same)
> #cpu_index_tuple_cost = 0.001   # (same)
> #cpu_operator_cost = 0.0025     # (same)
>
> Does explain analyze show a big difference in expected an actual rows
> returned for any of the parts of the query plan?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match

Re: Performance problem...

From
Scott Marlowe
Date:
On Wed, 2005-03-16 at 07:46, Marcin Giedz wrote:
> Dnia wtorek, 15 marca 2005 18:00, Scott Marlowe napisał:
>
> OK now I know I mys query lasts so long:
>
> SELECT DISTINCT t1.Id, t1.IdTypNazwa, t1.IdFirma, t1.Nazwa, t1.NazwaAscii,
> t1.MskNazwa, t3.Id, t3.numer, t3.MskNumer, t4.Id, t4.numer, t4.MskNumer,
> t5.Id, t5.numer, t5.MskNumer, t6.Id, t6.numer, t6.MskNumer, t7.Id, t7.numer,
> t7.MskNumer, t8.Id, t8.numer, t8.MskNumer, t9.Id, t9.numer, t9.MskNumer,
> t10.Id, t10.IdTypFormaPrawna, t10.MskFormaPrawna, t11.slowo FROM firmy.adres
> AS n INNER JOIN firmy.nazwa AS t1 ON t1.IdFirma = n.IdPodmiot AND
> t1.IdTypNazwa = 153 AND t1.Historia = 0 LEFT JOIN firmy.numer AS t3 ON
> t3.IdPodmiot = t1.IdFirma AND t3.IdTypNumer = 156 AND t3.Historia = 0 LEFT
> JOIN firmy.numer AS t4 ON t4.IdPodmiot = t1.IdFirma AND t4.IdTypNumer = 160
> AND t4.Historia = 0 LEFT JOIN firmy.numer AS t5 ON t5.IdPodmiot = t1.IdFirma
> AND t5.IdTypNumer = 155 AND t5.Historia = 0 LEFT JOIN firmy.numer AS t6 ON
> t6.IdPodmiot = t1.IdFirma AND t6.IdTypNumer = 627 AND t6.Historia = 0 LEFT
> JOIN firmy.numer AS t7 ON t7.IdPodmiot = t1.IdFirma AND t7.IdTypNumer = 848
> AND t7.Historia = 0 LEFT JOIN firmy.numer AS t8 ON t8.IdPodmiot = t1.IdFirma
> AND t8.IdTypNumer = 763 AND t8.Historia = 0 LEFT JOIN firmy.numer AS t9 ON
> t9.IdPodmiot = t1.IdFirma AND t9.IdTypNumer = 762 AND t9.Historia = 0 LEFT
> JOIN firmy.formaprawna AS t10 ON t10.IdFirma = t1.IdFirma LEFT JOIN
> slowniki.tslownik AS t11 ON t11.idslownik = t10.IdTypFormaPrawna AND
> t11.idjezyktyp = 2 WHERE n.Miasto ILIKE 'warszawa%' LIMIT 25
>
> ...because of DISTINCT :D With DISTINCT it tooks about 0,5 minute to search
> several records ..... without DISTINCT (of course a few doubled records
> occured) but query tooks 10ms ;). On the other hand doubled records are
> annoying ..... is it possible to avoid these without changing data in tables?
> Maybe some other query?

Can you enclose the whole query in something like:

select distinct * from (your query here) as a

and see if that helps?

Re: FYI: Interview with Josh Berkus at Mad Penguin

From
John DeSoi
Date:
On Mar 16, 2005, at 8:09 AM, Wes Williams wrote:

> phpPgAdmin was as exhaustive as phpMyAdmin,
> more users might be FIRST inclined to start with Postgres


I'm not familiar with phpMyAdmin. Could you give a few examples of
things it has that phpPgAdmin is lacking?

Thanks,

John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL


Re: Performance problem...

From
Marcin Giedz
Date:
Dnia środa, 16 marca 2005 15:05, Scott Marlowe napisał:
> On Wed, 2005-03-16 at 07:46, Marcin Giedz wrote:
> > Dnia wtorek, 15 marca 2005 18:00, Scott Marlowe napisał:
> >
> > OK now I know I mys query lasts so long:
> >
> > SELECT DISTINCT t1.Id, t1.IdTypNazwa, t1.IdFirma, t1.Nazwa,
> > t1.NazwaAscii, t1.MskNazwa, t3.Id, t3.numer, t3.MskNumer, t4.Id,
> > t4.numer, t4.MskNumer, t5.Id, t5.numer, t5.MskNumer, t6.Id, t6.numer,
> > t6.MskNumer, t7.Id, t7.numer, t7.MskNumer, t8.Id, t8.numer, t8.MskNumer,
> > t9.Id, t9.numer, t9.MskNumer, t10.Id, t10.IdTypFormaPrawna,
> > t10.MskFormaPrawna, t11.slowo FROM firmy.adres AS n INNER JOIN
> > firmy.nazwa AS t1 ON t1.IdFirma = n.IdPodmiot AND t1.IdTypNazwa = 153 AND
> > t1.Historia = 0 LEFT JOIN firmy.numer AS t3 ON t3.IdPodmiot = t1.IdFirma
> > AND t3.IdTypNumer = 156 AND t3.Historia = 0 LEFT JOIN firmy.numer AS t4
> > ON t4.IdPodmiot = t1.IdFirma AND t4.IdTypNumer = 160 AND t4.Historia = 0
> > LEFT JOIN firmy.numer AS t5 ON t5.IdPodmiot = t1.IdFirma AND
> > t5.IdTypNumer = 155 AND t5.Historia = 0 LEFT JOIN firmy.numer AS t6 ON
> > t6.IdPodmiot = t1.IdFirma AND t6.IdTypNumer = 627 AND t6.Historia = 0
> > LEFT JOIN firmy.numer AS t7 ON t7.IdPodmiot = t1.IdFirma AND
> > t7.IdTypNumer = 848 AND t7.Historia = 0 LEFT JOIN firmy.numer AS t8 ON
> > t8.IdPodmiot = t1.IdFirma AND t8.IdTypNumer = 763 AND t8.Historia = 0
> > LEFT JOIN firmy.numer AS t9 ON t9.IdPodmiot = t1.IdFirma AND
> > t9.IdTypNumer = 762 AND t9.Historia = 0 LEFT JOIN firmy.formaprawna AS
> > t10 ON t10.IdFirma = t1.IdFirma LEFT JOIN slowniki.tslownik AS t11 ON
> > t11.idslownik = t10.IdTypFormaPrawna AND t11.idjezyktyp = 2 WHERE
> > n.Miasto ILIKE 'warszawa%' LIMIT 25
> >
> > ...because of DISTINCT :D With DISTINCT it tooks about 0,5 minute to
> > search several records ..... without DISTINCT (of course a few doubled
> > records occured) but query tooks 10ms ;). On the other hand doubled
> > records are annoying ..... is it possible to avoid these without changing
> > data in tables? Maybe some other query?
>
> Can you enclose the whole query in something like:
>
> select distinct * from (your query here) as a
>
> and see if that helps?
>

Yes it works excellent ;) Thanks so much .....

BR,
Marcin

> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org