Thread: another query optimization question

another query optimization question

From

David Teran

Date:

30 January 2004, 14:00:47

Hi,

its me again. As far as we tested postgresql ist fast, very fast
compared to the other db system we test and are using currently.

  We are now testing some test databases on Postgres. We use one
function which simply calculates a difference between two values and
checks if on value is 0, so something like this:


declare
diff integer;
begin
if $1 > $2
then
diff := $1 -$2;
return diff * diff;
else
return 0;
end if;
end;

Language for this function is plpgsql

executing a select like this:

select
sum(job_property_difference(t0.int_value, t1.int_value)) as rank
   from
   job_property t0,
   job_property t1
   where
   t0.id_job_profile = 911
   and t0.id_job_attribute = t1.id_job_attribute
   and t1.id_job_profile in (select id_job_profile from unemployed)
   and t1.id_job_profile <> 911;

results in a query plan result:

                                                             QUERY PLAN
------------------------------------------------------------------------
-----------------------------------------------------------
  Aggregate  (cost=70521.28..70521.28 rows=1 width=8)
    ->  Merge Join  (cost=66272.11..70158.29 rows=145194 width=8)
          Merge Cond: ("outer".id_job_attribute =
"inner".id_job_attribute)
          ->  Sort  (cost=31.53..32.44 rows=366 width=8)
                Sort Key: t0.id_job_attribute
                ->  Index Scan using
job_property__id_job_profile__fk_index on job_property t0
(cost=0.00..15.95 rows=366 width=8)
                      Index Cond: (id_job_profile = 911)
          ->  Sort  (cost=66240.58..67456.79 rows=486483 width=8)
                Sort Key: t1.id_job_attribute
                ->  Hash IN Join  (cost=34.08..20287.32 rows=486483
width=8)
                      Hash Cond: ("outer".id_job_profile =
"inner".id_job_profile)
                      ->  Seq Scan on job_property t1
(cost=0.00..12597.89 rows=558106 width=12)
                            Filter: (id_job_profile <> 911)
                      ->  Hash  (cost=31.46..31.46 rows=1046 width=4)
                            ->  Seq Scan on unemployed
(cost=0.00..31.46 rows=1046 width=4)
(21 rows)




This takes about 1minute, 45 seconds on a test database with about
31.882 job_profile and 8.483.005 job_property records. The final
solution will have about 1.000.000 job_profile records and, well ...
about 266.074.901  so we wonder what options we have in order to
improve this select. Should we rewrite the function (and others) in C?
Turning off seqscans makes it slower which might be because psql is
hopping between the index and the row values back and forth as a lot of
rows are involved.

Any hint how to speed up this would be great.


regards David

Re: another query optimization question

From

PC Drew

Date:

30 January 2004, 14:09:50

On Jan 30, 2004, at 11:00 AM, David Teran wrote:
>
> executing a select like this:
>
> select
> sum(job_property_difference(t0.int_value, t1.int_value)) as rank
>   from
>   job_property t0,
>   job_property t1
>   where
>   t0.id_job_profile = 911
>   and t0.id_job_attribute = t1.id_job_attribute
>   and t1.id_job_profile in (select id_job_profile from unemployed)
>   and t1.id_job_profile <> 911;
>
> results in a query plan result:
>

What's the goal here?  Are you trying to rank by how the int_value
relates to each other?  I'd like to know more about what kind of result
you're trying to achieve.

--
PC Drew

Re: another query optimization question

From

Stephan Szabo

Date:

30 January 2004, 14:10:19

On Fri, 30 Jan 2004, David Teran wrote:

> select
> sum(job_property_difference(t0.int_value, t1.int_value)) as rank
>    from
>    job_property t0,
>    job_property t1
>    where
>    t0.id_job_profile = 911
>    and t0.id_job_attribute = t1.id_job_attribute
>    and t1.id_job_profile in (select id_job_profile from unemployed)
>    and t1.id_job_profile <> 911;
>
> results in a query plan result:

Can we see explain analyze output for the query, it'll give more
information about actual time and row counts than plain explain.

Re: another query optimization question

From

David Teran

Date:

30 January 2004, 14:20:30

Hi,


On 30.01.2004, at 19:10, Stephan Szabo wrote:

>
> On Fri, 30 Jan 2004, David Teran wrote:
>
>> select
>> sum(job_property_difference(t0.int_value, t1.int_value)) as rank
>>    from
>>    job_property t0,
>>    job_property t1
>>    where
>>    t0.id_job_profile = 911
>>    and t0.id_job_attribute = t1.id_job_attribute
>>    and t1.id_job_profile in (select id_job_profile from unemployed)
>>    and t1.id_job_profile <> 911;
>>
>> results in a query plan result:
>
> Can we see explain analyze output for the query, it'll give more
> information about actual time and row counts than plain explain.
>


sure, here it is comes. What we need to achieve is: we have different
job_profiles, each profile has multiple values. For one given profile
we need the ' sum of the distance of every value in the given profile
and every other profile'. The result is usually grouped by the profile
id but to make the query easier i removed this, it does not cost too
much time and it turned out that this query here uses most of the time.

thanks, David




            QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
--------------------------------
  Aggregate  (cost=2689349.81..2689349.81 rows=1 width=8) (actual
time=100487.423..100487.423 rows=1 loops=1)
    ->  Merge Join  (cost=2451266.53..2655338.83 rows=13604393 width=8)
(actual time=82899.466..-2371037.726 rows=2091599 loops=1)
          Merge Cond: ("outer".id_job_attribute =
"inner".id_job_attribute)
          ->  Sort  (cost=97.43..100.63 rows=1281 width=8) (actual
time=3.937..4.031 rows=163 loops=1)
                Sort Key: t0.id_job_attribute
                ->  Index Scan using
job_property__id_job_profile__fk_index on job_property t0
(cost=0.00..31.31 rows=1281 width=8) (actual time=1.343..3.766 rows=163
loops=1)
                      Index Cond: (id_job_profile = 911)
          ->  Sort  (cost=2451169.10..2483246.47 rows=12830947 width=8)
(actual time=82891.076..-529619.213 rows=4187378 loops=1)
                Sort Key: t1.id_job_attribute
                ->  Hash IN Join  (cost=507.32..439065.37 rows=12830947
width=8) (actual time=61.943..1874640.807 rows=4187378 loops=1)
                      Hash Cond: ("outer".id_job_profile =
"inner".id_job_profile)
                      ->  Seq Scan on job_property t1
(cost=0.00..246093.84 rows=12830947 width=12) (actual
time=0.136..19101.796 rows=8482533 loops=1)
                            Filter: (id_job_profile <> 911)
                      ->  Hash  (cost=467.46..467.46 rows=15946 width=4)
(actual time=61.313..61.313 rows=0 loops=1)
                            ->  Seq Scan on unemployed
(cost=0.00..467.46 rows=15946 width=4) (actual time=0.157..50.842
rows=15960 loops=1)
  Total runtime: 103769.592 ms

Re: another query optimization question

From

Tom Lane

Date:

30 January 2004, 15:19:41

David Teran <david.teran@cluster9.com> writes:
> [ query plan ]

I got a little distracted by the bizarre actual-time values shown for
some of the query steps:

>     ->  Merge Join  (cost=2451266.53..2655338.83 rows=13604393 width=8)
> (actual time=82899.466..-2371037.726 rows=2091599 loops=1)

>           ->  Sort  (cost=2451169.10..2483246.47 rows=12830947 width=8)
> (actual time=82891.076..-529619.213 rows=4187378 loops=1)

>                 ->  Hash IN Join  (cost=507.32..439065.37 rows=12830947
> width=8) (actual time=61.943..1874640.807 rows=4187378 loops=1)

The hash-join total time is obviously wrong seeing that the total
runtime is only about 100000 msec, and the negative values for the other
two are even more obviously wrong.

I recall that we saw similar symptoms once before, and I thought we'd
fixed it, but I didn't find any relevant mentions in the CVS logs.

What version are you running, exactly?  Could you repeat the EXPLAIN
with client_min_messages set to 'debug2', and see if you see any
messages about InstrStartTimer or InstrStopNode?

            regards, tom lane

Re: another query optimization question

From

Richard Huxton

Date:

30 January 2004, 16:25:30

On Friday 30 January 2004 19:19, Tom Lane wrote:
>
> The hash-join total time is obviously wrong seeing that the total
> runtime is only about 100000 msec, and the negative values for the other
> two are even more obviously wrong.
>
> I recall that we saw similar symptoms once before, and I thought we'd
> fixed it, but I didn't find any relevant mentions in the CVS logs.

You're not thinking of the -ve durations on the auto-vacuum gizmo? Something
to do with system clock accuracy IIRC.

--
  Richard Huxton
  Archonet Ltd

Re: another query optimization question

From

David Teran

Date:

30 January 2004, 18:02:49

HI Tom.

> I got a little distracted by the bizarre actual-time values shown for
> some of the query steps:
>
>>     ->  Merge Join  (cost=2451266.53..2655338.83 rows=13604393
>> width=8)
>> (actual time=82899.466..-2371037.726 rows=2091599 loops=1)
>
>>           ->  Sort  (cost=2451169.10..2483246.47 rows=12830947
>> width=8)
>> (actual time=82891.076..-529619.213 rows=4187378 loops=1)
>
>>                 ->  Hash IN Join  (cost=507.32..439065.37
>> rows=12830947
>> width=8) (actual time=61.943..1874640.807 rows=4187378 loops=1)
>
> The hash-join total time is obviously wrong seeing that the total
> runtime is only about 100000 msec, and the negative values for the
> other
> two are even more obviously wrong.
>
> I recall that we saw similar symptoms once before, and I thought we'd
> fixed it, but I didn't find any relevant mentions in the CVS logs.
>
> What version are you running, exactly?  Could you repeat the EXPLAIN
> with client_min_messages set to 'debug2', and see if you see any
> messages about InstrStartTimer or InstrStopNode?
>

7.4.1, build from sourcecode. Running on MacOS X Server 10.3.2, dual G5
with 3.5 GB RAM

I have set client_min_messages in postgresql.conf to debug2 but i see
nothing. Is this going to the normal logfile? Must i activate anything
else?

regards David

Re: another query optimization question

From

Tom Lane

Date:

30 January 2004, 18:40:53

David Teran <david.teran@cluster9.com> writes:
>> I recall that we saw similar symptoms once before, and I thought we'd
>> fixed it, but I didn't find any relevant mentions in the CVS logs.
>>
>> What version are you running, exactly?

> 7.4.1, build from sourcecode. Running on MacOS X Server 10.3.2, dual G5
> with 3.5 GB RAM

Interesting.  I have recollected where we saw this before:
http://archives.postgresql.org/pgsql-hackers/2003-11/msg01528.php
Apparently gettimeofday() has a glitch on some BSD releases.  OS X is
a BSD derivative and it's not so surprising if it has it too.

May I suggest that you try the test program given here:
http://archives.postgresql.org/pgsql-hackers/2003-11/msg01546.php
and file a bug report with Apple if it shows any out-of-order results?

I am fairly certain that I tried that test program when I devised it
on my own OS X machine, and couldn't get a failure.  Maybe it depends
on your hardware (hm, could dual CPUs be the key factor)?

            regards, tom lane

Re: another query optimization question

From

David Teran

Date:

31 January 2004, 06:28:47

Hi Tim,

you are right:


> Interesting.  I have recollected where we saw this before:
> http://archives.postgresql.org/pgsql-hackers/2003-11/msg01528.php
> Apparently gettimeofday() has a glitch on some BSD releases.  OS X is
> a BSD derivative and it's not so surprising if it has it too.
>
> May I suggest that you try the test program given here:
> http://archives.postgresql.org/pgsql-hackers/2003-11/msg01546.php
> and file a bug report with Apple if it shows any out-of-order results?
>
> I am fairly certain that I tried that test program when I devised it
> on my own OS X machine, and couldn't get a failure.  Maybe it depends
> on your hardware (hm, could dual CPUs be the key factor)?


p:~ david$ ./a.out
bogus tv_usec: 1075544305 -615731632, prev 1075544304 349672
out of order tv_sec: 1075544304 349759, prev 1075544305 -615731632
out of order tv_usec: 1075544305 339904, prev 1075544305 339905
bogus tv_usec: 1075544307 -615731811, prev 1075544306 349493
out of order tv_sec: 1075544306 349498, prev 1075544307 -615731811
out of order tv_usec: 1075544307 339442, prev 1075544307 339443
out of order tv_usec: 1075544308 339351, prev 1075544308 339352

This is a part of the output. Whats -very- interesting:

Apple provides a little tool that can enable / disable the l2 cache ...
one CPU of a dual CPU system on the fly. When i start the testapp with
two CPU's enabled i get this output here, when i turn off one CPU while
the app is still running the messages disappear as long as one CPU is
turned off. Reactivating the CPU again produces new error messages. I
checked the app on a single G4, no errors and i checked the app on a
dual G4, -not- G5 and also no error messages.

Do you remember where one can find a patch? Maybe its something one can
fix because parts of the OS from Apple are 'open source'.

Do you know if this bug makes a system unusable with PostgresSQL?

Regards David

Re: another query optimization question

From

Tom Lane

Date:

31 January 2004, 11:48:39

David Teran <david.teran@cluster9.com> writes:
> Apple provides a little tool that can enable / disable the l2 cache ...
> one CPU of a dual CPU system on the fly. When i start the testapp with
> two CPU's enabled i get this output here, when i turn off one CPU while
> the app is still running the messages disappear as long as one CPU is
> turned off. Reactivating the CPU again produces new error messages.

Ah-hah, so the gettimeofday bug *is* linked to multiple CPUs.  Marc,
were the machines you saw it on all multi-CPU?

> Do you remember where one can find a patch?

File a bug report with Apple.

> Do you know if this bug makes a system unusable with PostgresSQL?

Not at all.  The only bad effect we've noted is that EXPLAIN results
are sometimes wacky.  In theory you could sometimes see the result of
now() being off by a second or two, but no one's reported seeing that.

            regards, tom lane

Re: another query optimization question

From

"Marc G. Fournier"

Date:

31 January 2004, 16:35:31

On Sat, 31 Jan 2004, Tom Lane wrote:

> David Teran <david.teran@cluster9.com> writes:
> > Apple provides a little tool that can enable / disable the l2 cache ...
> > one CPU of a dual CPU system on the fly. When i start the testapp with
> > two CPU's enabled i get this output here, when i turn off one CPU while
> > the app is still running the messages disappear as long as one CPU is
> > turned off. Reactivating the CPU again produces new error messages.
>
> Ah-hah, so the gettimeofday bug *is* linked to multiple CPUs.  Marc,
> were the machines you saw it on all multi-CPU?

I'm not sure ... I thought I ran it on my P4 here in the office and saw it
too, albeit not near as frequently ... but, in FreeBSD's case, it is a
"design issue" ... there are two different functions, once that is kinda
fuzzy (but fast), and the other that is designed to be exact, but at a
performance loss ... or was it the same function, but a 'sysctl' variable
that changes the state?

Can't remember which, but it is by design on FreeBSD ... and, if we're
talking about Apple, the same most likely applies, as its based on the
same kernel ...

Back of my mind, I *think* it was these sysctl variables:

kern.timecounter.method: 0
kern.timecounter.hardware: i8254

 ----
Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org           Yahoo!: yscrappy              ICQ: 7615664

Re: another query optimization question

From

David Teran

Date:

31 January 2004, 17:40:56

Hi,

> I'm not sure ... I thought I ran it on my P4 here in the office and
> saw it
> too, albeit not near as frequently ... but, in FreeBSD's case, it is a
> "design issue" ... there are two different functions, once that is
> kinda
> fuzzy (but fast), and the other that is designed to be exact, but at a
> performance loss ... or was it the same function, but a 'sysctl'
> variable
> that changes the state?
>
> Can't remember which, but it is by design on FreeBSD ... and, if we're
> talking about Apple, the same most likely applies, as its based on the
> same kernel ...
>
> Back of my mind, I *think* it was these sysctl variables:
>
> kern.timecounter.method: 0
> kern.timecounter.hardware: i8254
>
I will try to check this on my system.

But here another hint, maybe more interesting for Apple though: The bug
does -not- occur if another process uses a lot of CPU time. We encoded
a quicktime movie into mpeg2 and while this was using about 90% and
while encoding the vcd i wanted to show the bug to a friend and it did
not work.

But besides this, is there any chance that we can optimize our initial
performance problem ;-)

regards David