Thread: Updates on large tables are extremely slow

Updates on large tables are extremely slow

From

Yves Vindevogel

Date:

12 June 2005, 14:40:45

Hi,


I'm trying to update a table that has about 600.000 records.

The update query is very simple  :    update mytable set pagesdesc = -
pages ;


(I use pagesdesc to avoid problems with sort that have one field in
ascending order and one in descending order.  That was a problem I had
a week ago)


The query takes about half an hour to an hour to execute.  I have
tried a lot of things.

This is my setup


Linux Slackware 10.1

Postgres 8.0.1

My filesystem has EXT2 filesystem so I don't have journaling.

My partition is mounted in fstab with the noatime option.


I have tried to change some settings in $PGDATA/postgresql.conf.  But
that does not seem to matter a lot.

I'm not even sure that file is being used.  I ran KSysGuard when
executing my query and I don't see my processor being used more than
20%

The memory increases for the cache, but not for the app itself.


My testsystem is an Asus portable, P4 with 1 Gig of RAM.

Disk is speedy.  All runs fine except for the update queries.


I would appreciate some help or a document to point me to the settings
I must change.


Met vriendelijke groeten,

Bien à vous,

Kind regards,


<bold>Yves Vindevogel</bold>

<bold>Implements</bold>

<smaller>

</smaller>Hi,

I'm trying to update a table that has about 600.000 records.
The update query is very simple  :    update mytable set pagesdesc = -
pages ;

(I use pagesdesc to avoid problems with sort that have one field in
ascending order and one in descending order.  That was a problem I had
a week ago)

The query takes about half an hour to an hour to execute.  I have tried
a lot of things.
This is my setup

Linux Slackware 10.1
Postgres 8.0.1
My filesystem has EXT2 filesystem so I don't have journaling.
My partition is mounted in fstab with the noatime option.

I have tried to change some settings in $PGDATA/postgresql.conf.  But
that does not seem to matter a lot.
I'm not even sure that file is being used.  I ran KSysGuard when
executing my query and I don't see my processor being used more than
20%
The memory increases for the cache, but not for the app itself.

My testsystem is an Asus portable, P4 with 1 Gig of RAM.
Disk is speedy.  All runs fine except for the update queries.

I would appreciate some help or a document to point me to the settings
I must change.

Met vriendelijke groeten,
Bien à vous,
Kind regards,

Yves Vindevogel
Implements

<smaller>


Mail: yves.vindevogel@implements.be  - Mobile: +32 (478) 80 82 91


Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76


Web: http://www.implements.be

<italic><x-tad-smaller>

First they ignore you.  Then they laugh at you.  Then they fight you.
Then you win.

Mahatma Ghandi.</x-tad-smaller></italic></smaller>


Mail: yves.vindevogel@implements.be  - Mobile: +32 (478) 80 82 91

Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76

Web: http://www.implements.be

First they ignore you.  Then they laugh at you.  Then they fight you.
Then you win.
Mahatma Ghandi.

Attachment

Pasted Graphic 2.tiff

Re: Updates on large tables are extremely slow

From

Jacques Caron

Date:

12 June 2005, 14:58:09

Hi,

At 19:40 12/06/2005, Yves Vindevogel wrote:
>Hi,
>
>I'm trying to update a table that has about 600.000 records.
>The update query is very simple  :    update mytable set pagesdesc = -
>pages ;
>
>(I use pagesdesc to avoid problems with sort that have one field in
>ascending order and one in descending order.  That was a problem I had a
>week ago)

An index on (-pages) would probably do exactly what you want without having
to add another column.

>The query takes about half an hour to an hour to execute.

Depending on the total size of the table and associated indexes and on your
exact setup (especially your hardare), this could be quite normal: the
exuctor goes through all rows in the table, and for each, creates a copy
with the additional column, updates indexes, and logs to WAL. You might
want to look into moving your WAL files (pg_xlog) to a separate disk,
increase WAL and checkpoint buffers, add more RAM, add more disks...

But as I said, you might not even need to do that, just use an index on an
expression...

Jacques.

Re: Updates on large tables are extremely slow

From

Mark Kirkwood

Date:

12 June 2005, 23:43:56

Yves Vindevogel wrote:
>
> I'm trying to update a table that has about 600.000 records.
> The update query is very simple : update mytable set pagesdesc = - pages ;
>
> The query takes about half an hour to an hour to execute. I have tried a
> lot of things.
>

Half an hour seem a bit long - I would expect less than 5 minutes on
reasonable hardware.

You may have dead tuple bloat - can you post the output of 'ANALYZE
VERBOSE mytable' ?

Cheers

Mark

Re: Updates on large tables are extremely slow

From

Mark Kirkwood

Date:

13 June 2005, 05:54:41

Apologies - I should have said output of 'VACUUM VERBOSE mytable'.

(been using 8.1, which displays dead tuple info in ANALYZE...).

Mark

Yves Vindevogel wrote:
> rvponp=# analyze verbose tblPrintjobs ;
> INFO: analyzing "public.tblprintjobs"
> INFO: "tblprintjobs": 19076 pages, 3000 rows sampled, 588209 estimated
> total rows
> ANALYZE
>
>
> On 13 Jun 2005, at 04:43, Mark Kirkwood wrote:
>
>     Yves Vindevogel wrote:
>
>         I'm trying to update a table that has about 600.000 records.
>         The update query is very simple : update mytable set pagesdesc =
>         - pages ;
>         The query takes about half an hour to an hour to execute. I have
>         tried a lot of things.
>
>
>     Half an hour seem a bit long - I would expect less than 5 minutes on
>     reasonable hardware.
>
>     You may have dead tuple bloat - can you post the output of 'ANALYZE
>     VERBOSE mytable' ?

Re: Updates on large tables are extremely slow

From

Yves Vindevogel

Date:

13 June 2005, 06:02:22

rvponp=# vacuum verbose tblPrintjobs ;

INFO:  vacuuming "public.tblprintjobs"

INFO:  index "pkprintjobs" now contains 622972 row versions in 8410
pages

DETAIL:  9526 index row versions were removed.

0 index pages have been deleted, 0 are currently reusable.

CPU 0.60s/0.31u sec elapsed 31.68 sec.

INFO:  index "uxprintjobs" now contains 622972 row versions in 3978
pages

DETAIL:  9526 index row versions were removed.

0 index pages have been deleted, 0 are currently reusable.

CPU 0.15s/0.48u sec elapsed 3.59 sec.

INFO:  index "ixprintjobsipaddress" now contains 622972 row versions
in 2542 pages

DETAIL:  9526 index row versions were removed.

49 index pages have been deleted, 0 are currently reusable.

CPU 0.13s/0.24u sec elapsed 2.57 sec.

INFO:  index "ixprintjobshostname" now contains 622972 row versions in
2038 pages

DETAIL:  9526 index row versions were removed.

35 index pages have been deleted, 0 are currently reusable.

CPU 0.09s/0.30u sec elapsed 1.14 sec.

INFO:  index "ixprintjobsrecordnumber" now contains 622972 row
versions in 1850 pages

DETAIL:  9526 index row versions were removed.

0 index pages have been deleted, 0 are currently reusable.

CPU 0.07s/0.28u sec elapsed 1.51 sec.

INFO:  index "ixprintjobseventdate" now contains 622972 row versions
in 1408 pages

DETAIL:  9526 index row versions were removed.

4 index pages have been deleted, 0 are currently reusable.

CPU 0.05s/0.24u sec elapsed 2.61 sec.

INFO:  index "ixprintjobseventtime" now contains 622972 row versions
in 1711 pages

DETAIL:  9526 index row versions were removed.

0 index pages have been deleted, 0 are currently reusable.

CPU 0.12s/0.53u sec elapsed 11.66 sec.

INFO:  index "ixprintjobseventcomputer" now contains 622972 row
versions in 2039 pages

DETAIL:  9526 index row versions were removed.

36 index pages have been deleted, 0 are currently reusable.

CPU 0.12s/0.23u sec elapsed 1.27 sec.

INFO:  index "ixprintjobseventuser" now contains 622972 row versions
in 2523 pages

DETAIL:  9526 index row versions were removed.

19 index pages have been deleted, 0 are currently reusable.

CPU 0.14s/0.24u sec elapsed 1.74 sec.

INFO:  index "ixprintjobsloginuser" now contains 622972 row versions
in 2114 pages

DETAIL:  9526 index row versions were removed.

13 index pages have been deleted, 0 are currently reusable.

CPU 0.07s/0.32u sec elapsed 4.29 sec.

INFO:  index "ixprintjobsprintqueue" now contains 622972 row versions
in 2201 pages

DETAIL:  9526 index row versions were removed.

30 index pages have been deleted, 0 are currently reusable.

CPU 0.10s/0.34u sec elapsed 1.92 sec.

INFO:  index "ixprintjobsprintport" now contains 622972 row versions
in 3040 pages

DETAIL:  9526 index row versions were removed.

40 index pages have been deleted, 0 are currently reusable.

CPU 0.18s/0.27u sec elapsed 2.63 sec.

INFO:  index "ixprintjobssize" now contains 622972 row versions in
1733 pages

DETAIL:  9526 index row versions were removed.

0 index pages have been deleted, 0 are currently reusable.

CPU 0.16s/0.43u sec elapsed 4.07 sec.

INFO:  index "ixprintjobspages" now contains 622972 row versions in
1746 pages

DETAIL:  9526 index row versions were removed.

24 index pages have been deleted, 0 are currently reusable.

CPU 0.13s/0.22u sec elapsed 1.58 sec.

INFO:  index "ixprintjobsapplicationtype" now contains 622972 row
versions in 1395 pages

DETAIL:  9526 index row versions were removed.

27 index pages have been deleted, 0 are currently reusable.

CPU 0.07s/0.29u sec elapsed 1.20 sec.

INFO:  index "ixprintjobsusertype" now contains 622972 row versions in
1393 pages

DETAIL:  9526 index row versions were removed.

24 index pages have been deleted, 0 are currently reusable.

CPU 0.07s/0.22u sec elapsed 0.82 sec.

INFO:  index "ixprintjobsdocumentname" now contains 622972 row
versions in 4539 pages

DETAIL:  9526 index row versions were removed.

6 index pages have been deleted, 0 are currently reusable.

CPU 0.24s/0.38u sec elapsed 5.83 sec.

INFO:  index "ixprintjobsdesceventdate" now contains 622972 row
versions in 1757 pages

DETAIL:  9526 index row versions were removed.

4 index pages have been deleted, 0 are currently reusable.

CPU 0.08s/0.25u sec elapsed 1.16 sec.

INFO:  index "ixprintjobsdesceventtime" now contains 622972 row
versions in 1711 pages

DETAIL:  9526 index row versions were removed.

0 index pages have been deleted, 0 are currently reusable.

CPU 0.18s/0.52u sec elapsed 9.44 sec.

INFO:  index "ixprintjobsdescpages" now contains 622972 row versions
in 1748 pages

DETAIL:  9526 index row versions were removed.

24 index pages have been deleted, 0 are currently reusable.

CPU 0.06s/0.26u sec elapsed 0.94 sec.

INFO:  index "ixprintjobspagesperjob" now contains 622972 row versions
in 5259 pages

DETAIL:  9526 index row versions were removed.

4 index pages have been deleted, 0 are currently reusable.

CPU 0.31s/0.36u sec elapsed 5.47 sec.

INFO:  "tblprintjobs": removed 9526 row versions in 307 pages

DETAIL:  CPU 0.00s/0.06u sec elapsed 0.23 sec.

INFO:  "tblprintjobs": found 9526 removable, 622972 nonremovable row
versions in 19382 pages

DETAIL:  0 dead row versions cannot be removed yet.

There were 75443 unused item pointers.

0 pages are entirely empty.

CPU 3.43s/6.83u sec elapsed 97.86 sec.

INFO:  vacuuming "pg_toast.pg_toast_2169880"

INFO:  index "pg_toast_2169880_index" now contains 0 row versions in 1
pages

DETAIL:  0 index pages have been deleted, 0 are currently reusable.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO:  "pg_toast_2169880": found 0 removable, 0 nonremovable row
versions in 0 pages

DETAIL:  0 dead row versions cannot be removed yet.

There were 0 unused item pointers.

0 pages are entirely empty.

CPU 0.00s/0.00u sec elapsed 0.00 sec.

VACUUM

rvponp=#

On 13 Jun 2005, at 10:54, Mark Kirkwood wrote:

<excerpt>Apologies - I should have said output of 'VACUUM VERBOSE
mytable'.

(been using 8.1, which displays dead tuple info in ANALYZE...).

Mark

Yves Vindevogel wrote:

<excerpt>rvponp=# analyze verbose tblPrintjobs ;

INFO: analyzing "public.tblprintjobs"

INFO: "tblprintjobs": 19076 pages, 3000 rows sampled, 588209 estimated
total rows

ANALYZE

On 13 Jun 2005, at 04:43, Mark Kirkwood wrote:

    Yves Vindevogel wrote:

        I'm trying to update a table that has about 600.000 records.

        The update query is very simple : update mytable set pagesdesc =

        - pages ;

        The query takes about half an hour to an hour to execute. I
have

        tried a lot of things.

    Half an hour seem a bit long - I would expect less than 5 minutes
on

    reasonable hardware.

    You may have dead tuple bloat - can you post the output of 'ANALYZE

    VERBOSE mytable' ?

</excerpt>

</excerpt>Met vriendelijke groeten,

Bien à vous,

Kind regards,

<bold>Yves Vindevogel</bold>

<bold>Implements</bold>

<smaller>

</smaller>rvponp=# vacuum verbose tblPrintjobs ;
INFO:  vacuuming "public.tblprintjobs"
INFO:  index "pkprintjobs" now contains 622972 row versions in 8410
pages
DETAIL:  9526 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.60s/0.31u sec elapsed 31.68 sec.
INFO:  index "uxprintjobs" now contains 622972 row versions in 3978
pages
DETAIL:  9526 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.15s/0.48u sec elapsed 3.59 sec.
INFO:  index "ixprintjobsipaddress" now contains 622972 row versions in
2542 pages
DETAIL:  9526 index row versions were removed.
49 index pages have been deleted, 0 are currently reusable.
CPU 0.13s/0.24u sec elapsed 2.57 sec.
INFO:  index "ixprintjobshostname" now contains 622972 row versions in
2038 pages
DETAIL:  9526 index row versions were removed.
35 index pages have been deleted, 0 are currently reusable.
CPU 0.09s/0.30u sec elapsed 1.14 sec.
INFO:  index "ixprintjobsrecordnumber" now contains 622972 row versions
in 1850 pages
DETAIL:  9526 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.07s/0.28u sec elapsed 1.51 sec.
INFO:  index "ixprintjobseventdate" now contains 622972 row versions in
1408 pages
DETAIL:  9526 index row versions were removed.
4 index pages have been deleted, 0 are currently reusable.
CPU 0.05s/0.24u sec elapsed 2.61 sec.
INFO:  index "ixprintjobseventtime" now contains 622972 row versions in
1711 pages
DETAIL:  9526 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.12s/0.53u sec elapsed 11.66 sec.
INFO:  index "ixprintjobseventcomputer" now contains 622972 row
versions in 2039 pages
DETAIL:  9526 index row versions were removed.
36 index pages have been deleted, 0 are currently reusable.
CPU 0.12s/0.23u sec elapsed 1.27 sec.
INFO:  index "ixprintjobseventuser" now contains 622972 row versions in
2523 pages
DETAIL:  9526 index row versions were removed.
19 index pages have been deleted, 0 are currently reusable.
CPU 0.14s/0.24u sec elapsed 1.74 sec.
INFO:  index "ixprintjobsloginuser" now contains 622972 row versions in
2114 pages
DETAIL:  9526 index row versions were removed.
13 index pages have been deleted, 0 are currently reusable.
CPU 0.07s/0.32u sec elapsed 4.29 sec.
INFO:  index "ixprintjobsprintqueue" now contains 622972 row versions
in 2201 pages
DETAIL:  9526 index row versions were removed.
30 index pages have been deleted, 0 are currently reusable.
CPU 0.10s/0.34u sec elapsed 1.92 sec.
INFO:  index "ixprintjobsprintport" now contains 622972 row versions in
3040 pages
DETAIL:  9526 index row versions were removed.
40 index pages have been deleted, 0 are currently reusable.
CPU 0.18s/0.27u sec elapsed 2.63 sec.
INFO:  index "ixprintjobssize" now contains 622972 row versions in 1733
pages
DETAIL:  9526 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.16s/0.43u sec elapsed 4.07 sec.
INFO:  index "ixprintjobspages" now contains 622972 row versions in
1746 pages
DETAIL:  9526 index row versions were removed.
24 index pages have been deleted, 0 are currently reusable.
CPU 0.13s/0.22u sec elapsed 1.58 sec.
INFO:  index "ixprintjobsapplicationtype" now contains 622972 row
versions in 1395 pages
DETAIL:  9526 index row versions were removed.
27 index pages have been deleted, 0 are currently reusable.
CPU 0.07s/0.29u sec elapsed 1.20 sec.
INFO:  index "ixprintjobsusertype" now contains 622972 row versions in
1393 pages
DETAIL:  9526 index row versions were removed.
24 index pages have been deleted, 0 are currently reusable.
CPU 0.07s/0.22u sec elapsed 0.82 sec.
INFO:  index "ixprintjobsdocumentname" now contains 622972 row versions
in 4539 pages
DETAIL:  9526 index row versions were removed.
6 index pages have been deleted, 0 are currently reusable.
CPU 0.24s/0.38u sec elapsed 5.83 sec.
INFO:  index "ixprintjobsdesceventdate" now contains 622972 row
versions in 1757 pages
DETAIL:  9526 index row versions were removed.
4 index pages have been deleted, 0 are currently reusable.
CPU 0.08s/0.25u sec elapsed 1.16 sec.
INFO:  index "ixprintjobsdesceventtime" now contains 622972 row
versions in 1711 pages
DETAIL:  9526 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.18s/0.52u sec elapsed 9.44 sec.
INFO:  index "ixprintjobsdescpages" now contains 622972 row versions in
1748 pages
DETAIL:  9526 index row versions were removed.
24 index pages have been deleted, 0 are currently reusable.
CPU 0.06s/0.26u sec elapsed 0.94 sec.
INFO:  index "ixprintjobspagesperjob" now contains 622972 row versions
in 5259 pages
DETAIL:  9526 index row versions were removed.
4 index pages have been deleted, 0 are currently reusable.
CPU 0.31s/0.36u sec elapsed 5.47 sec.
INFO:  "tblprintjobs": removed 9526 row versions in 307 pages
DETAIL:  CPU 0.00s/0.06u sec elapsed 0.23 sec.
INFO:  "tblprintjobs": found 9526 removable, 622972 nonremovable row
versions in 19382 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 75443 unused item pointers.
0 pages are entirely empty.
CPU 3.43s/6.83u sec elapsed 97.86 sec.
INFO:  vacuuming "pg_toast.pg_toast_2169880"
INFO:  index "pg_toast_2169880_index" now contains 0 row versions in 1
pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_2169880": found 0 removable, 0 nonremovable row
versions in 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM
rvponp=#

On 13 Jun 2005, at 10:54, Mark Kirkwood wrote:

> Apologies - I should have said output of 'VACUUM VERBOSE mytable'.
>
> (been using 8.1, which displays dead tuple info in ANALYZE...).
>
> Mark
>
> Yves Vindevogel wrote:
>> rvponp=# analyze verbose tblPrintjobs ;
>> INFO: analyzing "public.tblprintjobs"
>> INFO: "tblprintjobs": 19076 pages, 3000 rows sampled, 588209
>> estimated total rows
>> ANALYZE
>> On 13 Jun 2005, at 04:43, Mark Kirkwood wrote:
>>     Yves Vindevogel wrote:
>>         I'm trying to update a table that has about 600.000 records.
>>         The update query is very simple : update mytable set
>> pagesdesc =
>>         - pages ;
>>         The query takes about half an hour to an hour to execute. I
>> have
>>         tried a lot of things.
>>     Half an hour seem a bit long - I would expect less than 5 minutes
>> on
>>     reasonable hardware.
>>     You may have dead tuple bloat - can you post the output of
>> 'ANALYZE
>>     VERBOSE mytable' ?
>
>
Met vriendelijke groeten,
Bien à vous,
Kind regards,

Yves Vindevogel
Implements

<smaller>

Mail: yves.vindevogel@implements.be  - Mobile: +32 (478) 80 82 91

Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76

Web: http://www.implements.be

<italic><x-tad-smaller>

First they ignore you.  Then they laugh at you.  Then they fight you.
Then you win.

Mahatma Ghandi.</x-tad-smaller></italic></smaller>

Mail: yves.vindevogel@implements.be  - Mobile: +32 (478) 80 82 91

Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76

Web: http://www.implements.be

First they ignore you.  Then they laugh at you.  Then they fight you.
Then you win.
Mahatma Ghandi.

Attachment

Pasted Graphic 2.tiff

How to enhance the chance that data is in disk cache

From

Jona

Date:

13 June 2005, 09:59:48

Hi there
I have a query (please refer to
http://213.173.234.215:8080/get_content_plan.htm for the query as well
as query plan) that is slow when it's run the first time and fast(ish)
on all successive runs within a reasonable time period.
That is, if the query is not run for like 30 min, execution time returns
to the initial time.

This leads me to suspect that when the query is first run, all used data
have to be fetched from the disk where as once it has been run all data
is available in the OS's disk cache.
Comparing the execution times we're talking roughly a factor 35 in time
difference, thus optimization would be handy.
Is there anway to either enhance the chance that the data can be found
in the disk cache or allowing the database to fetch the data faster?
Is this what the CLUSTER command is for, if so, which tables would I
need to cluster?
Or is my only option to de-normalize the table structure around this
query to speed it up?

Furthermore, it seems the database spends the majority of its time in
the loop marked with italic in the initial plan, any idea what it spends
its time on there?

Database is PG 7.3.9 on RH ES 3.0, with Dual XEON 1.9GHz processors and
2GB of RAM.
effective_cache_size = 100k
shared_buffers = 14k
random_page_cost = 3
default_statistics_target = 50
VACUUM ANALYZE runs every few hours, so statistics should be up to date.

Appreciate any input here.

Cheers
Jona

Re: Updates on large tables are extremely slow

From

Tom Lane

Date:

13 June 2005, 11:32:22

Yves Vindevogel <yves.vindevogel@implements.be> writes:
> rvponp=3D# vacuum verbose tblPrintjobs ;
> INFO:  vacuuming "public.tblprintjobs"
> [ twenty-one different indexes on one table ]

Well, there's your problem.  You think updating all those indexes is
free?  It's *expensive*.  Heed the manual's advice: avoid creating
indexes you are not certain you need for identifiable commonly-used
queries.

(The reason delete is fast is it doesn't have to touch the indexes ...
the necessary work is left to be done by VACUUM.)

            regards, tom lane

Re: How to enhance the chance that data is in disk cache

From

Tom Lane

Date:

13 June 2005, 11:51:45

Jona <jonanews@oismail.com> writes:
> I have a query (please refer to
> http://213.173.234.215:8080/get_content_plan.htm for the query as well
> as query plan) that is slow when it's run the first time and fast(ish)
> on all successive runs within a reasonable time period.

> This leads me to suspect that when the query is first run, all used data
> have to be fetched from the disk where as once it has been run all data
> is available in the OS's disk cache.

Sounds like that to me too.

> Is there anway to either enhance the chance that the data can be found
> in the disk cache or allowing the database to fetch the data faster?

Run the query more often?

Also, that pile of INNER JOINs is forcing a probably-bad join order;
you need to think carefully about the order you want things joined in,
or else convert the query to non-JOIN syntax.  See the "Performance
Tips" chapter of the manual.

            regards, tom lane

Re: How to enhance the chance that data is in disk cache

From

Jona

Date:

13 June 2005, 14:06:06

Thank you for the response Tom, I bet you get a lot of mails with "trivial" solutions (mine likely being one of them)
I for one however truly appreciate you taking the time to answer them.

Run the query more often?

The query is dynamically constructed from user input, although the total number of different queries that can be run is limited (around 10k different combinations I suspect) it seems rather pointless to run all of them (or even the most common) more often just to keep the data in the disk cache.
Is there a way to make the data more accessible on the disk?

Also, that pile of INNER JOINs is forcing a probably-bad join order;
you need to think carefully about the order you want things joined in,
or else convert the query to non-JOIN syntax.  See the "Performance
Tips" chapter of the manual.

You're probably right here, the join order must be bad though it just flattening the join and letting the planner decide on what would be best makes the plan change for every execution.
Have query cost variering from from 1350 to 4500.
I wager it ends up using GEQO due to the number of possiblities for a join order that the query has and thus just decides on a "good" plan out of those it examined.
In any case, the "right" way to do this is definning a good explicit join order, no?
On top of my head I'm not sure how to re-write it proberly, suppose trial and errors is the only way....
From the plan it appears that the following part is where the cost dramatically increases (although the time does not??):
-> Nested Loop (cost=0.00..1207.19 rows=75 width=32) (actual time=0.28..18.47 rows=164 loops=1)
    -> Nested Loop (cost=0.00..868.23 rows=58 width=20) (actual time=0.16..13.91 rows=164 loops=1)
        -> Index Scan using subcat_uq on sct2subcattype_tbl (cost=0.00..479.90 rows=82 width=8) (actual time=0.11..9.47 rows=164 loops=1)
              Index Cond: (subcattpid = 50)
              Filter: (NOT (subplan))
              SubPlan
              -> Seq Scan on aff2sct2subcattype_tbl (cost=0.00..1.92 rows=1 width=4) (actual time=0.05..0.05 rows=0 loops=164)
                    Filter: ((affid = 8) AND ($0 = sctid))
        -> Index Scan using aff_price_uq on price_tbl (cost=0.00..4.72 rows=1 width=12) (actual time=0.02..0.02 rows=1 loops=164)
              Index Cond: ((price_tbl.affid = 8) AND (price_tbl.sctid = outer".sctid))"
    -> Index Scan using ctp_statcon on statcon_tbl (cost=0.00..5.86 rows=1 width=12) (actual time=0.02..0.02 rows=1 loops=164)
          Index Cond: ((statcon_tbl.sctid = outer".sctid) AND (statcon_tbl.ctpid = 1))"
Especially the index scan on subcat_uq seems rather expensive, but is pretty fast.
Can there be drawn a relation between estimated cost and execution time?
Any other pointers in the right direction would be very much appreciated.

For the full query and query plan, please refer to: http://213.173.234.215:8080/get_content_plan.htm

Cheers
Jona

Tom Lane wrote:

Jona <jonanews@oismail.com> writes:

I have a query (please refer to 
http://213.173.234.215:8080/get_content_plan.htm for the query as well 
as query plan) that is slow when it's run the first time and fast(ish) 
on all successive runs within a reasonable time period.

This leads me to suspect that when the query is first run, all used data 
have to be fetched from the disk where as once it has been run all data 
is available in the OS's disk cache.

Sounds like that to me too.

Is there anway to either enhance the chance that the data can be found 
in the disk cache or allowing the database to fetch the data faster?

Run the query more often?

Also, that pile of INNER JOINs is forcing a probably-bad join order;
you need to think carefully about the order you want things joined in,
or else convert the query to non-JOIN syntax.  See the "Performance
Tips" chapter of the manual.

You're probably right herem though I'm not sure I can

		regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster