Thread: Optimizing further

Optimizing further

From

Ken Williams

Date:

13 June 2002, 22:07:28

Hi,

I'm wondering whether I need to do something to further optimize
a query.  My query looks like this:


   select price, date FROM foo
    WHERE date < '2000-07-01 00:29:00+10' and code='UCL'
    ORDER by date DESC, stock DESC
    LIMIT 100;

I've got over 7 million rows in the table, and a btree
index(date, code).  According to "explain", the query will be
executed as follows:

   Limit  (cost=0.00..393.45 rows=100 width=32)
     ->  Index Scan Backward using foo_date_code on trades
                                (cost=0.00..12309.13 rows=3128 width=32)

When I run the query, it takes a really long time (10 seconds)
the first time I execute it, and then returns instantly the
second time.

Is there something I need to do in order to get these running
faster, the *first* time?  I've run "VACUUM ANALYZE" on the
table after all the rows were inserted.

Thanks.

  -Ken

Re: Optimizing further

From

Tom Lane

Date:

13 June 2002, 22:44:19

Ken Williams <ken@mathforum.org> writes:
> When I run the query, it takes a really long time (10 seconds)
> the first time I execute it, and then returns instantly the
> second time.

That sounds like a caching issue.  How much RAM do you have,
what have you got shared_buffers set to, etc?

            regards, tom lane

Re: Optimizing further

From

Ken Williams

Date:

13 June 2002, 23:06:50

On Friday, June 14, 2002, at 12:35  PM, Tom Lane wrote:
> Ken Williams <ken@mathforum.org> writes:
>> When I run the query, it takes a really long time (10 seconds)
>> the first time I execute it, and then returns instantly the
>> second time.
>
> That sounds like a caching issue.  How much RAM do you have,
> what have you got shared_buffers set to, etc?

Hi Tom,

'top' reports 1024M of real memory, with 819M virtual.  shared_buffers
is currently set to 64.  Is there some information about the table
itself or the index that would be helpful to know?

  -Ken

Re: Optimizing further

From

Tom Lane

Date:

13 June 2002, 23:14:22

Ken Williams <ken@mathforum.org> writes:
> On Friday, June 14, 2002, at 12:35  PM, Tom Lane wrote:
>> That sounds like a caching issue.  How much RAM do you have,
>> what have you got shared_buffers set to, etc?

> 'top' reports 1024M of real memory, with 819M virtual.  shared_buffers
> is currently set to 64.

Hm.  If there were nothing else going on in your system, I'd expect
the kernel to buffer disk pages more or less indefinitely.  Do you
have other processes running that might be sucking memory away from
kernel disk buffers?

Most people think that shared_buffers=64 is a ridiculously low
default, and that something in the low thousands is more realistic for
a production database.  However I'm not sure that bumping that up will
help for your particular issue.  I'm guessing that your machine is
under memory pressure and that what you really need to do is buy more
RAM.  Does 'top' tell you anything about the amount of RAM being used
for kernel disk buffers?

            regards, tom lane

Re: Optimizing further

From

Doug Fields

Date:

13 June 2002, 23:23:02

>When I run the query, it takes a really long time (10 seconds) the first
>time I execute it, and then returns instantly the second time.

This is behavior which I often encounter with my databases as well (and I
have 8 gigs of RAM now), and is easy to explain.

The first time, the system has to page in all the necessary data from the
disk. This is slow.

The second and subsequent times, the pages are in memory (either in the
shared_mem segment or the OS disk cache) and hence the query runs extremely
quickly.

Of course, if you run a different query which requires other page loads in
the interim, and those page loads cause the other pages to be flushed, the
next time this query runs it will be slow again.

Cheers,

Doug

Re: Optimizing further

From

Curt Sampson

Date:

14 June 2002, 02:19:29

On Fri, 14 Jun 2002, Ken Williams wrote:

>    select price, date FROM foo
>     WHERE date < '2000-07-01 00:29:00+10' and code='UCL'
>     ORDER by date DESC, stock DESC
>     LIMIT 100;
>
> I've got over 7 million rows in the table, and a btree
> index(date, code).  According to "explain", the query will be
> executed as follows:
>
>    Limit  (cost=0.00..393.45 rows=100 width=32)
>      ->  Index Scan Backward using foo_date_code on trades
>                                 (cost=0.00..12309.13 rows=3128 width=32)
>
> When I run the query, it takes a really long time (10 seconds)
> the first time I execute it, and then returns instantly the
> second time.

My first suspicion is that, as someone else explained, the data is
being dragged (very slowly) off the disk with the first query, and
then is being read from the cache in the second query. So it's disk
I/O that's your problem.  You can do a "set show_query_stats =
true", run the query, and check the log you'll find out how many
I/O operations you're doing.

So let's see how we might process this query. (Experts should
correct me if I go wrong, here.) First we find the last occurance
of the given date in the index, thus doing perhaps 2-4 reads
(depending on the depth of the btree). Then we scan backwards from
that point, looking at the code value in the index, and collecting
up the ones that have a code of UCL. This should be pretty quick
as, even though we probably are not using OS read-ahead, the blocks
are going to be darn close together and the index entries are all
clustered together in a few blocks. They estimate 3128 rows read,
so let's say it's 300 or so blocks we read. (It's probably less.)

Once we've got a hundred of these babies, we have to go and fetch
their tuples, which could be up to 100 reads, if they're scattered
randomly about the table.

Then we sort the whole lot, and we're done.

So one would think you'd see only about 400 blocks read for the
query, 300 of which are very close together. On modern disks this
should only take a couple of seconds. You don't happen to have this on
a really old, slow disk, do you? I suspect not.

Maybe it's reading a lot more than 400 blocks? Perhaps the optimizer
doesn't realize it can get the value of code from the index, and
is instead reading the tuple for every item in the index it scans.
If that's the case, maybe reversing the two values in your index
(foo_code_date instead of foo_date_code) would help?

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC