Thread: Digesting explain analyze

Digesting explain analyze

From

Jesper Krogh

Date:

06 January 2010, 15:22:16

Hi.

I have a table that consists of somewhere in the magnitude of 100.000.000
rows and all rows are of this tuples

(id1,id2,evalue);

Then I'd like to speed up a query like this:

explain analyze select id from table where id1 = 2067 or id2 = 2067 order
by evalue asc limit 100;
                                                                  QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1423.28..1423.28 rows=100 width=12) (actual
time=2.565..2.567 rows=100 loops=1)
   ->  Sort  (cost=1423.28..1424.54 rows=505 width=12) (actual
time=2.560..2.560 rows=100 loops=1)
         Sort Key: evalue
         Sort Method:  top-N heapsort  Memory: 25kB
         ->  Bitmap Heap Scan on table  (cost=16.58..1420.75 rows=505
width=12) (actual time=0.709..1.752 rows=450 loops=1)
               Recheck Cond: ((id1 = 2067) OR (id2 = 2067))
               ->  BitmapOr  (cost=16.58..16.58 rows=506 width=0) (actual
time=0.676..0.676 rows=0 loops=1)
                     ->  Bitmap Index Scan on id1_evalue_idx
(cost=0.00..11.44 rows=423 width=0) (actual
time=0.599..0.599 rows=450 loops=1)
                           Index Cond: (id1_id = 2067)
                     ->  Bitmap Index Scan on id2_evalue_idx
(cost=0.00..4.89 rows=83 width=0) (actual
time=0.070..0.070 rows=1 loops=1)
                           Index Cond: (id2_id = 2067)
 Total runtime: 2.642 ms
(12 rows)


What I had expected was to see the "Bitmap Index Scan on id1_evalue_idx"
to chop it off at a "limit 1". The inner sets are on average 3.000 for
both id1 and id2 and a typical limit would be 100, so if I could convince
postgresql to not fetch all of them then I would reduce the set retrieved
by around 60. The dataset is quite large so the random query is not very
likely to be hitting the same part of the dataset again, so there is going
to be a fair amount of going to disk.,

I would also mean that using it in a for loop in a stored-procedure in
plpgsql it would not get any benefit from the CURSOR effect?

I actually tried to stuff id1,id2 into an array and do a GIST index on the
array,evalue hoping that it directly would satisfy this query.. it used
the GIST index fetch the rows the post-sorting and limit on the set.

What it boils down to is more or less:

Does a "bitmap index scan" support ordering and limit ?
Does a "multicolummn gist index" support ordering and limit ?

Have I missed anything that can hugely speed up fetching these (typically
100 rows) from the database.

--
Jesper

Re: Digesting explain analyze

From

Ron Mayer

Date:

06 January 2010, 16:03:33

Jesper Krogh wrote:
> I have a table that consists of somewhere in the magnitude of 100.000.000
> rows and all rows are of this tuples
>
> (id1,id2,evalue);
>
> Then I'd like to speed up a query like this:
>
> explain analyze select id from table where id1 = 2067 or id2 = 2067 order
> by evalue asc limit 100;
>
> ...The inner sets are on average 3.000 for
> both id1 and id2 and a typical limit would be 100, so if I could convince
> postgresql to not fetch all of them then I would reduce the set retrieved
> by around 60. The dataset is quite large so the random query is not very
> likely to be hitting the same part of the dataset again, so there is going
> to be a fair amount of going to disk.,

If disk seeks are killing you a kinda crazy idea would be to
duplicate the table - clustering one by (id1) and
the other one by an index on (id2) and unioning the
results of each.

Since each of these duplicates of the table will be clustered
by the column you're querying it on, it should just take one
seek in each table.

Then your query could be something like

  select * from (
    select * from t1 where id1=2067 order by evalue limit 100
    union
    select * from t2 where id2=2067 order by evalue limit 100
  ) as foo order by evalue limit 100;

Hmm..  and I wonder if putting evalue into the criteria to cluster
the tables too (i.e. cluster on id1,evalue) if you could make it
so the limit finds the right 100 evalues first for each table....

Re: Digesting explain analyze

From

Robert Haas

Date:

06 January 2010, 21:58:38

On Wed, Jan 6, 2010 at 2:10 PM, Jesper Krogh <jesper@krogh.cc> wrote:
> Hi.
>
> I have a table that consists of somewhere in the magnitude of 100.000.000
> rows and all rows are of this tuples
>
> (id1,id2,evalue);
>
> Then I'd like to speed up a query like this:
>
> explain analyze select id from table where id1 = 2067 or id2 = 2067 order
> by evalue asc limit 100;
>                                                                  QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=1423.28..1423.28 rows=100 width=12) (actual
> time=2.565..2.567 rows=100 loops=1)
>   ->  Sort  (cost=1423.28..1424.54 rows=505 width=12) (actual
> time=2.560..2.560 rows=100 loops=1)
>         Sort Key: evalue
>         Sort Method:  top-N heapsort  Memory: 25kB
>         ->  Bitmap Heap Scan on table  (cost=16.58..1420.75 rows=505
> width=12) (actual time=0.709..1.752 rows=450 loops=1)
>               Recheck Cond: ((id1 = 2067) OR (id2 = 2067))
>               ->  BitmapOr  (cost=16.58..16.58 rows=506 width=0) (actual
> time=0.676..0.676 rows=0 loops=1)
>                     ->  Bitmap Index Scan on id1_evalue_idx
> (cost=0.00..11.44 rows=423 width=0) (actual
> time=0.599..0.599 rows=450 loops=1)
>                           Index Cond: (id1_id = 2067)
>                     ->  Bitmap Index Scan on id2_evalue_idx
> (cost=0.00..4.89 rows=83 width=0) (actual
> time=0.070..0.070 rows=1 loops=1)
>                           Index Cond: (id2_id = 2067)
>  Total runtime: 2.642 ms
> (12 rows)
>
>
> What I had expected was to see the "Bitmap Index Scan on id1_evalue_idx"
> to chop it off at a "limit 1". The inner sets are on average 3.000 for
> both id1 and id2 and a typical limit would be 100, so if I could convince
> postgresql to not fetch all of them then I would reduce the set retrieved
> by around 60. The dataset is quite large so the random query is not very
> likely to be hitting the same part of the dataset again, so there is going
> to be a fair amount of going to disk.,
>
> I would also mean that using it in a for loop in a stored-procedure in
> plpgsql it would not get any benefit from the CURSOR effect?
>
> I actually tried to stuff id1,id2 into an array and do a GIST index on the
> array,evalue hoping that it directly would satisfy this query.. it used
> the GIST index fetch the rows the post-sorting and limit on the set.
>
> What it boils down to is more or less:
>
> Does a "bitmap index scan" support ordering and limit ?
> Does a "multicolummn gist index" support ordering and limit ?
>
> Have I missed anything that can hugely speed up fetching these (typically
> 100 rows) from the database.

Bitmap index scans always return all the matching rows.  It would be
nice if they could fetch them in chunks for queries like this, but
they can't.  I am not sure whether there's any way to make GIST do
what you want.

You might try something like this (untested):

SELECT * FROM (
   select id from table where id1 = 2067 order by evalue asc limit 100
   union all
   select id from table where id2 = 2067 order by evalue asc limit 100
) x ORDER BY evalue LIMIT 100

If you have an index by (id1, evalue) and by (id2, evalue) then I
would think this would be pretty quick, as it should do two index
scans (not bitmap index scans) to fetch 100 rows for each, then append
the results, sort them, and then limit again.

...Robert

Re: Digesting explain analyze

From

Jesper Krogh

Date:

07 January 2010, 02:11:29

Ron Mayer wrote:
>> ...The inner sets are on average 3.000 for
>> both id1 and id2 and a typical limit would be 100, so if I could convince
>> postgresql to not fetch all of them then I would reduce the set retrieved
>> by around 60. The dataset is quite large so the random query is not very
>> likely to be hitting the same part of the dataset again, so there is going
>> to be a fair amount of going to disk.,
>
> If disk seeks are killing you a kinda crazy idea would be to
> duplicate the table - clustering one by (id1) and
> the other one by an index on (id2) and unioning the
> results of each.

That's doubling the disk space needs for the table. Is there any odds
that this would benefit when the intitial table significantly exceeds
available memory by itself?

> Since each of these duplicates of the table will be clustered
> by the column you're querying it on, it should just take one
> seek in each table.
>
> Then your query could be something like
>
>   select * from (
>     select * from t1 where id1=2067 order by evalue limit 100
>     union
>     select * from t2 where id2=2067 order by evalue limit 100
>   ) as foo order by evalue limit 100;

This is actually what I ended up with as the best performing query, just
still on a single table, because without duplication I can add index and
optimize this one by (id1,evalue) and (id2,evalue). It is still getting
killed quite a lot by disk IO. So I guess I'm up to:

1) By better disk (I need to get an estimate how large it actually is
going to get).
2) Stick with one table, but make sure to have enough activity to get a
large part of the index in the OS-cache anyway. (and add more memory if
nessesary).

The data is seeing a fair amount of growth (doubles in a couple of years
) so it is fairly hard to maintain clustering on them .. I would suspect.

Is it possible to get PG to tell me, how many rows that fits in a
disk-page. All columns are sitting in "plain" storage according to \d+
on the table.

> Hmm..  and I wonder if putting evalue into the criteria to cluster
> the tables too (i.e. cluster on id1,evalue) if you could make it
> so the limit finds the right 100 evalues first for each table....

I didnt cluster it, since clustering "locks everything".

--
Jesper

Re: Digesting explain analyze

From

Greg Smith

Date:

07 January 2010, 02:39:51

Jesper Krogh wrote:
> Is it possible to get PG to tell me, how many rows that fits in a
> disk-page. All columns are sitting in "plain" storage according to \d+
> on the table.
>
select relname,round(reltuples / relpages) as "avg_rows_per_page" from
pg_class where relpages > 0;

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com

Re: Digesting explain analyze

From

Oleg Bartunov

Date:

07 January 2010, 04:25:07

Jesper,

the whole idea of bitmap index scan is to optimize heap access, so it ruins
any ordering, returned by index. That's why our new KNNGist, which returned
ordered index tuples doesn't supports bitmap index scan (note, this is only
for knn search).

Oleg

On Wed, 6 Jan 2010, Robert Haas wrote:

> On Wed, Jan 6, 2010 at 2:10 PM, Jesper Krogh <jesper@krogh.cc> wrote:
> > Hi.
> >
> > I have a table that consists of somewhere in the magnitude of 100.000.000
> > rows and all rows are of this tuples
> >
> > (id1,id2,evalue);
> >
> > Then I'd like to speed up a query like this:
> >
> > explain analyze select id from table where id1 =3D 2067 or id2 =3D 2067 o=
> rder
> > by evalue asc limit 100;
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0QUERY PLAN
> > -------------------------------------------------------------------------=
> ----------------------------------------------------------------------
> > =A0Limit =A0(cost=3D1423.28..1423.28 rows=3D100 width=3D12) (actual
> > time=3D2.565..2.567 rows=3D100 loops=3D1)
> > =A0 -> =A0Sort =A0(cost=3D1423.28..1424.54 rows=3D505 width=3D12) (actual
> > time=3D2.560..2.560 rows=3D100 loops=3D1)
> > =A0 =A0 =A0 =A0 Sort Key: evalue
> > =A0 =A0 =A0 =A0 Sort Method: =A0top-N heapsort =A0Memory: 25kB
> > =A0 =A0 =A0 =A0 -> =A0Bitmap Heap Scan on table =A0(cost=3D16.58..1420.75=
>  rows=3D505
> > width=3D12) (actual time=3D0.709..1.752 rows=3D450 loops=3D1)
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 Recheck Cond: ((id1 =3D 2067) OR (id2 =3D 206=
> 7))
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 -> =A0BitmapOr =A0(cost=3D16.58..16.58 rows=
> =3D506 width=3D0) (actual
> > time=3D0.676..0.676 rows=3D0 loops=3D1)
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -> =A0Bitmap Index Scan on id1_ev=
> alue_idx
> > (cost=3D0.00..11.44 rows=3D423 width=3D0) (actual
> > time=3D0.599..0.599 rows=3D450 loops=3D1)
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Index Cond: (id1_id =
> =3D 2067)
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 -> =A0Bitmap Index Scan on id2_ev=
> alue_idx
> > (cost=3D0.00..4.89 rows=3D83 width=3D0) (actual
> > time=3D0.070..0.070 rows=3D1 loops=3D1)
> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Index Cond: (id2_id =
> =3D 2067)
> > =A0Total runtime: 2.642 ms
> > (12 rows)
> >
> >
> > What I had expected was to see the "Bitmap Index Scan on id1_evalue_idx"
> > to chop it off at a "limit 1". The inner sets are on average 3.000 for
> > both id1 and id2 and a typical limit would be 100, so if I could convince
> > postgresql to not fetch all of them then I would reduce the set retrieved
> > by around 60. The dataset is quite large so the random query is not very
> > likely to be hitting the same part of the dataset again, so there is going
> > to be a fair amount of going to disk.,
> >
> > I would also mean that using it in a for loop in a stored-procedure in
> > plpgsql it would not get any benefit from the CURSOR effect?
> >
> > I actually tried to stuff id1,id2 into an array and do a GIST index on the
> > array,evalue hoping that it directly would satisfy this query.. it used
> > the GIST index fetch the rows the post-sorting and limit on the set.
> >
> > What it boils down to is more or less:
> >
> > Does a "bitmap index scan" support ordering and limit ?
> > Does a "multicolummn gist index" support ordering and limit ?
> >
> > Have I missed anything that can hugely speed up fetching these (typically
> > 100 rows) from the database.
>
> Bitmap index scans always return all the matching rows.  It would be
> nice if they could fetch them in chunks for queries like this, but
> they can't.  I am not sure whether there's any way to make GIST do
> what you want.
>
> You might try something like this (untested):
>
> SELECT * FROM (
>    select id from table where id1 =3D 2067 order by evalue asc limit 100
>    union all
>    select id from table where id2 =3D 2067 order by evalue asc limit 100
> ) x ORDER BY evalue LIMIT 100
>
> If you have an index by (id1, evalue) and by (id2, evalue) then I
> would think this would be pretty quick, as it should do two index
> scans (not bitmap index scans) to fetch 100 rows for each, then append
> the results, sort them, and then limit again.
>
> ...Robert
>
> --=20
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

     Regards,
         Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

Re: Digesting explain analyze

From

Matthew Wakeling

Date:

07 January 2010, 07:24:08

On Thu, 7 Jan 2010, Jesper Krogh wrote:
>> If disk seeks are killing you a kinda crazy idea would be to
>> duplicate the table - clustering one by (id1) and
>> the other one by an index on (id2) and unioning the
>> results of each.
>
> That's doubling the disk space needs for the table. Is there any odds
> that this would benefit when the intitial table significantly exceeds
> available memory by itself?

If the table already greatly exceeds the available RAM, then doubling the
amount of data won't make a massive difference to performance. You're
going to disc for everything anyway.

>> Since each of these duplicates of the table will be clustered
>> by the column you're querying it on, it should just take one
>> seek in each table.
>>
>> Then your query could be something like
>>
>>   select * from (
>>     select * from t1 where id1=2067 order by evalue limit 100
>>     union
>>     select * from t2 where id2=2067 order by evalue limit 100
>>   ) as foo order by evalue limit 100;
>
> This is actually what I ended up with as the best performing query, just
> still on a single table, because without duplication I can add index and
> optimize this one by (id1,evalue) and (id2,evalue). It is still getting
> killed quite a lot by disk IO. So I guess I'm up to:

You're kind of missing the point. The crucial step in the above suggestion
is to cluster the table on the index. This will mean that all the rows
that are fetched together are located together on disc, and you will no
longer be killed by disc IO.

> 1) By better disk (I need to get an estimate how large it actually is
> going to get).

Unless you cluster, you are still going to be limited by the rate at which
the discs can seek. Postgres 8.4 has some improvements here for bitmap
index scans if you have a RAID array, and set the effective_concurrency
setting correctly.

> 2) Stick with one table, but make sure to have enough activity to get a
> large part of the index in the OS-cache anyway. (and add more memory if
> nessesary).

In order to win here, you will need to make memory at least as big as the
commonly-accessed parts of the database. This could get expensive.

> I didnt cluster it, since clustering "locks everything".

You can also test out the hypothesis by copying the table instead:

CREATE NEW TABLE test1 AS SELECT * FROM table1 ORDER BY id1;

Then create an index on id1, and test against that table. The copy will
become out of date quickly, but it will allow you to see whether the
performance benefit is worth it. It will also tell you how long a cluster
will actually take, without actually locking anything.

Matthew

--
 In the beginning was the word, and the word was unsigned,
 and the main() {} was without form and void...