Thread: Re: Select time jump after adding filter; please help me figure out what I'm doing wrong.

Don't forget to CC: the list.

Andrew Edson wrote:
> I apologize; You are correct in that I mistyped my original structure.  Here is the information for the correct
explainand explain analyze statements. 
>
>   attest=# explain select cntrct_id, tran_dt from ptrans where rcrd_cd = '0A';

No need for the simple explain - explain analyse includes all the
information.

>   attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd = '0A';
>                                                                    QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------
>  Bitmap Heap Scan on ptrans  (cost=1223.86..149853.23 rows=85741 width=21) (actual time=2302.363..70321.838
rows=6701655loops=1) 
>    ->  Bitmap Index Scan on ptrans_cid_trandt_idx  (cost=0.00..1223.86 rows=85741 width=0) (actual
time=2269.064..2269.064rows=204855 loops=1) 
>  Total runtime: 89854.843 ms

Well, it's taking 90 seconds to return 6.7 million rows. Depending on
your system and memory settings, that might not be unreasonable.

It *is* getting the estimate of returned rows wrong (it thinks 85,741
will match) which is hugely out of line. Is there something odd with
this table/column or haven't you analysed recently? How many unique
values does rcrd_cd have, and how many rows does the table have?

I don't know that you'll get this down to sub-second responses though,
not if you're trying to return 6 million rows from an even larger table.

--
   Richard Huxton
   Archonet Ltd


I apologize about the CC; I thought I had done so.
 
There are fourteen (14) distinct values in rcrd_cd.  And I don't know if this counts as something odd, but I got the following values by doing a vacuum full analyze, then running the set with index, dropping index, running set without.
 
Values with index:
>
> attest=# select count(*) from ptrans;
>  16488578
>
> attest=# select count(*) from ptrans where rcrd_cd = '0A';
>  6701655
>
> attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd = '                                                                                                                                                            >  0A';
>  Bitmap Heap Scan on ptrans  (cost=1223.86..151183.39 rows=87439 width=21) (actu                                                                                                                                                            >  al time=2255.640..70371.304 rows=6701655 loops=1)
>    ->  Bitmap Index Scan on ptrans_cid_trandt_idx  (cost=0.00..1223.86 rows=8743                                                                                                                                                            >  9 width=0) (actual time=2216.856..2216.856 rows=204855 loops=1)
>  Total runtime: 89964.419 ms
>
Values without index:
>
> attest=# select count(*) from ptrans;
>  16488578
>
> attest=# select count(*) from ptrans where rcrd_cd = '0A';
>  204855
>
> attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd = '0A';
>  Seq Scan on ptrans  (cost=0.00..384813.22 rows=87439 width=21) (actual time=20.286..65330.049 rows=204855 loops=1)
>    Filter: (rcrd_cd = '0A'::bpchar)
>  Total runtime: 65945.160 ms
 
Again, someone mind pointing out to me where I've managed to mess this one up?


Richard Huxton <dev@archonet.com> wrote:
Don't forget to CC: the list.

Andrew Edson wrote:
> I apologize; You are correct in that I mistyped my original structure. Here is the information for the correct explain and explain analyze statements.
>
> attest=# explain select cntrct_id, tran_dt from ptrans where rcrd_cd = '0A';

No need for the simple explain - explain analyse includes all the
information.

> attest=# explain analyze select cntrct_id, tran_dt from ptrans where rcrd_cd = '0A';
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on ptrans (cost=1223.86..149853.23 rows=85741 width=21) (actual time=2302.363..70321.838 rows=6701655 loops=1)
> -> Bitmap Index Scan on ptrans_cid_trandt_idx (cost=0.00..1223.86 rows=85741 width=0) (actual time=2269.064..2269.064 rows=204855 loops=1)
> Total runtime: 89854.843 ms

Well, it's taking 90 seconds to return 6.7 million rows. Depending on
your system and memory settings, that might not be unreasonable.

It *is* getting the estimate of returned rows wrong (it thinks 85,741
will match) which is hugely out of line. Is there something odd with
this table/column or haven't you analysed recently? How many unique
values does rcrd_cd have, and how many rows does the table have?

I don't know that you'll get this down to sub-second responses though,
not if you're trying to return 6 million rows from an even larger table.

--
Richard Huxton
Archonet Ltd


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


Ready for the edge of your seat? Check out tonight's top picks on Yahoo! TV.
Andrew Edson wrote:
> I apologize about the CC; I thought I had done so.

no problem

> There are fourteen (14) distinct values in rcrd_cd.  And I don't know
> if this counts as something odd, but I got the following values by
> doing a vacuum full analyze, then running the set with index,
> dropping index, running set without.

Might want to do ALTER TABLE ... ALTER COLUMN rcrd_cd SET STATISTICS =
14 (or a few more than that if you think it might be useful) - won't
help you with this though.

So - are you saying that with these two queries...

>> attest=# select count(*) from ptrans where rcrd_cd = '0A';
 >> 6701655

>> attest=# select count(*) from ptrans where rcrd_cd = '0A';
 >> 204855

...the only difference is that you've dropped an index?

Because that's just strange - and I don't think it's anything you're doing.

Do you have other partial indexes for different values of rcrd_cd, and
do they have similar problems? If this can be reproduced it might point
to something odd with bitmap scans.

Oh, remind me what version of PostgreSQL you're running?

--
   Richard Huxton
   Archonet Ltd

Yes.  The only difference between the two selects was that the index on the table in question was dropped.  As far as I know, that was the only partial index on there, although since it's a test db, I could probably go in and experiment on a few more if needed.
 
This problem may have already been solved; I'm using an older version of Postgres; 8.1.3.  My boss has requested that it not be upgraded just yet, however, so I'm stuck with it for the moment.
 
Richard Huxton <dev@archonet.com> wrote:
Andrew Edson wrote:
> I apologize about the CC; I thought I had done so.

no problem

> There are fourteen (14) distinct values in rcrd_cd. And I don't know
> if this counts as something odd, but I got the following values by
> doing a vacuum full analyze, then running the set with index,
> dropping index, running set without.

Might want to do ALTER TABLE ... ALTER COLUMN rcrd_cd SET STATISTICS =
14 (or a few more than that if you think it might be useful) - won't
help you with this though.

So - are you saying that with these two queries...

>> attest=# select count(*) from ptrans where rcrd_cd = '0A';
>> 6701655

>> attest=# select count(*) from ptrans where rcrd_cd = '0A';
>> 204855

...the only difference is that you've dropped an index?

Because that's just strange - and I don't think it's anything you're doing.

Do you have other partial indexes for different values of rcrd_cd, and
do they have similar problems? If this can be reproduced it might point
to something odd with bitmap scans.

Oh, remind me what version of PostgreSQL you're running?

--
Richard Huxton
Archonet Ltd


Moody friends. Drama queens. Your life? Nope! - their life, your story.
Play Sims Stories at Yahoo! Games.
Andrew Edson <cheighlund@yahoo.com> writes:
>   This problem may have already been solved; I'm using an older
>   version of Postgres; 8.1.3.

Ah.  I think your result is explained by this 8.1.4 bug fix:

2006-05-18 14:57  tgl

    * src/backend/optimizer/plan/createplan.c (REL8_1_STABLE): When a
    bitmap indexscan is using a partial index, it is necessary to
    include the partial index predicate in the scan's "recheck
    condition".  Otherwise, if the scan becomes lossy for lack of
    bitmap memory, we would fail to enforce that returned rows satisfy
    the predicate.    Noted while studying bug #2441 from Arjen van der
    Meijden.

IOW, once the bitmap goes lossy, we'd return *every* row on any page
that the index fingered as having *any* relevant row.

>   My boss has requested that it not be
>   upgraded just yet, however, so I'm stuck with it for the moment.

Better press him a little harder.  There are a lot of known bugs in
8.1.3, and not any very good reason not to update to a later 8.1.x.

As a really short-term measure, you could possibly avoid this bug by
increasing work_mem enough that the bitmap doesn't get compressed.

            regards, tom lane



Tom Lane <tgl@sss.pgh.pa.us> wrote:
Ah. I think your result is explained by this 8.1.4 bug fix:

2006-05-18 14:57 tgl

* src/backend/optimizer/plan/createplan.c (REL8_1_STABLE): When a
bitmap indexscan is using a partial index, it is necessary to
include the partial index predicate in the scan's "recheck
condition". Otherwise, if the scan becomes lossy for lack of
bitmap memory, we would fail to enforce that returned rows satisfy
the predicate. Noted while studying bug #2441 from Arjen van der
Meijden.

IOW, once the bitmap goes lossy, we'd return *every* row on any page
that the index fingered as having *any* relevant row.

Better press him a little harder. There are a lot of known bugs in
8.1.3, and not any very good reason not to update to a later 8.1.x.

As a really short-term measure, you could possibly avoid this bug by
increasing work_mem enough that the bitmap doesn't get compressed.

regards, tom lane
I'll press as I can...thanks for pointing that out to me.  How would I determine how much to increase work_mem by (or to) until then?


Got a little couch potato?
Check out fun summer activities for kids.
Andrew Edson wrote:
> Yes.  The only difference between the two selects was that the index
> on the table in question was dropped.  As far as I know, that was the
> only partial index on there, although since it's a test db, I could
> probably go in and experiment on a few more if needed.
>
> This problem may have already been solved; I'm using an older version
> of Postgres; 8.1.3.  My boss has requested that it not be upgraded
> just yet, however, so I'm stuck with it for the moment.

You should probably upgrade to 8.1.9 - there are a couple of bugfixes
that mention bitmap scans, although nothing that describes your problem
in-depth.

http://www.postgresql.org/docs/8.1/static/release.html#RELEASE-8-1-9

You don't need to dump/restore the database, it's just a matter of
upgrading the binaries within the 8.1 series. You should plan on doing
point-release updates fairly frequently.

I'd put the partial index back, see if you can reproduce it and if so
tell your boss you should upgrade. If you can't talk him around, give
him the URL for the release notes along with the three bugs you think
are most dangerous. If that doesn't work, I'm sure we can get someone
from the core developers to tell you the same thing I just have, and we
can forward that to your boss :-)

--
   Richard Huxton
   Archonet Ltd