Thread: inconsistent/weird index usage

inconsistent/weird index usage

From
Dustin Sallings
Date:
    To save some time, let me start by saying

PostgreSQL 7.4.3 on powerpc-apple-darwin7.4.0, compiled by GCC gcc
(GCC) 3.3 20030304 (Apple Computer, Inc. build 1640)

    OK, now on to details...

    I'm trying to implement oracle style ``partitions'' in postgres.  I've
run into my first snag on what should be a fairly quick query.
Basically, I started with the following schema and split the
``samples'' table into one table for each year (1999-2004).


-- BEGIN SCHEMA

create table sensor_types (
     sensor_type_id serial,
     sensor_type text not null,
     units varchar(10) not null,
     primary key(sensor_type_id)
);

create table sensors (
     sensor_id serial,
     sensor_type_id integer not null,
     serial char(16) not null,
     name text not null,
     low smallint not null,
     high smallint not null,
     active boolean default true,
     primary key(sensor_id),
     foreign key(sensor_type_id) references sensor_types(sensor_type_id)
);
create unique index sensors_byserial on sensors(serial);

create table samples (
     ts datetime not null,
     sensor_id integer not null,
     sample float not null,
     foreign key(sensor_id) references sensors(sensor_id)
);
create index samples_bytime on samples(ts);
create unique index samples_bytimeid on samples(ts, sensor_id);

-- END SCHEMA

    Each samples_[year] table looks, and is indexed exactly as the above
samples table was by using the following commands:

create index samples_1999_bytime on samples_1999(ts);
create index samples_2000_bytime on samples_2000(ts);
create index samples_2001_bytime on samples_2001(ts);
create index samples_2002_bytime on samples_2002(ts);
create index samples_2003_bytime on samples_2003(ts);
create index samples_2004_bytime on samples_2004(ts);

create unique index samples_1999_bytimeid on samples_1999(ts,
sensor_id);
create unique index samples_2000_bytimeid on samples_2000(ts,
sensor_id);
create unique index samples_2001_bytimeid on samples_2001(ts,
sensor_id);
create unique index samples_2002_bytimeid on samples_2002(ts,
sensor_id);
create unique index samples_2003_bytimeid on samples_2003(ts,
sensor_id);
create unique index samples_2004_bytimeid on samples_2004(ts,
sensor_id);

    The tables contain the following number of rows:

samples_1999    311030
samples_2000    2142245
samples_2001    2706571
samples_2002    3111602
samples_2003    3149316
samples_2004    2375972

    The following view creates the illusion of the old ``single-table''
model:

create view samples as
     select * from samples_1999
  union  select * from samples_2000
  union  select * from samples_2001
  union  select * from samples_2002
  union  select * from samples_2003
  union  select * from samples_2004

    ...along with the following rule on the view for the applications
performing inserts:

create rule sample_rule as on insert to samples
   do instead
   insert into samples_2004 (ts, sensor_id, sample)
     values(new.ts, new.sensor_id, new.sample)


    OK, now that that's over with, I have this one particular query that I
attempt to run for a report from my phone that no longer works because
it tries to do a table scan on *some* of the tables.  Why it chooses
this table scan, I can't imagine.  The query is as follows:

select
         s.serial as serial_num,
         s.name as name,
         date(ts) as day,
         min(sample) as min_temp,
         avg(sample) as avg_temp,
         stddev(sample) as stddev_temp,
         max(sample) as max_temp
     from
         samples inner join sensors s using (sensor_id)
     where
         ts > current_date - 7
     group by
         serial_num, name, day
     order by
         serial_num, day desc


    explain analyze reports the following (sorry for the horrible
wrapping):

  Sort  (cost=1185281.45..1185285.95 rows=1800 width=50) (actual
time=82832.106..82832.147 rows=56 loops=1)
    Sort Key: s.serial, date(samples.ts)
    ->  HashAggregate  (cost=1185161.62..1185184.12 rows=1800 width=50)
(actual time=82830.624..82831.601 rows=56 loops=1)
          ->  Hash Join  (cost=1063980.21..1181539.96 rows=206952
width=50) (actual time=80408.123..81688.590 rows=66389 loops=1)
                Hash Cond: ("outer".sensor_id = "inner".sensor_id)
                ->  Subquery Scan samples  (cost=1063979.10..1155957.38
rows=4598914 width=20) (actual time=80392.477..80922.764 rows=66389
loops=1)
                      ->  Unique  (cost=1063979.10..1109968.24
rows=4598914 width=20) (actual time=80392.451..80646.761 rows=66389
loops=1)
                            ->  Sort  (cost=1063979.10..1075476.39
rows=4598914 width=20) (actual time=80392.437..80442.787 rows=66389
loops=1)
                                  Sort Key: ts, sensor_id, sample
                                  ->  Append  (cost=0.00..312023.46
rows=4598914 width=20) (actual time=79014.428..80148.396 rows=66389
loops=1)
                                        ->  Subquery Scan "*SELECT* 1"
(cost=0.00..9239.37 rows=103677 width=20) (actual
time=4010.181..4010.181 rows=0 loops=1)
                                              ->  Seq Scan on
samples_1999  (cost=0.00..8202.60 rows=103677 width=20) (actual
time=4010.165..4010.165 rows=0 loops=1)
                                                    Filter: (ts >
((('now'::text)::date - 7))::timestamp without time zone)
                                        ->  Subquery Scan "*SELECT* 2"
(cost=0.00..28646.17 rows=714082 width=20) (actual time=44.827..44.827
rows=0 loops=1)
                                              ->  Index Scan using
samples_2000_bytime on samples_2000  (cost=0.00..21505.35 rows=714082
width=20) (actual time=44.818..44.818 rows=0 loops=1)
                                                    Index Cond: (ts >
((('now'::text)::date - 7))::timestamp without time zone)
                                        ->  Subquery Scan "*SELECT* 3"
(cost=0.00..80393.33 rows=902191 width=20) (actual
time=34772.377..34772.377 rows=0 loops=1)
                                              ->  Seq Scan on
samples_2001  (cost=0.00..71371.42 rows=902191 width=20) (actual
time=34772.366..34772.366 rows=0 loops=1)
                                                    Filter: (ts >
((('now'::text)::date - 7))::timestamp without time zone)
                                        ->  Subquery Scan "*SELECT* 4"
(cost=0.00..92424.05 rows=1037201 width=20) (actual
time=40072.103..40072.103 rows=0 loops=1)
                                              ->  Seq Scan on
samples_2002  (cost=0.00..82052.04 rows=1037201 width=20) (actual
time=40072.090..40072.090 rows=0 loops=1)
                                                    Filter: (ts >
((('now'::text)::date - 7))::timestamp without time zone)
                                        ->  Subquery Scan "*SELECT* 5"
(cost=0.00..42380.58 rows=1049772 width=20) (actual time=49.455..49.455
rows=0 loops=1)
                                              ->  Index Scan using
samples_2003_bytime on samples_2003  (cost=0.00..31882.86 rows=1049772
width=20) (actual time=49.448..49.448 rows=0 loops=1)
                                                    Index Cond: (ts >
((('now'::text)::date - 7))::timestamp without time zone)
                                        ->  Subquery Scan "*SELECT* 6"
(cost=0.00..58939.96 rows=791991 width=20) (actual
time=65.458..1124.363 rows=66389 loops=1)
                                              ->  Index Scan using
samples_2004_bytime on samples_2004  (cost=0.00..51020.05 rows=791991
width=20) (actual time=65.430..750.336 rows=66389 loops=1)
                                                    Index Cond: (ts >
((('now'::text)::date - 7))::timestamp without time zone)
                ->  Hash  (cost=1.09..1.09 rows=9 width=38) (actual
time=15.295..15.295 rows=0 loops=1)
                      ->  Seq Scan on sensors s  (cost=0.00..1.09 rows=9
width=38) (actual time=15.122..15.187 rows=9 loops=1)
  Total runtime: 82865.119 ms


    Essentially, what you can see here is that it's doing an index scan on
samples_2000, samples_2003, and samples_2004, but a sequential scan on
samples_1999, samples_2001, and samples_2002.  It's very strange to me
that it would make these choices.  If I disable sequential scans
altogether for this session, the query runs in under 4 seconds.

    This is a very cool solution for long-term storage, and isn't terribly
hard to manage.  I actually have other report queries that seem to be
making pretty good index selection currently...but I always want more!
:)  Does anyone have any suggestions as to how to get this to do what I
want?

    Of course, ideally, it would ignore five of the tables altogether.  :)

--
SPY                      My girlfriend asked me which one I like better.
pub  1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin@spy.net>
|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________


Re: inconsistent/weird index usage

From
John Meinel
Date:
Dustin Sallings wrote:
>

[...]

>     OK, now that that's over with, I have this one particular query that
> I attempt to run for a report from my phone that no longer works because
> it tries to do a table scan on *some* of the tables.  Why it chooses
> this table scan, I can't imagine.  The query is as follows:
>
> select
>         s.serial as serial_num,
>         s.name as name,
>         date(ts) as day,
>         min(sample) as min_temp,
>         avg(sample) as avg_temp,
>         stddev(sample) as stddev_temp,
>         max(sample) as max_temp
>     from
>         samples inner join sensors s using (sensor_id)
>     where
>         ts > current_date - 7
>     group by
>         serial_num, name, day
>     order by
>         serial_num, day desc
>
>

[ next section heavily clipped for clarity ]

->  Seq Scan on samples_1999  (cost rows=103677) (actual rows=0 loops=1)

->  Index Scan using samples_2000_bytime on samples_2000  (cost
rows=714082 (actual rows=0 loops=1)


->  Seq Scan on samples_2001  (cost rows=902191) (actual rows=0 loops=1)

->  Seq Scan on samples_2002  (cost rows=1037201) (actual rows=0 loops=1)

->  Index Scan using samples_2003_bytime on samples_2003  (cost
rows=1049772) (actual rows=0 loops=1)

->  Index Scan using samples_2004_bytime on samples_2004  (cost
rows=791991) (actual rows=66389 loops=1)

[...]
>
>
>     Essentially, what you can see here is that it's doing an index scan
> on samples_2000, samples_2003, and samples_2004, but a sequential scan
> on samples_1999, samples_2001, and samples_2002.  It's very strange to
> me that it would make these choices.  If I disable sequential scans
> altogether for this session, the query runs in under 4 seconds.
>
>     This is a very cool solution for long-term storage, and isn't
> terribly hard to manage.  I actually have other report queries that seem
> to be making pretty good index selection currently...but I always want
> more!  :)  Does anyone have any suggestions as to how to get this to do
> what I want?
>
>     Of course, ideally, it would ignore five of the tables altogether.  :)
>
> --
> SPY                      My girlfriend asked me which one I like better.
> pub  1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin@spy.net>
> |    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
> L_______________________ I hope the answer won't upset her. ____________
>
>

Just as a heads up. You have run vacuum analyze before running this
query, correct?

Because you'll notice that the query planner is thinking that it will
have 103677 rows from 1999, 700,000 rows from 2000, 900,000 rows from
2001, etc, etc. Obviously the query planner is not planning well
considering it there are only 60,000 rows from 2004, and no rows from
anything else.

It just seems like it hasn't updated it's statistics to be aware of when
the time is on most of the tables.

(By the way, an indexed scan returning 0 entries is *really* fast, so I
wouldn't worry about ignoring the extra tables. :)

I suppose the other question is whether this is a prepared or stored
query. Because sometimes the query planner cannot do enough optimization
in a stored query. (I ran into this problem where I had 1 column with
500,000+ entries referencing 1 number. If I ran manually, the time was
much better because I wasn't using *that* number. With a stored query,
it had to take into account that I *might* use that number, and didn't
want to do 500,000+ indexed lookups)

The only other thing I can think of is that there might be some
collision between datetime and date. Like it is thinking it is looking
at the time of day when it plans the queries (hence why so many rows),
but really it is looking at the date. Perhaps a cast is in order to make
it work right. I don't really know.

Interesting problem, though.
John
=:->


Attachment

Re: inconsistent/weird index usage

From
Tom Lane
Date:
Dustin Sallings <dustin@spy.net> writes:
>     The following view creates the illusion of the old ``single-table''
> model:

> create view samples as
>      select * from samples_1999
>   union  select * from samples_2000
>   union  select * from samples_2001
>   union  select * from samples_2002
>   union  select * from samples_2003
>   union  select * from samples_2004

You really, really, really want to use UNION ALL not UNION here.

>     OK, now that that's over with, I have this one particular query that I
> attempt to run for a report from my phone that no longer works because
> it tries to do a table scan on *some* of the tables.  Why it chooses
> this table scan, I can't imagine.

Most of the problem here comes from the fact that "current_date - 7"
isn't reducible to a constant and so the planner is making bad guesses
about how much of each table will be scanned.  If possible, do the date
arithmetic on the client side and send over a simple literal constant.
If that's not practical you can fake it with a mislabeled IMMUTABLE
function --- see the list archives for previous discussions of the
same issue.

            regards, tom lane

Re: inconsistent/weird index usage

From
Richard Huxton
Date:
Dustin Sallings wrote:
>     The following view creates the illusion of the old ``single-table''
> model:
>
> create view samples as
>     select * from samples_1999
>  union  select * from samples_2000
>  union  select * from samples_2001
>  union  select * from samples_2002
>  union  select * from samples_2003
>  union  select * from samples_2004

Try this with UNION ALL (you know there won't be any duplicates) and
possibly with some limits too:

SELECT * FROM samples_1999 WHERE ts BETWEEN '1999-01-01 00:00:00+00' AND
'1999-12-31 11:59:59+00'
UNION ALL ...

> select
>         s.serial as serial_num,
>         s.name as name,
>         date(ts) as day,
>         min(sample) as min_temp,
>         avg(sample) as avg_temp,
>         stddev(sample) as stddev_temp,
>         max(sample) as max_temp
>     from
>         samples inner join sensors s using (sensor_id)
>     where
>         ts > current_date - 7
>     group by
>         serial_num, name, day
>     order by
>         serial_num, day desc

Try restricting the timestamp too

WHERE
   ts BETWEEN (current_date -7) AND current_timestamp

Hopefully that will give the planner enough smarts to know it can skip
most of the sample_200x tables.

--
   Richard Huxton
   Archonet Ltd

Re: inconsistent/weird index usage

From
Josh Berkus
Date:
Tom,

> Most of the problem here comes from the fact that "current_date - 7"
> isn't reducible to a constant and so the planner is making bad guesses
> about how much of each table will be scanned.

I thought this was fixed in 7.4.  No?

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: inconsistent/weird index usage

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
>> Most of the problem here comes from the fact that "current_date - 7"
>> isn't reducible to a constant and so the planner is making bad guesses
>> about how much of each table will be scanned.

> I thought this was fixed in 7.4.  No?

No.  It's not fixed as of CVS tip either, although there was some talk
of doing something in time for 8.0.

            regards, tom lane

Re: inconsistent/weird index usage

From
Hannu Krosing
Date:
On R, 2004-10-01 at 19:34, Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> >> Most of the problem here comes from the fact that "current_date - 7"
> >> isn't reducible to a constant and so the planner is making bad guesses
> >> about how much of each table will be scanned.
>
> > I thought this was fixed in 7.4.  No?
>
> No.  It's not fixed as of CVS tip either, although there was some talk
> of doing something in time for 8.0.

That's weird - my 7.4.2 databases did not consider (now()-'15
min'::interval) to be a constant whereas 7.4.5 does (i.e. it does use
index scan on index on datetime column)

Is this somehow different for date types ?

--------------
Hannu

Re: inconsistent/weird index usage

From
Tom Lane
Date:
Hannu Krosing <hannu@tm.ee> writes:
>> No.  It's not fixed as of CVS tip either, although there was some talk
>> of doing something in time for 8.0.

> That's weird - my 7.4.2 databases did not consider (now()-'15
> min'::interval) to be a constant whereas 7.4.5 does (i.e. it does use
> index scan on index on datetime column)

The question isn't whether it can use it as an indexscan bound; the
question is whether it can derive an accurate rowcount estimate.
The issue is exactly that STABLE functions work for one but not the
other.

            regards, tom lane