Thread: Re: [GENERAL] select query takes 13 seconds to run with index

Re: [GENERAL] select query takes 13 seconds to run with index

From
mark
Date:
On Mon, May 26, 2008 at 4:26 PM, Justin <justin@emproshunts.com> wrote:
> mark wrote:
>> Hi, is there anyway this can be made faster?  id is the primary key,
>> and there is an index on uid..
>> thanks
>> EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
>> DESC limit 6;
>>                                                                     QUERY
>> PLAN
>>
>>
----------------------------------------------------------------------------------------------------------------------------------------------------
>>  Limit  (cost=0.00..9329.02 rows=6 width=135) (actual
>> time=13612.247..13612.247 rows=0 loops=1)
>>   ->  Index Scan Backward using pokes_pkey on pokes
>> (cost=0.00..5182270.69 rows=3333 width=135) (actual
>> time=13612.245..13612.245 rows=0 loops=1)
>>         Filter: (uid = 578439028)
>>  Total runtime: 13612.369 ms
>> (4 rows)
> First this should be posted on performance list.
sorry about this.

> how many records are in this table?
22334262, 22 million records.

> The estimate is way off, when was the last time Vaccum was on the table?
about a week ago i ran this VACUUM VERBOSE ANALYZE;
this table is never updated or deleted, rows are just inserted...


> What verison of Postgresql are you running
8.3.1

> Size of the Table
22 million rows approximately

> Table layout
CREATE TABLE pokes
(
  id serial NOT NULL,
  uid integer,
  action_id integer,
  created timestamp without time zone DEFAULT now(),
  friend_id integer,
  message text,
  pic text,
  "name" text,
  CONSTRAINT pokes_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);
ALTER TABLE pokes OWNER TO postgres;

-- Index: idx_action_idx

-- DROP INDEX idx_action_idx;

CREATE INDEX idx_action_idx
  ON pokes
  USING btree
  (action_id);

-- Index: idx_friend_id

-- DROP INDEX idx_friend_id;

CREATE INDEX idx_friend_id
  ON pokes
  USING btree
  (friend_id);

-- Index: idx_pokes_uid

-- DROP INDEX idx_pokes_uid;

CREATE INDEX idx_pokes_uid
  ON pokes
  USING btree
  (uid);


> Load on the database
how do i measure load on database?

Re: [GENERAL] select query takes 13 seconds to run with index

From
mark
Date:
On Mon, May 26, 2008 at 4:49 PM, Justin <justin@emproshunts.com> wrote:
> mark wrote:
>
> On Mon, May 26, 2008 at 4:26 PM, Justin <justin@emproshunts.com> wrote:
> mark wrote:
> Hi, is there anyway this can be made faster?  id is the primary key,
> and there is an index on uid..
> thanks
> EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
> DESC limit 6;
>                                                                     QUERY
> PLAN
>
>
----------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.00..9329.02 rows=6 width=135) (actual
> time=13612.247..13612.247 rows=0 loops=1)
>   ->  Index Scan Backward using pokes_pkey on pokes
> (cost=0.00..5182270.69 rows=3333 width=135) (actual
> time=13612.245..13612.245 rows=0 loops=1)
>         Filter: (uid = 578439028)
>  Total runtime: 13612.369 ms
> (4 rows)
>
>
> First this should be posted on performance list.
>
>
> sorry about this.
>
>
>
> how many records are in this table?
>
>
> 22334262, 22 million records.
>
>
>
> The estimate is way off, when was the last time Vaccum was on the table?
>
>
> about a week ago i ran this VACUUM VERBOSE ANALYZE;
> this table is never updated or deleted, rows are just inserted...
>
>
>
>
> What verison of Postgresql are you running
>
>
> 8.3.1
>
>
>
> Size of the Table
>
>
> 22 million rows approximately
>
>
> I have no experience  on large datasets so people with more experience in
> this area are going to have to chime in.
> My gut feel is 13 seconds for Postgresql to sort through an index of that
> size and table is not bad.
>
> you may need to take a look at hardware and postgresql.config settings to
> improve the performance for this query
>
> This query is very simple where changing it around or adding index results
> massive improvements is not going to help in this case.
the hardware is e5405 dual quad core on a 16GB RAM machine, with 8.3.1
default settings except maximum connections increased...

>> The estimate is way off, when was the last time Vaccum was on the table?
> about a week ago i ran this VACUUM VERBOSE ANALYZE;
> this table is never updated or deleted, rows are just inserted...

    You should analyze it more often, then... Postgres probably thinks the
table has the same data distribution as last week !
    Analyze just takes a couple seconds...

>> Load on the database
> how do i measure load on database?

    Just look at vmstat.

    Also if you very often do SELECT .. WHERE x = ... ORDER BY id DESC you'll
benefit from an index on (x,id) instead of just (x).



Re: [GENERAL] select query takes 13 seconds to run with index

From
hubert depesz lubaczewski
Date:
On Mon, May 26, 2008 at 04:32:50PM -0700, mark wrote:
> >> EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
> >> DESC limit 6;
> > The estimate is way off, when was the last time Vaccum was on the table?
> about a week ago i ran this VACUUM VERBOSE ANALYZE;
> this table is never updated or deleted, rows are just inserted...

1. boost default_statistics_target
2. run analyze more often - daily job for example
3. create index q on pokes (uid, id); should help

depesz

Re: [GENERAL] select query takes 13 seconds to run with index

From
mark
Date:
On Tue, May 27, 2008 at 1:22 AM, hubert depesz lubaczewski
<depesz@depesz.com> wrote:
> On Mon, May 26, 2008 at 04:32:50PM -0700, mark wrote:
>> >> EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
>> >> DESC limit 6;
>> > The estimate is way off, when was the last time Vaccum was on the table?
>> about a week ago i ran this VACUUM VERBOSE ANALYZE;
>> this table is never updated or deleted, rows are just inserted...
>
> 1. boost default_statistics_target
> 2. run analyze more often - daily job for example
> 3. create index q on pokes (uid, id); should help

OK I did this

ALTER TABLE pokes ALTER uid set statistics 500;
ALTER TABLE

ANALYZE pokes;
ANALYZE

and then it became super fast!! thanks a lot!!!
my question:
-> is 500 too high? what all does this affect?
-> now increasing this number does it affect only when i am running
analyze commands, or will it slow down inserts and other operations?
EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
desc limit 6;
                                                             QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=467.80..467.81 rows=6 width=134) (actual
time=0.016..0.016 rows=0 loops=1)
  ->  Sort  (cost=467.80..468.09 rows=117 width=134) (actual
time=0.016..0.016 rows=0 loops=1)
        Sort Key: id
        Sort Method:  quicksort  Memory: 25kB
        ->  Index Scan using idx_pokes_uid on pokes
(cost=0.00..465.70 rows=117 width=134) (actual time=0.011..0.011
rows=0 loops=1)
              Index Cond: (uid = 578439028)
 Total runtime: 0.037 ms

Re: [GENERAL] select query takes 13 seconds to run with index

From
hubert depesz lubaczewski
Date:
On Tue, May 27, 2008 at 07:46:05AM -0700, mark wrote:
> and then it became super fast!! thanks a lot!!!
> my question:
> -> is 500 too high? what all does this affect?

i usually dont go over 100. it affects number of elements in statistics
for fields. you can see the stats in:
select * from pg_stats;

> -> now increasing this number does it affect only when i am running
> analyze commands, or will it slow down inserts and other operations?
> EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
> desc limit 6;

it (theoretically) can slow down selects to to the fact that it now has
to load more data to be able to plan (i.e. it loads the statistics, and
since there are more values - the statistics are larger).

generally - in most cases this shouldn't be an issue.

additionally - i think that the 2-column index would work in this
particular case even better.

regards,

depesz