Thread: Problems with ordering (can't force query planner to use an index)
Hey, I have a table that links content together and it currently holds about 17 mio records. Typical query is a join with a content table and link table: noovo-new=# explain analyze SELECT "core_accessor"."id", "core_accessor"."content_type_id", "core_accessor"."object_id", "core_accessor"."ordering", "core_accessor"."label", "core_accessor"."date_posted", "core_accessor"."publish_state", "core_accessor"."nooximity_old", "core_accessor"."rising", "core_accessor"."nooximity", "core_accessor"."nooximity_old_date_posted", "core_accessor"."nooximity_date_posted", "core_accessor"."user_id", "core_accessor"."slot_id", "core_accessor"."slot_type_id", "core_accessor"."role", "core_base"."object_id", "core_base"."content_type_id", "core_base"."abstract", "core_base"."abstract_title", "core_base"."image", "core_base"."date_posted", "core_base"."date_modified", "core_base"."date_expires", "core_base"."publish_state", "core_base"."location", "core_base"."location_x", "core_base"."location_y", "core_base"."raw", "core_base"."author_id", "core_base"."excerpt", "core_base"."state_id", "core_base"."country_id", "core_base"."language", "core_base"."_identifier", "core_base"."slot_url", "core_base"."source_id", "core_base"."source_content_type_id", "core_base"."source_type", "core_base"."source_value", "core_base"."source_title", "core_base"."direct_to_source", "core_base"."comment_count", "core_base"."public" FROM "core_accessor" INNER JOIN core_base AS core_base ON core_base.content_type_id = core_accessor.content_type_id AND core_base.object_id = core_accessor.object_id WHERE (("core_accessor"."slot_type_id" = 119 AND "core_accessor"."slot_id" = 472 AND "core_accessor"."label" = E'' AND "core_accessor"."publish_state" >= 60 AND "core_accessor"."role" IN (0) AND "core_accessor"."user_id" = 0)) order by core_accessor.date_posted, core_accessor.nooximity LIMIT 5 ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=31930.65..31930.66 rows=5 width=860) (actual time=711.924..711.927 rows=5 loops=1) -> Sort (cost=31930.65..31937.80 rows=2861 width=860) (actual time=711.923..711.923 rows=5 loops=1) Sort Key: core_accessor.date_posted, core_accessor.nooximity Sort Method: top-N heapsort Memory: 31kB -> Nested Loop (cost=0.00..31883.13 rows=2861 width=860) (actual time=0.089..543.497 rows=68505 loops=1) -> Index Scan using core_accessor_fresh_idx on core_accessor (cost=0.00..5460.07 rows=2970 width=92) (actual time=0.068..54.921 rows=69312 loops=1) Index Cond: ((slot_id = 472) AND (slot_type_id = 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND (publish_state >= 60)) -> Index Scan using core_base_pkey on core_base (cost=0.00..8.88 rows=1 width=768) (actual time=0.004..0.005 rows=1 loops=69312) Index Cond: ((core_base.object_id = core_accessor.object_id) AND (core_base.content_type_id = core_accessor.content_type_id)) Total runtime: 712.031 ms (10 rows) noovo-new=# select * from pg_stat_user_tables where relname='core_accessor'; relid | schemaname | relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze -------+------------+---------------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+-------------------------------+-----------------+-------------------------------+------------------ 51159 | public | core_accessor | 58 | 749773516 | 13785608 | 149165183 | 9566 | 548 | 347 | 206 | 17144303 | 251 | 2009-03-03 07:02:19.733778-06 | | 2009-03-03 06:17:47.784268-06 | (1 row) noovo-new=# \d+ core_accessor; Table "public.core_accessor" Column | Type | Modifiers | Description ---------------------------+--------------------------+------------------------------------------------------------+------------- id | bigint | not null default nextval('core_accessor_id_seq'::regclass) | flavor | character varying(32) | | content_type_id | integer | not null | object_id | integer | not null | publish_state | smallint | not null | date_posted | timestamp with time zone | not null | user_id | integer | | slot_id | integer | | slot_type_id | integer | | role | smallint | | ordering | integer | | author_id | integer | | nooximity_old | double precision | default 0.0 | rising | double precision | default 0.0 | label | text | | nooximity | double precision | not null default 1.0 | nooximity_old_date_posted | timestamp with time zone | | nooximity_date_posted | timestamp with time zone | | Indexes: "portal_metainfo_pkey" PRIMARY KEY, btree (id) "portal_metainfo_unique_constr" UNIQUE, btree (content_type_id, object_id, user_id, slot_id, slot_type_id, role, label) CLUSTER "core_accessor_date_idx" btree (date_posted, nooximity) "core_accessor_dated_idx" btree (slot_id, slot_type_id, label, user_id, role, publish_state, date_posted, nooximity) "core_accessor_fresh_idx" btree (slot_id, slot_type_id, label, user_id, role, publish_state) "core_accessor_popularity_idx" btree (nooximity, date_posted) Check constraints: "portal_metainfo_object_id_check" CHECK (object_id >= 0) "portal_metainfo_owner_id_check" CHECK (slot_id >= 0) Foreign-key constraints: "portal_metainfo_accessor_id_fkey" FOREIGN KEY (user_id) REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED "portal_metainfo_content_type_id_fkey" FOREIGN KEY (content_type_id) REFERENCES django_content_type(id) DEFERRABLE INITIALLY DEFERRED "portal_metainfo_owner_type_id_fkey" FOREIGN KEY (slot_type_id) REFERENCES django_content_type(id) DEFERRABLE INITIALLY DEFERRED Has OIDs: no As far as I understand the explain, it fetches 68505 rows, matches them with core_base and then tries to sort them? AFAIK it would probably be much more effective to just find the records in accessor via core_accessor_dated_idx and then lookup the core_base table? But for some reason it doesn't want to? I ran analyze, vacuum and reindex but nothing helped. Queries just eat all the I/O and block. There is a huge difference between cached and non-cached queries, like 50.000 to 50 ms. Help! :) Thanks, Sebastjan
On Tue, Mar 3, 2009 at 12:05 PM, Sebastjan Trepca <trepca@gmail.com> wrote: > Hey, > > I have a table that links content together and it currently holds > about 17 mio records. Typical query is a join with a content table and > link table: > > noovo-new=# explain analyze SELECT "core_accessor"."id", > "core_accessor"."content_type_id", > "core_accessor"."object_id", "core_accessor"."ordering", > "core_accessor"."label", "core_accessor"."date_posted", > "core_accessor"."publish_state", "core_accessor"."nooximity_old", > "core_accessor"."rising", "core_accessor"."nooximity", > "core_accessor"."nooximity_old_date_posted", > "core_accessor"."nooximity_date_posted", "core_accessor"."user_id", > "core_accessor"."slot_id", "core_accessor"."slot_type_id", > "core_accessor"."role", "core_base"."object_id", > "core_base"."content_type_id", "core_base"."abstract", > "core_base"."abstract_title", "core_base"."image", > "core_base"."date_posted", "core_base"."date_modified", > "core_base"."date_expires", "core_base"."publish_state", > "core_base"."location", "core_base"."location_x", > "core_base"."location_y", "core_base"."raw", "core_base"."author_id", > "core_base"."excerpt", "core_base"."state_id", > "core_base"."country_id", "core_base"."language", > "core_base"."_identifier", > "core_base"."slot_url", "core_base"."source_id", > "core_base"."source_content_type_id", "core_base"."source_type", > "core_base"."source_value", "core_base"."source_title", > "core_base"."direct_to_source", "core_base"."comment_count", > "core_base"."public" FROM "core_accessor" INNER JOIN core_base AS > core_base ON core_base.content_type_id = > core_accessor.content_type_id AND core_base.object_id = > core_accessor.object_id WHERE (("core_accessor"."slot_type_id" = 119 > AND "core_accessor"."slot_id" = 472 AND "core_accessor"."label" = E'' > AND "core_accessor"."publish_state" >= 60 AND > "core_accessor"."role" IN (0) AND "core_accessor"."user_id" = 0)) > order by core_accessor.date_posted, core_accessor.nooximity LIMIT 5 > ; > > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=31930.65..31930.66 rows=5 width=860) (actual > time=711.924..711.927 rows=5 loops=1) > -> Sort (cost=31930.65..31937.80 rows=2861 width=860) (actual > time=711.923..711.923 rows=5 loops=1) > Sort Key: core_accessor.date_posted, core_accessor.nooximity > Sort Method: top-N heapsort Memory: 31kB > -> Nested Loop (cost=0.00..31883.13 rows=2861 width=860) > (actual time=0.089..543.497 rows=68505 loops=1) > -> Index Scan using core_accessor_fresh_idx on > core_accessor (cost=0.00..5460.07 rows=2970 width=92) (actual > time=0.068..54.921 rows=69312 loops=1) > Index Cond: ((slot_id = 472) AND (slot_type_id = > 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND > (publish_state >= 60)) > -> Index Scan using core_base_pkey on core_base > (cost=0.00..8.88 rows=1 width=768) (actual time=0.004..0.005 rows=1 > loops=69312) > Index Cond: ((core_base.object_id = > core_accessor.object_id) AND (core_base.content_type_id = > core_accessor.content_type_id)) > Total runtime: 712.031 ms > (10 rows) > > noovo-new=# select * from pg_stat_user_tables where relname='core_accessor'; > relid | schemaname | relname | seq_scan | seq_tup_read | > idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | > n_tup_hot_upd | n_live_tup | n_dead_tup | last_vacuum > | last_autovacuum | last_analyze | last_autoanalyze > -------+------------+---------------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+-------------------------------+-----------------+-------------------------------+------------------ > 51159 | public | core_accessor | 58 | 749773516 | > 13785608 | 149165183 | 9566 | 548 | 347 | > 206 | 17144303 | 251 | 2009-03-03 07:02:19.733778-06 | > | 2009-03-03 06:17:47.784268-06 | > (1 row) > > noovo-new=# \d+ core_accessor; > Table "public.core_accessor" > Column | Type | > Modifiers | Description > ---------------------------+--------------------------+------------------------------------------------------------+------------- > id | bigint | not null > default nextval('core_accessor_id_seq'::regclass) | > flavor | character varying(32) | > | > content_type_id | integer | not null > | > object_id | integer | not null > | > publish_state | smallint | not null > | > date_posted | timestamp with time zone | not null > | > user_id | integer | > | > slot_id | integer | > | > slot_type_id | integer | > | > role | smallint | > | > ordering | integer | > | > author_id | integer | > | > nooximity_old | double precision | default 0.0 > | > rising | double precision | default 0.0 > | > label | text | > | > nooximity | double precision | not null > default 1.0 | > nooximity_old_date_posted | timestamp with time zone | > | > nooximity_date_posted | timestamp with time zone | > | > Indexes: > "portal_metainfo_pkey" PRIMARY KEY, btree (id) > "portal_metainfo_unique_constr" UNIQUE, btree (content_type_id, > object_id, user_id, slot_id, slot_type_id, role, label) CLUSTER > "core_accessor_date_idx" btree (date_posted, nooximity) > "core_accessor_dated_idx" btree (slot_id, slot_type_id, label, > user_id, role, publish_state, date_posted, nooximity) > "core_accessor_fresh_idx" btree (slot_id, slot_type_id, label, > user_id, role, publish_state) > "core_accessor_popularity_idx" btree (nooximity, date_posted) > Check constraints: > "portal_metainfo_object_id_check" CHECK (object_id >= 0) > "portal_metainfo_owner_id_check" CHECK (slot_id >= 0) > Foreign-key constraints: > "portal_metainfo_accessor_id_fkey" FOREIGN KEY (user_id) > REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED > "portal_metainfo_content_type_id_fkey" FOREIGN KEY > (content_type_id) REFERENCES django_content_type(id) DEFERRABLE > INITIALLY DEFERRED > "portal_metainfo_owner_type_id_fkey" FOREIGN KEY (slot_type_id) > REFERENCES django_content_type(id) DEFERRABLE INITIALLY DEFERRED > Has OIDs: no > > > > As far as I understand the explain, it fetches 68505 rows, matches > them with core_base and then tries to sort them? AFAIK it would > probably be much more effective to just find the records in accessor > via core_accessor_dated_idx and then lookup the core_base table? But > for some reason it doesn't want to? > > I ran analyze, vacuum and reindex but nothing helped. Queries just eat > all the I/O and block. There is a huge difference between cached and > non-cached queries, like 50.000 to 50 ms. > > Help! :) Please send the output of EXPLAIN ANALYZE for this query. ...Robert
But it's already attached in the first mail or am I missing something? If you don't see it, check this: http://pastebin.com/d71b996d0 Sebastjan On Tue, Mar 3, 2009 at 6:12 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Mar 3, 2009 at 12:05 PM, Sebastjan Trepca <trepca@gmail.com> wrote: >> Hey, >> >> I have a table that links content together and it currently holds >> about 17 mio records. Typical query is a join with a content table and >> link table: >> >> noovo-new=# explain analyze SELECT "core_accessor"."id", >> "core_accessor"."content_type_id", >> "core_accessor"."object_id", "core_accessor"."ordering", >> "core_accessor"."label", "core_accessor"."date_posted", >> "core_accessor"."publish_state", "core_accessor"."nooximity_old", >> "core_accessor"."rising", "core_accessor"."nooximity", >> "core_accessor"."nooximity_old_date_posted", >> "core_accessor"."nooximity_date_posted", "core_accessor"."user_id", >> "core_accessor"."slot_id", "core_accessor"."slot_type_id", >> "core_accessor"."role", "core_base"."object_id", >> "core_base"."content_type_id", "core_base"."abstract", >> "core_base"."abstract_title", "core_base"."image", >> "core_base"."date_posted", "core_base"."date_modified", >> "core_base"."date_expires", "core_base"."publish_state", >> "core_base"."location", "core_base"."location_x", >> "core_base"."location_y", "core_base"."raw", "core_base"."author_id", >> "core_base"."excerpt", "core_base"."state_id", >> "core_base"."country_id", "core_base"."language", >> "core_base"."_identifier", >> "core_base"."slot_url", "core_base"."source_id", >> "core_base"."source_content_type_id", "core_base"."source_type", >> "core_base"."source_value", "core_base"."source_title", >> "core_base"."direct_to_source", "core_base"."comment_count", >> "core_base"."public" FROM "core_accessor" INNER JOIN core_base AS >> core_base ON core_base.content_type_id = >> core_accessor.content_type_id AND core_base.object_id = >> core_accessor.object_id WHERE (("core_accessor"."slot_type_id" = 119 >> AND "core_accessor"."slot_id" = 472 AND "core_accessor"."label" = E'' >> AND "core_accessor"."publish_state" >= 60 AND >> "core_accessor"."role" IN (0) AND "core_accessor"."user_id" = 0)) >> order by core_accessor.date_posted, core_accessor.nooximity LIMIT 5 >> ; >> >> QUERY PLAN >> --------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> Limit (cost=31930.65..31930.66 rows=5 width=860) (actual >> time=711.924..711.927 rows=5 loops=1) >> -> Sort (cost=31930.65..31937.80 rows=2861 width=860) (actual >> time=711.923..711.923 rows=5 loops=1) >> Sort Key: core_accessor.date_posted, core_accessor.nooximity >> Sort Method: top-N heapsort Memory: 31kB >> -> Nested Loop (cost=0.00..31883.13 rows=2861 width=860) >> (actual time=0.089..543.497 rows=68505 loops=1) >> -> Index Scan using core_accessor_fresh_idx on >> core_accessor (cost=0.00..5460.07 rows=2970 width=92) (actual >> time=0.068..54.921 rows=69312 loops=1) >> Index Cond: ((slot_id = 472) AND (slot_type_id = >> 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND >> (publish_state >= 60)) >> -> Index Scan using core_base_pkey on core_base >> (cost=0.00..8.88 rows=1 width=768) (actual time=0.004..0.005 rows=1 >> loops=69312) >> Index Cond: ((core_base.object_id = >> core_accessor.object_id) AND (core_base.content_type_id = >> core_accessor.content_type_id)) >> Total runtime: 712.031 ms >> (10 rows) >> >> noovo-new=# select * from pg_stat_user_tables where relname='core_accessor'; >> relid | schemaname | relname | seq_scan | seq_tup_read | >> idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | >> n_tup_hot_upd | n_live_tup | n_dead_tup | last_vacuum >> | last_autovacuum | last_analyze | last_autoanalyze >> -------+------------+---------------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+-------------------------------+-----------------+-------------------------------+------------------ >> 51159 | public | core_accessor | 58 | 749773516 | >> 13785608 | 149165183 | 9566 | 548 | 347 | >> 206 | 17144303 | 251 | 2009-03-03 07:02:19.733778-06 | >> | 2009-03-03 06:17:47.784268-06 | >> (1 row) >> >> noovo-new=# \d+ core_accessor; >> Table "public.core_accessor" >> Column | Type | >> Modifiers | Description >> ---------------------------+--------------------------+------------------------------------------------------------+------------- >> id | bigint | not null >> default nextval('core_accessor_id_seq'::regclass) | >> flavor | character varying(32) | >> | >> content_type_id | integer | not null >> | >> object_id | integer | not null >> | >> publish_state | smallint | not null >> | >> date_posted | timestamp with time zone | not null >> | >> user_id | integer | >> | >> slot_id | integer | >> | >> slot_type_id | integer | >> | >> role | smallint | >> | >> ordering | integer | >> | >> author_id | integer | >> | >> nooximity_old | double precision | default 0.0 >> | >> rising | double precision | default 0.0 >> | >> label | text | >> | >> nooximity | double precision | not null >> default 1.0 | >> nooximity_old_date_posted | timestamp with time zone | >> | >> nooximity_date_posted | timestamp with time zone | >> | >> Indexes: >> "portal_metainfo_pkey" PRIMARY KEY, btree (id) >> "portal_metainfo_unique_constr" UNIQUE, btree (content_type_id, >> object_id, user_id, slot_id, slot_type_id, role, label) CLUSTER >> "core_accessor_date_idx" btree (date_posted, nooximity) >> "core_accessor_dated_idx" btree (slot_id, slot_type_id, label, >> user_id, role, publish_state, date_posted, nooximity) >> "core_accessor_fresh_idx" btree (slot_id, slot_type_id, label, >> user_id, role, publish_state) >> "core_accessor_popularity_idx" btree (nooximity, date_posted) >> Check constraints: >> "portal_metainfo_object_id_check" CHECK (object_id >= 0) >> "portal_metainfo_owner_id_check" CHECK (slot_id >= 0) >> Foreign-key constraints: >> "portal_metainfo_accessor_id_fkey" FOREIGN KEY (user_id) >> REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED >> "portal_metainfo_content_type_id_fkey" FOREIGN KEY >> (content_type_id) REFERENCES django_content_type(id) DEFERRABLE >> INITIALLY DEFERRED >> "portal_metainfo_owner_type_id_fkey" FOREIGN KEY (slot_type_id) >> REFERENCES django_content_type(id) DEFERRABLE INITIALLY DEFERRED >> Has OIDs: no >> >> >> >> As far as I understand the explain, it fetches 68505 rows, matches >> them with core_base and then tries to sort them? AFAIK it would >> probably be much more effective to just find the records in accessor >> via core_accessor_dated_idx and then lookup the core_base table? But >> for some reason it doesn't want to? >> >> I ran analyze, vacuum and reindex but nothing helped. Queries just eat >> all the I/O and block. There is a huge difference between cached and >> non-cached queries, like 50.000 to 50 ms. >> >> Help! :) > > Please send the output of EXPLAIN ANALYZE for this query. > > ...Robert >
On Tue, Mar 3, 2009 at 12:05 PM, Sebastjan Trepca <trepca@gmail.com> wrote: > -> Nested Loop (cost=0.00..31883.13 rows=2861 width=860) > (actual time=0.089..543.497 rows=68505 loops=1) > -> Index Scan using core_accessor_fresh_idx on > core_accessor (cost=0.00..5460.07 rows=2970 width=92) (actual > time=0.068..54.921 rows=69312 loops=1) > Index Cond: ((slot_id = 472) AND (slot_type_id = > 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND > (publish_state >= 60)) That index scan on core_accessor_fresh_idx has a pretty big disparity between what the planer expects to get (2970 rows) and what it actually gets (69312 rows). You should try increasing the statistics target if you haven't, then re-analyze and try the query again to see if the planner picks something better. The default of 10 is pretty small- try 100, or higher. -- - David T. Wilson david.t.wilson@gmail.com
Set statistics to 1000, reanalyzed and got exactly same results: noovo-new=# explain analyze SELECT "core_accessor"."id", "core_accessor"."content_type_id", "core_accessor"."object_id", "core_accessor"."ordering", "core_accessor"."label", "core_accessor"."date_posted", "core_accessor"."publish_state", "core_accessor"."nooximity_old", "core_accessor"."rising", "core_accessor"."nooximity", "core_accessor"."nooximity_old_date_posted", "core_accessor"."nooximity_date_posted", "core_accessor"."user_id", "core_accessor"."slot_id", "core_accessor"."slot_type_id", "core_accessor"."role", "core_base"."object_id", "core_base"."content_type_id", "core_base"."abstract", "core_base"."abstract_title", "core_base"."image", "core_base"."date_posted", "core_base"."date_modified", "core_base"."date_expires", "core_base"."publish_state", "core_base"."location", "core_base"."location_x", "core_base"."location_y", "core_base"."raw", "core_base"."author_id", "core_base"."excerpt", "core_base"."state_id", "core_base"."country_id", "core_base"."language", "core_base"."_identifier", "core_base"."slot_url", "core_base"."source_id", "core_base"."source_content_type_id", "core_base"."source_type", "core_base"."source_value", "core_base"."source_title", "core_base"."direct_to_source", "core_base"."comment_count", "core_base"."public" FROM "core_accessor" INNER JOIN core_base AS core_base ON core_base.content_type_id = core_accessor.content_type_id AND core_base.object_id = core_accessor.object_id WHERE (("core_accessor"."slot_type_id" = 119 AND "core_accessor"."slot_id" = 472 AND "core_accessor"."label" = E'' AND "core_accessor"."publish_state" >= 60 AND "core_accessor"."role" IN (0) AND "core_accessor"."user_id" = 0)) order by core_accessor.date_posted, core_accessor.nooximity LIMIT 5 ; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=31716.13..31716.14 rows=5 width=860) (actual time=711.340..711.343 rows=5 loops=1) -> Sort (cost=31716.13..31722.19 rows=2424 width=860) (actual time=711.339..711.339 rows=5 loops=1) Sort Key: core_accessor.date_posted, core_accessor.nooximity Sort Method: top-N heapsort Memory: 31kB -> Nested Loop (cost=0.00..31675.87 rows=2424 width=860) (actual time=0.076..544.039 rows=68505 loops=1) -> Index Scan using core_accessor_fresh_idx on core_accessor (cost=0.00..9234.77 rows=2511 width=92) (actual time=0.058..55.225 rows=69312 loops=1) Index Cond: ((slot_id = 472) AND (slot_type_id = 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND (publish_state >= 60)) -> Index Scan using core_base_pkey on core_base (cost=0.00..8.92 rows=1 width=768) (actual time=0.005..0.005 rows=1 loops=69312) Index Cond: ((core_base.object_id = core_accessor.object_id) AND (core_base.content_type_id = core_accessor.content_type_id)) Total runtime: 711.443 ms (10 rows) This is how I did it: noovo-new=# alter table core_accessor alter column slot_id set statistics 1000; ALTER TABLE noovo-new=# alter table core_accessor alter column slot_type_id set statistics 1000; ALTER TABLE noovo-new=# alter table core_accessor alter column label set statistics 1000; ALTER TABLE noovo-new=# alter table core_accessor alter column user_id set statistics 1000; ALTER TABLE noovo-new=# alter table core_accessor alter column role set statistics 1000; ALTER TABLE noovo-new=# alter table core_accessor alter column publish_state set statistics 1000; ALTER TABLE noovo-new=# analyze core_accessor; ANALYZE Sebastjan On Tue, Mar 3, 2009 at 6:34 PM, David Wilson <david.t.wilson@gmail.com> wrote: > On Tue, Mar 3, 2009 at 12:05 PM, Sebastjan Trepca <trepca@gmail.com> wrote: > >> -> Nested Loop (cost=0.00..31883.13 rows=2861 width=860) >> (actual time=0.089..543.497 rows=68505 loops=1) >> -> Index Scan using core_accessor_fresh_idx on >> core_accessor (cost=0.00..5460.07 rows=2970 width=92) (actual >> time=0.068..54.921 rows=69312 loops=1) >> Index Cond: ((slot_id = 472) AND (slot_type_id = >> 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND >> (publish_state >= 60)) > > That index scan on core_accessor_fresh_idx has a pretty big disparity > between what the planer expects to get (2970 rows) and what it > actually gets (69312 rows). You should try increasing the statistics > target if you haven't, then re-analyze and try the query again to see > if the planner picks something better. The default of 10 is pretty > small- try 100, or higher. > > > > -- > - David T. Wilson > david.t.wilson@gmail.com >
On Tue, Mar 3, 2009 at 12:20 PM, Sebastjan Trepca <trepca@gmail.com> wrote: > But it's already attached in the first mail or am I missing something? > > If you don't see it, check this: http://pastebin.com/d71b996d0 Woops, sorry, I thought you had sent plain EXPLAIN. I see it now. The lowest level at which I see a problem is here: -> Index Scan using core_accessor_fresh_idx on core_accessor (cost=0.00..5460.07 rows=2970 width=92) (actual time=0.068..54.921 rows=69312 loops=1) Index Cond: ((slot_id = 472) AND (slot_type_id = 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND (publish_state >= 60)) For some reason it expect 2970 rows but gets 69312. A good place to start is to change your default_statistics_target value to 100 in postgresql.conf, restart postgresql, and re-ANALYZE. ...Robert
Still the same :/ I raised the default_statistics_target to 600 (it was already 100). I then restarted pg, ran analyze through all tables and yet there is not effect. This is the output for core_accessor: INFO: analyzing "public.core_accessor" INFO: "core_accessor": scanned 291230 of 291230 pages, containing 17144315 live rows and 0 dead rows; 300000 rows in sample, 17144315 estimated total rows It thinks there are even less rows in the set: --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=30816.49..30816.50 rows=5 width=855) (actual time=683.907..683.910 rows=5 loops=1) -> Sort (cost=30816.49..30822.29 rows=2321 width=855) (actual time=683.906..683.907 rows=5 loops=1) Sort Key: core_accessor.date_posted, core_accessor.nooximity Sort Method: top-N heapsort Memory: 31kB -> Nested Loop (cost=0.00..30777.94 rows=2321 width=855) (actual time=0.072..517.970 rows=68505 loops=1) -> Index Scan using core_accessor_fresh_idx on core_accessor (cost=0.00..8955.44 rows=2440 width=92) (actual time=0.056..53.107 rows=69312 loops=1) Index Cond: ((slot_id = 472) AND (slot_type_id = 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND (publish_state >= 60)) -> Index Scan using core_base_pkey on core_base (cost=0.00..8.93 rows=1 width=763) (actual time=0.004..0.005 rows=1 loops=69312) Index Cond: ((core_base.object_id = core_accessor.object_id) AND (core_base.content_type_id = core_accessor.content_type_id)) Total runtime: 684.015 ms (10 rows) Sebastjan On Tue, Mar 3, 2009 at 6:40 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Mar 3, 2009 at 12:20 PM, Sebastjan Trepca <trepca@gmail.com> wrote: >> But it's already attached in the first mail or am I missing something? >> >> If you don't see it, check this: http://pastebin.com/d71b996d0 > > Woops, sorry, I thought you had sent plain EXPLAIN. I see it now. > > The lowest level at which I see a problem is here: > > -> Index Scan using core_accessor_fresh_idx on core_accessor > (cost=0.00..5460.07 rows=2970 width=92) (actual time=0.068..54.921 > rows=69312 loops=1) > Index Cond: ((slot_id = 472) AND (slot_type_id = 119) AND (label = > ''::text) AND (user_id = 0) AND (role = 0) AND (publish_state >= 60)) > > For some reason it expect 2970 rows but gets 69312. > > A good place to start is to change your default_statistics_target > value to 100 in postgresql.conf, restart postgresql, and re-ANALYZE. > > ...Robert >
Maybe this is useful, I removed the JOIN and it uses other index(core_accessor_date_idx indexes (date_posted, nooximity)), but its still hardly any better: noovo-new=# explain analyze SELECT * FROM "core_accessor" WHERE (("core_accessor"."slot_type_id" = 119 noovo-new(# AND "core_accessor"."slot_id" = 472 AND "core_accessor"."label" = E'' AND "core_accessor"."publish_state" >= 60 AND noovo-new(# "core_accessor"."role" IN (0) AND "core_accessor"."user_id" = 0)) ORDER BY "core_accessor"."date_posted" DESC, "core_accessor"."nooximity" DESC LIMIT 5 noovo-new-# ; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..3709.56 rows=5 width=178) (actual time=4593.867..4597.587 rows=5 loops=1) -> Index Scan Backward using core_accessor_date_idx on core_accessor (cost=0.00..1810265.67 rows=2440 width=178) (actual time=4593.866..4597.583 rows=5 loops=1) Filter: ((publish_state >= 60) AND (slot_type_id = 119) AND (slot_id = 472) AND (label = ''::text) AND (role = 0) AND (user_id = 0)) Total runtime: 4597.632 ms (4 rows) Sebastjan On Tue, Mar 3, 2009 at 8:05 PM, Sebastjan Trepca <trepca@gmail.com> wrote: > Still the same :/ > > I raised the default_statistics_target to 600 (it was already 100). I > then restarted pg, ran analyze through all tables and yet there is not > effect. > This is the output for core_accessor: > INFO: analyzing "public.core_accessor" > INFO: "core_accessor": scanned 291230 of 291230 pages, containing > 17144315 live rows and 0 dead rows; 300000 rows in sample, 17144315 > estimated total rows > > It thinks there are even less rows in the set: > > --------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=30816.49..30816.50 rows=5 width=855) (actual > time=683.907..683.910 rows=5 loops=1) > -> Sort (cost=30816.49..30822.29 rows=2321 width=855) (actual > time=683.906..683.907 rows=5 loops=1) > Sort Key: core_accessor.date_posted, core_accessor.nooximity > Sort Method: top-N heapsort Memory: 31kB > -> Nested Loop (cost=0.00..30777.94 rows=2321 width=855) > (actual time=0.072..517.970 rows=68505 loops=1) > -> Index Scan using core_accessor_fresh_idx on > core_accessor (cost=0.00..8955.44 rows=2440 width=92) (actual > time=0.056..53.107 rows=69312 loops=1) > Index Cond: ((slot_id = 472) AND (slot_type_id = > 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND > (publish_state >= 60)) > -> Index Scan using core_base_pkey on core_base > (cost=0.00..8.93 rows=1 width=763) (actual time=0.004..0.005 rows=1 > loops=69312) > Index Cond: ((core_base.object_id = > core_accessor.object_id) AND (core_base.content_type_id = > core_accessor.content_type_id)) > Total runtime: 684.015 ms > (10 rows) > > > > > > Sebastjan > > > > On Tue, Mar 3, 2009 at 6:40 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Tue, Mar 3, 2009 at 12:20 PM, Sebastjan Trepca <trepca@gmail.com> wrote: >>> But it's already attached in the first mail or am I missing something? >>> >>> If you don't see it, check this: http://pastebin.com/d71b996d0 >> >> Woops, sorry, I thought you had sent plain EXPLAIN. I see it now. >> >> The lowest level at which I see a problem is here: >> >> -> Index Scan using core_accessor_fresh_idx on core_accessor >> (cost=0.00..5460.07 rows=2970 width=92) (actual time=0.068..54.921 >> rows=69312 loops=1) >> Index Cond: ((slot_id = 472) AND (slot_type_id = 119) AND (label = >> ''::text) AND (user_id = 0) AND (role = 0) AND (publish_state >= 60)) >> >> For some reason it expect 2970 rows but gets 69312. >> >> A good place to start is to change your default_statistics_target >> value to 100 in postgresql.conf, restart postgresql, and re-ANALYZE. >> >> ...Robert >> >
On Tue, Mar 3, 2009 at 2:16 PM, Sebastjan Trepca <trepca@gmail.com> wrote: > Maybe this is useful, I removed the JOIN and it uses other > index(core_accessor_date_idx indexes (date_posted, nooximity)), but > its still hardly any better: > > noovo-new=# explain analyze SELECT * FROM "core_accessor" WHERE > (("core_accessor"."slot_type_id" = 119 > noovo-new(# AND "core_accessor"."slot_id" = 472 AND > "core_accessor"."label" = E'' AND "core_accessor"."publish_state" >= > 60 AND > noovo-new(# "core_accessor"."role" IN (0) AND > "core_accessor"."user_id" = 0)) ORDER BY "core_accessor"."date_posted" > DESC, "core_accessor"."nooximity" DESC LIMIT 5 > noovo-new-# ; > > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Limit (cost=0.00..3709.56 rows=5 width=178) (actual > time=4593.867..4597.587 rows=5 loops=1) > -> Index Scan Backward using core_accessor_date_idx on > core_accessor (cost=0.00..1810265.67 rows=2440 width=178) (actual > time=4593.866..4597.583 rows=5 loops=1) > Filter: ((publish_state >= 60) AND (slot_type_id = 119) AND > (slot_id = 472) AND (label = ''::text) AND (role = 0) AND (user_id = > 0)) > Total runtime: 4597.632 ms > (4 rows) > > > Sebastjan Well, in that case, you are being bitten by the fact that our multi-column selectivity estimates are not very good. The planner has good information on how each column behaves in isolation, but not how they act together. I've found this to be a very difficult problem to fix. Which of the parameters in this query vary and which ones are typically always the same? Sometimes you can improve things by creating an appropriate partial index. ...Robert
Sebastjan Trepca <trepca@gmail.com> writes: > It thinks there are even less rows in the set: > -> Index Scan using core_accessor_fresh_idx on > core_accessor (cost=0.00..8955.44 rows=2440 width=92) (actual > time=0.056..53.107 rows=69312 loops=1) > Index Cond: ((slot_id = 472) AND (slot_type_id = > 119) AND (label = ''::text) AND (user_id = 0) AND (role = 0) AND > (publish_state >= 60)) Maybe you should get rid of this six-column index, if you'd rather the query didn't use it. It seems a tad overspecialized anyway. regards, tom lane