Thread: Order by and index
It looks like the Postgres optimizer cannot use indexes for "order by" conditions. The query that made me conclude this, looks like this: explain analyze select "document#" from moreover_documents where created_at<TIMESTAMP '2010-07-01' order by "document#" limit 10; The plan looks like this: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=280833.01..280833.03 rows=10 width=8) (actual time=2335.322..2335.325 rows=10 loops=1) -> Sort (cost=280833.01..285481.01 rows=1859200 width=8) (actual time=2335.320..2335.321 rows=10 loops=1) Sort Key: moreover.moreover_documents."document#" Sort Method: top-N heapsort Memory: 25kB -> Result (cost=0.00..240656.36 rows=1859200 width=8) (actual time=0.022..1980.176 rows=1857510 loops=1) -> Append (cost=0.00..240656.36 rows=1859200 width=8) (actual time=0.021..1683.461 rows=1857510 loops=1) -> Seq Scan on moreover_documents (cost=0.00..10.25 rows=7 width=8) (actual time=0.001..0.001 rows=0 loops=1) Filter: (created_at < '2010-07-01 00:00:00'::timestamp without time zone) -> Seq Scan on moreover_documents_y2010m06 moreover_documents (cost=0.00..240646.11 rows=1859193 width=8) (actual time=0.020..1436.262 rows=1857510 loops=1) Filter: (created_at < '2010-07-01 00:00:00'::timestamp without time zone) Total runtime: 2335.364 ms (11 rows) Column "document#" is the primary key and will be renamed to "document_id" before the project enters the production phase. And there is the catch: I am accessing only the primary key, which is, of course, indexed by a B*tree index. Data set in the B*tree index is sorted, by virtue of the underlying mathematical structure which is, not surprisingly, known as B*tree. So, this query can be resolved by using index blocks only and reading them in order. The plan, however, says something else. The database is doing heap sort and sequential scan. The source database can use index quite well: SQL> set autotrace on explain; SQL> select document# from ( 2 select document# from moreover_documents 3 order by document#) 4 where rownum<=10; DOCUMENT# ---------- 927598124 927598126 927598128 927598130 927598132 927598134 927598136 927598138 927598140 927598142 10 rows selected. Execution Plan ---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=2 Card=10 Bytes=130) 1 0 COUNT (STOPKEY) 2 1 VIEW (Cost=2 Card=31829315 Bytes=413781095) 3 2 INDEX (FULL SCAN) OF 'MOREOVER_DOCUMENTS_PK' (UNIQUE) (Cost=1 Card=31829315 Bytes=190975890) I am nitpicking because my application developers are used to that possibility and have created several indexes with the explicit purpose of helping them with "order by" clause. I know that the version 9 does resolve "max" and "min" using index. This should not be that hard to do, either. Would it be presumptuous from me to expect something like this in Postgres 9.1? What is actually needed is a the "index full scan" access method, which can be used when all of the columns in the select list are indexed. The difference in speed is quite significant: news=# set search_path=moreover; SET news=# \timing Timing is on. news=# select "document#" from moreover_documents news-# where created_at<TIMESTAMP '2010-07-01' news-# order by "document#" news-# limit 10; document# ----------- 927598124 927598126 927598128 927598130 927598132 927598134 927598136 927598138 927598140 927598142 (10 rows) Time: 1594.013 ms news=# The other database is almost 5 times faster: SQL> set timing on SQL> select document# from ( 2 select document# from moreover_documents 3 order by document#) 4 where rownum<=10; DOCUMENT# ---------- 927598124 927598126 927598128 927598130 927598132 927598134 927598136 927598138 927598140 927598142 10 rows selected. Elapsed: 00:00:00.32 -- Mladen Gogala Sr. Oracle DBA 1500 Broadway New York, NY 10036 (212) 329-5251 http://www.vmsinfo.com The Leader in Integrated Media Intelligence Solutions
On Fri, Aug 27, 2010 at 6:30 PM, Mladen Gogala <mgogala@vmsinfo.com> wrote: > It looks like the Postgres optimizer cannot use indexes for "order by" > conditions. The query that made me conclude this, looks like this: It looks to me like the reason that you have that heapsort step is because of your WHERE clause involving the "created_at" timestamp. > explain analyze > select "document#" from moreover_documents > where created_at<TIMESTAMP '2010-07-01' > order by "document#" > limit 10; And your comparison showing Oracle to be faster doesn't use this WHERE clause: > SQL> set autotrace on explain; > SQL> select document# from ( > 2 select document# from moreover_documents > 3 order by document#) > 4 where rownum<=10; Perhaps Oracle is smart enough to use indexes on "created_at" and "document#" together to avoid a sort entirely, but your example doesn't show this. Postgres should be able to use an Index Scan and avoid that sort step if you don't involve "created_at": Also, I'm not sure whether this would help in your case, but there was some talk recently about implementing "Index Organized Tables" for Postgres, borrowing from Oracle. http://archives.postgresql.org/pgsql-hackers/2010-02/msg01708.php Josh
Josh Kupershmidt wrote: > Perhaps Oracle is smart enough to use indexes on "created_at" and > "document#" together to avoid a sort entirely, but your example > doesn't show this. Postgres should be able to use an Index Scan and > avoid that sort step if you don't involve "created_at": > Josh, the problem is the fact that the Postgres table is partitioned, while the Oracle table is not. That should explain the mysterious "created_at" condition, absent on the Oracle side. Partitioning is the most important reason for copying the 200GB Oracle table to Postgres. The problem with the partitioning on the Oracle side is that it costs money. When I query a partition directly, the index is used: news=# explain analyze news-# select author from moreover_documents_y2010m06 news-# order by "document#" news-# limit 10; QUERY PLAN -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- ------------------ Limit (cost=0.00..1.48 rows=10 width=20) (actual time=37.458..108.098 rows=10 loops=1) -> Index Scan using pk_moreover_documents_y2010m06 on moreover_documents_y20 10m06 (cost=0.00..274673.53 rows=1856853 width=20) (actual time=37.457..108.095 rows=10 loops=1) Total runtime: 108.130 ms (3 rows) If you take a look at the plan I have originally posted, you will note that the very same table is scanned, despite the fact that the table originally specified is "moreover_documents". I believe this to be a bug in optimizing queries to the partitioned tables. This is a "history table", needed for reporting purposes. Open source tools like Jasper and Pentaho, as well as the reporting software like Crystal Reports can use Postgres to produce reports, while this monster is slowing down everything on the Oracle side. My problem is what indexes can be used and when, so that the reports perform better than they do now. If that is not the case, the report users will storm my office, with tar and feathers. Somehow, I find such prospect unappealing. -- Mladen Gogala Sr. Oracle DBA 1500 Broadway New York, NY 10036 (212) 329-5251 www.vmsinfo.com