Thread: order by index, and inheritance
I have a query which performs not so well: SELECT * FROM mm_mediasources ORDER BY number DESC LIMIT 20; costs nearly a minute. The table contains over 300 000 records. The table has two extensions, which are (a the moment) nearly empty, but have something to do with this, because: SELECT * FROM only mm_mediasources ORDER BY number DESC LIMIT 20; performs ok (8ms). The query plan is then as I would expect: media=# explain SELECT * FROM only mm_mediasources ORDER BY number DESC LIMIT 20; QUERY PLAN --------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..8.36 rows=20 width=105) -> Index Scan Backward using mediasource_object on mm_mediasources (cost=0.00..114641.05 rows=274318 width=105) The query plan of the original query, without 'only' does table scans: media=# explain SELECT * FROM mm_mediasources ORDER BY number DESC LIMIT 20; QUERY PLAN -------------------------------------------------------------------------------------------------------- Limit (cost=47248.70..47248.75 rows=20 width=105) -> Sort (cost=47248.70..47934.52 rows=274328 width=105) Sort Key: public.mm_mediasources.number -> Result (cost=0.00..8364.28 rows=274328 width=105) -> Append (cost=0.00..8364.28 rows=274328 width=105) -> Seq Scan on mm_mediasources (cost=0.00..8362.18 rows=274318 width=105) -> Seq Scan on mm_audiosources mm_mediasources (cost=0.00..1.01 rows=1 width=84) -> Seq Scan on mm_videosources mm_mediasources (cost=0.00..1.09 rows=9 width=89) and presumably because if that performs so lousy. Simply selecting on a number does work fast: media=# explain SELECT * FROM mm_mediasources where number = 606973 ; QUERY PLAN -------------------------------------------------------------------------------------------------------- Result (cost=0.00..6.13 rows=4 width=105) -> Append (cost=0.00..6.13 rows=4 width=105) -> Index Scan using mediasource_object on mm_mediasources (cost=0.00..4.00 rows=2 width=105) Index Cond: (number = 606973) -> Seq Scan on mm_audiosources mm_mediasources (cost=0.00..1.01 rows=1 width=84) Filter: (number = 606973) -> Seq Scan on mm_videosources mm_mediasources (cost=0.00..1.11 rows=1 width=89) Filter: (number = 606973) (3ms) I suppose seq scans are used on the extensions because they contain so few records. All tables have index on number. How do I force it to use them also when I use order by? I use psql 7.3.2 Michiel -- Michiel Meeuwissen | Mediapark C101 Hilversum | +31 (0)35 6772979 | I hate computers nl_NL eo_XX en_US | mihxil' | [] () |
On Thu, 2004-04-22 at 07:02, Michiel Meeuwissen wrote: > Rod Taylor <pg@rbt.ca> wrote: > > The scan is picking the best method for grabbing everything within the > > table, since it is not aware that we do not require everything. > > Hmm. That is a bit silly. Why does it use the index if select only from > mm_mediasources? > > > You can explicitly tell it what you want to do via: > > > > SELECT * > > FROM (SELECT * FROM mm_mediasources ORDER BY number DESC LIMIT 20 > > UNION SELECT * FROM <subtable> ORDER BY number DESC LIMIT 20) AS tab > > ORDER BY number DESC LIMIT 20 > > I think you meant 'only mm_mediasources', and btw order by and limit are not > accepted before union, so the above query does not compile. Yes, I did mean only. Try putting another set of brackets around the selects to get ORDER BY, etc. accepted. You can add another layer of subselects in the from if that doesn't work.
> This indeeds performs good (about 10000 times faster then select number,url > from mm_mediasources order by number desc limit 20) . But hardly beautiful, > and quite useless too because of course I am now going to want to use an > offset (limit 20 offset 20, you see..), which seems more or less impossible > in this way, isn't it. Yes, and the offset is a good reason why PostgreSQL will not be able to do it by itself either. Is "number" unique across the board? If so, instead of the offset you could use WHERE number > $lastValue.
Rod Taylor <pg@rbt.ca> wrote: > On Thu, 2004-04-22 at 07:02, Michiel Meeuwissen wrote: > > Rod Taylor <pg@rbt.ca> wrote: > > > The scan is picking the best method for grabbing everything within the > > > table, since it is not aware that we do not require everything. > > > > Hmm. That is a bit silly. Why does it use the index if select only from > > mm_mediasources? > > > > > You can explicitly tell it what you want to do via: > > > > > > SELECT * > > > FROM (SELECT * FROM mm_mediasources ORDER BY number DESC LIMIT 20 > > > UNION SELECT * FROM <subtable> ORDER BY number DESC LIMIT 20) AS tab > > > ORDER BY number DESC LIMIT 20 > > > > I think you meant 'only mm_mediasources', and btw order by and limit are not > > accepted before union, so the above query does not compile. > > Yes, I did mean only. Try putting another set of brackets around the > selects to get ORDER BY, etc. accepted. You can add another layer of > subselects in the from if that doesn't work. Ok, I can get it working: select number,url from ( select number,url from (select number,url from only mm_mediasources order by number desc limit 20) as A union select number,url from (select number,url from mm_audiosources order by number desc limit 20) as B union select number,url from (select number,url from mm_videosources order by number desc limit 20) as C ) as TAB order by number desc limit 20; This indeeds performs good (about 10000 times faster then select number,url from mm_mediasources order by number desc limit 20) . But hardly beautiful, and quite useless too because of course I am now going to want to use an offset (limit 20 offset 20, you see..), which seems more or less impossible in this way, isn't it. select number,url from ( select number,url from (select number,url from only mm_mediasources order by number desc limit 100020) as A union select number,url from (select number,url from mm_audiosources order by number desc limit 100020) as B union select number,url from (select number,url from mm_videosources order by number desc limit 100020) as C ) as TAB order by number desc limit 20 offset 100000; This would be it, I think, but this performs, expectedly, quit bad again, though still 5 times faster then select url,number from mm_mediasources order by number desc limit 20 offset 100000; I'm thinking of dropping inheritance all together and using foreign keys or so for the extra fields, to simulate inheritance. That might perhaps work a whole lot better? Thanks anyway, -- Michiel Meeuwissen Mediapark C101 Hilversum +31 (0)35 6772979 nl_NL eo_XX en_US mihxil' [] ()
Rod Taylor <pg@rbt.ca> wrote: > The scan is picking the best method for grabbing everything within the > table, since it is not aware that we do not require everything. Hmm. That is a bit silly. Why does it use the index if select only from mm_mediasources? > You can explicitly tell it what you want to do via: > > SELECT * > FROM (SELECT * FROM mm_mediasources ORDER BY number DESC LIMIT 20 > UNION SELECT * FROM <subtable> ORDER BY number DESC LIMIT 20) AS tab > ORDER BY number DESC LIMIT 20 I think you meant 'only mm_mediasources', and btw order by and limit are not accepted before union, so the above query does not compile. I can't figure out any acceptable work-around. Even if something as the above would work, it still would be hardly elegant, and you can as well have no support for inheritance (actually, you can _better_ have no inheritance, because at least it is clear what works then). Michiel btw. Why are these messages not appearing on the list itself? -- Michiel Meeuwissen Mediapark C101 Hilversum +31 (0)35 6772979 nl_NL eo_XX en_US mihxil' [] ()