Inherits and "get highest id". - Mailing list pgsql-general
From | Dawid Kuroczko |
---|---|
Subject | Inherits and "get highest id". |
Date | |
Msg-id | 758d5e7f05061003272c52d341@mail.gmail.com Whole thread Raw |
Responses |
Re: Inherits and "get highest id".
|
List | pgsql-general |
Hello. I've tried data partitioning using INHERITS mechanism (pgsql 8.0.3). The schema looks like this: CREATE TABLE log ( logid integer NOT NULL PRIMARY KEY, logdate timestamp(0) without time zone NOT NULL, typeid integer NOT NULL, ip inet, [.....] ); CREATE TABLE log_data.log200501 () INHERITS log; CREATE TABLE log_data.log200502 () INHERITS log; ... CREATE TABLE log_data.log200512 () INHERITS log; Each month-table has over 2 milion rows. Each table has a primary index on logid column. And now, if I do a simple query like this: explain analyze SELECT logid FROM log_200501 ORDER BY logid LIMIT 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..0.04 rows=1 width=4) (actual time=0.113..0.115 rows=1 loops=1) -> Index Scan using log_200501_pkey on log_200501 (cost=0.00..168766.23 rows=4149475 width=4) (actual time=0.108..0.108 rows=1 loops=1) Total runtime: 0.196 ms But when I'm trying to get "global" highest logid I get: qnex=# explain SELECT logid FROM log ORDER BY logid LIMIT 1; QUERY PLAN ---------------------------------------------------------------------------------------------------- Limit (cost=4140892.91..4140892.91 rows=1 width=4) -> Sort (cost=4140892.91..4201607.87 rows=24285986 width=4) Sort Key: logs.log.logid -> Result (cost=0.00..887109.86 rows=24285986 width=4) -> Append (cost=0.00..887109.86 rows=24285986 width=4) -> Seq Scan on log (cost=0.00..10.70 rows=70 width=4) -> Seq Scan on log_200501 log (cost=0.00..155529.75 rows=4149475 width=4) -> Seq Scan on log_200502 log (cost=0.00..145904.29 rows=3857729 width=4) -> Seq Scan on log_200503 log (cost=0.00..165485.78 rows=4369278 width=4) -> Seq Scan on log_200504 log (cost=0.00..420093.74 rows=11908874 width=4) -> Seq Scan on log_200505 log (cost=0.00..10.70 rows=70 width=4) -> Seq Scan on log_200506 log (cost=0.00..10.70 rows=70 width=4) -> Seq Scan on log_200507 log (cost=0.00..10.70 rows=70 width=4) -> Seq Scan on log_200508 log (cost=0.00..10.70 rows=70 width=4) -> Seq Scan on log_200509 log (cost=0.00..10.70 rows=70 width=4) -> Seq Scan on log_200510 log (cost=0.00..10.70 rows=70 width=4) -> Seq Scan on log_200511 log (cost=0.00..10.70 rows=70 width=4) -> Seq Scan on log_200512 log (cost=0.00..10.70 rows=70 width=4) (18 rows) Time: 3,142 ms In other words -- doing such a one row select means scanning the whole partitioned data. the primaryindex on logid is not used anywhere. Creating view (SELECT * UNION ALL SELECT * UNION ALL....) does not help either. I wonder -- is there any better solution? selecting highest from each table, unioning it and then selecting highest of the highest works well but it strikes me as 'not exactly the niciest implementation' (one of the reasons is that I have to remembed about it each time new table will be added to schema...). Any comments, hints? Regards, Dawid
pgsql-general by date: