Efficiency of timestamps - Mailing list pgsql-performance
From | Martin Foster |
---|---|
Subject | Efficiency of timestamps |
Date | |
Msg-id | 3F0B6178.2020805@ethereal-realms.org Whole thread Raw |
Responses |
Re: Efficiency of timestamps
|
List | pgsql-performance |
As I keep looking through code to see where I can make things more efficient, I noticed that in some cases timestamps seem horribly inefficient. This leads to very long run times for certain queries. Here is an example: -- USING TIMESTAMPS TO NARROW DOWN -- SELECT Post.PostIDNumber, Post.PuppeteerLogin, Post.PuppetName, Post.PostCmd, Post.PostClass FROM ethereal.Post WHERE Post.PostTimeStamp > (LOCALTIMESTAMP - INTERVAL '10 Minutes') AND Post.RealmName='Amalgam' AND (Post.PostTo='all' OR Post.PostTo='root') AND (NOT EXISTS (SELECT PuppetIgnore.PuppetLogin FROM ethereal.PuppetIgnore WHERE PuppetIgnore.PuppetIgnore='global' AND PuppetIgnore.PuppeteerLogin='root' AND PuppetIgnore.PuppetLogin=Post.PuppeteerLogin) OR Post.PuppeteerLogin IS NULL) AND (NOT EXISTS (SELECT PuppetIgnore.PuppetName FROM ethereal.PuppetIgnore WHERE PuppetIgnore='single' AND PuppetIgnore.PuppeteerLogin='root' AND PuppetIgnore.PuppetName=Post.PuppetName) OR Post.PuppetName IS NULL) ORDER BY Post.PostIDNumber LIMIT 100 -- Explain of Above-- Limit (cost=0.00..260237.32 rows=100 width=48) -> Index Scan using pkpost on post (cost=0.00..3020594.00 rows=1161 width=48) Filter: ((posttimestamp > (('now'::text)::timestamp(6) without time zone - '00:10'::interval)) AND (realmname = 'Amalgam'::character varying) AND ((postto = 'all'::character varying) OR (postto = 'root'::character varying)) AND ((NOT (subplan)) OR (puppeteerlogin IS NULL)) AND ((NOT (subplan)) OR (puppetname IS NULL))) SubPlan -> Index Scan using pkpuppetignore on puppetignore (cost=0.00..13.31 rows=1 width=10) Index Cond: (puppeteerlogin = 'root'::character varying) Filter: ((puppetignore = 'global'::character varying) AND (puppetlogin = $0)) -> Index Scan using pkpuppetignore on puppetignore (cost=0.00..5.84 rows=1 width=15) Index Cond: ((puppeteerlogin = 'root'::character varying) AND (puppetname = $1)) Filter: (puppetignore = 'single'::character varying) Result : 22 rows fetched (17.21 sec) -- USING A GENERATED ID NUMBER -- SELECT Post.PostIDNumber, Post.PuppeteerLogin, Post.PuppetName, Post.PostCmd, Post.PostClass FROM ethereal.Post WHERE Post.PostIDNumber > 1 AND Post.RealmName='Amalgam' AND (Post.PostTo='all' OR Post.PostTo='root') AND (NOT EXISTS (SELECT PuppetIgnore.PuppetLogin FROM ethereal.PuppetIgnore WHERE PuppetIgnore.PuppetIgnore='global' AND PuppetIgnore.PuppeteerLogin='root' AND PuppetIgnore.PuppetLogin=Post.PuppeteerLogin) OR Post.PuppeteerLogin IS NULL) AND (NOT EXISTS (SELECT PuppetIgnore.PuppetName FROM ethereal.PuppetIgnore WHERE PuppetIgnore='single' AND PuppetIgnore.PuppeteerLogin='root' AND PuppetIgnore.PuppetName=Post.PuppetName) OR Post.PuppetName IS NULL) ORDER BY Post.PostIDNumber LIMIT 100 -- Explain of Above-- Limit (cost=0.00..86712.10 rows=100 width=48) -> Index Scan using pkpost on post (cost=0.00..3019119.56 rows=3482 width=48) Index Cond: (postidnumber > 1) Filter: ((realmname = 'Amalgam'::character varying) AND ((postto = 'all'::character varying) OR (postto = 'root'::character varying)) AND ((NOT (subplan)) OR (puppeteerlogin IS NULL)) AND ((NOT (subplan)) OR (puppetname IS NULL))) SubPlan -> Index Scan using pkpuppetignore on puppetignore (cost=0.00..13.31 rows=1 width=10) Index Cond: (puppeteerlogin = 'root'::character varying) Filter: ((puppetignore = 'global'::character varying) AND (puppetlogin = $0)) -> Index Scan using pkpuppetignore on puppetignore (cost=0.00..5.84 rows=1 width=15) Index Cond: ((puppeteerlogin = 'root'::character varying) AND (puppetname = $1)) Filter: (puppetignore = 'single'::character varying) Result : 100 rows fetched ( 0.19 sec) -- USING A MIXTURE OF BOTH -- SELECT Post.PostIDNumber, Post.PuppeteerLogin, Post.PuppetName, Post.PostCmd, Post.PostClass FROM ethereal.Post WHERE Post.PostIDNumber > (SELECT MIN(PostIDNumber) FROM ethereal.Post WHERE Post.PostTimeStamp > (LOCALTIMESTAMP - INTERVAL '10 minutes'))::INT AND Post.RealmName='Amalgam' AND (Post.PostTo='all' OR Post.PostTo='root') AND (NOT EXISTS (SELECT PuppetIgnore.PuppetLogin FROM ethereal.PuppetIgnore WHERE PuppetIgnore.PuppetIgnore='global' AND PuppetIgnore.PuppeteerLogin='root' AND PuppetIgnore.PuppetLogin=Post.PuppeteerLogin) OR Post.PuppeteerLogin IS NULL) AND (NOT EXISTS (SELECT PuppetIgnore.PuppetName FROM ethereal.PuppetIgnore WHERE PuppetIgnore='single' AND PuppetIgnore.PuppeteerLogin='root' AND PuppetIgnore.PuppetName=Post.PuppetName) OR Post.PuppetName IS NULL) ORDER BY Post.PostIDNumber LIMIT 100 -- Explain of Above-- Limit (cost=0.00..87101.38 rows=100 width=48) InitPlan -> Aggregate (cost=12412.82..12412.82 rows=1 width=4) -> Index Scan using idxpost_timestamp on post (cost=0.00..12282.42 rows=52160 width=4) Index Cond: (posttimestamp > (('now'::text)::timestamp(6) without time zone - '00:10'::interval)) -> Index Scan using pkpost on post (cost=0.00..1010992.25 rows=1161 width=48) Index Cond: (postidnumber > $0) Filter: ((realmname = 'Amalgam'::character varying) AND ((postto = 'all'::character varying) OR (postto = 'root'::character varying)) AND ((NOT (subplan)) OR (puppeteerlogin IS NULL)) AND ((NOT (subplan)) OR (puppetname IS NULL))) SubPlan -> Index Scan using pkpuppetignore on puppetignore (cost=0.00..13.31 rows=1 width=10) Index Cond: (puppeteerlogin = 'root'::character varying) Filter: ((puppetignore = 'global'::character varying) AND (puppetlogin = $1)) -> Index Scan using pkpuppetignore on puppetignore (cost=0.00..5.84 rows=1 width=15) Index Cond: ((puppeteerlogin = 'root'::character varying) AND (puppetname = $2)) Filter: (puppetignore = 'single'::character varying) Result : 18 rows fetched ( 0.04 sec) Both PostIDNumber and PostTimestamp are indexed, so that should not be a bottleneck in itself. However, as you can see in the third example the use of a sub-query actually accelerates the process considerably, meaning that integer based searching is much much faster. Under MySQL timestamps where in Unix time, which is why I may have never noticed such an extreme slowdown when doing similar on that script. Of course to boggle the mind, here is a view that works very well: CREATE VIEW ethereal.Who AS SELECT Po.PuppetName AS PuppetName, Po.PuppeteerLogin AS PuppeteerLogin, Po.RealmName AS RealmName, Re.RealmPublic AS RealmPublic, Re.RealmVerified AS RealmVerified FROM ethereal.Post Po, ethereal.Puppet Ch, ethereal.Realm Re WHERE Po.PuppeteerLogin = Ch.PuppeteerLogin AND Po.RealmName = Re.RealmName AND Po.PostTimestamp > (LOCALTIMESTAMP - INTERVAL '10 minutes') AND Po.PuppetName IS NOT NULL GROUP BY Po.PuppeteerLogin, Po.PuppetName, Po.RealmName, Re.RealmPublic, Re.RealmVerified ORDER BY Po.RealmName, Po.PuppetName; Sort (cost=309259.89..309629.34 rows=147780 width=79) Sort Key: po.realmname, po.puppetname -> Group (cost=270648.27..292815.19 rows=147780 width=79) -> Sort (cost=270648.27..274342.75 rows=1477795 width=79) Sort Key: po.puppeteerlogin, po.puppetname, po.realmname, re.realmpublic, re.realmverified -> Merge Join (cost=22181.60..41087.65 rows=1477795 width=79) Merge Cond: ("outer".puppeteerlogin = "inner".puppeteerlogin) -> Sort (cost=17172.82..17300.26 rows=50978 width=69) Sort Key: po.puppeteerlogin -> Hash Join (cost=12.41..13186.95 rows=50978 width=69) Hash Cond: ("outer".realmname = "inner".realmname) -> Index Scan using idxpost_timestamp on post po (cost=0.00..12282.42 rows=50978 width=42) Index Cond: (posttimestamp > (('now'::text)::timestamp(6) without time zone - '00:10'::interval)) Filter: (puppetname IS NOT NULL) -> Hash (cost=11.93..11.93 rows=193 width=27) -> Seq Scan on realm re (cost=0.00..11.93 rows=193 width=27) -> Sort (cost=5008.78..5100.22 rows=36574 width=10) Sort Key: ch.puppeteerlogin -> Seq Scan on puppet ch (cost=0.00..2236.74 rows=36574 width=10) Result : 48 rows fetched ( 0.55 sec) It uses the exact same time restraint as the first three examples, looks through the same table, does a tipple join and still gets off at higher speeds. This seems to indicate that timestamps are actually efficient, which contradicts above examples. Any ideas? Code for the table creation is below signature: Martin Foster Creator/Designer Ethereal Realms martin@ethereal-realms.org -- -- -- NAME : Post -- REFERENCES : Realm* -- Puppet* -- PuppeteerLogin* -- -- DESCRIPTION : Post is the hive of activity for all realms. Associated with all three -- major tables, it is not actually linked because of the nature of storing -- posts for statistics and auditing. CREATE TABLE ethereal.Post ( PostIDNumber INT NOT NULL DEFAULT NEXTVAL('ethereal.seqPost'), RealmName VARCHAR(30) NOT NULL, PuppetName VARCHAR(30), PuppeteerLogin VARCHAR(10), PostTo VARCHAR(30), PostTimestamp TIMESTAMP NOT NULL DEFAULT LOCALTIMESTAMP, PostClass VARCHAR(10) NOT NULL DEFAULT 'general', PostCmd VARCHAR(10) NOT NULL DEFAULT 'none', PostFullFormat TEXT, PostImagelessFormat TEXT, PostPartialFormat TEXT, CONSTRAINT pkPost PRIMARY KEY (PostIDNumber), CONSTRAINT enumPostClass CHECK (PostCLass IN ('banner','dice','duplicate','general','play','private','special','system')), CONSTRAINT enumPostCmd CHECK (PostCmd IN ('general','none','play','stream')) ) WITHOUT OIDS; -- STANDARD INDEX CREATE INDEX idxPost_Class ON ethereal.Post ( PostClass ); CREATE INDEX idxPost_Login ON ethereal.Post ( PuppeteerLogin ); CREATE INDEX idxPost_Puppet ON ethereal.Post ( PuppetName ); CREATE INDEX idxPost_Realm ON ethereal.Post ( RealmName ); CREATE INDEX idxPost_Timestamp ON ethereal.Post ( PostTimestamp );
pgsql-performance by date: