Re: SQL report - Mailing list pgsql-sql

From wkipjohn@gmail.com
Subject Re: SQL report
Date
Msg-id 0016364c637589ebcb046ff865c1@google.com
Whole thread Raw
In response to Re: SQL report  (Rob Sargent <robjsargent@gmail.com>)
List pgsql-sql
Hi Rob,<br /><br />I have default B-Tree indexes created for each of the indexed columes and primary key columes. (No
multiplecolumes indexe or NULL FIRST or DESC/ASC). I am using PostgreSQL 8.3 with the auto vacuum daemon on. I assume
analysewill be automatically run to collect statistics for use by the planner and there is no maintainance for B-tree
indexesonce it is created. (Please point me out if I am wrong about this)<br /><br />I will probably try to partition
thestatus table to group more recent status records together to minimize the dataset I am querying. <br /><br />Thx<br
/>John<br/><br /><br />On Jul 31, 2009 1:16am, Rob Sargent <robjsargent@gmail.com> wrote:<br />> I would be
curiousto know the performance curve for let's say 20K, 40K , 60K, 80K, 100K records. And what sort of indexing you
have,whether or not it's clustered, re-built and so on.<br />> <br />> <br />> <br />> One could envision
partitioningthe status table such that recent records were grouped together (on the assumption that they will be most
frequently"reported").<br />> <br />> <br />> <br />> wkipjohn@gmail.com wrote:<br />> <br />> <br
/>>I have the following senario.<br />> <br />> <br />> <br />> I have a tracking system. The system
willrecord the status of an object regularly, all the status records are stored in one table. And it will keep a
historyof maximum 1000 status record for each object it tracks. The maximum objects the system will track is 100,000.
Whichmeans I will potentially have a table size of 100 million records.<br />> <br />> <br />> <br />> I
haveto generate a report on the latest status of all objects being tracked at a particular point in time, and also I
haveto allow user to sort and filter on different columes in the status record displayed in the report.<br />> <br
/>><br />> <br />> The following is a brief description in the status record (they are not actual code)<br
/>><br />> <br />> <br />> ObjectRecord(<br />> <br />> objectId bigint PrimaryKey<br />> <br
/>>desc varchar<br />> <br />> )<br />> <br />> <br />> <br />> StatusRecord (<br />> <br
/>>id bigint PrimaryKey<br />> <br />> objectId bigint indexed<br />> <br />> datetime bigint indexed<br
/>><br />> capacity double<br />> <br />> reliability double<br />> <br />> efficiency double<br
/>><br />> )<br />> <br />> <br />> <br />> I have tried to do the following, it works very well with
around20,000 objects. (The query return in less than 10s) But when I have 100,000 objects it becomes very very slow. (I
don'teven have patience to wait for it to return.... I kill it after 30 mins)<br />> <br />> <br />> <br
/>>select * from statusrecord s1 INNER JOIN ( SELECT objectId , MAX(datetime) AS msdt FROM statusrecord WHERE
startDatetime<br />> <br />> <br />> I did try to write a store procedure like below, for 100,000 objects and
1000status records / object, it returns in around 30 mins.<br />> <br />> <br />> <br />> CREATE OR REPLACE
FUNCTIONgetStatus(pitvalue BIGINT) RETURNS SETOF statusrecord AS $BODY$<br />> <br />> DECLARE<br />> <br
/>>id VARCHAR;<br />> <br />> status statusrecord%ROWTYPE;<br />> <br />> BEGIN<br />> <br />> FOR
objectIN SELECT * FROM objectRecord<br />> <br />> LOOP<br />> <br />> EXECUTE 'SELECT * FROM statusrecord
WHEREobjectId = ' || quote_literal(object.objectId) ||<br />> <br />> ' AND datetime <br />> INTO status;<br
/>><br />> IF FOUND THEN<br />> <br />> RETURN NEXT status;<br />> <br />> END IF;<br />> <br
/>>END LOOP;<br />> <br />> RETURN;<br />> <br />> END<br />> <br />> $BODY$ LANGUAGE plpgsql;<br
/>><br />> <br />> <br />> Just wanna to know if anyone have a different approach to my senario. Thanks
alot.<br/>> <br />> <br />> <br />> John <br />> <br />> 

pgsql-sql by date:

Previous
From: wkipjohn@gmail.com
Date:
Subject: Re: SQL report
Next
From: dino
Date:
Subject: Activity count problem