Thread: Transaction table

Transaction table

From
Deepa Thulasidasan
Date:
Dear All,

I have a query in postgresql if any one can support.

A transaction table in a vehicle tracking application is  inserted with the current position of each vehicle at regular
interval(seconds). 
This  transaction table consists of 12 columns, which are of the type varchar, time, numeric or double precision. A new
transactiontable is created every day. Total number of records at the end of the day is around 1 million records. Data
isonly inserted in to this table and there is no update or delete. This table is indexed using 2 columns. Now, we are
expectingthis transaction table to grow by 10 times in near future. In this regard, we would like to know if this same
structureof the transaction table and the indexing would be sufficient for quick retrivel of data  or do we have to
partitionthis table? If so what kind of partition would be suitable? 

Awaiting positive reply.

Regards,
Deepa.




      The INTERNET now has a personality. YOURS! See your Yahoo! Homepage. http://in.yahoo.com/

Re: Transaction table

From
Scott Marlowe
Date:
On Sat, Mar 20, 2010 at 2:47 AM, Deepa Thulasidasan
<deepatulsidasan@yahoo.co.in> wrote:
> Dear All,
>
> I have a query in postgresql if any one can support.
>
> A transaction table in a vehicle tracking application is  inserted with the current position of each vehicle at
regularinterval (seconds). 
> This  transaction table consists of 12 columns, which are of the type varchar, time, numeric or double precision. A
newtransaction table is created every day. Total number of records at the end of the day is around 1 million records.
Datais only inserted in to this table and there is no update or delete. This table is indexed using 2 columns. Now, we
areexpecting this transaction table to grow by 10 times in near future. In this regard, we would like to know if this
samestructure of the transaction table and the indexing would be sufficient for quick retrivel of data  or do we have
topartition this table? If so what kind of partition would be suitable? 

You generally wanna partition on the one (or maybe two) fields you'll
regularly use to restrict your data set.  For many systems like this
that is a partition on date.  Sometimes you can partition on two
things, and if it makes sense to do so your current usage patterns
will show it.  Normally one axis of partitioning is fine.

Re: Transaction table

From
Scott Mead
Date:
On Sat, Mar 20, 2010 at 5:26 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Sat, Mar 20, 2010 at 2:47 AM, Deepa Thulasidasan
<deepatulsidasan@yahoo.co.in> wrote:
> Dear All,
>
> I have a query in postgresql if any one can support.
>
> A transaction table in a vehicle tracking application is  inserted with the current position of each vehicle at regular interval (seconds).
> This  transaction table consists of 12 columns, which are of the type varchar, time, numeric or double precision. A new transaction table is created every day. Total number of records at the end of the day is around 1 million records. Data is only inserted in to this table and there is no update or delete. This table is indexed using 2 columns. Now, we are expecting this transaction table to grow by 10 times in near future. In this regard, we would like to know if this same structure of the transaction table and the indexing would be sufficient for quick retrivel of data  or do we have to partition this table? If so what kind of partition would be suitable?

You generally wanna partition on the one (or maybe two) fields you'll
regularly use to restrict your data set.  For many systems like this
that is a partition on date.  Sometimes you can partition on two
things, and if it makes sense to do so your current usage patterns
will show it.  Normally one axis of partitioning is fine.

That'll help with reporting, how are you inserting the data now?  Are you using individual inserts, or are you loading in batches.  Typically, if you can buffer some of the inserts into a group on the application side and then load them you'll be better off then just doing straight inserts for every 'event'.  Then, you can combine that with partitioning to report over multiple days pretty easily.

--Scott Mead

(Just realized that not only are there lots of Scott's on this list... we have multiple Scott M's! :)

 

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Transaction table

From
Vick Khera
Date:
On Sat, Mar 20, 2010 at 4:47 AM, Deepa Thulasidasan
<deepatulsidasan@yahoo.co.in> wrote:
> transaction table to grow by 10 times in near future. In this regard, we would like to know if this same structure of
thetransaction table and the indexing would be sufficient for quick retrivel of data  or do we have to partition this
table?If so what kind of partition would be suitable? 


My experience has been that when the tables are approaching the 100
million record mark things tend to slow down.  Running reindex and
vacuum on those tables also takes much longer since you tend not to
have enough memory to do those operations efficiently.

I like to partition tables such that they end up having under 10
million records each.  I just (a few hours ago...) finished
partitioning and migrating the data from a single table that had about
120 million records into 100 partitions of about 1.2 million rows
each.  For this particular case, I just partitioned on a mod 100
operation of one of the ID keys on which I do the bulk of my searches.

Like the two Scott M's recommended, figure out your usage patterns and
partition across those vectors to optimize those searches.  I would
not worry about optimizing the insert pattern.

You really *never* delete this data?  I would suspect then that having
a partitioning scheme where the number of partitions can grow over
time is going to be important to you.

Re: Transaction table

From
John R Pierce
Date:
Vick Khera wrote:
> You really *never* delete this data?  I would suspect then that having
> a partitioning scheme where the number of partitions can grow over
> time is going to be important to you.
>

he said a new table is created each day, but nothing about these daily
tables being partitions in a larger view.   I don't know if that means
the old daily tables are deleted eventually or just kept forever.  he
then said the daily table will be growing by 10X, I don't know if these
means 10 times more vehicles or 10 times the number of daily trackpoints
per vehicle.   he said this daily table has two indexes, I suspect these
are vehicle number, and time of track point, but I'm just guessing.

if it is by vehicle and by time, and the number of vehicles is
multiplying, he could partition by vehicle if the daily table becomes
oto large, or he could partition by hour.   partitioning per vehicle
would allow putting them in different tablespaces, which could be on
different disk drives and spread the write load out across more
spindles, while hourly would concentrate all the writes on the same
sub-table.

Re: Transaction table

From
Scott Marlowe
Date:
On Sun, Mar 21, 2010 at 1:30 PM, Vick Khera <vivek@khera.org> wrote:
> Like the two Scott M's recommended, figure out your usage patterns and
> partition across those vectors to optimize those searches.  I would
> not worry about optimizing the insert pattern.

Note that once the partitions get small enough, on bigger hardware,
there's often little need to index at that level anymore.  If you're
pulling all the records from a 50 meg db file, it'll be read in well
under a second.  Even if you hit a few partitions, it's still pretty
fast since it's at worst a sequential scan, or more likely a read from
OS level cache.