Thread: PG optimization question

PG optimization question

From
Nickolay
Date:
Hi 2 all,

Here is my typical configuration: 1(2) GB of RAM, HP ML 350(150) series
server, SATA raid, Linux.

I have 1 big table (called "archive") which contains short text messages
with a plenty of additional service info.
Currently this table contains more than 4M rows for a period of 4,5
months, i.e. each row has average size of 1K.

I'm going to make our application work with partitions of this table
instead of one large table. The primary reason is that eventually we'd
need to remove old rows and it would be pretty hard with one table
because of blocking (and rows are being added constantly).

1. What would be your recommendations on how to partition this table (by
months, years or quarters)?
2. What is recommended PG settings for such configuration? Would it be
ok to set shared_buffers to let's say 512M (if RAM is 1Gig may be
shared_buffers is to be 400M?)? What other settings would you recommend?

Thanks in advance,
Nick.

Re: PG optimization question

From
Grzegorz Jaśkiewicz
Date:
maybe that 'one big table' needs something called 'normalisation'
first. See how much that will shed off. You might be surprised.
The partitioning needs to be done by some constant intervals, of time
- in your case. Whatever suits you, I would suggest to use the rate
that will give you both ease of archiving/removal of old data (so not
too wide), and also, one that would make sure that most of the data
you'll be searching for in your queries will be in one , two
partitions per query.

Re: PG optimization question

From
Thomas Kellerer
Date:
Nickolay wrote on 09.01.2010 11:24:
> it would be pretty hard with one table because of blocking

What do you man with "because of blocking"?

Thomas

Re: PG optimization question

From
Nickolay
Date:
I do not see any way to normalize this table anymore. it's size is 4Gig
for ~4M rows, i.e. 1Kb per row, i think it's ok.
Also there are 2 indexes: by date_time and by a couple of service fields
(total index size is 250Mb now).
I think i'll be going to partition by months (approx. 1M rows or 1Gig
per month), so it would be like 60 partitions for 5 years. Is that OK
for postgres?
Oh, btw, 95% of queries are searching rows for current date (last 24 hours).
Also we use SELECT...FOR UPDATE row-level locking for updating the rows
in archive (i.e. we INSERT new row when starting outgoing message
transmittion and then doing SELECT...FOR UPDATE and UPDATE for source
(incoming) message when outgoing message was sent), so I guess we would
have to explicitly write the name of partition table (i.e.
"archive_2009_12" instead of "archive") for SELECT...FOR UPDATE and
UPDATE requests, as they may need to access row in previous partition
instead of the current one.

Grzegorz Jaśkiewicz wrote:
> maybe that 'one big table' needs something called 'normalisation'
> first. See how much that will shed off. You might be surprised.
> The partitioning needs to be done by some constant intervals, of time
> - in your case. Whatever suits you, I would suggest to use the rate
> that will give you both ease of archiving/removal of old data (so not
> too wide), and also, one that would make sure that most of the data
> you'll be searching for in your queries will be in one , two
> partitions per query.
>
>
>

Re: PG optimization question

From
Kenneth Marshall
Date:
On Sat, Jan 09, 2010 at 03:42:08PM +0300, Nickolay wrote:
> I do not see any way to normalize this table anymore. it's size is 4Gig for
> ~4M rows, i.e. 1Kb per row, i think it's ok.
> Also there are 2 indexes: by date_time and by a couple of service fields
> (total index size is 250Mb now).
> I think i'll be going to partition by months (approx. 1M rows or 1Gig per
> month), so it would be like 60 partitions for 5 years. Is that OK for
> postgres?

Not a problem. We have a log server that has 64 daily partitions.

> Oh, btw, 95% of queries are searching rows for current date (last 24
> hours).

You may want to use a daily staging table and then flush to the
monthly archive tables at the end of the day.

Ken

Re: PG optimization question

From
Nickolay
Date:
That may help with the queries speed (not a problem now), but we'll then
have to add UNION statement for daily staging table for other 5% of
requests, right? And there would be a moment when daily message is in
archive table AND in daily table (while transferring from daily table to
archive).
Our main problem is in blocking when doing DELETE (app sometimes freezes
for a long time), and also we have to do VACUUM on live table, which is
not acceptable in our app.

Thanks for your reply, I was kinda worried about number of partitions
and how this would affect PG query execution speed.

Kenneth Marshall wrote:
>> Oh, btw, 95% of queries are searching rows for current date (last 24
>> hours).
>>
>
> You may want to use a daily staging table and then flush to the
> monthly archive tables at the end of the day.
>
>

Re: PG optimization question

From
Pierre Frédéric Caillaud
Date:
> That may help with the queries speed (not a problem now), but we'll then
> have to add UNION statement for daily staging table for other 5% of
> requests, right? And there would be a moment when daily message is in
> archive table AND in daily table (while transferring from daily table to
> archive).
> Our main problem is in blocking when doing DELETE (app sometimes freezes
> for a long time), and also we have to do VACUUM on live table, which is
> not acceptable in our app.
>
> Thanks for your reply, I was kinda worried about number of partitions
> and how this would affect PG query execution speed.
>
> Kenneth Marshall wrote:
>>> Oh, btw, 95% of queries are searching rows for current date (last 24
>>> hours).
>>>
>>
>> You may want to use a daily staging table and then flush to the monthly
>> archive tables at the end of the day.

    If the rows in the archive tables are never updated, this strategy means
you never need to vacuum the big archive tables (and indexes), which is
good. Also you can insert the rows into the archive table in the order of
your choice, the timestamp for example, which makes it nicely clustered,
without needing to ever run CLUSTER.

    And with partitioning you can have lots of indexes on the staging table
(and current months partition) (to speed up your most common queries which
are likely to be more OLTP), while using less indexes on the older
partitions (saves disk space) if queries on old partitions are likely to
be reporting queries which are going to grind through a large part of the
table anyway.

Re: PG optimization question

From
Nickolay
Date:
Okay, I see your point with staging table. That's a good idea!
The only problem I see here is the transfer-to-archive-table process. As
you've correctly noticed, the system is kind of a real-time and there
can be dozens of processes writing to the staging table, i cannot see
how to make the transfer/flush process right and clear...

Pierre Frédéric Caillaud wrote:
>>>> Oh, btw, 95% of queries are searching rows for current date (last
>>>> 24 hours).
>>>>
>>>
>>> You may want to use a daily staging table and then flush to the
>>> monthly archive tables at the end of the day.
>
>     If the rows in the archive tables are never updated, this strategy
> means you never need to vacuum the big archive tables (and indexes),
> which is good. Also you can insert the rows into the archive table in
> the order of your choice, the timestamp for example, which makes it
> nicely clustered, without needing to ever run CLUSTER.
>
>     And with partitioning you can have lots of indexes on the staging
> table (and current months partition) (to speed up your most common
> queries which are likely to be more OLTP), while using less indexes on
> the older partitions (saves disk space) if queries on old partitions
> are likely to be reporting queries which are going to grind through a
> large part of the table anyway.
>
>


Re: PG optimization question

From
Ludwik Dylag
Date:
2010/1/9 Nickolay <nitro@zhukcity.ru>
Okay, I see your point with staging table. That's a good idea!
The only problem I see here is the transfer-to-archive-table process. As you've correctly noticed, the system is kind of a real-time and there can be dozens of processes writing to the staging table, i cannot see how to make the transfer/flush process right and clear...

The simplest way to do this is to create view and add/remove first/last day by recreating the view on daily interval.

--
Ludwik Dyląg

Re: PG optimization question

From
Pierre Frédéric Caillaud
Date:
> If you transfer (delete from staging, insert into archive) in one
> transaction , then it will be always visible in exactly one of them,
> and exatly once in a view over both staging and archive(s).

    Does the latest version implement this :

INSERT INTO archive (...) DELETE FROM staging WHERE ... RETURNING ...


Re: PG optimization question

From
Robert Haas
Date:
2010/1/10 Pierre Frédéric Caillaud <lists@peufeu.com>:
>
>> If you transfer (delete from staging, insert into archive) in one
>> transaction , then it will be always visible in exactly one of them,
>> and exatly once in a view over both staging and archive(s).
>
>        Does the latest version implement this :
>
> INSERT INTO archive (...) DELETE FROM staging WHERE ... RETURNING ...

No.  There are no plans to support that, though there are proposals to support:

WITH x AS (DELETE FROM staging WHERE ... RETURNING ...) INSERT INTO
archive (...) SELECT ... FROM x

I'm not sure how much that will help though since, in the designs so
far discused, the tuples won't be pipelined.

...Robert

Re: PG optimization question

From
Pierre Frédéric Caillaud
Date:
On Sun, 10 Jan 2010 19:45:32 +0100, Robert Haas <robertmhaas@gmail.com>
wrote:

> 2010/1/10 Pierre Frédéric Caillaud <lists@peufeu.com>:
>>
>>> If you transfer (delete from staging, insert into archive) in one
>>> transaction , then it will be always visible in exactly one of them,
>>> and exatly once in a view over both staging and archive(s).
>>
>>        Does the latest version implement this :
>>
>> INSERT INTO archive (...) DELETE FROM staging WHERE ... RETURNING ...
>
> No.  There are no plans to support that, though there are proposals to
> support:
>
> WITH x AS (DELETE FROM staging WHERE ... RETURNING ...) INSERT INTO
> archive (...) SELECT ... FROM x
>
> I'm not sure how much that will help though since, in the designs so
> far discused, the tuples won't be pipelined.
>
> ...Robert
>

    Yeah, but it's a lot more user-friendly than SELECT FOR UPDATE, INSERT
SELECT, DELETE...