Thread: Scalability question

Scalability question

From
Zoltan Boszormenyi
Date:
Hi,

I got a question about scalability in high volume insert situation
where the table has a primary key and several non-unique indexes
on other columns of the table. How does PostgreSQL behave
in terms of scalability? The high volume of inserts comes from
multiple transactions.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/


Re: Scalability question

From
tv@fuzzy.cz
Date:
> Hi,
>
> I got a question about scalability in high volume insert situation
> where the table has a primary key and several non-unique indexes
> on other columns of the table. How does PostgreSQL behave
> in terms of scalability? The high volume of inserts comes from
> multiple transactions.
>
> Best regards,
> Zoltán Böszörményi

Well, that's a difficult question as it depends on hardware and software,
but with a proper tunning the results may be very good. Just do the basic
PostgreSQL tuning and then tune it for the INSERT performance if needed.
It's difficult to give any other recommendations without a more detailed
knowledge of the problem, but consider these hints:

1) move the pg_xlog to a separate drive (so it's linear)
2) move the table with large amount of inserts to a separate tablespace
3) minimize the amount of indexes etc.

The basic rule is that each index adds some overhead to the insert, but it
depends on datatype, etc. Just prepare some data to import, and run the
insert with and without the indexes and compare the time.

Tomas


Re: Scalability question

From
Zoltan Boszormenyi
Date:
tv@fuzzy.cz írta:
>> Hi,
>>
>> I got a question about scalability in high volume insert situation
>> where the table has a primary key and several non-unique indexes
>> on other columns of the table. How does PostgreSQL behave
>> in terms of scalability? The high volume of inserts comes from
>> multiple transactions.
>>
>> Best regards,
>> Zoltán Böszörményi
>>
>
> Well, that's a difficult question as it depends on hardware and software,
> but with a proper tunning the results may be very good. Just do the basic
> PostgreSQL tuning and then tune it for the INSERT performance if needed.
> It's difficult to give any other recommendations without a more detailed
> knowledge of the problem, but consider these hints:
>
> 1) move the pg_xlog to a separate drive (so it's linear)
> 2) move the table with large amount of inserts to a separate tablespace
> 3) minimize the amount of indexes etc.
>
> The basic rule is that each index adds some overhead to the insert, but it
> depends on datatype, etc. Just prepare some data to import, and run the
> insert with and without the indexes and compare the time.
>
> Tomas
>

Thanks. The question is more about theoretical working.
E.g. if INSERTs add "similar" records with identical index records
(they are non-unique indexes) does it cause contention? Because
these similar records add index tuples that supposed to be near
to each other in the btree.

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/


Re: Scalability question

From
Alvaro Herrera
Date:
Zoltan Boszormenyi wrote:
> Hi,
>
> I got a question about scalability in high volume insert situation
> where the table has a primary key and several non-unique indexes
> on other columns of the table. How does PostgreSQL behave
> in terms of scalability? The high volume of inserts comes from
> multiple transactions.

btree and gist indexes can have multiple concurrent insertions in
flight.  A potential for blocking is in UNIQUE indexes: if two
transactions try to insert the same value in the unique index, the
second one will block until the first transaction finishes.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: Scalability question

From
"Scott Marlowe"
Date:
On Wed, Jun 11, 2008 at 3:56 AM, Zoltan Boszormenyi <zb@cybertec.at> wrote:
> Hi,
>
> I got a question about scalability in high volume insert situation
> where the table has a primary key and several non-unique indexes
> on other columns of the table. How does PostgreSQL behave
> in terms of scalability? The high volume of inserts comes from
> multiple transactions.

PostgreSQL supports initial fill rates of < 100% for indexes, so set
it to 50% filled and new entries that live near current entries will
have room to be added without having the split the btree.

PostgreSQL also allows you to easily put your indexes on other
paritions / drive arrays etc...

PostgreSQL does NOT store visibility info in the indexes, so they stay
small and updates to them are pretty fast.