Thread: Insert performance with composite index

Insert performance with composite index

From
Divakar Singh
Date:
Hi,
I am trying to tune my libpq program for insert performance.
When I tried inserting 1M rows into a table with a Primary Key, it took almost 62 seconds.
After adding a composite index of 2 columns, the performance degrades to 125 seconds.
I am using COPY to insert all data in 1 transaction.

the table definition is

CREATE TABLE ABC
(
  event integer,
  innodeid character varying(80),
  innodename character varying(80),
  sourceid character varying(300),
  intime timestamp(3) without time zone,
  outnodeid character varying(80),
  outnodename character varying(80),
  destinationid character varying(300),
  outtime timestamp(3) without time zone,
  bytes integer,
  cdrs integer,
  tableindex integer NOT NULL,
  noofsubfilesinfile integer,
  recordsequenceintegerlist character varying(1000),
  CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
)

the index definition is


CREATE INDEX "PK_AT2"
  ON ABC
  USING btree
  (event, tableindex)
TABLESPACE sample;

Any tip to increase the insert performance in this case?

It would also be helpful if someone can send comprehensive libpq programming guide for PG 9.x. Online doc of libpq is not much helpful for a newbie like me.

 
Best Regards,
Divakar


Re: Insert performance with composite index

From
Divakar Singh
Date:
Hi Marti,
Thanks for your tips. i will try those.
I am on Solaris Sparc 5.10
 
Best Regards,
Divakar



From: Marti Raudsepp <marti@juffo.org>
To: Divakar Singh <dpsmails@yahoo.com>
Cc: pgsql-performance@postgresql.org
Sent: Mon, November 1, 2010 6:23:17 PM
Subject: Re: [PERFORM] Insert performance with composite index

On Mon, Nov 1, 2010 at 14:49, Divakar Singh <dpsmails@yahoo.com> wrote:
> I am trying to tune my libpq program for insert performance.
> When I tried inserting 1M rows into a table with a Primary Key, it took
> almost 62 seconds.
> After adding a composite index of 2 columns, the performance degrades to 125
> seconds.

This sounds a lot like the bottleneck I was hitting. What Linux kernel
version are you running?

If it's 2.6.33 or later, see:
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#wal_sync_method_wal_buffers
http://archives.postgresql.org/pgsql-performance/2010-10/msg00602.php

Regards,
Marti

Re: Insert performance with composite index

From
Marti Raudsepp
Date:
On Mon, Nov 1, 2010 at 14:49, Divakar Singh <dpsmails@yahoo.com> wrote:
> I am trying to tune my libpq program for insert performance.
> When I tried inserting 1M rows into a table with a Primary Key, it took
> almost 62 seconds.
> After adding a composite index of 2 columns, the performance degrades to 125
> seconds.

This sounds a lot like the bottleneck I was hitting. What Linux kernel
version are you running?

If it's 2.6.33 or later, see:
http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#wal_sync_method_wal_buffers
http://archives.postgresql.org/pgsql-performance/2010-10/msg00602.php

Regards,
Marti

Re: Insert performance with composite index

From
Marti Raudsepp
Date:
On Mon, Nov 1, 2010 at 14:56, Divakar Singh <dpsmails@yahoo.com> wrote:
> Thanks for your tips. i will try those.
> I am on Solaris Sparc 5.10

Sorry, I assumed you were running Linux. But still it could be the
same problem as I had.

Be careful changing your wal_sync_method, as it has the potential to
corrupt your database. I have no experience with Solaris.

For what it's worth, Jignesh Shah recommends using
wal_sync_method=fsync on Solaris:
http://blogs.sun.com/jkshah/entry/postgresql_on_solaris_better_use
http://blogs.sun.com/jkshah/entry/postgresql_wal_sync_method_and

Regards,
Marti

Re: Insert performance with composite index

From
Cédric Villemain
Date:
2010/11/1 Divakar Singh <dpsmails@yahoo.com>:
> Hi,
> I am trying to tune my libpq program for insert performance.
> When I tried inserting 1M rows into a table with a Primary Key, it took
> almost 62 seconds.
> After adding a composite index of 2 columns, the performance degrades to 125
> seconds.
> I am using COPY to insert all data in 1 transaction.
>
> the table definition is
>
> CREATE TABLE ABC
> (
>   event integer,
>   innodeid character varying(80),
>   innodename character varying(80),
>   sourceid character varying(300),
>   intime timestamp(3) without time zone,
>   outnodeid character varying(80),
>   outnodename character varying(80),
>   destinationid character varying(300),
>   outtime timestamp(3) without time zone,
>   bytes integer,
>   cdrs integer,
>   tableindex integer NOT NULL,
>   noofsubfilesinfile integer,
>   recordsequenceintegerlist character varying(1000),
>   CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
> )
>
> the index definition is
>
>
> CREATE INDEX "PK_AT2"
>   ON ABC
>   USING btree
>   (event, tableindex)
> TABLESPACE sample;

Indexing twice the same column is useless. (perhaps move your PK to
the tablespace 'sample' is good too ?)

>
> Any tip to increase the insert performance in this case?

If you create or truncate  table then copy to it, you should create
index after the copy order.

>
> It would also be helpful if someone can send comprehensive libpq programming
> guide for PG 9.x. Online doc of libpq is not much helpful for a newbie like
> me.
>
>
> Best Regards,
> Divakar
>
>



--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

Re: Insert performance with composite index

From
Andres Freund
Date:
Hi,

On Monday 01 November 2010 13:49:14 Divakar Singh wrote:
> When I tried inserting 1M rows into a table with a Primary Key, it took
> almost 62 seconds.
> After adding a composite index of 2 columns, the performance degrades to
> 125 seconds.
> I am using COPY to insert all data in 1 transaction.
Without seeing your config its hard to suggest anything here. Did you do basic
tuning of your pg installation?

wal_buffers, shared_buffers, checkpoint_segments, maintenance_work_mem are
likely most relevant for that specific case.

Andres

Re: Insert performance with composite index

From
Andres Freund
Date:
On Monday 01 November 2010 15:08:10 Divakar Singh wrote:
> here are my parameters:
Which pg version is that?

Re: Insert performance with composite index

From
Divakar Singh
Date:
I am using 9.0.1
 
Best Regards,
Divakar



From: Andres Freund <andres@anarazel.de>
To: Divakar Singh <dpsmails@yahoo.com>
Cc: pgsql-performance@postgresql.org
Sent: Mon, November 1, 2010 7:44:31 PM
Subject: Re: [PERFORM] Insert performance with composite index

On Monday 01 November 2010 15:08:10 Divakar Singh wrote:
> here are my parameters:
Which pg version is that?

Re: Insert performance with composite index

From
Andres Freund
Date:
On Monday 01 November 2010 15:16:49 Divakar Singh wrote:
> I am using 9.0.1
Either thats not true or you cargo culted loads of your config from a
significantly older pg version.

Things like:

#bgwriter_delay = 200            # 10-10000 milliseconds between rounds
bgwriter_lru_percent = 0        # 0-100% of LRU buffers scanned/round
#bgwriter_lru_maxpages = 5        # 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333        # 0-100% of all buffers scanned/round
bgwriter_all_maxpages = 0        # 0-1000 buffers max written/round

make me very suspicious.

As I said, I would check the variables I referenced in my first post...

Andres

Re: Insert performance with composite index

From
Divakar Singh
Date:
Do you mean these parameters have been removed starting 9.X?
As I see on http://www.network-theory.co.uk/docs/postgresql/vol3/BackgroundWriter.html ,these parameters were added starting from 8.0 right?

 
Best Regards,
Divakar



From: Andres Freund <andres@anarazel.de>
To: Divakar Singh <dpsmails@yahoo.com>
Cc: pgsql-performance@postgresql.org
Sent: Mon, November 1, 2010 7:50:59 PM
Subject: Re: [PERFORM] Insert performance with composite index

On Monday 01 November 2010 15:16:49 Divakar Singh wrote:
> I am using 9.0.1
Either thats not true or you cargo culted loads of your config from a
significantly older pg version.

Things like:

#bgwriter_delay = 200            # 10-10000 milliseconds between rounds
bgwriter_lru_percent = 0        # 0-100% of LRU buffers scanned/round
#bgwriter_lru_maxpages = 5        # 0-1000 buffers max written/round
#bgwriter_all_percent = 0.333        # 0-100% of all buffers scanned/round
bgwriter_all_maxpages = 0        # 0-1000 buffers max written/round

make me very suspicious.

As I said, I would check the variables I referenced in my first post...

Andres

Re: Insert performance with composite index

From
Andres Freund
Date:
On Monday 01 November 2010 15:28:19 Divakar Singh wrote:
> Do you mean these parameters have been removed starting 9.X?
> As I see on
> http://www.network-theory.co.uk/docs/postgresql/vol3/BackgroundWriter.html
> ,these parameters were added starting from 8.0 right?
No, I mean setting to 0 is a bit of a strange value in many situations.

And you have comments like:
#max_fsm_pages = 20000            # min max_fsm_relations*16, 6 bytes each
#max_fsm_relations = 1000        # min 100, ~70 bytes each

Which reference config options which do not exist anymore. And you have
shared_buffers = 81920
Which indicates that you started from 8.1/8.2 or so...

Andres

Re: Insert performance with composite index

From
hubert depesz lubaczewski
Date:
On Mon, Nov 01, 2010 at 02:57:56PM +0100, Cédric Villemain wrote:
> >   CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
> > )
> > the index definition is
> > CREATE INDEX "PK_AT2"
> >   ON ABC
> >   USING btree
> >   (event, tableindex)
> > TABLESPACE sample;
>
> Indexing twice the same column is useless. (perhaps move your PK to
> the tablespace 'sample' is good too ?)

why do you say that?
these are not the same indexes and they serve different purposes.

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

Re: Insert performance with composite index

From
Cédric Villemain
Date:
2010/11/2 hubert depesz lubaczewski <depesz@depesz.com>:
> On Mon, Nov 01, 2010 at 02:57:56PM +0100, Cédric Villemain wrote:
>> >   CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
>> > )
>> > the index definition is
>> > CREATE INDEX "PK_AT2"
>> >   ON ABC
>> >   USING btree
>> >   (event, tableindex)
>> > TABLESPACE sample;
>>
>> Indexing twice the same column is useless. (perhaps move your PK to
>> the tablespace 'sample' is good too ?)
>
> why do you say that?
> these are not the same indexes and they serve different purposes.

Given that tableindex is the PK column, I really like to now the usage
pattern for having it indexed twice.

>
> Best regards,
>
> depesz
>
> --
> Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
> jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007
>



--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

Re: Insert performance with composite index

From
hubert depesz lubaczewski
Date:
On Tue, Nov 02, 2010 at 12:04:42PM +0100, Cédric Villemain wrote:
> 2010/11/2 hubert depesz lubaczewski <depesz@depesz.com>:
> > On Mon, Nov 01, 2010 at 02:57:56PM +0100, Cédric Villemain wrote:
> >> >   CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
> >> > )
> >> > the index definition is
> >> > CREATE INDEX "PK_AT2"
> >> >   ON ABC
> >> >   USING btree
> >> >   (event, tableindex)
> >> > TABLESPACE sample;
> >>
> >> Indexing twice the same column is useless. (perhaps move your PK to
> >> the tablespace 'sample' is good too ?)
> >
> > why do you say that?
> > these are not the same indexes and they serve different purposes.
>
> Given that tableindex is the PK column, I really like to now the usage
> pattern for having it indexed twice.

select * from table where event = 123 order by tableindex desc limit 50;

Best regards,

depesz

--
Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007

Re: Insert performance with composite index

From
Divakar Singh
Date:
May be a query that is filtering based on these 2 columns?
 
Best Regards,
Divakar



From: Cédric Villemain <cedric.villemain.debian@gmail.com>
To: depesz@depesz.com
Cc: Divakar Singh <dpsmails@yahoo.com>; pgsql-performance@postgresql.org
Sent: Tue, November 2, 2010 4:34:42 PM
Subject: Re: [PERFORM] Insert performance with composite index

2010/11/2 hubert depesz lubaczewski <depesz@depesz.com>:
> On Mon, Nov 01, 2010 at 02:57:56PM +0100, Cédric Villemain wrote:
>> >   CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
>> > )
>> > the index definition is
>> > CREATE INDEX "PK_AT2"
>> >   ON ABC
>> >   USING btree
>> >   (event, tableindex)
>> > TABLESPACE sample;
>>
>> Indexing twice the same column is useless. (perhaps move your PK to
>> the tablespace 'sample' is good too ?)
>
> why do you say that?
> these are not the same indexes and they serve different purposes.

Given that tableindex is the PK column, I really like to now the usage
pattern for having it indexed twice.

>
> Best regards,
>
> depesz
>
> --
> Linkedin: http://www.linkedin.com/in/depesz  /  blog: http://www.depesz.com/
> jid/gtalk: depesz@depesz.com / aim:depeszhdl / skype:depesz_hdl / gg:6749007
>



--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Insert performance with composite index

From
Cédric Villemain
Date:
2010/11/2 hubert depesz lubaczewski <depesz@depesz.com>:
> On Tue, Nov 02, 2010 at 12:04:42PM +0100, Cédric Villemain wrote:
>> 2010/11/2 hubert depesz lubaczewski <depesz@depesz.com>:
>> > On Mon, Nov 01, 2010 at 02:57:56PM +0100, Cédric Villemain wrote:
>> >> >   CONSTRAINT tableindex_pkey PRIMARY KEY (tableindex)
>> >> > )
>> >> > the index definition is
>> >> > CREATE INDEX "PK_AT2"
>> >> >   ON ABC
>> >> >   USING btree
>> >> >   (event, tableindex)
>> >> > TABLESPACE sample;
>> >>
>> >> Indexing twice the same column is useless. (perhaps move your PK to
>> >> the tablespace 'sample' is good too ?)
>> >
>> > why do you say that?
>> > these are not the same indexes and they serve different purposes.
>>
>> Given that tableindex is the PK column, I really like to now the usage
>> pattern for having it indexed twice.
>
> select * from table where event = 123 order by tableindex desc limit 50;

Correct. Thanks Hubert.

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support