Thread: INSERT performace.

INSERT performace.

From
"Marc Mitchell"
Date:
We have a fresh database and have begun to observe performance degradation
for INSERTs as a table went from empty to 100,000-ish rows.  Initial
INSERTs were sub second while after 30k rows, they were 1-3 seconds.

Note that we have done NO performance tuning yet nor are there any indexes
on the tables.

My question is about the way Postgres manages the organizational structure
of the base table data.  From my Ingres background, I know that that
product allows you to set the structure of the base data to be any of it's
supported types (Hash, Heap, Btree, etc.) .  And my experience tells me
that this can greatly effect INSERT performance and must be properly
managed.  For example, you have to consider the allocation of data pages
for hash and overflow and page fill factors for btrees to avoid excessive
splits and balancing of index data pages.  Is there anything like this in
Postgres?

I realize that these issues must also be considered for alternate indexes
but currently, we don't have any.  I'm sure eventually we will and will
need to consider the overhead indexes add to INSERTs.  But currently, we
are basically just INSERTing into rather empty tables.

Is there anything that can be done when defining a table or managing it
after the fact to ensure it is ready for inserts?  Aside from alternate
indexes, how is the base data for a table stored and structured?  While
much is written about shared cache, buffer and spreading tables across
disks, my thinking is that this has a much bigger impact on SELECTs (or
UPDATEs) than INSERTs.  So what can to done to optimize this?  Also, I'm
aware of fsync but don't wish to go there until I know we're not missing
something simpler.

Any feedback is very much appreciated.


Re: INSERT performace.

From
Bruce Momjian
Date:
Marc Mitchell wrote:
> We have a fresh database and have begun to observe performance degradation
> for INSERTs as a table went from empty to 100,000-ish rows.  Initial
> INSERTs were sub second while after 30k rows, they were 1-3 seconds.
>
> Note that we have done NO performance tuning yet nor are there any indexes
> on the tables.

Can we have PostgreSQL version and OS information?

> My question is about the way Postgres manages the organizational structure
> of the base table data.  From my Ingres background, I know that that
> product allows you to set the structure of the base data to be any of it's
> supported types (Hash, Heap, Btree, etc.) .  And my experience tells me
> that this can greatly effect INSERT performance and must be properly
> managed.  For example, you have to consider the allocation of data pages
> for hash and overflow and page fill factors for btrees to avoid excessive
> splits and balancing of index data pages.  Is there anything like this in
> Postgres?

Data is stored in flat heap files in PostgreSQL.  It is not like Ingres
where you can put a structure on the base table;  it is always heap, and
you add indexes as you need them.

Off the top of my head I can't guess why the INSERT would get slower
because it is merely putting data on the end of the table.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: INSERT performace.

From
Tom Lane
Date:
"Marc Mitchell" <marcm@eisolution.com> writes:
> We have a fresh database and have begun to observe performance degradation
> for INSERTs as a table went from empty to 100,000-ish rows.  Initial
> INSERTs were sub second while after 30k rows, they were 1-3 seconds.

> Note that we have done NO performance tuning yet nor are there any indexes
> on the tables.

INSERT per se should be a constant-time operation: stick another tuple
on the end of the file.  There is undoubtedly something you're not
telling us about the database schema.  Foreign keys, triggers, rules,
constraints, something like that?

            regards, tom lane

Re: INSERT performace.

From
"Marc Mitchell"
Date:
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Marc Mitchell" <marcm@eisolution.com>
Cc: <pgsql-admin@postgresql.org>
Sent: Saturday, January 26, 2002 11:06 PM
Subject: Re: [ADMIN] INSERT performace.


> "Marc Mitchell" <marcm@eisolution.com> writes:
> > We have a fresh database and have begun to observe performance
degradation
> > for INSERTs as a table went from empty to 100,000-ish rows.  Initial
> > INSERTs were sub second while after 30k rows, they were 1-3 seconds.
>
> > Note that we have done NO performance tuning yet nor are there any
indexes
> > on the tables.
>
> INSERT per se should be a constant-time operation: stick another tuple
> on the end of the file.  There is undoubtedly something you're not
> telling us about the database schema.  Foreign keys, triggers, rules,
> constraints, something like that?
>

Based on feedback received so far, at least it seems to be confirming that
there is no configuration that needs to be considered for the basic storage
structure of a new table.  I assume tables can be considered heaps with
continuous pointer to the "top" allowing additions to be done with little
fluxuation tied to size.

While there are currently no triggers, rules, or constraints on the tables,
there are defined Foreign keys so perhaps it is the lookups of these that
is contributing to the problem.  This makes a lot of sense.

Marc

> regards, tom lane

>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster


Re: INSERT performace.

From
"Marc Mitchell"
Date:
----- Original Message -----
From: "Bruce Momjian" <pgman@candle.pha.pa.us>
To: "Marc Mitchell" <marcm@eisolution.com>
Cc: <pgsql-admin@postgresql.org>
Sent: Saturday, January 26, 2002 10:51 PM
Subject: Re: [ADMIN] INSERT performace.


> Marc Mitchell wrote:
> > We have a fresh database and have begun to observe performance
degradation
> > for INSERTs as a table went from empty to 100,000-ish rows.  Initial
> > INSERTs were sub second while after 30k rows, they were 1-3 seconds.
> >
> > Note that we have done NO performance tuning yet nor are there any
indexes
> > on the tables.
>
> Can we have PostgreSQL version and OS information?

OS is RedHat Linux 2.2.19 #8 SMP Fri Sep 21 10:04:24 CDT
Postgres is  PostgreSQL 7.1.2 on i686-pc-linux-gnu, compiled by GCC 2.96




Re: INSERT performace.

From
hubert depesz lubaczewski
Date:
On Sat, Jan 26, 2002 at 08:10:19PM -0600, Marc Mitchell wrote:
> We have a fresh database and have begun to observe performance degradation
> for INSERTs as a table went from empty to 100,000-ish rows.  Initial
> INSERTs were sub second while after 30k rows, they were 1-3 seconds.

we just hit this problem when moving old database to new one (new
schema).
we had to insert approx. 1.5 million rows, and from initial 0.02s/insert
after several thousands of inserts it came to 20 seconds per insert.
what we did was removing foreign keys from table which we made inserts
to.
it helped. we manage to put 300k records in ca. 2-3 hours.

depesz

--
hubert depesz lubaczewski                          http://www.depesz.pl/
------------------------------------------------------------------------
... vows are spoken to be broken ...                 [enjoy the silence]
... words are meaningless and forgettable ...             [depeche mode]

Re: INSERT performace.

From
Sean Chittenden
Date:
> > We have a fresh database and have begun to observe performance
> > degradation for INSERTs as a table went from empty to 100,000-ish
> > rows.  Initial INSERTs were sub second while after 30k rows, they
> > were 1-3 seconds.
>
> we just hit this problem when moving old database to new one (new
> schema).  we had to insert approx. 1.5 million rows, and from
> initial 0.02s/insert after several thousands of inserts it came to
> 20 seconds per insert.  what we did was removing foreign keys from
> table which we made inserts to.  it helped. we manage to put 300k
> records in ca. 2-3 hours.

If possible, use the COPY command.  We did 90K rows in about 40sec
using this puppy on a Solaris U5 (took over 130sec for MySQL on the
same box in case the performance geeks in the crowd are interested).

http://www.postgresql.org/idocs/index.php?sql-copy.html

-sc

--
Sean Chittenden

Re: INSERT performace.

From
Ferdinand Smit
Date:
On Sunday 27 January 2002 20:31, Sean Chittenden wrote:
> > > We have a fresh database and have begun to observe performance
> > > degradation for INSERTs as a table went from empty to 100,000-ish
> > > rows.  Initial INSERTs were sub second while after 30k rows, they
> > > were 1-3 seconds.
> >
> > we just hit this problem when moving old database to new one (new
> > schema).  we had to insert approx. 1.5 million rows, and from
> > initial 0.02s/insert after several thousands of inserts it came to
> > 20 seconds per insert.  what we did was removing foreign keys from
> > table which we made inserts to.  it helped. we manage to put 300k
> > records in ca. 2-3 hours.
>
> If possible, use the COPY command.  We did 90K rows in about 40sec
> using this puppy on a Solaris U5 (took over 130sec for MySQL on the
> same box in case the performance geeks in the crowd are interested).

We were transfering a mysql-database to a new linux-server (PIII-800 dual). I
don't now how mutch rows, but the dump was 8 Gb (not zipped).
It took us 4 hours to import, and 5 hours to create the indexes.

By testing we created a postgres database to on an other server (same type).
The copy command did'nt work, because of 'strange characters', so we used
normal inserts. It took us 12 hours to import, and 10 hours to create the
indexes.

Although, i like postgres more, mysql is still faster with hugh (simple)
data.

Regards,
Ferdinand

Re: INSERT performace.

From
Sean Chittenden
Date:
> > > > We have a fresh database and have begun to observe performance
> > > > degradation for INSERTs as a table went from empty to
> > > > 100,000-ish rows.  Initial INSERTs were sub second while after
> > > > 30k rows, they were 1-3 seconds.
> > >
> > > we just hit this problem when moving old database to new one
> > > (new schema).  we had to insert approx. 1.5 million rows, and
> > > from initial 0.02s/insert after several thousands of inserts it
> > > came to 20 seconds per insert.  what we did was removing foreign
> > > keys from table which we made inserts to.  it helped. we manage
> > > to put 300k records in ca. 2-3 hours.
> >
> > If possible, use the COPY command.  We did 90K rows in about 40sec
> > using this puppy on a Solaris U5 (took over 130sec for MySQL on
> > the same box in case the performance geeks in the crowd are
> > interested).
>
> We were transfering a mysql-database to a new linux-server (PIII-800
> dual). I don't now how mutch rows, but the dump was 8 Gb (not
> zipped).  It took us 4 hours to import, and 5 hours to create the
> indexes.

How were you inserting the data?  Were you doing multiple inserts per
transactions?  Copy?  That sounds really slow to me.

> By testing we created a postgres database to on an other server
> (same type).  The copy command did'nt work, because of 'strange
> characters', so we used normal inserts. It took us 12 hours to
> import, and 10 hours to create the indexes.

Have you tried to escape the data before you inserted it?  That
should've solve things.

http://www.postgresql.org/idocs/index.php?sql-copy.html

> Although, i like postgres more, mysql is still faster with hugh
> (simple) data.

I've never found that to be the case in only a few instances
actually... and typically with small data sets that are less than 1M
rows.  vacuum analyze and turn fsync off.  :~)  -sc

--
Sean Chittenden

Re: INSERT performace.

From
Ferdinand Smit
Date:
Hi,

> > > > > We have a fresh database and have begun to observe performance
> > > > > degradation for INSERTs as a table went from empty to
> > > > > 100,000-ish rows.  Initial INSERTs were sub second while after
> > > > > 30k rows, they were 1-3 seconds.
> > > >
> > > > we just hit this problem when moving old database to new one
> > > > (new schema).  we had to insert approx. 1.5 million rows, and
> > > > from initial 0.02s/insert after several thousands of inserts it
> > > > came to 20 seconds per insert.  what we did was removing foreign
> > > > keys from table which we made inserts to.  it helped. we manage
> > > > to put 300k records in ca. 2-3 hours.
> > >
> > > If possible, use the COPY command.  We did 90K rows in about 40sec
> > > using this puppy on a Solaris U5 (took over 130sec for MySQL on
> > > the same box in case the performance geeks in the crowd are
> > > interested).
> >
> > We were transfering a mysql-database to a new linux-server (PIII-800
> > dual). I don't now how mutch rows, but the dump was 8 Gb (not
> > zipped).  It took us 4 hours to import, and 5 hours to create the
> > indexes.
>
> How were you inserting the data?  Were you doing multiple inserts per
> transactions?  Copy?  That sounds really slow to me.

The database mainly contains integers, which represent the behavior of
internet users on our site, so it's very compact data.
We used multiple insert with mysql but i did'nt find that option in postgres.

> > By testing we created a postgres database to on an other server
> > (same type).  The copy command did'nt work, because of 'strange
> > characters', so we used normal inserts. It took us 12 hours to
> > import, and 10 hours to create the indexes.
>
> Have you tried to escape the data before you inserted it?  That
> should've solve things.

No, how do you do that ?

> http://www.postgresql.org/idocs/index.php?sql-copy.html
>
> > Although, i like postgres more, mysql is still faster with hugh
> > (simple) data.
>
> I've never found that to be the case in only a few instances
> actually... and typically with small data sets that are less than 1M
> rows.  vacuum analyze and turn fsync off.  :~)  -sc

Of course, fsync was off and i increased the memory usage. Vacuum is not
usable with a total import ;-)

Ferdinand

Re: INSERT performace.

From
"Eduardo Caillava"
Date:
We have 7.1.3 PostgreSQL running on a NetVista PIII 800 with 512 MB ram 'n 2
80 GB IDE 10krmp Seagate disks, using a linked script for boot in
/etc/rc.d/rc2.d with "hdparm -X66 -u1 -d1 -m16 -c3"

We cannot use "copy" because of "strange char..."

So...we make at week (WITH 8K ROWS), from a güinbox running VFoxPro, around
900.000 rows inserts in 25/30 mins. max, and create complex index: three
numeric fields + one varchar: (rep,ord,afi,nomb) en abouth 20 or 30 seconds.

Can u use it ?

Hope it help you

Eduardo Caillava
Tucumán
Argentina

----- Original Message -----
From: "Ferdinand Smit" <ferdinand@telegraafnet.nl>
To: <pgsql-admin@postgresql.org>
Sent: Wednesday, January 30, 2002 6:42 AM
Subject: Re: [ADMIN] INSERT performace.


> Hi,
>
> > > > > > We have a fresh database and have begun to observe performance
> > > > > > degradation for INSERTs as a table went from empty to
> > > > > > 100,000-ish rows.  Initial INSERTs were sub second while after
> > > > > > 30k rows, they were 1-3 seconds.
> > > > >
> > > > > we just hit this problem when moving old database to new one
> > > > > (new schema).  we had to insert approx. 1.5 million rows, and
> > > > > from initial 0.02s/insert after several thousands of inserts it
> > > > > came to 20 seconds per insert.  what we did was removing foreign
> > > > > keys from table which we made inserts to.  it helped. we manage
> > > > > to put 300k records in ca. 2-3 hours.
> > > >
> > > > If possible, use the COPY command.  We did 90K rows in about 40sec
> > > > using this puppy on a Solaris U5 (took over 130sec for MySQL on
> > > > the same box in case the performance geeks in the crowd are
> > > > interested).
> > >
> > > We were transfering a mysql-database to a new linux-server (PIII-800
> > > dual). I don't now how mutch rows, but the dump was 8 Gb (not
> > > zipped).  It took us 4 hours to import, and 5 hours to create the
> > > indexes.
> >
> > How were you inserting the data?  Were you doing multiple inserts per
> > transactions?  Copy?  That sounds really slow to me.
>
> The database mainly contains integers, which represent the behavior of
> internet users on our site, so it's very compact data.
> We used multiple insert with mysql but i did'nt find that option in
postgres.
>
> > > By testing we created a postgres database to on an other server
> > > (same type).  The copy command did'nt work, because of 'strange
> > > characters', so we used normal inserts. It took us 12 hours to
> > > import, and 10 hours to create the indexes.
> >
> > Have you tried to escape the data before you inserted it?  That
> > should've solve things.
>
> No, how do you do that ?
>
> > http://www.postgresql.org/idocs/index.php?sql-copy.html
> >
> > > Although, i like postgres more, mysql is still faster with hugh
> > > (simple) data.
> >
> > I've never found that to be the case in only a few instances
> > actually... and typically with small data sets that are less than 1M
> > rows.  vacuum analyze and turn fsync off.  :~)  -sc
>
> Of course, fsync was off and i increased the memory usage. Vacuum is not
> usable with a total import ;-)
>
> Ferdinand
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>


Re: INSERT performace.

From
Sean Chittenden
Date:
> > How were you inserting the data?  Were you doing multiple inserts per
> > transactions?  Copy?  That sounds really slow to me.
>
> The database mainly contains integers, which represent the behavior of
> internet users on our site, so it's very compact data.
> We used multiple insert with mysql but i did'nt find that option in postgres.

BEGIN;
INSERT INTO table (val1, val2, val3) VALUES (1,2,3);
INSERT INTO table (val1, val2, val3) VALUES (4,5,6);
INSERT INTO table (val1, val2, val3) VALUES (7,8,9);
INSERT INTO table (val1, val2, val3) VALUES (10,11,12);
[...]
COMMIT;

> > > By testing we created a postgres database to on an other server
> > > (same type).  The copy command did'nt work, because of 'strange
> > > characters', so we used normal inserts. It took us 12 hours to
> > > import, and 10 hours to create the indexes.
> >
> > Have you tried to escape the data before you inserted it?  That
> > should've solve things.
>
> No, how do you do that ?

Do you know what those 'strange characters' are?  -sc

--
Sean Chittenden