Thread: performance "tests"

performance "tests"

From
Raphael Bauduin
Date:
Hi,

having read on this list (some time ago) that inserts could become slow
when there are foreign keys constraints, I wanted to test it. So I
created a DB with 5 tables (T1 -> T5) with, for 0<i,j<6, Tj has a
foreign key from Ti. More clearly:

create table T1(k1 integer NOT NULL PRIMARY KEY, k2 integer,k3 integer,k4 integer,k5 integer);
create table T2(k2 integer NOT NULL PRIMARY KEY, k1 integer,k3 integer,k4 integer,k5 integer, FOREIGN KEY(k1)
REFERENCEST1 ON DELETE CASCADE); 
create table T3(k3 integer NOT NULL PRIMARY KEY, k1 integer,k2 integer,k4 integer,k5 integer, FOREIGN KEY(k1)
REFERENCEST1 ON DELETE CASCADE,FOREIGN KEY(k2) REFERENCES T2 ON DELETE CASCADE); 
create table T4(k4 integer NOT NULL PRIMARY KEY, k1 integer,k2 integer,k3 integer,k5 integer, FOREIGN KEY(k1)
REFERENCEST1 ON DELETE CASCADE,FOREIGN KEY(k2) REFERENCES T2 ON DELETE CASCADE,FOREIGN KEY(k3) REFERENCES T3 ON DELETE
CASCADE);
create table T5(k5 integer NOT NULL PRIMARY KEY, k1 integer,k2 integer,k3 integer,k4 integer, FOREIGN KEY(k1)
REFERENCEST1 ON DELETE CASCADE,FOREIGN KEY(k2) REFERENCES T2 ON DELETE CASCADE,FOREIGN KEY(k3) REFERENCES T3 ON DELETE
CASCADE,FOREIGNKEY(k4) REFERENCES T4 ON DELETE CASCADE); 

I also wrote a python script to populate this DB (see below). I certainly don't
pretend this test is reflecting reality, but I wanted to ask explanations about
one of the things that happen.  I now have 300000 rows in each table, and it
fills the database quite fast, and the postmaster takes something like 25% of
the CPU. At some times, it seems to hang: it doesn't insert any rows for more
than 10 seconds. At that time, the postmaster process takes 0%. Why is that? I
would have thought that the postmaster  would use much power to insert few rows
when the DB gets filled, but it's not happening that way. When rows are
inserted, it happens more or less at the same speed as initially (when DB is
empty). When rows are inserted more slowly, it's because the postmaster
process uses less CPU.

What's strange is that everything else hangs also! Would that be due to the CPU??

Thanks for your help.

Raph.




Some info:

Debian GNU/Linux

cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 6
model name      : AMD Athlon(tm) XP
stepping        : 2
cpu MHz         : 1050.052
cache size      : 256 KB



dpkg -l postgresql
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Installed/Config-files/Unpacked/Failed-config/Half-installed
|/ Err?=(none)/Hold/Reinst-required/X=both-problems (Status,Err: uppercase=bad)
||/ Name           Version        Description
+++-==============-==============-============================================
ii  postgresql     7.1.3-7        Object-relational SQL database, descended fr


uname -r
2.4.17

cat /etc/debian_version
3.0




HEre's the script:

import random
import sys
from pyPgSQL import libpq

dbname = 'test'


random.seed()


cnx = libpq.PQconnectdb('host=localhost user=rb password=linuxxxx dbname=%s' % dbname)

INSERT FIRST 5 ENTRIES IN EACH TABLE
for i in range (1,6):    #INSERER 5 RANGEES
   for j in range(1,6):           #DANS LES 5 TABLES
       res = cnx.query('INSERT INTO T'+str(j)+' VALUES('+str(i)+','+str(i)+','+str(i)+','+str(i)+','+str(i)+')')



N=[5,5,5,5,5]    #number of rows in each table


while N[0]<1000000:
    t=random.randrange(1,6,1)    #table in which to insert the next row
    k=random.randrange(1,6,1)    #table to which we link the inserted row
    r=[random.randrange(1,N[0],1),random.randrange(1,N[1],1) ,random.randrange(1,N[2],1) ,random.randrange(1,N[3],1)
,random.randrange(1,N[4],1)] 
    r[t-1]=N[t-1]+1    #In table Tt ,the field kt is the primary key and has the value N[t]+1
    sqlquery="insert into T"+ str(t)+ " (k1,k2,k3,k4,k5) values
('"+str(r[0])+"','"+str(r[1])+"','"+str(r[2])+"','"+str(r[3])+"','"+str(r[4])+"')"
    print sqlquery
    res = cnx.query(sqlquery)
    N[t-1]=N[t-1]+1

del cnx, res



Re: performance "tests"

From
Oliver Elphick
Date:
On Wed, 2002-04-10 at 14:00, Raphael Bauduin wrote:
...
> dpkg -l postgresql
> Desired=Unknown/Install/Remove/Purge/Hold
> | Status=Not/Installed/Config-files/Unpacked/Failed-config/Half-installed
> |/ Err?=(none)/Hold/Reinst-required/X=both-problems (Status,Err: uppercase=bad)
> ||/ Name           Version        Description
> +++-==============-==============-============================================
> ii  postgresql     7.1.3-7        Object-relational SQL database, descended fr

7.2.1 is available in Debian unstable.

It might be better to run tests on the latest version.

--
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C

     "But seek ye first the kingdom of God, and his
      righteousness; and all these things shall be added
      unto you."     Matthew 6:33

Attachment

Re: performance "tests"

From
Tom Lane
Date:
Raphael Bauduin <raphael@be.easynet.net> writes:
> At some times, it seems to hang: it doesn't insert any rows for more
> than 10 seconds. At that time, the postmaster process takes 0%. Why is
> that?

At a guess, you're seeing the syncer daemon flushing a lot of dirty
kernel disk buffers out to disk, and thereby monopolizing disk I/O.
I haven't experimented too much with Linux, but on HPUX it's not
difficult for a sync() call to bring the system to its knees for many
seconds, if you've got application programs that have written a whole
lot of pages since the last sync.

> What's strange is that everything else hangs also! Would that be due
> to the CPU??

The CPU is free, but everything that wants to access disk is starved
for disk bandwidth ...

> ii  postgresql     7.1.3-7        Object-relational SQL database, descended fr

If you're going to run tests on foreign-key performance, please use
7.2.*.

            regards, tom lane

Re: performance "tests"

From
Naomi Walker
Date:
 From my many years of Informix knowledge, we noticed that checkpoints,
during high activity times, did take a long time, because it locked the
shared memory segment.  We found that setting the checkpoint knobs to flush
almost constantly, overall, was much better for performance.

Looking in postgresql.conf, it seems that some tweaking of :
CHECKPOINT_SEGMENTS and CHECKPOINT_TIMEOUT are in order.

I also see some interesting items in the WAL_* configuration parameters,
and would look at these as well.  Again, in Informix-speak, we were able to
control when the buffers flushed to disk, with parameters like:
                         Start flushing buffers when they are X% full
                            and keep flushing until they are X% full

Overall, having TONS of buffers helped benchmark performance, but could
have slowed down checkpoints had we not continually flushed to disk.

At 11:37 AM 4/10/02 -0400, Tom Lane wrote:
>Raphael Bauduin <raphael@be.easynet.net> writes:
> > At some times, it seems to hang: it doesn't insert any rows for more
> > than 10 seconds. At that time, the postmaster process takes 0%. Why is
> > that?
>
>At a guess, you're seeing the syncer daemon flushing a lot of dirty
>kernel disk buffers out to disk, and thereby monopolizing disk I/O.
>I haven't experimented too much with Linux, but on HPUX it's not
>difficult for a sync() call to bring the system to its knees for many
>seconds, if you've got application programs that have written a whole
>lot of pages since the last sync.


--
Naomi Walker
Chief Information Officer
Eldorado Computing, Inc.
602-604-3100  ext 242


Re: performance "tests"

From
Bruce Momjian
Date:
Naomi Walker wrote:
>  From my many years of Informix knowledge, we noticed that checkpoints,
> during high activity times, did take a long time, because it locked the
> shared memory segment.  We found that setting the checkpoint knobs to flush
> almost constantly, overall, was much better for performance.
>
> Looking in postgresql.conf, it seems that some tweaking of :
> CHECKPOINT_SEGMENTS and CHECKPOINT_TIMEOUT are in order.
>
> I also see some interesting items in the WAL_* configuration parameters,
> and would look at these as well.  Again, in Informix-speak, we were able to
> control when the buffers flushed to disk, with parameters like:
>                          Start flushing buffers when they are X% full
>                             and keep flushing until they are X% full
>
> Overall, having TONS of buffers helped benchmark performance, but could
> have slowed down checkpoints had we not continually flushed to disk.

Actually, we don't lock shared memory like Informix does.  As I
remember, other backends can write to the WAL while we are doing the
checkpoint.  In fact, there is code in there that expects the WAL file
may grow during checkpointing.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: performance "tests"

From
Bruce Momjian
Date:
Tom Lane wrote:
> Raphael Bauduin <raphael@be.easynet.net> writes:
> > At some times, it seems to hang: it doesn't insert any rows for more
> > than 10 seconds. At that time, the postmaster process takes 0%. Why is
> > that?
>
> At a guess, you're seeing the syncer daemon flushing a lot of dirty
> kernel disk buffers out to disk, and thereby monopolizing disk I/O.
> I haven't experimented too much with Linux, but on HPUX it's not
> difficult for a sync() call to bring the system to its knees for many
> seconds, if you've got application programs that have written a whole
> lot of pages since the last sync.

Some BSD's implement trickle sync, for this very reason.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: performance "tests"

From
Tom Lane
Date:
Raphael Bauduin <raphael.bauduin@be.easynet.net> writes:
> I just installed 7.2.1. I run 7.1.3 in parallel. after I run the script
> to introduce 1000 rows in T1 in each version, deleting is much faster
> with 7.1.3 than with 7.2.1 (I took the same configuration parameters for
> both servers...).  When running the delete query in parallel, 7.1.3
> finishes really faster.

Uh, what was the test query again?  And what does EXPLAIN say about it
on each version?

            regards, tom lane

Re: performance "tests"

From
Raphael Bauduin
Date:
On Thu, Apr 11, 2002 at 10:34:40AM -0400, Tom Lane wrote:
> Raphael Bauduin <raphael.bauduin@be.easynet.net> writes:
> > I just installed 7.2.1. I run 7.1.3 in parallel. after I run the script
> > to introduce 1000 rows in T1 in each version, deleting is much faster
> > with 7.1.3 than with 7.2.1 (I took the same configuration parameters for
> > both servers...).  When running the delete query in parallel, 7.1.3
> > finishes really faster.
>
> Uh, what was the test query again?  And what does EXPLAIN say about it
> on each version?

I just want to delete all rows in the table T1. As all other tables have
foreign keys coming from T1, all rows in all tables are deleted.

For example, from T1:

test=# select * from T1 where k1>50 and k1< 55;

  k1 | k2 | k3 | k4 | k5
  ----+----+----+----+----
  51 |  3 | 42 | 26 |  7
  52 | 15 | 32 | 35 |  2
  53 |  4 |  8 | 11 | 12
  54 |  3 | 33 | 38 | 26
  (4 rows)

for Tj, kj is primary key.
for 0<i<j<5, Tj has a foreign key coming (ki)  from Ti (ki, which is
also primary key of Ti)
so, Tj has j-1 foreign keys (T2 has one from T1, T3 has 2: from T1 and
T2,....)
All tables don't have the same number of rows, but it's more or less the same
(rows are inserted randomly in the tables)

From 7.1.3:
test=# select count(*) from T1;
 count
-------
  1000
(1 row)

test=# explain delete from T1;
NOTICE:  QUERY PLAN:

Seq Scan on t1  (cost=0.00..20.00 rows=1000 width=6)

EXPLAIN

From 7.2.1:
test=# select count(*) from T1;
 count
-------
  1000
(1 row)

test=# explain delete from T1;
NOTICE:  QUERY PLAN:

Seq Scan on t1  (cost=0.00..20.00 rows=1000 width=6)

EXPLAIN


So it says exactly the same.




Re: performance "tests"

From
Raphael Bauduin
Date:
> If you're going to run tests on foreign-key performance, please use
> 7.2.*.


I just installed 7.2.1. I run 7.1.3 in parallel. after I run the script
to introduce 1000 rows in T1 in each version, deleting is much faster
with 7.1.3 than with 7.2.1 (I took the same configuration parameters for
both servers...).  When running the delete query in parallel, 7.1.3
finishes really faster.

For inserts, the 7.2.1 is slightly faster for the first 3000 inserts,
from inserts from 3000 to 5000 it's more or less equivalent

I'll go further in documentation to look if I forgot something, but if
you have any hints, they're welcome :-)


Raph