Thread: performance test

performance test

From

Michael Meskes

Date:

18 January 1999, 12:01:40

I tried inserting 15000 tuples each in two tables via pgsql and timed it.
After insertion I also created btree indeces on both tables.

Some results:

First try was with a 6.4.1 version without -F option and all inserts in one
transaction:
real    2m11.084s
user    0m1.760s
sys     0m1.680s

Then I switched to 6.5 without -F:
real    2m35.833s
user    0m2.900s
sys     0m3.910s

So some of the latest changes cost quite a lot of time. On the other hand I
was surprised to see that the size of the table file decreased by about 10%
when switching to 6.5.

Then I switched on -F:
real    1m11.573s
user    0m3.150s
sys     0m3.760s

I've seen even bigger advantages of -F.

For comparison I tried without BEGIN/END:
real    1m12.463s
user    0m3.030s
sys     0m4.480s

So, yes it is slower than just one transaction, but not very much.

Finally I tried the same script (except for changing int4 to number(4)) on
Oracle 8.0.5:
real    1m29.248s
user    0m24.460s
sys     0m4.880s

I got similar results when I tried the last time. On stuff like insertions
we are quite a lot faster than Oracle if we use -F. 

All test were run on my AMD K6/350 with 64MB ram running Linux-2.2.0-pre7.

michael

-- 
Michael Meskes                         | Go SF 49ers!
Th.-Heuss-Str. 61, D-41812 Erkelenz    | Go Rhein Fire!
Tel.: (+49) 2431/72651                 | Use Debian GNU/Linux!
Email: Michael.Meskes@gmx.net          | Use PostgreSQL!

Re: [HACKERS] performance test

From

Vadim Mikheev

Date:

18 January 1999, 22:28:18

Michael Meskes wrote:
> 
> I tried inserting 15000 tuples each in two tables via pgsql and timed it.
> After insertion I also created btree indeces on both tables.
> 
...
> 
> Then I switched to 6.5 without -F:
> real    2m35.833s
> user    0m2.900s
> sys     0m3.910s
> 
...
> 
> Then I switched on -F:
> real    1m11.573s
> user    0m3.150s
> sys     0m3.760s
> 
> I've seen even bigger advantages of -F.

You've seen disadvantages of our unperfect buffer manager -:)
When server need in buffer for new data comming and there is
no unused buffers in pool (i.e. - all buffers filled with new
data and marked as dirty), server gets some dirty buffer,
writes it AND FSYNC file. So, server does many fsyncs
even with BEGIN/END while should do _one_ fsync at COMMIT.

Having this problem fixed you wouldn't had so big difference
between -F and BEGIN/END-no-F.

Vadim

Re: [HACKERS] performance test

From

Bruce Momjian

Date:

19 January 1999, 01:01:20

> You've seen disadvantages of our unperfect buffer manager -:)
> When server need in buffer for new data comming and there is
> no unused buffers in pool (i.e. - all buffers filled with new
> data and marked as dirty), server gets some dirty buffer,
> writes it AND FSYNC file. So, server does many fsyncs
> even with BEGIN/END while should do _one_ fsync at COMMIT.
> 
> Having this problem fixed you wouldn't had so big difference
> between -F and BEGIN/END-no-F.

Is this idea still reasonable?

---------------------------------------------------------------------------

Here is an archive of the pg_log discussion.

---------------------------------------------------------------------------

From: Bruce Momjian <maillist@candle.pha.pa.us>
Message-Id: <199711170542.AAA24561@candle.pha.pa.us>
Subject: [HACKERS] Bufferd loggins/pg_log
To: hackers@postgreSQL.org (PostgreSQL-development)
Date: Mon, 17 Nov 1997 00:42:18 -0500 (EST)
Cc: vadim@sable.krasnoyarsk.su (Vadim B. Mikheev)
X-Mailer: ELM [version 2.4 PL25]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hackers@hub.org
Precedence: bulk
Status: OR

Here is my current idea for doing bufferd logging, and exists between
the normal fsync on every transaction and no-fsync options.  I believe
it will be very popular, because it mimicks the Unix file system
reliability structure.

---------------------------------------------------------------------------

On startup, the postmaster makes a copy of pg_log, called pg_log_live.
Each postgres backend mmaps() this new file into its address space.  A
lock is gotten to make changes to the file.  All backend use pg_log_live
rather than pg_log.  Only the postmaster write to pg_log.  (I will
someday remove the exec() from postmaster, so backends will get this
address space automatically.) 

The first 512 bytes of pg_log and pg_log_live are used for log managment
information.  We add a new field to pg_log_live called min_xid_commit
which records the lowest transaction id that any backend has committed
since the start of the last sync pass of the postmater.  We also add
fields to record current pg_variable oid and xid at the same time.  (xid
may have to be moved into pg_variable so backends can fsync it (see
below).)

Every 60-90 seconds, the postmaster gets a write lock on pg_log, takes
the minimum xid set in the start of pg_log, and resets its value.  It
records the current oid and xid from pg_variable.  It then clears the
lock, and starts reading from the minimum recorded xid changed to the
end of pg_log_live, and copies it into allocated memory.  It then does a
sync (twice?), waits for completion, and then writes the pg_log_live
partial copy it made to pg_log.  We update the copies of oid and xid we
saved before the sync to the bottom of pg_log_live.  

We can change the 60-90 seconds to be longer, but the system does it
every 30 seconds anyway.

When the postmaster stops, it does this same operation before shutting
down, and pg_log_live is removed.

We make a copy of the current xid and oid in the front of pg_log_live,
so that if the postmaster starts up, and pg_log_live exists, the
postmaster adds 10,000 to xid and oid of pg_variable, so no previously
used but unsynced values are used.

We know that the current values of pg_variable could not have been
exceeded by 10,000, because each backend consults the pg_log copies of
these variable to make sure they do not exceed 10,000 from the value
before the last sync.  They exceed those values only by fscyn'ing every
10,000 increments.

Said another way, if a postgres backend exceeds the pg_log last xid or
oid of pg_log, or any 10,000 multiple, it must fsync the change to
pg_variable.  This way, a crash skips over any unsynced oid/xid's used,
and this is done without having to keep fsyncing pg_variable.  In most
cases, the 10,000 will never be exceeded by a backend before the
postmaster does a sync and increases the last xid/oid again.

I think this is a very clean way to give us no-fync performance with
full-rollback buffered logging.  The specification is clean and almost
complete enough for coding.

I think this gives us what we need, by having a mmap'ed() pg_log_live,
which backends can use, and a postmaster-controlled pg_log, which is
used on startup, with xid/oid controls in a crash situation to skip over
partially committed transactions.

Comments?

-- 
Bruce Momjian
maillist@candle.pha.pa.us

---------------------------------------------------------------------------

Sender: root@www.krasnet.ru
Message-ID: <346FF895.167EB0E7@sable.krasnoyarsk.su>
Date: Mon, 17 Nov 1997 14:56:05 +0700
From: "Vadim B. Mikheev" <vadim@sable.krasnoyarsk.su>
Organization: ITTS (Krasnoyarsk)
X-Mailer: Mozilla 3.01 (X11; I; FreeBSD 2.2.5-RELEASE i386)
MIME-Version: 1.0
To: Bruce Momjian <maillist@candle.pha.pa.us>
CC: PostgreSQL-development <hackers@postgreSQL.org>,       "Vadim B. Mikheev" <vadim@post.krasnet.ru>
Subject: Re: Bufferd loggins/pg_log
References: <199711170542.AAA24561@candle.pha.pa.us>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Status: OR

Bruce Momjian wrote:
> 
> On startup, the postmaster makes a copy of pg_log, called pg_log_live.
> Each postgres backend mmaps() this new file into its address space.  A
> lock is gotten to make changes to the file.  All backend use pg_log_live
> rather than pg_log.  Only the postmaster write to pg_log.  (I will
> someday remove the exec() from postmaster, so backends will get this
> address space automatically.)

What are advantages of mmaping entire pg_log over "online" pg_log
pages ?
pg_log may be very big (tens of Mb) - why we have to spend
process address space for tens of Mb of mostly unused data ?
Also, do all systems have mmap ?

> 
> Every 60-90 seconds, the postmaster gets a write lock on pg_log, takes
> the minimum xid set in the start of pg_log, and resets its value.  It
> records the current oid and xid from pg_variable.  It then clears the
> lock, and starts reading from the minimum recorded xid changed to the
> end of pg_log_live, and copies it into allocated memory.  It then does a
> sync (twice?), waits for completion, and then writes the pg_log_live       ^^^^^
man sync:
    The sync() function forces a write of dirty (modified) buffers in the                        ^^^^^^    block buffer
cacheout to disk...

...

BUGS    Sync() may return before the buffers are completely flushed.

Vadim

---------------------------------------------------------------------------

From: Bruce Momjian <maillist@candle.pha.pa.us>
Message-Id: <199711171346.IAA01964@candle.pha.pa.us>
Subject: [HACKERS] Re: Bufferd loggins/pg_log
To: vadim@sable.krasnoyarsk.su (Vadim B. Mikheev)
Date: Mon, 17 Nov 1997 08:46:29 -0500 (EST)
Cc: hackers@postgreSQL.org (PostgreSQL-development)
In-Reply-To: <346FF895.167EB0E7@sable.krasnoyarsk.su> from "Vadim B. Mikheev" at Nov 17, 97 02:56:05 pm
X-Mailer: ELM [version 2.4 PL25]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hackers@hub.org
Precedence: bulk
Status: OR

> 
> Bruce Momjian wrote:
> > 
> > On startup, the postmaster makes a copy of pg_log, called pg_log_live.
> > Each postgres backend mmaps() this new file into its address space.  A
> > lock is gotten to make changes to the file.  All backend use pg_log_live
> > rather than pg_log.  Only the postmaster write to pg_log.  (I will
> > someday remove the exec() from postmaster, so backends will get this
> > address space automatically.)
> 
> What are advantages of mmaping entire pg_log over "online" pg_log
> pages ?
> pg_log may be very big (tens of Mb) - why we have to spend
> process address space for tens of Mb of mostly unused data ?
> Also, do all systems have mmap ?

I believe you are correct that it would be better keeping the last few
pages of pg_log in shared memory rather than using mmap().

I think the important new ideas are keeping track of the oid/xid before
sync so we can accurately add 10,000 after a crash.

I am a little foggy on race condiions of growing the pg_log region while
other backends are running, and modifying non-shared memory pages, but
you seem to have a handle on it.

We don't need pg_log_live if only the postmaster writes those last two
pages to pg_log, and if we keep track of a crash status somewhere else,
perhaps at the start of pg_log.

> 
> > 
> > Every 60-90 seconds, the postmaster gets a write lock on pg_log, takes
> > the minimum xid set in the start of pg_log, and resets its value.  It
> > records the current oid and xid from pg_variable.  It then clears the
> > lock, and starts reading from the minimum recorded xid changed to the
> > end of pg_log_live, and copies it into allocated memory.  It then does a
> > sync (twice?), waits for completion, and then writes the pg_log_live
>         ^^^^^
> man sync:
> 
>      The sync() function forces a write of dirty (modified) buffers in the
>                          ^^^^^^
>      block buffer cache out to disk...
> ...
> 
> BUGS
>      Sync() may return before the buffers are completely flushed.
> 
> Vadim
> 

My BSD/OS doesn't mention this, but twice is a good idea.

-- 
Bruce Momjian
maillist@candle.pha.pa.us

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026 +  If your life is a hard drive,     |  (610)
353-9879(w)+  Christ can be your backup.        |  (610) 853-3000(h)

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: [HACKERS] performance test

From

Vadim Mikheev

Date:

19 January 1999, 01:16:55

Bruce Momjian wrote:
> 
> > You've seen disadvantages of our unperfect buffer manager -:)
> > When server need in buffer for new data comming and there is
> > no unused buffers in pool (i.e. - all buffers filled with new
> > data and marked as dirty), server gets some dirty buffer,
> > writes it AND FSYNC file. So, server does many fsyncs
> > even with BEGIN/END while should do _one_ fsync at COMMIT.
> >
> > Having this problem fixed you wouldn't had so big difference
> > between -F and BEGIN/END-no-F.
> 
> Is this idea still reasonable?
> 
> ---------------------------------------------------------------------------
> 
> Here is an archive of the pg_log discussion.

Thanks for archive, Bruce.
But delayed fsync is another idea.
bufmgr has to be fixed anyway.

Vadim

Re: [HACKERS] performance test

From

Michael Meskes

Date:

19 January 1999, 10:05:05

On Tue, Jan 19, 1999 at 10:27:50AM +0700, Vadim Mikheev wrote:
> You've seen disadvantages of our unperfect buffer manager -:)
> When server need in buffer for new data comming and there is
> no unused buffers in pool (i.e. - all buffers filled with new
> data and marked as dirty), server gets some dirty buffer,
> writes it AND FSYNC file. So, server does many fsyncs
> even with BEGIN/END while should do _one_ fsync at COMMIT.
> 
> Having this problem fixed you wouldn't had so big difference
> between -F and BEGIN/END-no-F.

I see. But without BEGIN/END a huge difference remains to be expected,
doesn't it?

Michael
-- 
Michael Meskes                         | Go SF 49ers!
Th.-Heuss-Str. 61, D-41812 Erkelenz    | Go Rhein Fire!
Tel.: (+49) 2431/72651                 | Use Debian GNU/Linux!
Email: Michael.Meskes@gmx.net          | Use PostgreSQL!

Re: [HACKERS] performance test

From

Vadim Mikheev

Date:

19 January 1999, 21:04:47

Michael Meskes wrote:
> 
> On Tue, Jan 19, 1999 at 10:27:50AM +0700, Vadim Mikheev wrote:
> > You've seen disadvantages of our unperfect buffer manager -:)
> > When server need in buffer for new data comming and there is
> > no unused buffers in pool (i.e. - all buffers filled with new
> > data and marked as dirty), server gets some dirty buffer,
> > writes it AND FSYNC file. So, server does many fsyncs
> > even with BEGIN/END while should do _one_ fsync at COMMIT.
> >
> > Having this problem fixed you wouldn't had so big difference
> > between -F and BEGIN/END-no-F.
> 
> I see. But without BEGIN/END a huge difference remains to be expected,
> doesn't it?

Of 'course! Two fsync per commit...

Vadim