Thread: Performance

Performance

From
"Jason C. Leach"
Date:
hi,

I've been playing with pgsql for a few days now and am getting the hang
of it.  I just did a loop that inserts a few thousand records into a
table.  I did a statement, prepare, execute; it worked fine although pg
seemed to access the hd for every insert.  Is there a way to cache
inserts and then write them all at once later.  I'm using Perl with
DBD::Pg/DBI and  see with DBI there is a prepare_cached, and a commit.
Not much in the way of docs for the modules though.

Perhaps I should be doing statement, prepare, statement, prepare,
commit?

I'd also be curious where you start postgres with the -F option?  I did
not see it as a postmaster option.

This is just a curiosity, but some of the MS Certified people are being
taught that MS SQL server requires something like a PIII with 512MB
RAM.  That's pretty crazy, if you ask me, but I can see it as being
dependent on how many requests you server will need to deal with. On
equal machines, could pg handle more or less requests that MS SQL?

Thanks,
    Jason

--
.............
......... Jason C. Leach
...... University College of the Cariboo
... jcl@mail.ocis.net.
.. http://www.ocis.net/~jcl
.

The Search for Extraterrestrial Intelligence from Home:
http://setiathome.ssl.berkeley.edu

                                                                LINUX!




Re: [GENERAL] Performance

From
Jim Richards
Date:
I don't know about the DBI specifically, but it should have this,
try doing the inserts as

BEGIN WORK

INSERT ...
INSERT ...
INSERT ...
INSERT ...

COMMIT WORK

this will wrap all the transactions in one statement, so during the inserts
other process won't be able to see the changes until the commit is done.
Also mean if there in an error during the insert sequence, it can all
be rolled back without a problem.

>I've been playing with pgsql for a few days now and am getting the hang
>of it.  I just did a loop that inserts a few thousand records into a
>table.  I did a statement, prepare, execute; it worked fine although pg
>seemed to access the hd for every insert.  Is there a way to cache
>inserts and then write them all at once later.  I'm using Perl with
>DBD::Pg/DBI and  see with DBI there is a prepare_cached, and a commit.
>Not much in the way of docs for the modules though.
>
>Perhaps I should be doing statement, prepare, statement, prepare,
>commit?


--
Subvert the dominant paradigm
 http://www.cyber4.org/members/grumpy/index.html

Re: [GENERAL] Performance

From
Charles Tassell
Date:
Try turning off DBI's autocommit , that way it will cache all the inserts
until it's sure there are no errors, then just do the one write to the
database.  You do this by changing your DBI connect command from:
$pg_con=DBI->connect("DBI:Pg:....."
to
$pg_con=DBI->connect("DBI:Pg(AutoCommit=>0):....."

Don't forget to call $pg_con->commit; before disconnecting or else you will
lose your added data.

At 04:47 AM 10/29/99, Jason C. Leach wrote:
>hi,
>
>I've been playing with pgsql for a few days now and am getting the hang
>of it.  I just did a loop that inserts a few thousand records into a
>table.  I did a statement, prepare, execute; it worked fine although pg
>seemed to access the hd for every insert.  Is there a way to cache
>inserts and then write them all at once later.  I'm using Perl with
>DBD::Pg/DBI and  see with DBI there is a prepare_cached, and a commit.
>Not much in the way of docs for the modules though.
>
>Perhaps I should be doing statement, prepare, statement, prepare,
>commit?
>
>I'd also be curious where you start postgres with the -F option?  I did
>not see it as a postmaster option.
>
>This is just a curiosity, but some of the MS Certified people are being
>taught that MS SQL server requires something like a PIII with 512MB
>RAM.  That's pretty crazy, if you ask me, but I can see it as being
>dependent on how many requests you server will need to deal with. On
>equal machines, could pg handle more or less requests that MS SQL?
>
>Thanks,
>     Jason
>
>--
>.............
>......... Jason C. Leach
>...... University College of the Cariboo
>... jcl@mail.ocis.net.
>.. http://www.ocis.net/~jcl
>.
>
>The Search for Extraterrestrial Intelligence from Home:
>http://setiathome.ssl.berkeley.edu
>
>                                                                 LINUX!
>
>
>
>
>************


Re: [GENERAL] Performance

From
Jurgen Defurne
Date:
 

Jason C. Leach wrote:
hi,

What's a good way to calculate how many transactions you should buffer before
you commit them?  Do you just make an estimate on how much mem each will take
up and calculate how much you wish to spare?

Thanks,
    J
 

That's a though question.
I don't think that could easily be done, because it will depend upon the implementation of the transaction system. Maybe one of the implementors could give an answer. However...

According to the database manuals I had the chance to read, one should always make such decisions based upon measurements.

However, take care with these measurements. What you are trying to do is batch processing to get a performance measurement. This measurement, however, will only reflect your performance while doing batch processing.

There are two ways to process data. The first one (because the oldest) is batch processing. The second one is interactive, or transactional, processing.
The difference in performance measurement is that for the batch job, you could indeed first do measurements based upon an always growing number of transactions between BEGIN and COMMIT statements. When you notice your swap space getting activated, then you will have reached your optimal transaction usage.

For a real batch application, you would write your program such that it writes (or updates) the optimal number of records, writes a checkpoint (or savepoint) and then commits the transaction. If the job should crash, then it can be restarted from where the last checkpoint was saved.

Trying to establish a performance reading for interactive transactional processing is much harder, due to the fact that many transactions will be open, there will be locks, commits, etc... I am sure that someone good at mathematics and with the right data about the distribution of reads, writes and rewrites and a knowledge of the IO channel etc, could make an estimate based on probability theory, but this is highly speculational.

That is one of the reasons that there is an organisation which tests databases. Unfortunately, these tests cost a lot of money and you should be a member, and the only members that I know of are the large database vendors. The only other source that I know of which has database benchmark data are the people who wrote mySQL.

Lastly, have a look at the Benchmarking-HOWTO. It will provide you some more starting ground on testing systems.

Feel free to ask questions anytime about these subjects. I have finally found the time to push through on the issue of PostgreSQL, but also on Tcl/Tk, which I find a terrific combination to write applications in, but there are some other technical details that I need to master myself.

Regards,

Jurgen Defurne
Flanders
Belgium