Re: [GENERAL] Performance - Mailing list pgsql-general

From Jurgen Defurne
Subject Re: [GENERAL] Performance
Date
Msg-id 381AC0D2.49212F9A@glo.be
Whole thread Raw
In response to Performance  ("Jason C. Leach" <jcl@mail.ocis.net>)
List pgsql-general
 

Jason C. Leach wrote:
hi,

What's a good way to calculate how many transactions you should buffer before
you commit them?  Do you just make an estimate on how much mem each will take
up and calculate how much you wish to spare?

Thanks,
    J
 

That's a though question.
I don't think that could easily be done, because it will depend upon the implementation of the transaction system. Maybe one of the implementors could give an answer. However...

According to the database manuals I had the chance to read, one should always make such decisions based upon measurements.

However, take care with these measurements. What you are trying to do is batch processing to get a performance measurement. This measurement, however, will only reflect your performance while doing batch processing.

There are two ways to process data. The first one (because the oldest) is batch processing. The second one is interactive, or transactional, processing.
The difference in performance measurement is that for the batch job, you could indeed first do measurements based upon an always growing number of transactions between BEGIN and COMMIT statements. When you notice your swap space getting activated, then you will have reached your optimal transaction usage.

For a real batch application, you would write your program such that it writes (or updates) the optimal number of records, writes a checkpoint (or savepoint) and then commits the transaction. If the job should crash, then it can be restarted from where the last checkpoint was saved.

Trying to establish a performance reading for interactive transactional processing is much harder, due to the fact that many transactions will be open, there will be locks, commits, etc... I am sure that someone good at mathematics and with the right data about the distribution of reads, writes and rewrites and a knowledge of the IO channel etc, could make an estimate based on probability theory, but this is highly speculational.

That is one of the reasons that there is an organisation which tests databases. Unfortunately, these tests cost a lot of money and you should be a member, and the only members that I know of are the large database vendors. The only other source that I know of which has database benchmark data are the people who wrote mySQL.

Lastly, have a look at the Benchmarking-HOWTO. It will provide you some more starting ground on testing systems.

Feel free to ask questions anytime about these subjects. I have finally found the time to push through on the issue of PostgreSQL, but also on Tcl/Tk, which I find a terrific combination to write applications in, but there are some other technical details that I need to master myself.

Regards,

Jurgen Defurne
Flanders
Belgium

pgsql-general by date:

Previous
From: Ken Gunderson
Date:
Subject: Re: [GENERAL] Studying
Next
From: Thomas Good
Date:
Subject: Re: [NOVICE] next steps