I have just changed around some programs that ran too slowly (too much time
in io-wait) and they speeded up greatly. This was not unexpected, but I
wonder about the limitations.
By transaction, I mean a single INSERT or a few related INSERTs.
What I used to do is roughly like this:
for each file {
for each record {
BEGIN WORK;
INSERT stuff in table(s);
if error {
ROLLBACK WORK
}
else {
COMMIT WORK;
}
}
}
The speedup was the obvious one:
for each file {
BEGIN WORK;
for each record {
INSERT stuff in table(s);
}
if error {
ROLLBACK WORK
}
else {
COMMIT WORK;
}
}
This means, of course, that the things I think of as transactions have been
bunched into a much smaller number of what postgreSQL thinks of as large
transactions, since there is only one per file rather than one per record.
Now if a file has several thousand records, this seems to work out just great.
But what is the limitation on such a thing? In this case, I am just
populating the database and there are no other users at such a time. I am
willing to lose the whole insert of a file if something goes wrong -- I
would fix whatever went wrong and start over anyway.
But at some point, disk IO would have to be done. Is this just a function of
how big /pgsql/data/postgresql.conf's shared_buffers is set to? Or does it
have to do with wal_buffers and checkpoint_segments?
--
.~. Jean-David Beyer Registered Linux User 85642.
/V\ PGP-Key: 9A2FC99A Registered Machine 241939.
/( )\ Shrewsbury, New Jersey http://counter.li.org
^^-^^ 11:10:01 up 2 days, 3:28, 4 users, load average: 5.76, 5.70, 5.53