Thread: Size of Postgres Transaction Logs

Size of Postgres Transaction Logs

From
Vikaas BV
Date:
I had been using Postgres database version 6.5.3.  Initially, I had the
problem of the data files growing very huge and thereby affecting the
performance of the database.  However, on employing vacuum regularly, I
found that the size of files occupied on the disk would reduce except
for the  INDEX FILES.  It was rather funny to see that even though there
are no records in the table, the index file was occupying about 6-8MB.
The only solution I could find was to delete and recreate the index.

When I migrated to Postgres 7.x (7.0.3 and 7.1.1), I find that the
concept of a separate transaction log has been introduced (This was
located as pg_log in 7.0.3 in the data directory and as a numbered file
in the global directory under the data directory).  I find that if there
are repeated periodic operations on the postgres database, the
transaction log size increases very rapidly (I was aghast to see that
after initializing a new database with only 2 tables and a program
polling one of the tables every 20 secs, the transaction log had grown
to 120MB overnight) and there appears to be no mechanism of reducing the
log size except for taking a dump, reinitializing the database and
reimporting the dump.

I tried vacuuming the database(s) and unfortunately did not succeed in
reducing the overall size of the log.  I have tried truncating the log
file size by using some of the unix tools/programs but any tampering of
the log renders all the databases/tables useless.

Dumping the database and restoring it is too cumbersome to be done on a
day to day basis. My question is whether after vacuuming, the cleared
space in the log file reused and also whether the overall performance of
the DB is affected by a large log file on disk.  Further, I would like
to know if there is any mechanism of releasing the unused space from the
log.

Thanks
Vikaas

Re: Size of Postgres Transaction Logs

From
Tom Lane
Date:
Vikaas BV <vikas@cellcloud.com> writes:
> I find that if there
> are repeated periodic operations on the postgres database, the
> transaction log size increases very rapidly (I was aghast to see that
> after initializing a new database with only 2 tables and a program
> polling one of the tables every 20 secs, the transaction log had grown
> to 120MB overnight) and there appears to be no mechanism of reducing the
> log size except for taking a dump, reinitializing the database and
> reimporting the dump.

The WAL log should be auto-truncated at checkpoints, assuming that the
old entries are no longer needed.  However, if you leave uncommitted
transactions sitting around, their WAL entries can't be deleted until
you commit or abort them.  I'd only expect massive WAL growth if you
have very long-running transactions ...

            regards, tom lane

Re: Size of Postgres Transaction Logs

From
Lincoln Yeoh
Date:
At 12:45 PM 5/18/01 -0400, Tom Lane wrote:
>
>The WAL log should be auto-truncated at checkpoints, assuming that the
>old entries are no longer needed.  However, if you leave uncommitted
>transactions sitting around, their WAL entries can't be deleted until
>you commit or abort them.  I'd only expect massive WAL growth if you
>have very long-running transactions ...

Uhoh.

Scenario 1:
Using FCGI, persistent DB connections and Perl DBI.

The problem I see is that the Perl DBI doesn't have a BEGIN transaction,
can only rollback or commit, which implicitly begins a new transaction.

So what happens typically is:
1) open DB connection.
2) wait for http request.
3) rollback (to implicitly begin a new transaction so that 'now' isn't
years ago :) ).
4) do DB stuff
5) rollback/commit.

Would we get massive WAL growth if the long running transactions are
inactive (stuck at step 2) - not doing anything, but other transactions are
active?

Scenario 2:
search/query engine - only selects are done, so db connection and
transaction held open for long periods. Lots of selects being done.

Any issues with this?

Cheerio,
Link.


Re: Size of Postgres Transaction Logs

From
Tom Lane
Date:
Lincoln Yeoh <lyeoh@pop.jaring.my> writes:
> So what happens typically is:
> 1) open DB connection.
> 2) wait for http request.
> 3) rollback (to implicitly begin a new transaction so that 'now' isn't
> years ago :) ).
> 4) do DB stuff
> 5) rollback/commit.

> Would we get massive WAL growth if the long running transactions are
> inactive (stuck at step 2) - not doing anything, but other transactions are
> active?

Not if they haven't done anything yet.  "BEGIN" doesn't really start a
transaction, it's only your first database access that gets the
transaction engine running.  Furthermore, if I recall the WAL stuff
correctly, what really counts for WAL is your first *write* access.
A read-only xact doesn't exist as far as WAL is concerned.

Nonetheless, I'm pretty concerned about this log-growth issue.  See
current pghackers thread about whether we really need WAL to provide
UNDO support.  AFAICS, we could truncate the WAL log at each checkpoint
if we didn't need UNDO.

            regards, tom lane