Re: [HACKERS] Re: DB logging (was: Problem with the numbers I reported yesterday) - Mailing list pgsql-hackers

From jwieck@debis.com (Jan Wieck)
Subject Re: [HACKERS] Re: DB logging (was: Problem with the numbers I reported yesterday)
Date
Msg-id m0y5Umk-000BFRC@orion.SAPserv.Hamburg.dsh.de
Whole thread Raw
In response to Re: DB logging (was: Problem with the numbers I reported yesterday)  ("Kent S. Gordon" <kgor@inetspace.com>)
List pgsql-hackers
Kent wrote:
>
> >>>>> "jwieck" == Jan Wieck <jwieck@debis.com> writes:
>     >     When doing a complete after image logging, that is taking
>     > all the tuples that are stored on insert/update, the tuple id's
>     > of deletes plus the information about transaction id's that
>     > commit, the regression tests produce log data that is more than
>     > the size of the final regression database.  The performance
>     > increase when only syncing the log- and controlfiles (2 control
>     > files on different devices and the logfile on a different device
>     > from the database files) and running the backends with -F is
>     > about 15-20% for the regression test.
>
> Log files do get very big with image logging.  I would not expect a
> huge win in performance unless the log device is a raw device.  On a
> cooked device (file system) buffer cache effects are very large (all
> disk data is being buffered both by postgresql and the OS buffer
> cache.  The buffer cache is actual harmfully in this case, since data
> is not reused, and the writes are synced.  The number of writes to the
> log also flush out other buffer from the cache leading to even more
> io.).  If a system does not have raw devices (linux, NT), it would be
> very useful if a flag exists to tell the OS that the file will be read
> sequential like in the madvise() call for mmap.  Is your code
> available anywhere?

    I don't have that code any more. It wasn't that much so I can
    redo it if at least you would like to help on that topic. But
    since  this  will  be  a  feature  we should wait for the 6.3
    release before touching anything.

>
>     >     I thought this is far too much logging data and so I didn't
>     > spent much time trying to implement a recovery. But as far as I
>     > got it I can tell that the updates to system catalogs and
>     > keeping the indices up to date will be really tricky.
>
> I have not looked at this area of the code.  Do the system catalogs
> have a separate storage manager?  I do not see why the could not be
> handled like any other data except for keeping the buffer in the cache.

    I just had some problems on the system catalogs (maybe due to
    the  system caching). I think that it can be handled somehow.

    There are other details in the logging we should  care  about
    when we implement it.

    The   logging  should  be  configurable  per  database.  Some
    databases have logging enabled while others are  unprotected.

    It  must  be  able  to do point in time recovery (restore the
    database from a backup and recover until an absolute time  or
    transaction ID).

    The   previous  two  produce  a  problem  for  shared  system
    relations.  If a backend  running  on  an  unlogged  database
    updates  pg_user  for  example, this time it must go into the
    log!

    We should give query logging instead of image logging a  try.


Until later, Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

pgsql-hackers by date:

Previous
From: Ronald Baljeu
Date:
Subject: Re: [HACKERS] new Group BY code
Next
From: The Hermit Hacker
Date:
Subject: pg_user permissions problem (Was: Re: [HACKERS] RE: New ecgp code problem.)