Thread: xlogdump fixups and WAL log question.

xlogdump fixups and WAL log question.

From

Theo Schlossnagle

Date:

20 October 2006, 14:18:22

Not sure who cares, so xzilla indicated I should drop a note here.  I  
just made the xlogdump stuff work for 8.1 (trivial) and fixed a few  
other small issues that caused it to not work right both generally  
and in our environment.

http://pgfoundry.org/tracker/index.php? 
func=detail&aid=1000760&group_id=1000202&atid=772

We're using it to track down what's causing some wal log ruckus.   
We're generating about a quarter terabyte of WAL logs a day (on bad  
days) which is posing some PITR backup pains.  That amount isn't a  
severe challenge to backup, but our previous install was on Oracle  
and it generated substantially less archive redo logs (10-20 gigs per  
day).

Is it possible to create tables in fashion that will not write info  
to the WAL log -- knowingly and intentionally making them  
unrecoverable?  This is very desirable for us.  We snapshot tables  
from a production environment.  If the database goes down and we  
recover, the old snapshots are out of date anyway and serve no useful  
purpose.  The periodic snapshot procedure would re-snap them in short  
order anyway.  I'd like to do:

INSERT INTO TABLE tblfoo_snap1 AS SELECT * from <table on remote  
database> NO LOGGING;

(NO LOGGING being the only part we're currently missing) Is something  
like this possible?

Cheers ;-)
Theo

// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/

Re: xlogdump fixups and WAL log question.

From

Tom Lane

Date:

20 October 2006, 14:58:55

Theo Schlossnagle <jesus@omniti.com> writes:
> Is it possible to create tables in fashion that will not write info  
> to the WAL log -- knowingly and intentionally making them  
> unrecoverable?

Use temp tables?

Also, it's likely that much of the WAL volume is full-page images.
While you can't safely turn those off in 8.1, you can dial down the
frequency of occurrence by increasing checkpoint_segments and
checkpoint_timeout as much as you can stand.  (The tradeoffs are
amount of space occupied by pg_xlog/ and time to recover from a crash.)
        regards, tom lane

Re: xlogdump fixups and WAL log question.

From

Theo Schlossnagle

Date:

20 October 2006, 15:13:07

On Oct 20, 2006, at 1:58 PM, Tom Lane wrote:

> Theo Schlossnagle <jesus@omniti.com> writes:
>> Is it possible to create tables in fashion that will not write info
>> to the WAL log -- knowingly and intentionally making them
>> unrecoverable?
>
> Use temp tables?

temp tables won't work too well -- unless I can make a whole  
tablespace "temp" and multiple backends can see it.  They work fine  
for small tables we snapshot (couple hundred or even a few thousand  
rows), but many of the tables are a few hundred thousand rows and  
several processes on the system all need them.

> Also, it's likely that much of the WAL volume is full-page images.
> While you can't safely turn those off in 8.1, you can dial down the
> frequency of occurrence by increasing checkpoint_segments and
> checkpoint_timeout as much as you can stand.  (The tradeoffs are
> amount of space occupied by pg_xlog/ and time to recover from a  
> crash.)

Our pg_xlog is currently at 9.6GB.  Not sure I can reasonably tune it  
up much higher.

// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/

Re: xlogdump fixups and WAL log question.

From

"Simon Riggs"

Date:

20 October 2006, 17:25:31

On Fri, 2006-10-20 at 13:18 -0400, Theo Schlossnagle wrote:
> Not sure who cares, so xzilla indicated I should drop a note here.  I  
> just made the xlogdump stuff work for 8.1 (trivial) and fixed a few  
> other small issues that caused it to not work right both generally  
> and in our environment.
> 
> http://pgfoundry.org/tracker/index.php? 
> func=detail&aid=1000760&group_id=1000202&atid=772

Diogo Biazus was working on that; I care also.

> We're using it to track down what's causing some wal log ruckus.   
> We're generating about a quarter terabyte of WAL logs a day (on bad  
> days) which is posing some PITR backup pains.  That amount isn't a  
> severe challenge to backup, but our previous install was on Oracle  
> and it generated substantially less archive redo logs (10-20 gigs per  
> day).

As Tom says, definitely because of full_page_writes=on

> Is it possible to create tables in fashion that will not write info  
> to the WAL log -- knowingly and intentionally making them  
> unrecoverable?  This is very desirable for us.  We snapshot tables  
> from a production environment.  If the database goes down and we  
> recover, the old snapshots are out of date anyway and serve no useful  
> purpose.  The periodic snapshot procedure would re-snap them in short  
> order anyway.  I'd like to do:
> 
> INSERT INTO TABLE tblfoo_snap1 AS SELECT * from <table on remote  
> database> NO LOGGING;
> 
> (NO LOGGING being the only part we're currently missing) Is something  
> like this possible?

Do you want this because of:
1) performance?
2) to reduce the WAL volume of PITR backups?

If you're thinking (1), then I guess I'd ask whether you've considered
what will happen when the reporting environment includes data from other
sources as it inevitably will. At that point, data loss would be much
more annoying. My experience is that the success of your current
implementation will lead quickly to a greatly increased user
requirement.

I've been looking at ways of reducing the WAL volume for PITR backups.
Here's a few ideas:

1. Provide a filter that can be easily used by archive_command to remove
full page writes from WAL files. This would require us to disable the
file size test when we begin recovery on a new WAL files, plus would
need to redesign initial location of the checkpoint record since we
could no longer rely on the XLogRecPtr being a byte offset within the
file.

e.g. archive_command = 'pg_WAL_filter -f | ... '

2. Include tablespaceid within the header of xlog records. This would
allow us to filter out WAL from one or more tablespaces, similarly to
(1), plus it would also allow single tablespace recovery.

e.g. archive_command = 'pg_WAL_filter -x 35456 | ... '

There are some other ideas for generally reducing WAL volume also.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com

Re: xlogdump fixups and WAL log question.

From

Theo Schlossnagle

Date:

20 October 2006, 18:05:21

On Oct 20, 2006, at 4:24 PM, Simon Riggs wrote:

> On Fri, 2006-10-20 at 13:18 -0400, Theo Schlossnagle wrote:
>> Not sure who cares, so xzilla indicated I should drop a note here.  I
>> just made the xlogdump stuff work for 8.1 (trivial) and fixed a few
>> other small issues that caused it to not work right both generally
>> and in our environment.
>>
>> http://pgfoundry.org/tracker/index.php?
>> func=detail&aid=1000760&group_id=1000202&atid=772
>
> Diogo Biazus was working on that; I care also.

Cool.  Patch is short.

>> We're using it to track down what's causing some wal log ruckus.
>> We're generating about a quarter terabyte of WAL logs a day (on bad
>> days) which is posing some PITR backup pains.  That amount isn't a
>> severe challenge to backup, but our previous install was on Oracle
>> and it generated substantially less archive redo logs (10-20 gigs per
>> day).
>
> As Tom says, definitely because of full_page_writes=on

Can I turn that off in 8.1?

>> Is it possible to create tables in fashion that will not write info
>> to the WAL log -- knowingly and intentionally making them
>> unrecoverable?  This is very desirable for us.  We snapshot tables
>> from a production environment.  If the database goes down and we
>> recover, the old snapshots are out of date anyway and serve no useful
>> purpose.  The periodic snapshot procedure would re-snap them in short
>> order anyway.  I'd like to do:
>>
>> INSERT INTO TABLE tblfoo_snap1 AS SELECT * from <table on remote
>> database> NO LOGGING;
>>
>> (NO LOGGING being the only part we're currently missing) Is something
>> like this possible?
>
> Do you want this because of:
> 1) performance?

performance in that a substantial portion of my time is spent writing  
to pg_xlog

> 2) to reduce the WAL volume of PITR backups?

Yes.  Main concern.

>
> e.g. archive_command = 'pg_WAL_filter -f | ... '
> e.g. archive_command = 'pg_WAL_filter -x 35456 | ... '
>
> There are some other ideas for generally reducing WAL volume also.

I'd like to see them not written to the xlogs at all (if possible).   
Seems rather unnecessary unless I'm missing something.

// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/

Re: xlogdump fixups and WAL log question.

From

"Simon Riggs"

Date:

21 October 2006, 06:38:26

On Fri, 2006-10-20 at 17:04 -0400, Theo Schlossnagle wrote:
> On Oct 20, 2006, at 4:24 PM, Simon Riggs wrote:

> >> Is it possible to create tables in fashion that will not write info
> >> to the WAL log -- knowingly and intentionally making them
> >> unrecoverable?  This is very desirable for us.  We snapshot tables
> >> from a production environment.  If the database goes down and we
> >> recover, the old snapshots are out of date anyway and serve no useful
> >> purpose.  The periodic snapshot procedure would re-snap them in short
> >> order anyway.  I'd like to do:
> >>
> >> INSERT INTO TABLE tblfoo_snap1 AS SELECT * from <table on remote
> >> database> NO LOGGING;
> >>
> >> (NO LOGGING being the only part we're currently missing) Is something
> >> like this possible?

> > Do you want this because of:
> > 1) performance?
> 
> performance in that a substantial portion of my time is spent writing  
> to pg_xlog
> 
> > 2) to reduce the WAL volume of PITR backups?
> 
> Yes.  Main concern.
> 
> >
> > e.g. archive_command = 'pg_WAL_filter -f | ... '
> > e.g. archive_command = 'pg_WAL_filter -x 35456 | ... '
> >
> > There are some other ideas for generally reducing WAL volume also.
> 
> I'd like to see them not written to the xlogs at all (if possible).   
> Seems rather unnecessary unless I'm missing something.

You aware you can turn archive_command off until after the load? That
way you'll have nothing to back up at all until its done. In 8.1, when
archive_command is off, CREATE TABLE AS SELECT is optimised to produce
no WAL.

Turning off WAL is a difficult topic. Without it you have no crash
recovery, which IMHO everybody says they don't care about until they
crash, then they realise. It's hard to be selective about writing WAL
for specific operations also.

However, there may be two cases not discussed before:
1. A newly created database into which a full load and/or pg_dump
restore is being run. In that case we could have a mode where we turn
off WAL completely during initdb via pg_control and then turn it back on
again permanently (i.e. a one way switch) once the server is fully
loaded. That covers the "restore database" use case.

2. Turn off WAL for one or more tablespaces, though never the main data
directory. This allows for data which is "externally recoverable" to be
isolated from things like the catalog and other more normal data. That
would be a tablespace level option that would propagate to each object.
That covers the "regular snapshot load" use case you describe.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com

Re: xlogdump fixups and WAL log question.

From

Martijn van Oosterhout

Date:

21 October 2006, 07:09:06

On Sat, Oct 21, 2006 at 10:37:51AM +0100, Simon Riggs wrote:
> Turning off WAL is a difficult topic. Without it you have no crash
> recovery, which IMHO everybody says they don't care about until they
> crash, then they realise. It's hard to be selective about writing WAL
> for specific operations also.

It's been discussed before. One idea is to declare tables without
logging. The idea being that during recovery those tables and related
indexes are simply truncated. No foreign keys allowed. Obviously they
will not be saved via PITR either.

Put another way, the table structure is saved in WAL, but the data
isn't.

Have a ncie day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: xlogdump fixups and WAL log question.

From

Theo Schlossnagle

Date:

21 October 2006, 10:01:14

On Oct 21, 2006, at 6:08 AM, Martijn van Oosterhout wrote:

> On Sat, Oct 21, 2006 at 10:37:51AM +0100, Simon Riggs wrote:
>> Turning off WAL is a difficult topic. Without it you have no crash
>> recovery, which IMHO everybody says they don't care about until they
>> crash, then they realise. It's hard to be selective about writing WAL
>> for specific operations also.
>
> It's been discussed before. One idea is to declare tables without
> logging. The idea being that during recovery those tables and related
> indexes are simply truncated. No foreign keys allowed. Obviously they
> will not be saved via PITR either.
>
> Put another way, the table structure is saved in WAL, but the data
> isn't.

This is exactly what I'd like.  Simon suggested turning off WAL  
during the loads as a possible hack solution.  The reason this won't  
work is that we snap all the time, lots of tables.  We have between  
2000 and 4000 snapshot operations per day (throughout).  At the same  
time we have reporting queries running (that create and/or populate  
other tables) that last from 5 minutes to 18 hours.  It is important  
that we run everything but the snapshots with WAL on (as we must have  
PITR -- sans snapshots)

// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/

Re: xlogdump fixups and WAL log question.

From

"Simon Riggs"

Date:

21 October 2006, 17:51:11

On Sat, 2006-10-21 at 09:00 -0400, Theo Schlossnagle wrote:
> On Oct 21, 2006, at 6:08 AM, Martijn van Oosterhout wrote:
> 
> > On Sat, Oct 21, 2006 at 10:37:51AM +0100, Simon Riggs wrote:
> >> Turning off WAL is a difficult topic. Without it you have no crash
> >> recovery, which IMHO everybody says they don't care about until they
> >> crash, then they realise. It's hard to be selective about writing WAL
> >> for specific operations also.
> >
> > It's been discussed before. One idea is to declare tables without
> > logging. The idea being that during recovery those tables and related
> > indexes are simply truncated. No foreign keys allowed. Obviously they
> > will not be saved via PITR either.
> >
> > Put another way, the table structure is saved in WAL, but the data
> > isn't.
> 
> This is exactly what I'd like.  Simon suggested turning off WAL  
> during the loads as a possible hack solution.  The reason this won't  
> work is that we snap all the time, lots of tables.  We have between  
> 2000 and 4000 snapshot operations per day (throughout).  At the same  
> time we have reporting queries running (that create and/or populate  
> other tables) that last from 5 minutes to 18 hours.  It is important  
> that we run everything but the snapshots with WAL on (as we must have  
> PITR -- sans snapshots)

These tables are loaded once then read-only, yes?

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com

Re: xlogdump fixups and WAL log question.

From

Theo Schlossnagle

Date:

21 October 2006, 17:54:41

On Oct 21, 2006, at 3:12 PM, Simon Riggs wrote:

> On Sat, 2006-10-21 at 09:00 -0400, Theo Schlossnagle wrote:
>> On Oct 21, 2006, at 6:08 AM, Martijn van Oosterhout wrote:
>>
>>> On Sat, Oct 21, 2006 at 10:37:51AM +0100, Simon Riggs wrote:
>>>> Turning off WAL is a difficult topic. Without it you have no crash
>>>> recovery, which IMHO everybody says they don't care about until  
>>>> they
>>>> crash, then they realise. It's hard to be selective about  
>>>> writing WAL
>>>> for specific operations also.
>>>
>>> It's been discussed before. One idea is to declare tables without
>>> logging. The idea being that during recovery those tables and  
>>> related
>>> indexes are simply truncated. No foreign keys allowed. Obviously  
>>> they
>>> will not be saved via PITR either.
>>>
>>> Put another way, the table structure is saved in WAL, but the data
>>> isn't.
>>
>> This is exactly what I'd like.  Simon suggested turning off WAL
>> during the loads as a possible hack solution.  The reason this won't
>> work is that we snap all the time, lots of tables.  We have between
>> 2000 and 4000 snapshot operations per day (throughout).  At the same
>> time we have reporting queries running (that create and/or populate
>> other tables) that last from 5 minutes to 18 hours.  It is important
>> that we run everything but the snapshots with WAL on (as we must have
>> PITR -- sans snapshots)
>
> These tables are loaded once then read-only, yes?

No, they are loaded, and then reloaded, and then reloaded. Queries  
that use them will get the most recently loaded version of them.  It  
meets a business rule like: table foo on the warehouse should be  
representative of version of table foo on OLTP no older than 30 minutes.

// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/

Re: xlogdump fixups and WAL log question.

From

"Simon Riggs"

Date:

21 October 2006, 19:44:47

On Sat, 2006-10-21 at 15:17 -0400, Theo Schlossnagle wrote:
> On Oct 21, 2006, at 3:12 PM, Simon Riggs wrote:
> 
> > On Sat, 2006-10-21 at 09:00 -0400, Theo Schlossnagle wrote:
> >> On Oct 21, 2006, at 6:08 AM, Martijn van Oosterhout wrote:
> >>
> >>> On Sat, Oct 21, 2006 at 10:37:51AM +0100, Simon Riggs wrote:
> >>>> Turning off WAL is a difficult topic. Without it you have no crash
> >>>> recovery, which IMHO everybody says they don't care about until  
> >>>> they
> >>>> crash, then they realise. It's hard to be selective about  
> >>>> writing WAL
> >>>> for specific operations also.
> >>>
> >>> It's been discussed before. One idea is to declare tables without
> >>> logging. The idea being that during recovery those tables and  
> >>> related
> >>> indexes are simply truncated. No foreign keys allowed. Obviously  
> >>> they
> >>> will not be saved via PITR either.
> >>>
> >>> Put another way, the table structure is saved in WAL, but the data
> >>> isn't.
> >>
> >> This is exactly what I'd like.  Simon suggested turning off WAL
> >> during the loads as a possible hack solution.  The reason this won't
> >> work is that we snap all the time, lots of tables.  We have between
> >> 2000 and 4000 snapshot operations per day (throughout).  At the same
> >> time we have reporting queries running (that create and/or populate
> >> other tables) that last from 5 minutes to 18 hours.  It is important
> >> that we run everything but the snapshots with WAL on (as we must have
> >> PITR -- sans snapshots)
> >
> > These tables are loaded once then read-only, yes?
> 
> No, they are loaded, and then reloaded, and then reloaded. Queries  
> that use them will get the most recently loaded version of them.  It  
> meets a business rule like: table foo on the warehouse should be  
> representative of version of table foo on OLTP no older than 30 minutes.

But they can be re-created anew with the same name each time? Or I guess
not, but you redefine a view every 30 minutes to point to the latest
one?

If so, then I have a patch that will speed up COPY when in the same
transaction as the table that created it. I've finally fixed a bug in my
earlier prototypes that seems to make that work now, in all cases.

I was being slightly slow before; I thought this was a new requirement
but its just the old one slightly restated.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com

Re: xlogdump fixups and WAL log question.

From

Theo Schlossnagle

Date:

21 October 2006, 20:20:39

On Oct 21, 2006, at 4:40 PM, Simon Riggs wrote:

> On Sat, 2006-10-21 at 15:17 -0400, Theo Schlossnagle wrote:
>> On Oct 21, 2006, at 3:12 PM, Simon Riggs wrote:
>>
>>> On Sat, 2006-10-21 at 09:00 -0400, Theo Schlossnagle wrote:
>>>> On Oct 21, 2006, at 6:08 AM, Martijn van Oosterhout wrote:
>>>>
>>>>> On Sat, Oct 21, 2006 at 10:37:51AM +0100, Simon Riggs wrote:
>>>>>> Turning off WAL is a difficult topic. Without it you have no  
>>>>>> crash
>>>>>> recovery, which IMHO everybody says they don't care about until
>>>>>> they
>>>>>> crash, then they realise. It's hard to be selective about
>>>>>> writing WAL
>>>>>> for specific operations also.
>>>>>
>>>>> It's been discussed before. One idea is to declare tables without
>>>>> logging. The idea being that during recovery those tables and
>>>>> related
>>>>> indexes are simply truncated. No foreign keys allowed. Obviously
>>>>> they
>>>>> will not be saved via PITR either.
>>>>>
>>>>> Put another way, the table structure is saved in WAL, but the data
>>>>> isn't.
>>>>
>>>> This is exactly what I'd like.  Simon suggested turning off WAL
>>>> during the loads as a possible hack solution.  The reason this  
>>>> won't
>>>> work is that we snap all the time, lots of tables.  We have between
>>>> 2000 and 4000 snapshot operations per day (throughout).  At the  
>>>> same
>>>> time we have reporting queries running (that create and/or populate
>>>> other tables) that last from 5 minutes to 18 hours.  It is  
>>>> important
>>>> that we run everything but the snapshots with WAL on (as we must  
>>>> have
>>>> PITR -- sans snapshots)
>>>
>>> These tables are loaded once then read-only, yes?
>>
>> No, they are loaded, and then reloaded, and then reloaded. Queries
>> that use them will get the most recently loaded version of them.  It
>> meets a business rule like: table foo on the warehouse should be
>> representative of version of table foo on OLTP no older than 30  
>> minutes.
>
> But they can be re-created anew with the same name each time? Or I  
> guess
> not, but you redefine a view every 30 minutes to point to the latest
> one?

closest to the latter.  A view is redefined when the new snapshot is  
complete.

> If so, then I have a patch that will speed up COPY when in the same
> transaction as the table that created it. I've finally fixed a bug  
> in my
> earlier prototypes that seems to make that work now, in all cases.
>
> I was being slightly slow before; I thought this was a new requirement
> but its just the old one slightly restated.

We don't use COPY.  We directly INSERT INTO target_snap SELECT * from  
remote_select(...) t(cast);

remote_select is part of dbi-link.

// Theo Schlossnagle
// CTO -- http://www.omniti.com/~jesus/
// OmniTI Computer Consulting, Inc. -- http://www.omniti.com/

Re: xlogdump fixups and WAL log question.

From

Tom Lane

Date:

21 October 2006, 22:59:24

"Simon Riggs" <simon@2ndquadrant.com> writes:
> But they can be re-created anew with the same name each time? Or I guess
> not, but you redefine a view every 30 minutes to point to the latest
> one?

> If so, then I have a patch that will speed up COPY when in the same
> transaction as the table that created it. I've finally fixed a bug in my
> earlier prototypes that seems to make that work now, in all cases.

Can you make the patch cover the case of
begin;truncate foo;copy foo from ...commit;

It might be infeasible to detect this case, but if it's not ...
        regards, tom lane

Re: xlogdump fixups and WAL log question.

From

"Simon Riggs"

Date:

22 October 2006, 09:14:56

On Sat, 2006-10-21 at 19:24 -0400, Tom Lane wrote:
> "Simon Riggs" <simon@2ndquadrant.com> writes:
> > But they can be re-created anew with the same name each time? Or I guess
> > not, but you redefine a view every 30 minutes to point to the latest
> > one?
> 
> > If so, then I have a patch that will speed up COPY when in the same
> > transaction as the table that created it. I've finally fixed a bug in my
> > earlier prototypes that seems to make that work now, in all cases.
> 
> Can you make the patch cover the case of
> 
>     begin;
>     truncate foo;
>     copy foo from ...
>     commit;
> 
> It might be infeasible to detect this case, but if it's not ...

I think it is possible to detect this case without making catalog
entries, so I'll give this a try. Methinks that the truncate *must* be
the immediately preceding command, otherwise we might have a trigger
executing to put rows back into the table before we COPY.

I should also be able to get that to work with Insert Select without
much bother too.

We might also add this capability to COPY itself by providing a WITH
TRUNCATE option, which would be even cleaner code-wise. We can add that
to pg_dump so it will work with/without --single-transaction mode. That
will change the permissions reqd slightly, but seems OK.

I'll go for all of the above changes unless there are objections.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com

Re: xlogdump fixups and WAL log question.

From

Alvaro Herrera

Date:

22 October 2006, 13:00:33

Simon Riggs wrote:
> On Sat, 2006-10-21 at 19:24 -0400, Tom Lane wrote:

> > Can you make the patch cover the case of
> > 
> >     begin;
> >     truncate foo;
> >     copy foo from ...
> >     commit;
> > 
> > It might be infeasible to detect this case, but if it's not ...
> 
> I think it is possible to detect this case without making catalog
> entries, so I'll give this a try. Methinks that the truncate *must* be
> the immediately preceding command, otherwise we might have a trigger
> executing to put rows back into the table before we COPY.

Hmm, is it possible to save the "is empty" info somewhere in local
memory, perhaps the relcache (not necessarily propagated), and have
heap_insert turn it off?

That would help when you do things like

begin;
truncate foo, bar;
copy foo from ...
copy bar from ...
commit;

On the other hand, what happens if you do

begin;
truncate foo;
-- another sessioncopy foo from ...

-- original session
copy foo from ...
commit;


How do you detect that the table is no longer empty?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: xlogdump fixups and WAL log question.

From

Tom Lane

Date:

22 October 2006, 13:12:34

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Simon Riggs wrote:
>> I think it is possible to detect this case without making catalog
>> entries, so I'll give this a try. Methinks that the truncate *must* be
>> the immediately preceding command, otherwise we might have a trigger
>> executing to put rows back into the table before we COPY.

> Hmm, is it possible to save the "is empty" info somewhere in local
> memory, perhaps the relcache (not necessarily propagated), and have
> heap_insert turn it off?

The relcache isn't a very safe place to store state --- it's a cache,
not stable storage.

However, I don't understand why Simon is on about "empty".  ISTM the
important state is "new relfilenode assigned in this transaction".
        regards, tom lane

Re: xlogdump fixups and WAL log question.

From

"Simon Riggs"

Date:

22 October 2006, 18:41:43

On Sun, 2006-10-22 at 12:12 -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Simon Riggs wrote:
> >> I think it is possible to detect this case without making catalog
> >> entries, so I'll give this a try. Methinks that the truncate *must* be
> >> the immediately preceding command, otherwise we might have a trigger
> >> executing to put rows back into the table before we COPY.
> 
> > Hmm, is it possible to save the "is empty" info somewhere in local
> > memory, perhaps the relcache (not necessarily propagated), and have
> > heap_insert turn it off?
> 
> The relcache isn't a very safe place to store state --- it's a cache,
> not stable storage.

I was imagining adding this onto the Relation struct, just as we do with
rd_createSubid and rd_targblock. That isn't a safe place to store that
state so we can't do this across multiple backends. We wouldn't want
that anyway since otherwise various statements would need to access
shared state before they can act, which is not good.

> However, I don't understand why Simon is on about "empty".  ISTM the
> important state is "new relfilenode assigned in this transaction".

Thank you for supplying clarity of thought; empty would be important if
we were taking full table locks, which we don't want to do (Christmas
Past...). So we need not have the DML immediately following a truncate,
only that a truncate has previously occurred within the top level
transaction or an unaborted subtransaction.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com

Re: xlogdump fixups and WAL log question.

From

Markus Schaber

Date:

24 October 2006, 08:14:56

Hi, Simon,

Simon Riggs wrote:

> 1. Provide a filter that can be easily used by archive_command to remove
> full page writes from WAL files. This would require us to disable the
> file size test when we begin recovery on a new WAL files, plus would
> need to redesign initial location of the checkpoint record since we
> could no longer rely on the XLogRecPtr being a byte offset within the
> file.

pg_WAL_filter could "correct" the XLogRecPtr and file sizes during the
filter run.

HTH,
Markus

--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf.     | Software Development GIS

Fight against software patents in Europe! www.ffii.org
www.nosoftwarepatents.org