Thread: DROP column: documentation unclear

DROP column: documentation unclear

From

Adrian von Bidder

Date:

08 March 2010, 16:22:10

Hi,

The documentation about DROP COLUMN is a bit unclear:

| The DROP COLUMN form does not physically remove the column, but simply
| makes it invisible to SQL operations. Subsequent insert and update
| operations in the table will store a null value for the column. Thus,
| dropping a column is quick but it will not immediately reduce the on-disk
| size of your table, as the space occupied by the dropped column is not
| reclaimed. The space will be reclaimed over time as existing rows are
| updated.

"subsequent ... will store a null value" would imply that deleted columns
will still take some place, while "the space will be reclaimed ..." would
suggest that new rows (insert or updates in mvcc) don't have the deleted
column anymore - I'm not quite sure how to interpret this.  What is pg
doing?

chees
-- vbi


--
featured product: PostgreSQL - http://postgresql.org

Attachment

signature.asc

Re: DROP column: documentation unclear

From

Martijn van Oosterhout

Date:

08 March 2010, 17:36:04

On Mon, Mar 08, 2010 at 05:09:14PM +0100, Adrian von Bidder wrote:
> Hi,
>
> The documentation about DROP COLUMN is a bit unclear:
>
> | The DROP COLUMN form does not physically remove the column, but simply
> | makes it invisible to SQL operations. Subsequent insert and update
> | operations in the table will store a null value for the column. Thus,
> | dropping a column is quick but it will not immediately reduce the on-disk
> | size of your table, as the space occupied by the dropped column is not
> | reclaimed. The space will be reclaimed over time as existing rows are
> | updated.
>
> "subsequent ... will store a null value" would imply that deleted columns
> will still take some place, while "the space will be reclaimed ..." would
> suggest that new rows (insert or updates in mvcc) don't have the deleted
> column anymore - I'm not quite sure how to interpret this.  What is pg
> doing?

What you're missing is that in postgres NULLs are stored as a bit in
the header and there is no data. So in a sense NULLs take no space
(well, one bit) which means both statements are true.

Have a nice day,

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Attachment

signature.asc

autovacuum question

From

"Scot Kreienkamp"

Date:

08 March 2010, 18:28:32

Hi everyone,

I have a database that is constantly getting reloaded several times per
day from production backups and is used for reporting purposes.  The
problem I'm having with it is that the database seems to be much slower
than the others I have that are more static.  I suspect that is due to
the lack of query planner statistics (analyze) having been done after
each restore, which is that way due to the amount of time it takes to
accomplish.

First, I'm hoping someone here can validate my theory.  Second, if
that's true, is there any way to trigger an auto-analyze on a table
automatically the first time a query touches that table?  (I ask because
there is no way to shrink the amount of time a database-wide analyze
would take into the window I have to do it in.  The expectations may be
a bit unrealistic here, I know.)  Third, what parameters can I set to
make analyze go as fast as possible, knowing that the disks are slow on
it because of the hardware?  (Can't do anything about that either, FYI)
Obviously more memory the better, and setting maintenance work memory
higher also.  Doing a vacuum is kind of pointless because it gets
reloaded every 2 hours, so all I really need is an analyze --I think--.


I'm on 8.3.7.  Since 8.3.10 is due out I'll probably upgrade to it after
it's shaken out a bit.

Thanks for your help.  Any suggestions are welcome.

Scot Kreienkamp
skreien@la-z-boy.com

Re: autovacuum question

From

Scott Mead

Date:

08 March 2010, 22:50:28

On Mon, Mar 8, 2010 at 5:13 PM, Scot Kreienkamp <SKreien@la-z-boy.com> wrote:

Hi everyone,

I have a database that is constantly getting reloaded several times per
day from production backups and is used for reporting purposes. The
problem I'm having with it is that the database seems to be much slower
than the others I have that are more static. I suspect that is due to
the lack of query planner statistics (analyze) having been done after
each restore, which is that way due to the amount of time it takes to
accomplish.

First, I'm hoping someone here can validate my theory.

It would seem likely that this could be the problem...

Second, if
that's true, is there any way to trigger an auto-analyze on a table
automatically the first time a query touches that table?

(I ask because
there is no way to shrink the amount of time a database-wide analyze
would take into the window I have to do it in. The expectations may be
a bit unrealistic here, I know.)

Why not just add an 'analyze' as the last step of the restore job?

Third, what parameters can I set to
make analyze go as fast as possible, knowing that the disks are slow on
it because of the hardware? (Can't do anything about that either, FYI)
Obviously more memory the better, and setting maintenance work memory
higher also. Doing a vacuum is kind of pointless because it gets
reloaded every 2 hours, so all I really need is an analyze --I think--.

Sounds like you've done what you can. How long does an analyze take?

I'm on 8.3.7. Since 8.3.10 is due out I'll probably upgrade to it after
it's shaken out a bit.

Why not move up to 8.4?

--Scott M

Thanks for your help. Any suggestions are welcome.

Scot Kreienkamp
skreien@la-z-boy.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: autovacuum question

From

"Scot Kreienkamp"

Date:

08 March 2010, 23:10:54

On Mon, Mar 8, 2010 at 5:13 PM, Scot Kreienkamp <SKreien@la-z-boy.com> wrote:

Hi everyone,

I have a database that is constantly getting reloaded several times per
day from production backups and is used for reporting purposes. The
problem I'm having with it is that the database seems to be much slower
than the others I have that are more static. I suspect that is due to
the lack of query planner statistics (analyze) having been done after
each restore, which is that way due to the amount of time it takes to
accomplish.

First, I'm hoping someone here can validate my theory.

It would seem likely that this could be the problem...

Cool…. Thanks. Glad I’m on the right track.

Second, if
that's true, is there any way to trigger an auto-analyze on a table
automatically the first time a query touches that table?

(I ask because
there is no way to shrink the amount of time a database-wide analyze
would take into the window I have to do it in. The expectations may be
a bit unrealistic here, I know.)

Why not just add an 'analyze' as the last step of the restore job?

Due to the amount of time it takes. The disks are slow enough to make a database-wide analyze painful since I would have to repeat it every 1-2 hours, IE every reload time.

Third, what parameters can I set to
make analyze go as fast as possible, knowing that the disks are slow on
it because of the hardware? (Can't do anything about that either, FYI)
Obviously more memory the better, and setting maintenance work memory
higher also. Doing a vacuum is kind of pointless because it gets
reloaded every 2 hours, so all I really need is an analyze --I think--.

Sounds like you've done what you can. How long does an analyze take?

Last I tried it, it took 15 minutes on a 30 gig database while it was being used.

I'm on 8.3.7. Since 8.3.10 is due out I'll probably upgrade to it after
it's shaken out a bit.

Why not move up to 8.4?

Because I’m constrained by our application developers who don’t have the time to vet our app against 8.4 yet. I’ve been pushing for it for the last 2 months.

--Scott M

Thanks for your help. Any suggestions are welcome.

Scot Kreienkamp
skreien@la-z-boy.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: autovacuum question

From

Tom Lane

Date:

08 March 2010, 23:31:59

"Scot Kreienkamp" <SKreien@la-z-boy.com> writes:
>> Why not just add an 'analyze' as the last step of the restore job?

> Due to the amount of time it takes.  The disks are slow enough to make a
> database-wide analyze painful since I would have to repeat it every 1-2
> hours, IE every reload time.

You claimed that before.  It didn't make any sense then and it doesn't
now.  There is no way that an analyze is expensive compared to a
database reload.

Maybe what you need to be doing is rethinking the strategy that involves
reloading every couple of hours...

            regards, tom lane

Re: autovacuum question

From

"Scot Kreienkamp"

Date:

09 March 2010, 09:47:30

Wish I could Tom.  I need a non-production, read-write copy of the
database that is updated every 1-2 hours from production. I don't set
this requirement, the business does. I just have to do it if it's
technically possible.

I found a way to do it very easily using LVM snapshots and WAL log
shipping, but the net effect is I'm bringing a new LVM snapshot copy of
the database out of recovery every 1-2 hours.  That means I'd have to
spend 15 minutes, or one-quarter of the time, doing an analyze every
time I refresh the database.  That's fairly painful.  The LVM snap and
restart only takes 1-2 minutes right now.

If you have any other ideas how I can accomplish or improve this I'm all
ears.

Thanks,

Scot Kreienkamp
skreien@la-z-boy.com

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, March 08, 2010 10:32 PM
To: Scot Kreienkamp
Cc: Scott Mead; pgsql-general@postgresql.org
Subject: Re: [GENERAL] autovacuum question

"Scot Kreienkamp" <SKreien@la-z-boy.com> writes:
>> Why not just add an 'analyze' as the last step of the restore job?

> Due to the amount of time it takes.  The disks are slow enough to make
a
> database-wide analyze painful since I would have to repeat it every
1-2
> hours, IE every reload time.

You claimed that before.  It didn't make any sense then and it doesn't
now.  There is no way that an analyze is expensive compared to a
database reload.

Maybe what you need to be doing is rethinking the strategy that involves
reloading every couple of hours...

            regards, tom lane

Re: autovacuum question

From

Greg Stark

Date:

09 March 2010, 10:39:50

On Tue, Mar 9, 2010 at 1:47 PM, Scot Kreienkamp <SKreien@la-z-boy.com> wrote:
> I found a way to do it very easily using LVM snapshots and WAL log
> shipping, but the net effect is I'm bringing a new LVM snapshot copy of
> the database out of recovery every 1-2 hours.  That means I'd have to
> spend 15 minutes, or one-quarter of the time, doing an analyze every
> time I refresh the database.  That's fairly painful.  The LVM snap and
> restart only takes 1-2 minutes right now.

Your snapshot should have the same stats that the server does, so this
doesn't actually seem to explain the discrepancy.

You be running into performance problems with LVM if the snapshot is
the one paying the price for all the CoW copies. Or it could be that
doing retail block copies as needed results in them being fragmented
and destroying the sequential scan performance. You might be able to
reduce the difference by making sure to do a vacuum and a checkpoint
immediately prior to the snapshot. That would hopefully achieve
setting most hint bits so that read-only queries on the snapshot don't
cause writes to blocks just to set them.

There might be an option in LVM to materialize the entire snapshot
which might be able to bring the performance up to the same level and
hopefully allocate all the blocks sequentially.

--
greg

Re: autovacuum question

From

"Scot Kreienkamp"

Date:

09 March 2010, 10:48:55

Would the stats come across in WAL log shipping to a physically separate server?  My understanding is that they won't.

Thanks,

Scot Kreienkamp
skreien@la-z-boy.com

-----Original Message-----
From: gsstark@gmail.com [mailto:gsstark@gmail.com] On Behalf Of Greg Stark
Sent: Tuesday, March 09, 2010 9:39 AM
To: Scot Kreienkamp
Cc: Tom Lane; Scott Mead; pgsql-general@postgresql.org
Subject: Re: autovacuum question

On Tue, Mar 9, 2010 at 1:47 PM, Scot Kreienkamp <SKreien@la-z-boy.com> wrote:
> I found a way to do it very easily using LVM snapshots and WAL log
> shipping, but the net effect is I'm bringing a new LVM snapshot copy of
> the database out of recovery every 1-2 hours.  That means I'd have to
> spend 15 minutes, or one-quarter of the time, doing an analyze every
> time I refresh the database.  That's fairly painful.  The LVM snap and
> restart only takes 1-2 minutes right now.

Your snapshot should have the same stats that the server does, so this
doesn't actually seem to explain the discrepancy.

You be running into performance problems with LVM if the snapshot is
the one paying the price for all the CoW copies. Or it could be that
doing retail block copies as needed results in them being fragmented
and destroying the sequential scan performance. You might be able to
reduce the difference by making sure to do a vacuum and a checkpoint
immediately prior to the snapshot. That would hopefully achieve
setting most hint bits so that read-only queries on the snapshot don't
cause writes to blocks just to set them.

There might be an option in LVM to materialize the entire snapshot
which might be able to bring the performance up to the same level and
hopefully allocate all the blocks sequentially.

--
greg

Re: autovacuum question

From

Scott Marlowe

Date:

09 March 2010, 10:56:46

On Tue, Mar 9, 2010 at 6:47 AM, Scot Kreienkamp <SKreien@la-z-boy.com> wrote:
> Wish I could Tom.  I need a non-production, read-write copy of the
> database that is updated every 1-2 hours from production. I don't set
> this requirement, the business does. I just have to do it if it's
> technically possible.
>
> I found a way to do it very easily using LVM snapshots and WAL log
> shipping, but the net effect is I'm bringing a new LVM snapshot copy of
> the database out of recovery every 1-2 hours.  That means I'd have to
> spend 15 minutes, or one-quarter of the time, doing an analyze every
> time I refresh the database.  That's fairly painful.  The LVM snap and
> restart only takes 1-2 minutes right now.
>
> If you have any other ideas how I can accomplish or improve this I'm all
> ears.

I'm gonna take a scientific wild-assed guess that the real issue here
is caching, or more specifically, lack thereof when you first start up
your copy of the db.

Re: autovacuum question

From

Scott Mead

Date:

09 March 2010, 11:01:41

On Tue, Mar 9, 2010 at 9:56 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:

On Tue, Mar 9, 2010 at 6:47 AM, Scot Kreienkamp <SKreien@la-z-boy.com> wrote:
> Wish I could Tom. I need a non-production, read-write copy of the
> database that is updated every 1-2 hours from production. I don't set
> this requirement, the business does. I just have to do it if it's
> technically possible.
>
> I found a way to do it very easily using LVM snapshots and WAL log
> shipping, but the net effect is I'm bringing a new LVM snapshot copy of
> the database out of recovery every 1-2 hours. That means I'd have to
> spend 15 minutes, or one-quarter of the time, doing an analyze every
> time I refresh the database. That's fairly painful. The LVM snap and
> restart only takes 1-2 minutes right now.
>
> If you have any other ideas how I can accomplish or improve this I'm all
> ears.

I'm gonna take a scientific wild-assed guess that the real issue here
is caching, or more specifically, lack thereof when you first start up
your copy of the db.

ISTM that 9.0's read-only standby feature may be of use to you. I know it doesn't help you *today* but have you looked at it yet?

--Scott M

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: autovacuum question

From

"Scot Kreienkamp"

Date:

09 March 2010, 11:04:34

ISTM that 9.0's read-only standby feature may be of use to you. I know it doesn't help you *today* but have you looked at it yet?

[Scot Kreienkamp]

I had considered it and it will make my life easier for my reporting server, but unfortunately in this case I need a read-write copy.

Re: autovacuum question

From

Scott Mead

Date:

09 March 2010, 11:23:42

On Tue, Mar 9, 2010 at 10:01 AM, Scott Mead <scott.lists@enterprisedb.com> wrote:

On Tue, Mar 9, 2010 at 9:56 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Tue, Mar 9, 2010 at 6:47 AM, Scot Kreienkamp <SKreien@la-z-boy.com> wrote:
> Wish I could Tom. I need a non-production, read-write copy of the
> database that is updated every 1-2 hours from production. I don't set
> this requirement, the business does. I just have to do it if it's
> technically possible.
>
> I found a way to do it very easily using LVM snapshots and WAL log
> shipping, but the net effect is I'm bringing a new LVM snapshot copy of
> the database out of recovery every 1-2 hours. That means I'd have to
> spend 15 minutes, or one-quarter of the time, doing an analyze every
> time I refresh the database. That's fairly painful. The LVM snap and
> restart only takes 1-2 minutes right now.
>
> If you have any other ideas how I can accomplish or improve this I'm all
> ears.

I'm gonna take a scientific wild-assed guess that the real issue here
is caching, or more specifically, lack thereof when you first start up
your copy of the db.

ISTM that 9.0's read-only standby feature may be of use to you. I know it doesn't help you *today* but have you looked at it yet?

Okay, so the RO database won't work. How much data are we talking? How much growth do you see between snapshots?

--Scott M

Re: autovacuum question

From

"Scot Kreienkamp"

Date:

09 March 2010, 11:26:22

ISTM that 9.0's read-only standby feature may be of use to you. I know it doesn't help you *today* but have you looked at it yet?

Okay, so the RO database won't work. How much data are we talking? How much growth do you see between snapshots?

The initial database size is 31 gigs. I give it 5 gigs in the snapshot to grow in, but I would be surprised if more than 200 megs of data changes. The actual change rate should be very low.

Re: autovacuum question

From

Bryan Montgomery

Date:

09 March 2010, 11:32:35

Could you have two of these non-production databases? Prepare one in the background, including an analyze and then make it the 'live' non-production database then use the offline / alternative database for the next load prepare that and then switch it on when ready.

In this scenario you'd need twice the disk space I guess but the 'downtime' would be a lot less. I'd imagine you could have databases on different ports and switch them at the roll over point, or maybe even just different database names.

Just an alternative idea to throw out there.

Bryan.

On Tue, Mar 9, 2010 at 8:47 AM, Scot Kreienkamp <SKreien@la-z-boy.com> wrote:

Wish I could Tom. I need a non-production, read-write copy of the
database that is updated every 1-2 hours from production. I don't set
this requirement, the business does. I just have to do it if it's
technically possible.

I found a way to do it very easily using LVM snapshots and WAL log
shipping, but the net effect is I'm bringing a new LVM snapshot copy of
the database out of recovery every 1-2 hours. That means I'd have to
spend 15 minutes, or one-quarter of the time, doing an analyze every
time I refresh the database. That's fairly painful. The LVM snap and
restart only takes 1-2 minutes right now.

If you have any other ideas how I can accomplish or improve this I'm all
ears.

Thanks,

Scot Kreienkamp
skreien@la-z-boy.com

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Monday, March 08, 2010 10:32 PM
To: Scot Kreienkamp
Cc: Scott Mead; pgsql-general@postgresql.org
Subject: Re: [GENERAL] autovacuum question

"Scot Kreienkamp" <SKreien@la-z-boy.com> writes:
>> Why not just add an 'analyze' as the last step of the restore job?

> Due to the amount of time it takes. The disks are slow enough to make
a
> database-wide analyze painful since I would have to repeat it every
1-2
> hours, IE every reload time.

You claimed that before. It didn't make any sense then and it doesn't
now. There is no way that an analyze is expensive compared to a
database reload.

Maybe what you need to be doing is rethinking the strategy that involves
reloading every couple of hours...

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: autovacuum question

From

"Scot Kreienkamp"

Date:

09 March 2010, 11:47:02

I'm gonna take a scientific wild-assed guess that the real issue here
is caching, or more specifically, lack thereof when you first start up
your copy of the db.

[Scot Kreienkamp]
That is definitely one of the problems.  No way to help that that I'm
aware of.

Re: DROP column: documentation unclear

From

Harald Fuchs

Date:

09 March 2010, 13:17:35

In article <20100308213549.GB660@svana.org>,
Martijn van Oosterhout <kleptog@svana.org> writes:

>> "subsequent ... will store a null value" would imply that deleted columns
>> will still take some place, while "the space will be reclaimed ..." would
>> suggest that new rows (insert or updates in mvcc) don't have the deleted
>> column anymore - I'm not quite sure how to interpret this.  What is pg
>> doing?

> What you're missing is that in postgres NULLs are stored as a bit in
> the header and there is no data. So in a sense NULLs take no space
> (well, one bit) which means both statements are true.

But if you already have eight nullable columns, the (maybe originally
non-null) column which has been dropped would cause the header to be
one byte larger, wouldn't it?

Re: DROP column: documentation unclear

From

Tom Lane

Date:

09 March 2010, 13:28:16

Harald Fuchs <hari.fuchs@gmail.com> writes:
> Martijn van Oosterhout <kleptog@svana.org> writes:
>> What you're missing is that in postgres NULLs are stored as a bit in
>> the header and there is no data. So in a sense NULLs take no space
>> (well, one bit) which means both statements are true.

> But if you already have eight nullable columns, the (maybe originally
> non-null) column which has been dropped would cause the header to be
> one byte larger, wouldn't it?

No; the size of the bitmap is equal to the total number of columns, not
the number of columns that are null.  One way to think about it is that
the first null in a particular row costs NUMCOLUMNS bits to store, and
then any additional nulls are free.

Anyway, arguing about the size of the null bitmap seems to me to be
beside the point.  If you care about the space occupied by a column at
all, it's going to be a lot more than one bit.  It's that macroscopic
space usage that the DROP COLUMN documentation is talking about
reclaiming, not whether or not you need a null bitmap.

            regards, tom lane