Thread: Thoughts on maintaining 7.3

Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
Hello,
 With the recent stint of pg_upgrade statements and the impending 
release of 7.4 what
do people think about having a dedicated maintenance team for 7.3? 7.3 
is a pretty
solid release and I think people will be hard pressed to upgrade to 7.4. 
Of course
a lot of people will, but I have customer that are just now upgrading to 
7.3 because
of legacy application and migratory issues.
  Anyway I was considering a similar situation to how Linux works where 
their is a
maintainer for each release... Heck even Linux 2.0 still released until 
recently.
 Of course the theory being that we backport "some" features and fix 
any bugs that
we find?
  What are people's thoughts on this?

SIncerley,

Joshua Drake

-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming, shared hosting and dedicated hosting.
+1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com
The most reliable support for the most reliable Open Source database.




Re: Thoughts on maintaining 7.3

From
"Marc G. Fournier"
Date:

On Tue, 30 Sep 2003, Joshua D. Drake wrote:

> Hello,
>
>   With the recent stint of pg_upgrade statements and the impending
> release of 7.4 what do people think about having a dedicated maintenance
> team for 7.3? 7.3 is a pretty solid release and I think people will be
> hard pressed to upgrade to 7.4.  Of course a lot of people will, but I
> have customer that are just now upgrading to 7.3 because of legacy
> application and migratory issues.
>
>    Anyway I was considering a similar situation to how Linux works where
> their is a maintainer for each release... Heck even Linux 2.0 still
> released until recently.
>
>   Of course the theory being that we backport "some" features and fix
> any bugs that we find?
>
>    What are people's thoughts on this?

The key issue here is that those creating the patches need to spend the
time to create appropriate ones for v7.3, and not many seem willing ...
Tom generally does alot of work on back-patching where appropriate, but
those patches are generally either very critical, or benign to changes
since v7.3 ...

The main detractor from us doing this up to this point has been, I
believe, testing to make sure any back patches don't break *any* of the
various OS ports, testing that generally only gets done while in a Beta
freeze ...

Not saying that if someone submit'd patches to v7.3, they wouldn't get
applied ... only that, to date, the work/effort has been greater then the
overall benefit, and nobody has step'd up to the plate to do it ...


Re: Thoughts on maintaining 7.3

From
Robert Treat
Date:
On Wed, 2003-10-01 at 08:36, Marc G. Fournier wrote:
> 
> 
> On Tue, 30 Sep 2003, Joshua D. Drake wrote:
> 
> > Hello,
> >
> >   With the recent stint of pg_upgrade statements and the impending
> > release of 7.4 what do people think about having a dedicated maintenance
> > team for 7.3? 7.3 is a pretty solid release and I think people will be
> > hard pressed to upgrade to 7.4.  Of course a lot of people will, but I
> > have customer that are just now upgrading to 7.3 because of legacy
> > application and migratory issues.
> >
> >    Anyway I was considering a similar situation to how Linux works where
> > their is a maintainer for each release... Heck even Linux 2.0 still
> > released until recently.
> >
> >   Of course the theory being that we backport "some" features and fix
> > any bugs that we find?
> >
> >    What are people's thoughts on this?
> 
> The key issue here is that those creating the patches need to spend the
> time to create appropriate ones for v7.3, and not many seem willing ...
> Tom generally does alot of work on back-patching where appropriate, but
> those patches are generally either very critical, or benign to changes
> since v7.3 ...
> 
> The main detractor from us doing this up to this point has been, I
> believe, testing to make sure any back patches don't break *any* of the
> various OS ports, testing that generally only gets done while in a Beta
> freeze ...
> 
> Not saying that if someone submit'd patches to v7.3, they wouldn't get
> applied ... only that, to date, the work/effort has been greater then the
> overall benefit, and nobody has step'd up to the plate to do it ...

Maybe I've mis-read Joshua's intentions, but I got the impression that
this 7.3 maintainer would follow the patches list and backport patches
whenever possible. This way folks coding for 7.4/7.5 can stay focused on
that, but folks who can't upgrade to 7.4 for whatever reason can still
get some features / improvements. 

Several linux distros already do this for many packages, and personally
I've always been surprised that, given postgresql's major release
upgrade issues, that no commercial company has stepped in to offer this
in the past. I think what Joshua is wondering is how much cooperation
would he get from the community if he was willing to donate these
efforts back into project.

While your concerns about testing are valid, there are already issues
with that for minor releases, as evidenced by our need to do the quick
7.3.4 after trouble in 7.3.3. Not to mention how little testing is
happening to the code that's been back patched into 7.3 since 7.3.4...
Hmm... maybe thats actually an argument against having more changes get
put in, OTOH if Joshua can address the testing issues maybe there would
be an overall improvement.  

I personally think it's a good idea for *someone* to do this, but I'll
leave it to core to decide if they want to put the projects stamp of
approval on it for any official community release.

Robert Treat 
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL



Re: Thoughts on maintaining 7.3

From
"scott.marlowe"
Date:
On Tue, 30 Sep 2003, Joshua D. Drake wrote:

> Hello,
> 
>   With the recent stint of pg_upgrade statements and the impending 
> release of 7.4 what
> do people think about having a dedicated maintenance team for 7.3? 7.3 
> is a pretty
> solid release and I think people will be hard pressed to upgrade to 7.4. 
> Of course
> a lot of people will, but I have customer that are just now upgrading to 
> 7.3 because
> of legacy application and migratory issues.
> 
>    Anyway I was considering a similar situation to how Linux works where 
> their is a
> maintainer for each release... Heck even Linux 2.0 still released until 
> recently.
> 
>   Of course the theory being that we backport "some" features and fix 
> any bugs that
> we find?
> 
>    What are people's thoughts on this?

It seems to me the upgrade from 7.2 to 7.4 is easier than an upgrade to 
7.3, since at least 7.4's pg_dumpall can connect to a 7.2 database and 
suck in everything, whereas in 7.3 I had to dump with 7.2's dumpall and 
then tweak the file by hand a fair bit to get it to go into 7.3.

With 7.4 I'm finding upgrading to be easier.  I'll likely upgrade out 
production servers to 7.4.0 when it comes out and wind up skipping 7.3 
altogether.



Re: Thoughts on maintaining 7.3

From
"Marc G. Fournier"
Date:
On Wed, 1 Oct 2003, Robert Treat wrote:

> Maybe I've mis-read Joshua's intentions, but I got the impression that
> this 7.3 maintainer would follow the patches list and backport patches
> whenever possible. This way folks coding for 7.4/7.5 can stay focused on
> that, but folks who can't upgrade to 7.4 for whatever reason can still
> get some features / improvements.

The problem, I think (and please note that I'm not against it, just
playing major devil's advocate here) is that there have always been some
major fundamental coding changes between releases that there are very few
patches that are "back-patchable" without having to do some heavy
re-writes ...

> Several linux distros already do this for many packages, and personally
> I've always been surprised that, given postgresql's major release
> upgrade issues, that no commercial company has stepped in to offer this
> in the past. I think what Joshua is wondering is how much cooperation
> would he get from the community if he was willing to donate these
> efforts back into project.

Using Linux/FreeBS/Insert OS Here as an example is like comparing apples
to oranges ... take FreeBSD as an example, since I know it ... 5.x has had
some *major* re-writes to the kernel done to it, getting rid of 'the Giant
Lock' that SMP in 4.x uses ... those changes are not back-patchable, since
then you'd have 5.x ... there are alot of changes to the 5.x kernel that
rely on those changes, and are therefore not *easily* back-patchable ...

Now, userland software is a totally different case, since they are rarely
"tied" to the kernel itself ...

Think of PostgreSQL as the kernel, not as the distro ... how many changes
from one kernel release ae easily patched into an older one, without
having to take alot of other baggage back with it ... ?

> I personally think it's a good idea for *someone* to do this, but I'll
> leave it to core to decide if they want to put the projects stamp of
> approval on it for any official community release.

I don't believe anyone would work against this, nor could I imagine that
anyone would think it was "a bad idea", I'm just curious as to how
possible it is to do ...



Re: Thoughts on maintaining 7.3

From
Tom Lane
Date:
"Marc G. Fournier" <scrappy@postgresql.org> writes:
> On Tue, 30 Sep 2003, Joshua D. Drake wrote:
>> Of course the theory being that we backport "some" features and fix
>> any bugs that we find?

> Not saying that if someone submit'd patches to v7.3, they wouldn't get
> applied ... only that, to date, the work/effort has been greater then the
> overall benefit, and nobody has step'd up to the plate to do it ...

The idea of backporting features scares me; I really doubt that you can
get enough beta-testing on a back branch to be confident that you
haven't broken anything with a feature addition.  In any case you'd be
quite limited in what you could do without forcing an initdb.

Another issue is that people expect dot-releases to be absolutely rock
solid.  If you start introducing new features then you considerably
increase the risk of introducing new bugs.  (I'm still embarrassed about
7.3.3's failure-to-start bug...)

Our past practice has been to back-port only bug fixes, and only
critical or low-risk ones at that.  I think this could be done in a more
thorough fashion, and it could be continued longer than we've done in
the past, but you shouldn't set the scope of the maintenance effort any
wider than that.
        regards, tom lane


Re: Thoughts on maintaining 7.3

From
Neil Conway
Date:
On Wed, 2003-10-01 at 09:14, Robert Treat wrote:
> Maybe I've mis-read Joshua's intentions, but I got the impression that
> this 7.3 maintainer would follow the patches list and backport patches
> whenever possible. This way folks coding for 7.4/7.5 can stay focused on
> that, but folks who can't upgrade to 7.4 for whatever reason can still
> get some features / improvements.

I don't think there's a need for a formalized "7.3 maintainer" -- if
individuals would like to see particular fixes backported to 7.3, they
can read pgsql-patches and post backported patches themselves. If
someone wants to go ahead and do that, I wouldn't complain. (Similarly,
if there is enough demand for a commercial company to do something
similar for their customers, that might also be a good idea).

However, I think it's a bad idea to backport any features into older
releases. The reason 7.3.x is really stable is precisely that it has had
a lot of testing and bugfixing work done, but no new features.
Furthermore, adding more features to 7.3.x reduces the incentive to
upgrade to 7.4, worsening the support problem: the more people using old
releases, the more demand there will be for backported features, leading
to more people using 7.3, leading to more demand for ...

(FWIW, I think that any energy we might spend on a 7.3 maintainer would
be better directed at improving the upgrade story...)

-Neil




Re: Thoughts on maintaining 7.3

From
Robert Treat
Date:
On Wed, 2003-10-01 at 10:49, Neil Conway wrote:
> On Wed, 2003-10-01 at 09:14, Robert Treat wrote:
> > Maybe I've mis-read Joshua's intentions, but I got the impression that
> > this 7.3 maintainer would follow the patches list and backport patches
> > whenever possible. This way folks coding for 7.4/7.5 can stay focused on
> > that, but folks who can't upgrade to 7.4 for whatever reason can still
> > get some features / improvements.
> 
> I don't think there's a need for a formalized "7.3 maintainer" -- if
> individuals would like to see particular fixes backported to 7.3, they
> can read pgsql-patches and post backported patches themselves. If
> someone wants to go ahead and do that, I wouldn't complain. (Similarly,
> if there is enough demand for a commercial company to do something
> similar for their customers, that might also be a good idea).

ok

> 
> However, I think it's a bad idea to backport any features into older
> releases. The reason 7.3.x is really stable is precisely that it has had
> a lot of testing and bugfixing work done, but no new features.

eh.. i could see some things, like tsearch2 or pg_autovacuum, which
afaik are almost if not completely compatible with 7.3, which will not
get back ported. Also fixes in some of the extra tools like psql could
be very doable, I know I had a custom psql for 7.2 that back patched the
\timing option and some of the pager fixes. now, weather that could be
done with stuff closer to core, i don't know...

btw personally i'm fine with these things not being packpatched, though
if someone came out with a 7.3 pg_autovacuum rpm, or a 7.3 psql rpm, i'm
sure a lot of people would use it. 

> Furthermore, adding more features to 7.3.x reduces the incentive to
> upgrade to 7.4, worsening the support problem: the more people using old
> releases, the more demand there will be for backported features, leading
> to more people using 7.3, leading to more demand for ...
> 
> (FWIW, I think that any energy we might spend on a 7.3 maintainer would
> be better directed at improving the upgrade story...)
> 

<homer>mmm. in place upgrade</homer>  

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL



Re: Thoughts on maintaining 7.3

From
Andrew Sullivan
Date:
On Tue, Sep 30, 2003 at 09:37:26AM -0700, Joshua D. Drake wrote:
> 
>  Of course the theory being that we backport "some" features and fix 
> any bugs that
> we find?

I would argue _very strongly_ against backporting features.

The backporting of features into the Linux kernel is an extremely
good analogy in this case.  Someone gets the clever idea that this or
that feature from 2.1/2.3/2.5 is desperately needed in 2.0/2.2/2.4
and merrily goes about adding all sorts of new cruft to the so-called
stable release.  As a result, we have plenty of examples of massive
filesystem corruption, modules that used to work and just plain don't
any more, sudden surprise hardware incompatibilites, &c.  All too
frequently releases in the "stable" series are one right atop the
other.  What's worse, all these additional features are bound up with
the important remote-root-type patches that make it into later
releases of the kernel.  As a result, it's a lot of work to compile a
known-safe and known-clean kernel for use on one's own machines. 

Patching an older release to fix critical, data-mangling bugs is one
thing.  But if people want the latest nifty feature backported to an
old release, let 'em pay the developer to do it in their private
source tree, and not force on the rest of us the job of sorting out
what crucial patches we need to apply to our old, pristine source of
PostgreSQL 7.3.4.  If you're really going to trust your database
software, you do not allow new features to be added after having
carefully teated all your applications against the system.

A

-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Afilias Canada                        Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8                                        +1 416 646 3304
x110



Re: Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
> eh.. i could see some things, like tsearch2 or pg_autovacuum, which
> afaik are almost if not completely compatible with 7.3, which will not
> get back ported. Also fixes in some of the extra tools like psql could
> be very doable, I know I had a custom psql for 7.2 that back patched the
> \timing option and some of the pager fixes. now, weather that could be
> done with stuff closer to core, i don't know...

Sure but businesses don't like to upgrade unless they have too. If we 
really want to attract more business to using PostgreSQL then they need
to feel like they don't have to upgrade every 12 months. Upgrading is 
expensive and it rarely goes as smoothly as a dump/restore.

> > Furthermore, adding more features to 7.3.x reduces the incentive to
> > upgrade to 7.4, worsening the support problem: the more people using old
> > releases, the more demand there will be for backported features, leading
> > to more people using 7.3, leading to more demand for ...

I am considering a time limited type thing. Not open ended. Something like 
18 or 24 months (max) from release of the new version. You can't expect
business to consider that timeframe during the development of the new 
release. They want to see the new release in action for a period of time.
They also want time to play with the new release without sacrificing 
support for the previous release.

> <homer>mmm. in place upgrade</homer>  

In reality in place upgrade will never work. Sure we can build a script 
that will deal with PostgreSQL itself, but not user defined data types, 
operators, functions etc... Those are all things that need stable time to 
migrate and test.

Sincerely,

Joshua Drake


> 
> Robert Treat
> 

-- 
Co-Founder
Command Prompt, Inc.
The wheel's spinning but the hamster's dead



Re: Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
> I would argue _very strongly_ against backporting features.

For massive features sure but an example of a feature that works
very well and easily with 7.3 is the preloading of libs.

Sincerely,

Joshua Drake

-- 
Co-Founder
Command Prompt, Inc.
The wheel's spinning but the hamster's dead



Re: Thoughts on maintaining 7.3

From
Robert Treat
Date:
On Wed, 2003-10-01 at 09:41, Marc G. Fournier wrote:
> On Wed, 1 Oct 2003, Robert Treat wrote:
> 
> > Several linux distros already do this for many packages, and personally
> > I've always been surprised that, given postgresql's major release
> > upgrade issues, that no commercial company has stepped in to offer this
> > in the past. I think what Joshua is wondering is how much cooperation
> > would he get from the community if he was willing to donate these
> > efforts back into project.
> 
> Using Linux/FreeBS/Insert OS Here as an example is like comparing apples
> to oranges ... take FreeBSD as an example, since I know it ... 5.x has had
> some *major* re-writes to the kernel done to it, getting rid of 'the Giant
> Lock' that SMP in 4.x uses ... those changes are not back-patchable, since
> then you'd have 5.x ... there are alot of changes to the 5.x kernel that
> rely on those changes, and are therefore not *easily* back-patchable ...
> 
> Now, userland software is a totally different case, since they are rarely
> "tied" to the kernel itself ...
> 

you missed my point. some distro's (red hat, suse, mandrake, etc..)
backpatch into their distributed packages separate of the packages
original source tree. this is great for folks who may want/need a new
change, but can't upgrade to latest source for some reason. 

> Think of PostgreSQL as the kernel, not as the distro ... how many changes
> from one kernel release ae easily patched into an older one, without
> having to take alot of other baggage back with it ... ?

I wasn't thinking of PostgreSQL as a distro, but actually I think that
view is somewhat valid, since there are enough add ons to core that one
could modify without having to make huge changes. 

As Tom pointed out, with the restriction of not being able to initdb,
you're probably pretty limited on what you can push back, but I think
there's still enough there that folks might want to look at it. (The
recent bugs in pltcl handling dropped columns come to mind, though maybe
Tom backpatched those? Cant recall)


Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL



Re: Thoughts on maintaining 7.3

From
Andrew Sullivan
Date:
On Wed, Oct 01, 2003 at 08:49:51AM -0700, Joshua D. Drake wrote:
> > I would argue _very strongly_ against backporting features.
> 
> For massive features sure but an example of a feature that works
> very well and easily with 7.3 is the preloading of libs.

Then let people patch the stable releases themselves, or pay
companies to produce such mini-branches (and thereby pay the cost of
the necessary testing, &c.).

How does one know in advance which set of "working well and easily"
features can be back ported and be sure not to break on some release
of IRIX, Solaris, AIX, or SCO?  Those are not platforms that get the
kind of kicking that Linux and FreeBSD do, but people are still
relying on the dot releases not to break anything on those platforms. 
I think that Postgres has a tradition that, when a release is stable,
it's _stable, man_ -- a tradition that other software (commercial or not)
should emulate.  I'd hate to see that go overboard in an attempt to
add features to the main releases.

A

-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Afilias Canada                        Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8                                        +1 416 646 3304
x110



Re: Thoughts on maintaining 7.3

From
Neil Conway
Date:
On Wed, 2003-10-01 at 11:48, Joshua D. Drake wrote:
> Sure but businesses don't like to upgrade unless they have too.

Granted, but maintaining old releases doesn't come at zero cost. It may
benefit some users, but the relevant question is whether that benefit is
worth the cost. The time someone spends backpatching changes into old
releases (and thoroughly testing those changes, and fixing the
regressions those changes cause) is presumably time that would otherwise
be spent improving the latest release of PostgreSQL.

So when the bugfix is important, has been well-tested, and is unlikely
to cause regressions, backpatching the change to previous stable
releases is a good idea. When this isn't the case (and even more so if
it's a feature and not a bugfix), I don't think it justifies the cost
(and the risk of destabilization) for most users.

In summary, I think the status quo is basically okay. Perhaps we should
backpatch a few more things, but we're basically in the right ballpark.

> In reality in place upgrade will never work.

Perhaps not, but the upgrade story can certainly be made more palatable.
I think that's the actual problem here -- rather than skating around it
by making it less necessary to do the upgrade in the first place, I
think our time is better spent making upgrades as painless as possible.
Just IMHO, of course (especially since I'm not particularly interested
in doing the work on the upgrade process myself).

-Neil




Re: Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
>With 7.4 I'm finding upgrading to be easier.  I'll likely upgrade out 
>production servers to 7.4.0 when it comes out and wind up skipping 7.3 
>altogether.
>  
>

Sure but I talking about people who are running 7.3 and are happy with 
it. The reality is that for probably 95% of the people
out there , there is no reason for 7.4. When you have existing system 
that works... why upgrade? That is one of the benefits
of Open Source stuff, we no longer get force into un-needed upgrade cycles.

We use PostgreSQL for everything, and I don't have any inclination to 
upgrade to 7.4 except that it is 7.4. I only have two
customers that will see any real benefit from going to 7.4. The rest are 
going to stay on 7.3 because they don't want:

A. The downtime
B. Unknown or unexpected problems
C. A brand new database
D. Migration costs

When you deal with the systems I do, the cost to a customer to migrate 
to 7.4 would be in the minimum of 10,000-20,000 dollars.
They start to ask why were upgrading with those numbers.

That is not to say that 7.4 is not worth it from a technical sense but 
for my customers, "If it ain't broke, don't fix it" is a mantra and
the reality is that 7.3 is not broke in their minds. There is 
limitations pg_dump/pg_restore has some issues, having to reindex the 
database
(which 7.4 doesn't fix), vacuum (which 7.4 doesn't fix) but my customers 
accept them as that.

Your mileage may vary but I can only talk from my experience.

Sincerely,

Joshua D. Drake





-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com
The most reliable support for the most reliable Open Source database.




Re: Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
>Maybe I've mis-read Joshua's intentions, but I got the impression that
>this 7.3 maintainer would follow the patches list and backport patches
>whenever possible. This way folks coding for 7.4/7.5 can stay focused on
>that, but folks who can't upgrade to 7.4 for whatever reason can still
>get some features / improvements. 
>  
>

And bug fixes but yes that is accurate.

Sincerely,

Joshua Drake


-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com
The most reliable support for the most reliable Open Source database.




Re: Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
>I don't believe anyone would work against this, nor could I imagine that
>anyone would think it was "a bad idea", I'm just curious as to how
>possible it is to do ...
>  
>

For most things probably not that possible. For things like:

Simple feature enhancements (preloading of libs)
Fixing pl/Language bugs (and making sure they still work on 7.3)
Buffer overflow fixes
Security problems (the fact that alter user/createuser with encrypted 
password ' will go into a .psqlhistory file is horrendous)
pg_dump/pg_restore enhancements

Would entirely be possible.

Sincerely,

Joshua rake




>
>---------------------------(end of broadcast)---------------------------
>TIP 7: don't forget to increase your free space map settings
>  
>

-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com
The most reliable support for the most reliable Open Source database.




Re: Thoughts on maintaining 7.3

From
Andrew Dunstan
Date:
Joshua D. Drake wrote:

>
> For most things probably not that possible. For things like:
>
> Simple feature enhancements (preloading of libs) 


How long is a piece of string? When does something stop being simple?

>
> Fixing pl/Language bugs (and making sure they still work on 7.3)
> Buffer overflow fixes 


Everyone seems to agree that bugs should be fixed.

>
> Security problems (the fact that alter user/createuser with encrypted 
> password ' will go into a .psqlhistory file is horrendous) 


you can avoid this in the create case by using createuser -P instead of 
psql. Or by using psql -c (although that might put stuff in your shell 
history ;-)
Maybe there's a good case for an alteruser counterpart to createuser.

>
> pg_dump/pg_restore enhancements
>
Which ones? If it is things known to be broken being fixed that comes 
under the bug fix category.

cheers

andrew



Re: Thoughts on maintaining 7.3

From
"scott.marlowe"
Date:
On Wed, 1 Oct 2003, Joshua D. Drake wrote:

> 
> >With 7.4 I'm finding upgrading to be easier.  I'll likely upgrade out 
> >production servers to 7.4.0 when it comes out and wind up skipping 7.3 
> >altogether.
> >  
> >
> 
> Sure but I talking about people who are running 7.3 and are happy with 
> it. The reality is that for probably 95% of the people
> out there , there is no reason for 7.4. When you have existing system 
> that works... why upgrade? That is one of the benefits
> of Open Source stuff, we no longer get force into un-needed upgrade cycles.

Agreed, we've been on 7.2 for a while now because it just works.  
The regex substring introduced in 7.3 was a pretty cool feature, for 
instance, that makes life easy.

> When you deal with the systems I do, the cost to a customer to migrate 
> to 7.4 would be in the minimum of 10,000-20,000 dollars.
> They start to ask why were upgrading with those numbers.

then maybe they would be willing to donate some small amount each ($500 or 
so) to pay for backporting issues.  Since mostly what I'd want on an older 
version would be bug / security fixes, that $500 should go a long way 
towards backporting.

> That is not to say that 7.4 is not worth it from a technical sense but 
> for my customers, "If it ain't broke, don't fix it" is a mantra and
> the reality is that 7.3 is not broke in their minds. There is 
> limitations pg_dump/pg_restore has some issues, having to reindex the 
> database
> (which 7.4 doesn't fix), vacuum (which 7.4 doesn't fix) but my customers 
> accept them as that.

I was under the imporession that 7.4 removed the need to reindex caused by 
monotonically increasing index keys, no?

> Your mileage may vary but I can only talk from my experience.

Yeah, I would rather have had more back porting to 7.2 because there were 
tons of little improvements form 7.2 to 7.3 I could have used while 
waiting for 7.4's improved pg_dumpall to come along.

Cheers:-)



Re: Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
>then maybe they would be willing to donate some small amount each ($500 or 
>so) to pay for backporting issues.  Since mostly what I'd want on an older 
>version would be bug / security fixes, that $500 should go a long way 
>towards backporting.
>  
>
Sure.

>I was under the imporession that 7.4 removed the need to reindex caused by 
>monotonically increasing index keys, no?
>  
>

Someone else brought that up. Maybe I am misunderstanding something but 
it was my understanding that 7.4 fixes alot of
the issues but one of the issues (index bloat) although improved is not 
entirely fixed and thus we would still need reindex?
Tom am I on crack?


>Yeah, I would rather have had more back porting to 7.2 because there were 
>tons of little improvements form 7.2 to 7.3 I could have used while 
>waiting for 7.4's improved pg_dumpall to come along.
>  
>

Well there ya go :)

Sincerely,

Joshua Drake




>Cheers:-)
>  
>

-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com
Editor-N-Chief - PostgreSQl.Org - http://www.postgresql.org




Re: Thoughts on maintaining 7.3

From
Alvaro Herrera
Date:
On Wed, Oct 01, 2003 at 11:53:12AM -0700, Joshua D. Drake wrote:
> 
> >Eh?  In 7.4 you should not need to reindex.
>
> I thought tom was saying that the index bloat was "better" in 7.4 but it 
> was not gone... thus we would still need reindex yes?

The problem has been "corrected enough" for there to be no need to
reindex, AFAIK.

I think what Tom is concerned about is that this hasn't been tested
enough with big datasets.  Also there a little loss of index pages but
it's much less (orders of magnitude, I think) than what was before.
This is because the index won't shrink "vertically".

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"I dream about dreams about dreams", sang the nightingale
under the pale moon (Sandman)


Re: Thoughts on maintaining 7.3

From
Robert Treat
Date:
On Wed, 2003-10-01 at 15:31, Joshua D. Drake wrote:
> 
> >then maybe they would be willing to donate some small amount each ($500 or 
> >so) to pay for backporting issues.  Since mostly what I'd want on an older 
> >version would be bug / security fixes, that $500 should go a long way 
> >towards backporting.
> >  
> >
> Sure.
> 

and the question as i thought was being discussed (or should be
discussed) was what is the level of interest in having this work kept in
the community cvs tree vs. someone else's quasi-forked branch... 

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL



Re: Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
>and the question as i thought was being discussed (or should be
>discussed) was what is the level of interest in having this work kept in
>the community cvs tree vs. someone else's quasi-forked branch... 
>  
>

It is my thinking that regardless of commercial backing that the 
PostgreSQL project as a whole would gain better validity
within the commercial world if we maintained releases longer.

It is really irrelevant whether somebody pays me or you 500.00 buck to 
make a patch and submit it to the tree. What is
relevant IMHO is that the community is backing a release for longer than 
12-18 months.

Yes a commercial company could just pick it up and say ... hey we will 
support it for x (Mammoth 7.3.4 is supported until 2005 for example)
but I was more looking at this from an overall community perspective.

Sincerely,

Joshua D. Drake




>Robert Treat
>  
>

-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com
Editor-N-Chief - PostgreSQl.Org - http://www.postgresql.org




Re: Thoughts on maintaining 7.3

From
Tom Lane
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> ... having to reindex the database (which 7.4 doesn't fix),

It's supposed to fix it.  What are you expecting not to be fixed?
        regards, tom lane


Re: Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
Hello,
  When I was reading hackers about the fixes you had made, it stated 
that the index bloat problems should be better. I took
that as meaning that although it won't be required nearly as often, we 
still may need to reindex occassionaly. It was later
pointed out to me that this may not be the case, to wit I responded: 
Tom, am I on crack?

Sincerely,

Joshua Drake


Tom Lane wrote:

>"Joshua D. Drake" <jd@commandprompt.com> writes:
>  
>
>>... having to reindex the database (which 7.4 doesn't fix),
>>    
>>
>
>It's supposed to fix it.  What are you expecting not to be fixed?
>
>            regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 4: Don't 'kill -9' the postmaster
>  
>

-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com
Editor-N-Chief - PostgreSQl.Org - http://www.postgresql.org




Re: Thoughts on maintaining 7.3

From
Tom Lane
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
>    When I was reading hackers about the fixes you had made, it stated 
> that the index bloat problems should be better. I took
> that as meaning that although it won't be required nearly as often, we 
> still may need to reindex occassionaly.

The critical word there is "may".  The index compression code covers
some cases and not others.  Depending on your usage pattern you might or
might not ever need to reindex.  I *think* that most people won't need
to reindex any more, but I'm waiting on field reports from 7.4 to find
out for sure.

In any case, people who aren't upgrading from 7.3 because they think
7.4 won't help them are making a self-fulfilling negative prophecy.
        regards, tom lane


Re: Thoughts on maintaining 7.3

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> I think what Tom is concerned about is that this hasn't been tested
> enough with big datasets.  Also there a little loss of index pages but
> it's much less (orders of magnitude, I think) than what was before.
> This is because the index won't shrink "vertically".

The fact that we won't remove levels shouldn't be meaningful at all ---
I mean, if the index was once big enough to require a dozen btree
levels, and you delete everything, are you going to be upset that it
drops to 13 pages rather than 2?  I doubt it.

The reason I'm waffling about whether the problem is completely fixed or
not is that the existing code will only remove-and-recycle completely
empty btree pages.  As long as you have one key left on a page it will
stay there.  So you could end up with ridiculously low percentage-filled
situations.  This could be fixed by collapsing together adjacent
more-than-half-empty pages, but we ran into a lot of problems trying to
do that in a concurrent fashion.  So I'm waiting to find out if real
usage patterns have a significant issue with this or not.

For example, if you have a timestamp index and you routinely clean out
all entries older than N-days-ago, you won't have a problem in 7.4.
If your pattern is to delete nine out of every ten entries (maybe you
drop minute-by-minute entries and keep only hourly entries after awhile)
then you might find the index loading getting unpleasantly low.  We'll
have to see whether it's a problem in practice.  I'm willing to revisit
the page-merging problem if it's proven to be a real practical problem,
but it looked hard enough that I think it's more profitable to spend the
development effort elsewhere until it's proven necessary.
        regards, tom lane


Re: Thoughts on maintaining 7.3

From
Tom Lane
Date:
Robert Treat <xzilla@users.sourceforge.net> writes:
> and the question as i thought was being discussed (or should be
> discussed) was what is the level of interest in having this work kept in
> the community cvs tree vs. someone else's quasi-forked branch... 

I see no reason that the maintenance shouldn't be done in the community
CVS archive.  The problem is where to find the people who want to do it.
Of course we have to trust those people enough to give them write access
to the community archive, but if they can't be trusted with that, one
wonders who's going to trust their work product either.
        regards, tom lane


Re: Thoughts on maintaining 7.3

From
Rod Taylor
Date:
> For example, if you have a timestamp index and you routinely clean out
> all entries older than N-days-ago, you won't have a problem in 7.4.
> If your pattern is to delete nine out of every ten entries (maybe you
> drop minute-by-minute entries and keep only hourly entries after awhile)
> then you might find the index loading getting unpleasantly low.  We'll
> have to see whether it's a problem in practice.  I'm willing to revisit
> the page-merging problem if it's proven to be a real practical problem,
> but it looked hard enough that I think it's more profitable to spend the
> development effort elsewhere until it's proven necessary.

A pattern I have on a few tables is to record daily data.  After a
period of time, create an entry for a week that is the sums of 7 days,
after another period of time compress 4 weeks into a month.

Index is on the date representing the block. It's a new insert, but
would go onto the old page. Anyway, I don't have that much data (~20M
rows) -- but I believe it is a real-world example of this pattern.

Re: Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
Hello,
 Possible scenario for maintaining 7.3:
 Only one or two committers  using a two stage cvs... one stage for 
testing (not including sandbox), one stage for commit. Scheduled releases based on non-critical fixes. Quarterly? Of
course
 
critical fixes should be released as soon as plausible.
 Separate mailing list for 7.3 issues, concerns etc... Which would help 
develop it's own temporary community.

Thoughts?

Joshua D. Drake


Tom Lane wrote:

>Robert Treat <xzilla@users.sourceforge.net> writes:
>  
>
>>and the question as i thought was being discussed (or should be
>>discussed) was what is the level of interest in having this work kept in
>>the community cvs tree vs. someone else's quasi-forked branch... 
>>    
>>
>
>I see no reason that the maintenance shouldn't be done in the community
>CVS archive.  The problem is where to find the people who want to do it.
>Of course we have to trust those people enough to give them write access
>to the community archive, but if they can't be trusted with that, one
>wonders who's going to trust their work product either.
>
>            regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 3: if posting/reading through Usenet, please send an appropriate
>      subscribe-nomail command to majordomo@postgresql.org so that your
>      message can get through to the mailing list cleanly
>  
>

-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com
Editor-N-Chief - PostgreSQl.Org - http://www.postgresql.org




Re: Thoughts on maintaining 7.3

From
Bruno Wolff III
Date:
On Thu, Oct 02, 2003 at 10:47:06 -0700, "Joshua D. Drake" <jd@commandprompt.com> wrote:
> Hello,
> 
>  Possible scenario for maintaining 7.3:
> 
>  Only one or two committers  using a two stage cvs... one stage for 
> testing (not including sandbox), one stage for commit.
>  Scheduled releases based on non-critical fixes. Quarterly? Of course 
> critical fixes should be released as soon as plausible.
> 
>  Separate mailing list for 7.3 issues, concerns etc... Which would help 
> develop it's own temporary community.
> 
> Thoughts?

It might be better to split into two different trees. One just gets bug fixes,
the other gets bug fixes plus enhancements that won't require an initdb.


Re: Thoughts on maintaining 7.3

From
Andrew Sullivan
Date:
On Thu, Oct 02, 2003 at 02:15:33PM -0500, Bruno Wolff III wrote:
> It might be better to split into two different trees. One just gets bug fixes,
> the other gets bug fixes plus enhancements that won't require an initdb.

Yes, please.  Please, please do not force all users to accept new
features in "stable" trees.  

-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Afilias Canada                        Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8                                        +1 416 646 3304
x110



Re: Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
>Yes, please.  Please, please do not force all users to accept new
>features in "stable" trees.  
>  
>
What if the feature does break compatibility with old features?
What if it is "truly" a new feature?

One example would be that we are considering reworking
pg_dump/restore a bit to support batch uploads and interactive mode.
It would not break compatibility with anything but would
greatly enhance one's ability to actually backup and restore
large volume sets.

Sincerely,

Joshua Drake





-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC - S/JDBC
Postgresql support, programming, shared hosting and dedicated hosting.
+1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL.Org - Editor-N-Chief - http://www.postgresql.org




Re: Thoughts on maintaining 7.3

From
Doug McNaught
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:

> >Yes, please.  Please, please do not force all users to accept new
> > features in "stable" trees.
> What if the feature does break compatibility with old features?
> What if it is "truly" a new feature?
> 
> One example would be that we are considering reworking
> pg_dump/restore a bit to support batch uploads and interactive mode.
> It would not break compatibility with anything but would
> greatly enhance one's ability to actually backup and restore
> large volume sets.

Well, since those are separate programs and not intimately tied to the
backend, you could distribute them separately for people who need
them...

-Doug


Re: Thoughts on maintaining 7.3

From
"Marc G. Fournier"
Date:

On Fri, 3 Oct 2003, Joshua D. Drake wrote:

>
> >Yes, please.  Please, please do not force all users to accept new
> >features in "stable" trees.
> >
> >
> What if the feature does break compatibility with old features?
> What if it is "truly" a new feature?
>
> One example would be that we are considering reworking
> pg_dump/restore a bit to support batch uploads and interactive mode.
> It would not break compatibility with anything but would
> greatly enhance one's ability to actually backup and restore
> large volume sets.

for stuff like this, why not just break off a gborg project for it,
seperate from the distros?  We could pull in the changes as beta starts on
a dev cycle, but then pg_dump/pg_restore could be maintained on its own
release cycle, and you could easily get 'back features' in like this ...



Re: Thoughts on maintaining 7.3

From
Bruce Momjian
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> > I think what Tom is concerned about is that this hasn't been tested
> > enough with big datasets.  Also there a little loss of index pages but
> > it's much less (orders of magnitude, I think) than what was before.
> > This is because the index won't shrink "vertically".
> 
> The fact that we won't remove levels shouldn't be meaningful at all ---
> I mean, if the index was once big enough to require a dozen btree
> levels, and you delete everything, are you going to be upset that it
> drops to 13 pages rather than 2?  I doubt it.
> 
> The reason I'm waffling about whether the problem is completely fixed or
> not is that the existing code will only remove-and-recycle completely
> empty btree pages.  As long as you have one key left on a page it will
> stay there.  So you could end up with ridiculously low percentage-filled
> situations.  This could be fixed by collapsing together adjacent
> more-than-half-empty pages, but we ran into a lot of problems trying to
> do that in a concurrent fashion.  So I'm waiting to find out if real
> usage patterns have a significant issue with this or not.

Though the new code will put empty index pages into the free-space map,
will it also shrink the index file to remove those pages?  For example,
if I have 200M rows in a table, and I delete all of them except 100,
does the index shrink, or the pages just become available for reuse. 
With VACUUM FULL, we have a way to shrink the heap.  Do we shrink the
index?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Thoughts on maintaining 7.3

From
Bruce Momjian
Date:
Andrew Sullivan wrote:
> On Thu, Oct 02, 2003 at 02:15:33PM -0500, Bruno Wolff III wrote:
> > It might be better to split into two different trees. One just gets bug fixes,
> > the other gets bug fixes plus enhancements that won't require an initdb.
> 
> Yes, please.  Please, please do not force all users to accept new
> features in "stable" trees.  

One word of warning --- PostgreSQL has grown partially because we gain
people but rarely lose them, and our stable releases help that.  I was
talking to someone about OS/X recently and the frequent breakage in
their OS releases is hurting their adoption rate --- you hit one or two
buggy releases in a row, and you start thinking about using something
else --- same is true for buggy Linux kernels, which Andrew described
earlier.

If we are going to back-patch more aggressively, we _have_ to be sure
that those back-patched releases have the same quality as all our other
releases.

I know people already know this, but it is worth mentioning specifically
--- my point is that more agressive backpatching has risks.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Thoughts on maintaining 7.3

From
"Nigel J. Andrews"
Date:
On Fri, 3 Oct 2003, Andrew Sullivan wrote:

> On Thu, Oct 02, 2003 at 02:15:33PM -0500, Bruno Wolff III wrote:
> > It might be better to split into two different trees. One just gets bug fixes,
> > the other gets bug fixes plus enhancements that won't require an initdb.
> 
> Yes, please.  Please, please do not force all users to accept new
> features in "stable" trees.  

I wanted to say something similar earlier in this thread.

To me the stable branches are not for feature introduction. If features are
going to be introduced it is better to not have them applied in a manner which
means a pure bug fix only version can't be obtained. Obviously this means
having two branches if features are going to be introduced.

I agree sometimes one looks at new developments and thinks how good it would be
to have that feature, imagine what it'll be like when tablespaces are
introduced and you're using the previous stable version, but those features
need to be kept separate from the version that fixes that particularly nasty
index corruption someone only provided a fix for 12 months after the version
you have based your system around was released. One could argue that what is
really needed is a collection of patches providing a pick and choose facility
for features, with dependecies where unavoidable of course. The patches being
applicable to the latest bug patched version of the stable branch.

As an example take tsearch2. If that were core code, not optional, contrib
material, and one was running a 7.3 series server but wanted the nifty features
of tsearch2 instead of tsearch, would you expect all people upgrading within
the stable 7.3 branch for bug fixes to be forced to use tsearch2 and not
tsearch?


-- 
Nigel J. Andrews



Re: Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
>If we are going to back-patch more aggressively, we _have_ to be sure
>that those back-patched releases have the same quality as all our other
>releases.
>  
>
I know that I am probably being semantic here but I in know way want to 
be more aggressive with back patching. My
thoughts for 98% of things in on bugfixes within the existing tree only. 
Although I am sure for some things we can
use (at least as a guide) code being written in 7.4.  My whole purpose 
in bringing the idea up is to increase the adoption rate.

My thought isn't to be more agressive per say, but more responsible in 
our releases. Like I said, I may be, being semantic.

Sincerely,

Joshua Drake





>I know people already know this, but it is worth mentioning specifically
>--- my point is that more agressive backpatching has risks.
>
>  
>

-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com
Editor-N-Chief - PostgreSQl.Org - http://www.postgresql.org




Re: Thoughts on maintaining 7.3

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Though the new code will put empty index pages into the free-space map,
> will it also shrink the index file to remove those pages?

If there are free pages at the end, yes --- but it won't move pages
around.  This is about the same story as for plain VACUUM ...
        regards, tom lane


Re: Thoughts on maintaining 7.3

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Though the new code will put empty index pages into the free-space map,
> > will it also shrink the index file to remove those pages?
> 
> If there are free pages at the end, yes --- but it won't move pages
> around.  This is about the same story as for plain VACUUM ...

I know indexes behave the same as heap for vacuum.  My point was that
the vacuum full case is different.  Vacuum full moves heap tuples from
the end to fill slots and then frees the pages at the end via
truncation.  (100% compaction, guaranteed.)  We can't move index tuples
around like that, of course, so that leaves us with partially filled
pages.

Do we move empty index pages to the end before truncation during vacuum
full?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Thoughts on maintaining 7.3

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Do we move empty index pages to the end before truncation during vacuum
> full?

No.  You'd be better off using REINDEX for that, I think.  IIRC we have
speculated about making VAC FULL fix the indexes via REINDEX rather than
indexbulkdelete.
        regards, tom lane


Re: Thoughts on maintaining 7.3

From
Alvaro Herrera
Date:
On Sat, Oct 04, 2003 at 11:41:17AM -0400, Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Do we move empty index pages to the end before truncation during vacuum
> > full?
> 
> No.  You'd be better off using REINDEX for that, I think.  IIRC we have
> speculated about making VAC FULL fix the indexes via REINDEX rather than
> indexbulkdelete.

I can't agree with that idea.  Imagine having to VACUUM FULL a huge
table.  Not only it will take the lot required to do the VACUUM in the
heap itself, it will also have to rebuild all indexes from scratch.  I
think there are scenarios where the REINDEX will be much worse, say when
there are not too many deleted tuples (but in that case, why is the user
doing VACUUM FULL in the first place?).  Of course there are also
scenario where the opposite is true.

I wonder if VACUUM FULL could choose what method to use based on some
statistics.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Vivir y dejar de vivir son soluciones imaginarias.
La existencia está en otra parte" (Andre Breton)


Re: Thoughts on maintaining 7.3

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Do we move empty index pages to the end before truncation during vacuum
> > full?
> 
> No.  You'd be better off using REINDEX for that, I think.  IIRC we have
> speculated about making VAC FULL fix the indexes via REINDEX rather than
> indexbulkdelete.

I guess my point is that if you forget to run regular vacuum for a
month, then realize the problem, you can just do a VACUUM FULL and the
heap is back to a perfect state as if you had been running regular
vacuum all along.  That is not true of indexes.  It would be nice if it
would.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Thoughts on maintaining 7.3

From
Alvaro Herrera
Date:
On Sat, Oct 04, 2003 at 11:17:09PM -0400, Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > > Do we move empty index pages to the end before truncation during vacuum
> > > full?
> > 
> > No.  You'd be better off using REINDEX for that, I think.  IIRC we have
> > speculated about making VAC FULL fix the indexes via REINDEX rather than
> > indexbulkdelete.
> 
> I guess my point is that if you forget to run regular vacuum for a
> month, then realize the problem, you can just do a VACUUM FULL and the
> heap is back to a perfect state as if you had been running regular
> vacuum all along.  That is not true of indexes.  It would be nice if it
> would.

In this scenario, the VACUUM FULL-does-REINDEX idea would be the perfect
fit because it will probably be much faster than doing indexbulkdelete.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Endurecerse, pero jamás perder la ternura" (E. Guevara)


Re: Thoughts on maintaining 7.3

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> On Sat, Oct 04, 2003 at 11:41:17AM -0400, Tom Lane wrote:
>> No.  You'd be better off using REINDEX for that, I think.  IIRC we have
>> speculated about making VAC FULL fix the indexes via REINDEX rather than
>> indexbulkdelete.

> I can't agree with that idea.

Why not?  There is plenty of anecdotal evidence in the archives saying
that it's faster to drop indexes, VACUUM FULL, recreate indexes than
to VACUUM FULL with indexes in place.  Most of those reports date from
before we had the lazy-vacuum alternative, but I don't think that
renders them less relevant.

> Imagine having to VACUUM FULL a huge
> table.  Not only it will take the lot required to do the VACUUM in the
> heap itself, it will also have to rebuild all indexes from scratch.

A very large chunk of VACUUM FULL's runtime is spent fooling with the
indexes.  Have you looked at the code in any detail?  It goes like this:

1. Scan heap looking for dead tuples and free space.

2. Make a pass over the indexes to delete index entries for dead tuples.

3. Copy remaining live tuples to lower-numbered pages to compact heap.
3a.  Every time we copy a tuple, make new index entries pointing to its    new location.  (The old index entries still
remain,though.)
 

4. Commit transaction so that new copies of moved tuples are good and  old ones are not.

5. Make a pass over the indexes to delete index entries for old copies  of moved tuples.

When there are only a few tuples being moved, this isn't too bad of a
strategy.  But when there are lots, steps 2, 3a, and 5 represent a huge
amount of work.  What's worse, step 3a swells the index well beyond its
final size.  This used to mean permanent index bloat.  Nowadays step 5
will be able to recover some of that space --- but not at zero cost.

I think it's entirely plausible that dropping steps 2, 3a, and 5 in
favor of an index rebuild at the end could be a winner.

> I think there are scenarios where the REINDEX will be much worse, say when
> there are not too many deleted tuples (but in that case, why is the user
> doing VACUUM FULL in the first place?).

Yeah, I think that's exactly the important point.  These days there's
not a lot of reason to do VACUUM FULL unless you have a major amount of
restructuring to do.  I would once have favored maintaining two code
paths with two strategies, but now I doubt it's worth the trouble.
(Or I should say, we have two code paths, the other being lazy VACUUM
--- do we need three?)
        regards, tom lane


Re: Thoughts on maintaining 7.3

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> No.  You'd be better off using REINDEX for that, I think.

> I guess my point is that if you forget to run regular vacuum for a
> month, then realize the problem, you can just do a VACUUM FULL and the
> heap is back to a perfect state as if you had been running regular
> vacuum all along.  That is not true of indexes.  It would be nice if it
> would.

A VACUUM FULL that invoked REINDEX would accomplish that *better* than
one that didn't, because of the problem of duplicate entries for moved
tuples.  See my response just now to Alvaro.
        regards, tom lane


Re: Thoughts on maintaining 7.3

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> No.  You'd be better off using REINDEX for that, I think.
> 
> > I guess my point is that if you forget to run regular vacuum for a
> > month, then realize the problem, you can just do a VACUUM FULL and the
> > heap is back to a perfect state as if you had been running regular
> > vacuum all along.  That is not true of indexes.  It would be nice if it
> > would.
> 
> A VACUUM FULL that invoked REINDEX would accomplish that *better* than
> one that didn't, because of the problem of duplicate entries for moved
> tuples.  See my response just now to Alvaro.

Right, REINDEX is closer to what you expect VACUUM FULL to be doing ---
it mimicks the heap result of full compaction.

I think Alvero's point is that if you are doing VACUUM FULL on a large
table with only a few expired tuples, the REINDEX could take a while,
which would seem strange considering you only have a few expired tuples
--- maybe we should reindex only if +10% of the heap rows are expired,
or the index contains +10% empty space, or something like that.

Of course, that is very abitrary, but only VACUUM knows how many rows it
is moving --- the user typically will not know that.

In an extreme case with always REINDEX, I can imagine a site that is
doing only VACUUM FULL at night, but no regular vacuums, and they find
they can't do VACUUM FULL at night anymore because it is taking too
long.  By doing REINDEX always, we eliminate some folks are are happy
doing VACUUM FULL at night, because very few tuples are expired.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Thoughts on maintaining 7.3

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> By doing REINDEX always, we eliminate some folks are are happy
> doing VACUUM FULL at night, because very few tuples are expired.

But if they have very few tuples expired, why do they need VACUUM FULL?
Seems to me that VACUUM FULL should be designed to cater to the case
of significant updates.
        regards, tom lane


Re: Thoughts on maintaining 7.3

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > By doing REINDEX always, we eliminate some folks are are happy
> > doing VACUUM FULL at night, because very few tuples are expired.
> 
> But if they have very few tuples expired, why do they need VACUUM FULL?
> Seems to me that VACUUM FULL should be designed to cater to the case
> of significant updates.

Right, they could just run vacuum, and my 10% idea was bad because the
vacuum full would take an unpredictable amount of time to run depending
on whether it does a reindex.

One idea would be to allow VACUUM, VACUUM DATA (no reindex), and VACUUM
FULL (reindex).  However, as you said, we might not need VACUUM DATA ---
I am just not sure.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Thoughts on maintaining 7.3

From
Alvaro Herrera
Date:
On Sat, Oct 04, 2003 at 11:53:49PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:

> > Imagine having to VACUUM FULL a huge
> > table.  Not only it will take the lot required to do the VACUUM in the
> > heap itself, it will also have to rebuild all indexes from scratch.
> 
> A very large chunk of VACUUM FULL's runtime is spent fooling with the
> indexes.  Have you looked at the code in any detail?  It goes like this:

Hmm.  No, I haven't looked at that code too much.  You are probably
right, of course.  Maybe the indexes could be dropped altogether and
then recreated after the vacuum is over, similar to what the cluster
code does.  This would be similar to REINDEX, I suppose.  (I haven't
actually looked at the REINDEX code either.)


> > I think there are scenarios where the REINDEX will be much worse, say when
> > there are not too many deleted tuples (but in that case, why is the user
> > doing VACUUM FULL in the first place?).
> 
> Yeah, I think that's exactly the important point.  These days there's
> not a lot of reason to do VACUUM FULL unless you have a major amount of
> restructuring to do.  I would once have favored maintaining two code
> paths with two strategies, but now I doubt it's worth the trouble.
> (Or I should say, we have two code paths, the other being lazy VACUUM
> --- do we need three?)

There are two points that could be made here:

1. We do not want users having to think too hard about what kind of
VACUUM they want.  This probably botches Bruce's idea of an additional
VACUUM DATA command.

2. We do not want to expose the VACUUM command family at all.  The
decisions about what code paths should be taken are best left to the
backend-integrated vacuum daemon, which has probably much better
information than users.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"You knock on that door or the sun will be shining on places inside you
that the sun doesn't usually shine" (en Death: "The High Cost of Living")


Re: Thoughts on maintaining 7.3

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> > Yeah, I think that's exactly the important point.  These days there's
> > not a lot of reason to do VACUUM FULL unless you have a major amount of
> > restructuring to do.  I would once have favored maintaining two code
> > paths with two strategies, but now I doubt it's worth the trouble.
> > (Or I should say, we have two code paths, the other being lazy VACUUM
> > --- do we need three?)
> 
> There are two points that could be made here:
> 
> 1. We do not want users having to think too hard about what kind of
> VACUUM they want.  This probably botches Bruce's idea of an additional
> VACUUM DATA command.
> 
> 2. We do not want to expose the VACUUM command family at all.  The
> decisions about what code paths should be taken are best left to the
> backend-integrated vacuum daemon, which has probably much better
> information than users.

Agreed.  We need to head in a direction where vacuum is automatic.  I
guess the question is whether an automatic method would ever user VACUUM
DATA?

I just did a simple test.  I did:test=> CREATE TABLE test (x INT, y TEXT);CREATE TABLEtest=> INSERT INTO test VALUES
(1,'lk;jasdflkjlkjawsiopfjqwerfokjasdflkj');INSERT 17147 1test=> INSERT INTO test SELECT * FROM test;{ repeat until 65k
rowsare inserted, so there are 131k rows}test=> INSERT INTO test SELECT 2, y FROM test;INSERT 0 131072test=> DELETE
FROMtest WHERE x=1;DELETE 131072test=> \timingTiming is on.test=> VACUUM FULL;VACUUMTime: 4661.82 mstest=> INSERT INTO
testSELECT 3, y FROM test;INSERT 0 131072Time: 7925.57 mstest=> CREATE INDEX i ON test(x);CREATE INDEXTime: 3337.96
mstest=>DELETE FROM test WHERE x=2;DELETE 131072Time: 3204.18 mstest=> VACUUM FULL;VACUUMTime: 10523.69 mstest=>
REINDEXTABLE test;REINDEXTime: 2193.14 ms
 


Now, as I understand it, this is the worst-case for VACUUM FULL.  What
we have here is 4661.82 for VACUUM FULL without an index, and 10523.69
for VACUUM FULL with an index, and REINDEX takes 2193.14.  If we assume
VACUUM FULL with REINDEX will equal the time of VACUUM without the index
plus the REINDEX time, we have 4661.82 + 2193.14, or 6854.96 vs.
10523.69, so clearly VACUUM REINDEX is a win for this case.  What I
don't know is what percentage of a table has to be expired for REINDEX
to be a win.  I assume if only one row is expired, you get 4661.82 +
2193.14 vs. just 4661.82, roughly.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Thoughts on maintaining 7.3

From
Andrew Sullivan
Date:
On Fri, Oct 03, 2003 at 09:17:16AM -0700, Joshua D. Drake wrote:
> >
> What if the feature does break compatibility with old features?
> What if it is "truly" a new feature?

There is _no_ mechanism in the community right now for testing all
these new features in the so-called stable tree.

I have lately been taking the position that Linux is only a
second-best choice for production use, precisely because of the
constant introduction of shiny new features in the supposed stable
branch.  Without using something like RHAS or Debian stable, I think
one is asking for trouble.  One needs to do a great deal of testing
on any new kernel release -- even a dot release -- just to be
reasonably confident that it won't eat filesystems, introduce some
new incompatibility, &c.  From my point of view, this is similar to
the position one is in with Windows: you need to quintuple-check
every security patch and hot fix, because it is as likely as not to
break something very badly.

One of the things I have always liked about PostgreSQL is that a
stable release really is stable.  Except for mighty serious, low risk
items, nothing gets backported to the stable releases.  If _you_ want
to back port things, go nuts: the source is there.  But the main
release does not get changed that way.

That's a good thing.  I happen to oversee one of those installations
you mentioned, where it costs us lots of money to upgrade.  If people
start adding features to the stable tree, it will cost me almost as
much to keep up with the small, important, must-be-applied fixes as
it would to upgrade.  Because I know those features won'r receive the
testing they really need, I'll have no choice but to hammer on them
all myself.  In the current situation, those happen infrequently
enough that I can do it.  But if one starts introducing all sorts of
extra features, I'll have to test _all_ of it.  Or start maintaining
a completely separate tree into which I put only the few patches I
want.

Why should everyone pay that cost for the sake of those people who
want to eat their cake and have it too?  If you want the new
features, you gotta pay the cost of the upgrade, or pay someone else
to support the new features for you.

A

-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Afilias Canada                        Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8                                        +1 416 646 3304
x110



Re: Thoughts on maintaining 7.3

From
Bruce Momjian
Date:
Andrew Sullivan wrote:
> On Fri, Oct 03, 2003 at 09:17:16AM -0700, Joshua D. Drake wrote:
> > >
> > What if the feature does break compatibility with old features?
> > What if it is "truly" a new feature?
> 
> There is _no_ mechanism in the community right now for testing all
> these new features in the so-called stable tree.
> 
> I have lately been taking the position that Linux is only a
> second-best choice for production use, precisely because of the
> constant introduction of shiny new features in the supposed stable
> branch.  Without using something like RHAS or Debian stable, I think
> one is asking for trouble.  One needs to do a great deal of testing

Agreed.  Great Bridge was going to test our releases and only distribute
the good ones --- obviously they were thinking of Linux kernels and not
PostgreSQL.  You almost need a commercial company to do testing with
Linux kernels.   PostgreSQL doesn't require this, and I think Linux is
popular _in_ _spite_ of their buggy backported kernels (odd numbers?),
not because of it.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Thoughts on maintaining 7.3

From
Andrew Dunstan
Date:
Bruce Momjian wrote:

>Andrew Sullivan wrote:
>  
>
>>On Fri, Oct 03, 2003 at 09:17:16AM -0700, Joshua D. Drake wrote:
>>    
>>
>>>What if the feature does break compatibility with old features?
>>>What if it is "truly" a new feature?
>>>      
>>>
>>There is _no_ mechanism in the community right now for testing all
>>these new features in the so-called stable tree.
>>
>>I have lately been taking the position that Linux is only a
>>second-best choice for production use, precisely because of the
>>constant introduction of shiny new features in the supposed stable
>>branch.  Without using something like RHAS or Debian stable, I think
>>one is asking for trouble.  One needs to do a great deal of testing
>>    
>>
>
>Agreed.  Great Bridge was going to test our releases and only distribute
>the good ones --- obviously they were thinking of Linux kernels and not
>PostgreSQL.  You almost need a commercial company to do testing with
>Linux kernels.   PostgreSQL doesn't require this, and I think Linux is
>popular _in_ _spite_ of their buggy backported kernels (odd numbers?),
>not because of it.
>
>  
>
The reason there is a lot of backporting in Linux kernels is that there 
is such a lot of time (2 years or more) between major kernel releases. 
This is not surprising given the kernel's complexity, but it is not the 
case here, with releases every 6 months or so.

In general I agree that only true bug fixes should go in later versions 
of official releases after they are out - if anyone wants to backpatch 
features they can, but then they wear the risk. Do it on GBorg if you 
like, but not in the main tree.

cheers

andrew



Re: Thoughts on maintaining 7.3

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> >Agreed.  Great Bridge was going to test our releases and only distribute
> >the good ones --- obviously they were thinking of Linux kernels and not
> >PostgreSQL.  You almost need a commercial company to do testing with
> >Linux kernels.   PostgreSQL doesn't require this, and I think Linux is
> >popular _in_ _spite_ of their buggy backported kernels (odd numbers?),
> >not because of it.
> >
> >  
> >
> The reason there is a lot of backporting in Linux kernels is that there 
> is such a lot of time (2 years or more) between major kernel releases. 
> This is not surprising given the kernel's complexity, but it is not the 
> case here, with releases every 6 months or so.

But the kernel goes through this reliable/unreliable cycle --- they
would be better off just making the old kernel more and more reliable
and focusing on the new kernel for features.

The reliable/unreliable cycle will kill your user base.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Thoughts on maintaining 7.3

From
Bruce Momjian
Date:
Joshua D. Drake wrote:
> But the kernel goes through this reliable/unreliable cycle --- they
> 
> >would be better off just making the old kernel more and more reliable
> >and focusing on the new kernel for features.
> >
> >The reliable/unreliable cycle will kill your user base.
> >  
> >
> The popularity of Linux would argue that statement a great deal.

Fine, let it argue.

I said _in_ _spite_ of the backpatching, not because of it.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
Hello,
 O.k. so everyone is basically in agreement of "no new features" to be 
backported.
How do we implement a stable release maintainer for back releases? I assume
we set a scope of of what would go in security/bug fixes only?

Sincerely,

Joshua Drake


-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC - S/JDBC
Postgresql support, programming, shared hosting and dedicated hosting.
+1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL.Org - Editor-N-Chief - http://www.postgresql.org




Re: Thoughts on maintaining 7.3

From
"Joshua D. Drake"
Date:
But the kernel goes through this reliable/unreliable cycle --- they

>would be better off just making the old kernel more and more reliable
>and focusing on the new kernel for features.
>
>The reliable/unreliable cycle will kill your user base.
>  
>
The popularity of Linux would argue that statement a great deal.

Sincerely,

Joshua Drake





-- 
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC - S/JDBC
Postgresql support, programming, shared hosting and dedicated hosting.
+1-503-222-2783 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL.Org - Editor-N-Chief - http://www.postgresql.org




Re: Thoughts on maintaining 7.3

From
Hans-Jürgen Schönig
Date:
Joshua D. Drake wrote:
>>eh.. i could see some things, like tsearch2 or pg_autovacuum, which
>>afaik are almost if not completely compatible with 7.3, which will not
>>get back ported. Also fixes in some of the extra tools like psql could
>>be very doable, I know I had a custom psql for 7.2 that back patched the
>>\timing option and some of the pager fixes. now, weather that could be
>>done with stuff closer to core, i don't know...
> 
> 
> Sure but businesses don't like to upgrade unless they have too. If we 
> really want to attract more business to using PostgreSQL then they need
> to feel like they don't have to upgrade every 12 months. Upgrading is 
> expensive and it rarely goes as smoothly as a dump/restore.


I have made the following experience:

If a new application is deployed and if it stays unchanged 99% of all 
bugs in the database or the software itself will be found within a 
comparatively short amount of time.
If a business partner decides to continue to work on his application 
(which means changing it) he will accept new PostgreSQL releases.
Up to now upgrading PostgreSQL has never been a problem because have 
expected major releases to be stable. In addition to that dump/restore 
worked nicely.
I remember having slightly more work when we switched to 7.3 because 
somehow type casts are handled differently (less implicit casts - I 
think that was the problem) but for that purpose intelligent customers 
have testing environments so that nothing evil can happen on the 
production system.

I don't think back porting features is a good idea. As Marc said: 
PostgreSQL is the kernel and not an ordinary package.
Personally I think that a database product should always be a rock solid 
product. Unless applications such as, let's say, xclock, database are 
truly critical and customers won't forget about releases eating data. 
However, in my opinion they can understand that maintenance is necessary.

> When you deal with the systems I do, the cost to a customer to migrate 
> to 7.4 would be in the minimum of 10,000-20,000 dollars.
> They start to ask why were upgrading with those numbers.

What did you do to cause these costs?????
We have several huge and critical customers as well but none of them 
would cause costs like that.

If everything works nicely: Why would you change the release anyway? Why 
would you back-port new features if you don't accept downtimes?

If something has been working for months there are not that many bugs 
you can expect. In case of disaster there are still options to fix bugs. 
That's what commercial guys are here for.
Fortunately we haven't ever seen a situation in which something really 
severe has been broken.

Buffer overflows:
Usually this kind of bugs can be fixed within just a few lines.

I have been working with PostgreSQL for 4 years now. All together I have 
encountered 3-4 bugs which caused me some headache and which I haven't 
known. I guess 1 per year is more than acceptable.

Regards,
    Hans

-- 
Cybertec Geschwinde u Schoenig
Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria
Tel: +43/2952/30706 or +43/660/816 40 77
www.cybertec.at, www.postgresql.at, kernel.cybertec.at




OT: Re: Thoughts on maintaining 7.3

From
Christopher Kings-Lynne
Date:
> I have lately been taking the position that Linux is only a
> second-best choice for production use, precisely because of the
> constant introduction of shiny new features in the supposed stable
> branch. 

That's what all us FreeBSD users learnt a long time ago :P

Chris




Re: Thoughts on maintaining 7.3

From
Bruce Momjian
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> > I think what Tom is concerned about is that this hasn't been tested
> > enough with big datasets.  Also there a little loss of index pages but
> > it's much less (orders of magnitude, I think) than what was before.
> > This is because the index won't shrink "vertically".
> 
> The fact that we won't remove levels shouldn't be meaningful at all ---
> I mean, if the index was once big enough to require a dozen btree
> levels, and you delete everything, are you going to be upset that it
> drops to 13 pages rather than 2?  I doubt it.
> 
> The reason I'm waffling about whether the problem is completely fixed or
> not is that the existing code will only remove-and-recycle completely
> empty btree pages.  As long as you have one key left on a page it will
> stay there.  So you could end up with ridiculously low percentage-filled
> situations.  This could be fixed by collapsing together adjacent
> more-than-half-empty pages, but we ran into a lot of problems trying to
> do that in a concurrent fashion.  So I'm waiting to find out if real
> usage patterns have a significant issue with this or not.

If we have an exclusive lock during VACUUM FULL, should we just collapse
the pages rather than REINDEX?  I realize we might have lots of expired
index tuples because VACUUM FULL creates new ones as part of
reorganizing the heap.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Thoughts on maintaining 7.3

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> > The reason I'm waffling about whether the problem is completely fixed or
> > not is that the existing code will only remove-and-recycle completely
> > empty btree pages.  As long as you have one key left on a page it will
> > stay there.  So you could end up with ridiculously low percentage-filled
> > situations.  This could be fixed by collapsing together adjacent
> > more-than-half-empty pages, but we ran into a lot of problems trying to
> > do that in a concurrent fashion.  So I'm waiting to find out if real
> > usage patterns have a significant issue with this or not.
> 
> If we have an exclusive lock during VACUUM FULL, should we just collapse
> the pages rather than REINDEX?  I realize we might have lots of expired
> index tuples because VACUUM FULL creates new ones as part of
> reorganizing the heap.

Never mind --- I remember now that we are going to use VACUUM for a few
updates, and VACUUM FULL for big updates.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Thoughts on maintaining 7.3

From
Bruce Momjian
Date:
Added to TODO:
* Have VACUUM FULL use REINDEX rather than index vacuum

---------------------------------------------------------------------------

Alvaro Herrera wrote:
> On Sat, Oct 04, 2003 at 11:53:49PM -0400, Tom Lane wrote:
> > Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> 
> > > Imagine having to VACUUM FULL a huge
> > > table.  Not only it will take the lot required to do the VACUUM in the
> > > heap itself, it will also have to rebuild all indexes from scratch.
> > 
> > A very large chunk of VACUUM FULL's runtime is spent fooling with the
> > indexes.  Have you looked at the code in any detail?  It goes like this:
> 
> Hmm.  No, I haven't looked at that code too much.  You are probably
> right, of course.  Maybe the indexes could be dropped altogether and
> then recreated after the vacuum is over, similar to what the cluster
> code does.  This would be similar to REINDEX, I suppose.  (I haven't
> actually looked at the REINDEX code either.)
> 
> 
> > > I think there are scenarios where the REINDEX will be much worse, say when
> > > there are not too many deleted tuples (but in that case, why is the user
> > > doing VACUUM FULL in the first place?).
> > 
> > Yeah, I think that's exactly the important point.  These days there's
> > not a lot of reason to do VACUUM FULL unless you have a major amount of
> > restructuring to do.  I would once have favored maintaining two code
> > paths with two strategies, but now I doubt it's worth the trouble.
> > (Or I should say, we have two code paths, the other being lazy VACUUM
> > --- do we need three?)
> 
> There are two points that could be made here:
> 
> 1. We do not want users having to think too hard about what kind of
> VACUUM they want.  This probably botches Bruce's idea of an additional
> VACUUM DATA command.
> 
> 2. We do not want to expose the VACUUM command family at all.  The
> decisions about what code paths should be taken are best left to the
> backend-integrated vacuum daemon, which has probably much better
> information than users.
> 
> -- 
> Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
> "You knock on that door or the sun will be shining on places inside you
> that the sun doesn't usually shine" (en Death: "The High Cost of Living")
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073