Thread: Buglist

Buglist

From
Bo Lorentsen
Date:
Hi ...

I'm trying to convince my boss to use posgresql (I need RI, transactions
and views), but he keeps comparing the project to mysql. Until now, I
found the answers to he's questions on the www.postgresql.org page, but
now I'm lost :-)

Where do I find a list of bugs both found and solved, or will I need to
ask on the pgsql-bugs list to know the answer ?

Also have anyone tryed to compare the new transaction model in MySQL 4.x
to PostgreSQL ?

I'm looking forward to recive even more constructive arguements :-)

/BL


Re: Buglist

From
"Roderick A. Anderson"
Date:
On 19 Aug 2003, Bo Lorentsen wrote:

> Also have anyone tryed to compare the new transaction model in MySQL 4.x
> to PostgreSQL ?

Bo,  I've recently started having to deal with MySQL. (Web sites
wanting/using php _need/have-to-have_ MySQL. Their words not mine.)  And
from going from a "I dislike MySQL" to "I'm really hating MySQL" has been
getting easier and easier.
   My dealings with MySQL are for the 3.xx version but I semi-followed a
thread on this several months ago so feel fully qualified to to throw in
my views.  :-)  My take on others research was that MySQL transaction
model is a bubble gum and bailing wire add on not an integral part of
MySQL.  It _was_ tacked onto the top of the database so if either it or
MySQL failed you were likely to loose data.

> I'm looking forward to recive even more constructive arguements :-)

How about "Friends don't let friends use MySQL"?

Hopefully others with a stonger knowledge will provide this.


Rod
--
  "Open Source Software - Sometimes you get more than you paid for..."


Re: Buglist

From
"Shridhar Daithankar"
Date:
On 19 Aug 2003 at 13:32, Bo Lorentsen wrote:

> Hi ...
>
> I'm trying to convince my boss to use posgresql (I need RI, transactions
> and views), but he keeps comparing the project to mysql. Until now, I
> found the answers to he's questions on the www.postgresql.org page, but
> now I'm lost :-)
>
> Where do I find a list of bugs both found and solved, or will I need to
> ask on the pgsql-bugs list to know the answer ?

Well, you could look at release notes. That contains lot of information. Of
course archives of pgsql-bugs and pgsql-patches are the ultimate unless you
plan to delve into CVS history and sources..


> Also have anyone tryed to compare the new transaction model in MySQL 4.x
> to PostgreSQL ?

Check this.. http://www.mysql.com/doc/en/ANSI_diff_Transactions.html

To me, it seems that their definition of transaction is limited to preventing
two guys writing to same row simaltenously, which is of course a limited view
of things.

Few major differences I can see right here. Correct me on mysql side.

- WAL log
- Transactabl DDLs
- Nested transactions coming soon to PG
- PITR coming soon to PG

I would love to see entire checklist but don't have any time to devote to
mysql.

Bye
 Shridhar

--
Vulcans never bluff.        -- Spock, "The Doomsday Machine", stardate 4202.1


Re: Buglist

From
Bo Lorentsen
Date:
On Tue, 2003-08-19 at 14:31, Shridhar Daithankar wrote:
> Well, you could look at release notes. That contains lot of information. Of
> course archives of pgsql-bugs and pgsql-patches are the ultimate unless you
> plan to delve into CVS history and sources..
Ok, I just liked to find something like bugzilla, or an explanation to
how bugs are garantied to be visible. My boos like to compare this to
the Mysql model found on : http://bugs.mysql.com/bugstats

> Check this.. http://www.mysql.com/doc/en/ANSI_diff_Transactions.html
Hmm, it sound like they have transactions on OmniDB tables, but these
are really slow, and therefor they put much energy into advetising for
the MyISAM files (non transactional). Or, am I missing something.

> To me, it seems that their definition of transaction is limited to preventing
> two guys writing to same row simaltenously, which is of course a limited view
> of things.
This sounds like there MyISAM tables, or ???

> Few major differences I can see right here. Correct me on mysql side.
>
> - WAL log
> - Transactabl DDLs
Yes and lets add :
- Views
- subselects
- plperl, plsql, plpython, plXXX

> - Nested transactions coming soon to PG
> - PITR coming soon to PG
Not good for argumenting with my boss about future :-)

> I would love to see entire checklist but don't have any time to devote to
> mysql.
I do understand, and its no pleasure either :-)

/BL


Re: Buglist

From
Bo Lorentsen
Date:
On Tue, 2003-08-19 at 14:37, Roderick A. Anderson wrote:

> Bo,  I've recently started having to deal with MySQL. (Web sites
> wanting/using php _need/have-to-have_ MySQL. Their words not mine.)  And
> from going from a "I dislike MySQL" to "I'm really hating MySQL" has been
> getting easier and easier.
Been there too :-)

>    My dealings with MySQL are for the 3.xx version but I semi-followed a
> thread on this several months ago so feel fully qualified to to throw in
> my views.  :-)  My take on others research was that MySQL transaction
> model is a bubble gum and bailing wire add on not an integral part of
> MySQL.  It _was_ tacked onto the top of the database so if either it or
> MySQL failed you were likely to loose data.
But this goes for 3.x have you tried 4.x and there InnoDB tables ?

> How about "Friends don't let friends use MySQL"?
Nice thanks, but me boss dont by that ... yet.

/BL


Re: Buglist

From
Tom Lane
Date:
"Shridhar Daithankar" <shridhar_daithankar@persistent.co.in> writes:
> On 19 Aug 2003 at 13:32, Bo Lorentsen wrote:
>> Where do I find a list of bugs both found and solved, or will I need to
>> ask on the pgsql-bugs list to know the answer ?

> Well, you could look at release notes. That contains lot of information. Of
> course archives of pgsql-bugs and pgsql-patches are the ultimate unless you
> plan to delve into CVS history and sources..

Also the pgsql-committers archives.  (Personally, when I want to look at
a change history, I use cvs2cl to extract one from the CVS server.)  The
release notes are a good high-level view, but if you want details you
need to look to the mailing list archives or CVS logs.

BTW, lots of people have the bad habit of reporting bugs on -general or
-hackers; there are significant bug fixes that never get a mention in
the -bugs list.  So you really have to troll all the archives if you are
going to use the archives as your primary source.  CVS change history
might be a better starting point.

            regards, tom lane

Re: Buglist

From
Shridhar Daithankar
Date:
On Tuesday 19 August 2003 18:59, Bo Lorentsen wrote:
> >    My dealings with MySQL are for the 3.xx version but I semi-followed a
> > thread on this several months ago so feel fully qualified to to throw in
> > my views.  :-)  My take on others research was that MySQL transaction
> > model is a bubble gum and bailing wire add on not an integral part of
> > MySQL.  It _was_ tacked onto the top of the database so if either it or
> > MySQL failed you were likely to loose data.
>
> But this goes for 3.x have you tried 4.x and there InnoDB tables ?

Well, if you have some time and hardware to play around, I am sure we can
easily test them side by side.

Interested? This could take couple of weeks taking things to real world
workloads but could provide some insight to community.

If you have mysql 4.x and postgresql 7.4beta installed, you can easily setup
these things..

 Shridhar


Re: Buglist

From
Shridhar Daithankar
Date:
On Tuesday 19 August 2003 19:12, Tom Lane wrote:
> BTW, lots of people have the bad habit of reporting bugs on -general or
> -hackers; there are significant bug fixes that never get a mention in
> the -bugs list.  So you really have to troll all the archives if you are
> going to use the archives as your primary source.  CVS change history
> might be a better starting point.

Making pgsql-bugs a open to non-subscription but moderated list might be a
good idea. It really does not matter if a bug gets filed couple of days late
but having to have subscribe to another list could be ditterent.

Or have bugzilla setup somewhere. That way the tracking will be hell lot
visible to outside world..

 Shridhar


Re: Buglist

From
Robert Treat
Date:
On Tue, 2003-08-19 at 07:32, Bo Lorentsen wrote:
> Hi ...
>
> I'm trying to convince my boss to use posgresql (I need RI, transactions
> and views), but he keeps comparing the project to mysql. Until now, I
> found the answers to he's questions on the www.postgresql.org page, but
> now I'm lost :-)
>
> Where do I find a list of bugs both found and solved, or will I need to
> ask on the pgsql-bugs list to know the answer ?

search the bugs forum. for a list of bugs/missing features, the TODO
list is your best bet.

>
> Also have anyone tryed to compare the new transaction model in MySQL 4.x
> to PostgreSQL ?
>

The biggest problem I see with the mysql transactional model is that it
violates relational theory, namely the separation of the physical and
logical model, but allowing both myisam and innodb tables to coexist.
I think it was stephen szabo who posted the best example of this which
involves starting a transaction, updating both a myisam table and an
innodb table, then rolling back the transaction. in this case the myisam
table will be updated and the innodb table wont, which means that app
writers can't simple know what information is contained in a table, they
have to know how tables are physically created and stored in order to
work with them, which tends to increase development time.

> I'm looking forward to recive even more constructive arguements :-)
>

a few links for your perusal:
http://faemalia.org/wiki/view/Technical/PostgreSQLvsMySQL
http://forums.devshed.com/archive/46/2002/10/3/35171
http://www.phpbuilder.com/columns/tim20001112.php3?page=1
http://openacs.org/philosophy/why-not-mysql.html

hope this helps,

Robert Treat
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


Re: Buglist

From
Bo Lorentsen
Date:
On Tue, 2003-08-19 at 15:45, Shridhar Daithankar wrote:

> Well, if you have some time and hardware to play around, I am sure we can
> easily test them side by side.
Interresting project, is this allowed :-)

> Interested? This could take couple of weeks taking things to real world
> workloads but could provide some insight to community.
Yeps, but this is not the propper time, and I am not sure if I have the
knowledge what it takes.

> If you have mysql 4.x and postgresql 7.4beta installed, you can easily setup
> these things..
Yeps, but how to test ? And how do one test stability :-)

/BL


Re: Buglist

From
Bo Lorentsen
Date:
On Tue, 2003-08-19 at 15:47, Shridhar Daithankar wrote:

> Or have bugzilla setup somewhere. That way the tracking will be hell lot
> visible to outside world..
I agree on this, as it seems messy from outside not to be able to get an
overview of both solved and not solved bugs.

I know that as developer, this may not seem like a big problem, but it
will help non hackers to get an overview.

/BL


Re: Buglist

From
Tom Lane
Date:
Bo Lorentsen <bl@netgroup.dk> writes:
> On Tue, 2003-08-19 at 14:37, Roderick A. Anderson wrote:
>> My take on others research was that MySQL transaction
>> model is a bubble gum and bailing wire add on not an integral part of
>> MySQL.  It _was_ tacked onto the top of the database so if either it or
>> MySQL failed you were likely to loose data.

> But this goes for 3.x have you tried 4.x and there InnoDB tables ?

It's still bolted on.  The entire concept that "transactional integrity
is optional" is ludicrous, IMHO.  "Integrity" and "optional" are
contradictory.

One thing you should ask about MySQL is where they keep the system's
metadata (catalog data).  In Postgres it's under transactional control
just like everything else, which means it's (a) crash-safe and (b)
rollback-able.  This is why all DDL changes are rollback-able in PG.
I honestly don't know what the corresponding arrangements are in MySQL
... but I suspect that even in an all-InnoDB database, there is critical
system data that is outside the InnoDB table handler and thus not
transaction-safe.

            regards, tom lane

Re: Buglist

From
Lincoln Yeoh
Date:
At 03:56 PM 8/19/2003 +0200, Bo Lorentsen wrote:

>On Tue, 2003-08-19 at 15:45, Shridhar Daithankar wrote:
>
> > Well, if you have some time and hardware to play around, I am sure we can
> > easily test them side by side.
>Interresting project, is this allowed :-)
>
> > Interested? This could take couple of weeks taking things to real world
> > workloads but could provide some insight to community.
>Yeps, but this is not the propper time, and I am not sure if I have the
>knowledge what it takes.

Install an application that can use both DBs. Muck around with it. If you
can't tell the difference, then I'd say go with postgresql - transactions
isn't bolted on, quite a number of other design wins too. If you can tell
the difference and MySQL is better, many of us here would be interested to
know.


> > If you have mysql 4.x and postgresql 7.4beta installed, you can easily
> setup
> > these things..
>Yeps, but how to test ? And how do one test stability :-)

Do lots of concurrent updates and inserts and selects for a long time?

While you are testing and have no critical data you can do stuff like
pressing the reset button midway during a transaction, or have someone trip
over the power cord. Someone on this list was doing stuff like that to
Postgresql and he said it did pretty well.

I'm not saying postgresql will save you from that, but that's one way to
learn how fault tolerant a product is before the uncontrollable faults
appear. So far there have been quite a number of people with flaky RAM or
hardware and a large percentage of them don't seem to have lost much data
(well it's hard to be 100% sure ;) ).

Have fun!
Link.




Re: Buglist

From
Lincoln Yeoh
Date:
At 03:59 PM 8/19/2003 +0200, Bo Lorentsen wrote:

>On Tue, 2003-08-19 at 15:47, Shridhar Daithankar wrote:
>
> > Or have bugzilla setup somewhere. That way the tracking will be hell lot
> > visible to outside world..
>I agree on this, as it seems messy from outside not to be able to get an
>overview of both solved and not solved bugs.
>
>I know that as developer, this may not seem like a big problem, but it
>will help non hackers to get an overview.

AFAIK bugzilla requires mysql (for now).

I've recently installed it and if it can be easily made to work with
postgresql I'd like to know.

Link.

Re: Buglist

From
Devrim GUNDUZ
Date:
Hi,

On Tue, 19 Aug 2003, Lincoln Yeoh wrote:

> AFAIK bugzilla requires mysql (for now).
>
> I've recently installed it and if it can be easily made to work with
> postgresql I'd like to know.

https://bugzilla.redhat.com/bugzilla/index.cgi
 Bugzilla News
===
January 1st, 2003
Current Red Hat version of Bugzilla using PostgreSQL code available for
download.
====

AFAIK RH runs bugzilla on PostgreSQL (or RHDB, whatever). The code is
available from there.

Regards,



--
Devrim GUNDUZ
devrim@gunduz.org                devrim.gunduz@linux.org.tr
            http://www.tdmsoft.com
            http://www.gunduz.org


Re: Buglist

From
Bo Lorentsen
Date:
On Tue, 2003-08-19 at 16:03, Tom Lane wrote:

> It's still bolted on.  The entire concept that "transactional integrity
> is optional" is ludicrous, IMHO.  "Integrity" and "optional" are
> contradictory.
Good point. Also the problem of MyISAM and InnoDB RI :-)

> One thing you should ask about MySQL is where they keep the system's
> metadata (catalog data).  In Postgres it's under transactional control
> just like everything else, which means it's (a) crash-safe and (b)
> rollback-able.  This is why all DDL changes are rollback-able in PG.
> I honestly don't know what the corresponding arrangements are in MySQL
> ... but I suspect that even in an all-InnoDB database, there is critical
> system data that is outside the InnoDB table handler and thus not
> transaction-safe.
Thats a really nice thing for temporary tables, but "point in time"
backup is a much stonger argument :-)

/BL


Re: Buglist

From
Bo Lorentsen
Date:
On Tue, 2003-08-19 at 16:20, Lincoln Yeoh wrote:

> Install an application that can use both DBs. Muck around with it. If you
> can't tell the difference, then I'd say go with postgresql - transactions
> isn't bolted on, quite a number of other design wins too. If you can tell
> the difference and MySQL is better, many of us here would be interested to
> know.
Ok, thanks, we may need to make a test utility, that is open and fair,
but thats a little hard, as PG have some featurs that MySQL does not.

> Do lots of concurrent updates and inserts and selects for a long time?
I do think I know why you say this :-)

> Have fun!
I like to do this, but I'm not sure that I have the time needed. If I
have the time, and I get some results, I let you now.

/BL


Re: Buglist

From
Vivek Khera
Date:
>>>>> "BL" == Bo Lorentsen <bl@netgroup.dk> writes:

BL> Hi ...
BL> I'm trying to convince my boss to use posgresql (I need RI, transactions
BL> and views), but he keeps comparing the project to mysql. Until now, I
BL> found the answers to he's questions on the www.postgresql.org page, but
BL> now I'm lost :-)

My big reason to choose postgres was concurrency.  My application has
transactions and updates all over the place, and even as a single
developer in the early stages, I could already see problems with
table-level locking that mysql was giving me.  Who knows what would
have happened in production with hundreds of people hitting the db
simultaneously!  The row-level locking in Postgres has made it
possible for an incredible number of simultaneous actions to be
carried out without any waiting for the users.

Try making a table grow beyond your file size limit in mysql.  You
can't.  Even if you use an OS with 64-bit file pointers (such as
FreeBSD) you can't grow your file beyond 4Gb in mysql (at least with
mysam tables -- dunno about innodb tables).  In Postgres, it is
automagically handled for you.

The *only* drawbacks I find with postgres is the need to dump/restore
on major version updates and the need to vacuum the tables
regularly...

Tops on my wish list is that postgres automatically notice when a row
is no longer needed (all transactional references to it are gone) and
'free' it at that time, rather then needing a special scan to
determine the row is no longer needed and freeing it.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Buglist

From
Shridhar Daithankar
Date:
On Tuesday 19 August 2003 21:03, Vivek Khera wrote:
> Tops on my wish list is that postgres automatically notice when a row
> is no longer needed (all transactional references to it are gone) and
> 'free' it at that time, rather then needing a special scan to
> determine the row is no longer needed and freeing it.

Heh.. we have autovacuum right. Well it does not work the way you want but it
works automatically at least.

Couple of releases down the line it will be good enough and vacuum problems
will be history if people take care to setup autovacuum and FSM parameters
correctly..

 Shridhar


Re: Buglist

From
Bruno Wolff III
Date:
On Tue, Aug 19, 2003 at 19:17:31 +0530,
  Shridhar Daithankar <shridhar_daithankar@persistent.co.in> wrote:
>
> Making pgsql-bugs a open to non-subscription but moderated list might be a
> good idea. It really does not matter if a bug gets filed couple of days late
> but having to have subscribe to another list could be ditterent.

All of the pgsql lists including bugs already work this way.

Re: Buglist

From
"Joshua D. Drake"
Date:
>
>
>It's still bolted on.  The entire concept that "transactional integrity
>is optional" is ludicrous, IMHO.  "Integrity" and "optional" are
>contradictory.
>

Obviously you have never voted in a major election ;)

>            regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 5: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faqs/FAQ.html
>
>



Re: Buglist

From
Bruno Wolff III
Date:
On Tue, Aug 19, 2003 at 18:01:33 +0530,
  Shridhar Daithankar <shridhar_daithankar@persistent.co.in> wrote:
>
> Few major differences I can see right here. Correct me on mysql side.
>
> - WAL log
> - Transactabl DDLs
> - Nested transactions coming soon to PG
> - PITR coming soon to PG

Note that the last two are not coming soon as in 7.4, but as in maybe for 7.5.
Which I can't see being out in less than 6 months.

Re: Buglist

From
Shridhar Daithankar
Date:
On Tuesday 19 August 2003 21:24, Bruno Wolff III wrote:
> On Tue, Aug 19, 2003 at 19:17:31 +0530,
>
>   Shridhar Daithankar <shridhar_daithankar@persistent.co.in> wrote:
> > Making pgsql-bugs a open to non-subscription but moderated list might be
> > a good idea. It really does not matter if a bug gets filed couple of days
> > late but having to have subscribe to another list could be ditterent.
>
> All of the pgsql lists including bugs already work this way.

No.. Ocassionally when somebody cross-posts and  I reply,  I get 'Post stalled
for modetaro because you are not member' types error. IIRC, the SQL list is
most frequent ones. I haven't seen any other list doing such stuff..

Anyway, good to know that it's done that way..

 Shridhar


Re: Buglist

From
Shridhar Daithankar
Date:
On Tuesday 19 August 2003 21:30, Bruno Wolff III wrote:
> On Tue, Aug 19, 2003 at 18:01:33 +0530,
>
>   Shridhar Daithankar <shridhar_daithankar@persistent.co.in> wrote:
> > Few major differences I can see right here. Correct me on mysql side.
> >
> > - WAL log
> > - Transactabl DDLs
> > - Nested transactions coming soon to PG
> > - PITR coming soon to PG
>
> Note that the last two are not coming soon as in 7.4, but as in maybe for
> 7.5. Which I can't see being out in less than 6 months.

Right. But as of now, we know that some code exists for each of these. Given
for how long transactions in mysql was vapourware (What, 2 years?), I think
what I am making is pretty much a statement and not a claim.

Of course, by postgresql standards, it may be bit too early to announce, I
admit...:-)

 Shridhar


Re: Buglist

From
Bruno Wolff III
Date:
On Tue, Aug 19, 2003 at 21:27:15 +0530,
  Shridhar Daithankar <shridhar_daithankar@persistent.co.in> wrote:
> On Tuesday 19 August 2003 21:24, Bruno Wolff III wrote:
> > On Tue, Aug 19, 2003 at 19:17:31 +0530,
> >
> >   Shridhar Daithankar <shridhar_daithankar@persistent.co.in> wrote:
> > > Making pgsql-bugs a open to non-subscription but moderated list might be
> > > a good idea. It really does not matter if a bug gets filed couple of days
> > > late but having to have subscribe to another list could be ditterent.
> >
> > All of the pgsql lists including bugs already work this way.
>
> No.. Ocassionally when somebody cross-posts and  I reply,  I get 'Post stalled
> for modetaro because you are not member' types error. IIRC, the SQL list is
> most frequent ones. I haven't seen any other list doing such stuff..

All that message means is that your message won't be posted until a moderator
approves it.

Re: Buglist

From
Vivek Khera
Date:
>>>>> "SD" == Shridhar Daithankar <shridhar_daithankar@persistent.co.in> writes:

SD> On Tuesday 19 August 2003 21:03, Vivek Khera wrote:
>> Tops on my wish list is that postgres automatically notice when a row
>> is no longer needed (all transactional references to it are gone) and
>> 'free' it at that time, rather then needing a special scan to
>> determine the row is no longer needed and freeing it.

SD> Heh.. we have autovacuum right. Well it does not work the way you
SD> want but it works automatically at least.


There's a big difference between "noticing that a table needs to be
vacuumed and running it" and "automatically having the backend free a
row as soon as we know it is eligible to be (as would normally be
determined by vacuum)".

One of these days when I can afford a 14-hour dump/restore, I'll
upgrade to 7.4 and try autovacuum :-)

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Buglist

From
Vivek Khera
Date:
>>>>> "VK" == Vivek Khera <khera@kcilink.com> writes:

One more nit:

Since the beginning of time (at least MySQL v3.22) MySQL has silently
ignored the foreign key references in table create statement.  Now
that they have foreign key support (version 4.x), do they honor those
statements?  Nope.  You have to use their own syntax to declare your
FKs.  They still silently ignore the references in the table create
statements.

Standards are a good thing, especially when everyone does NOT have
their own.



--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Buglist

From
Shridhar Daithankar
Date:
On Tuesday 19 August 2003 21:43, Vivek Khera wrote:
> There's a big difference between "noticing that a table needs to be
> vacuumed and running it" and "automatically having the backend free a
> row as soon as we know it is eligible to be (as would normally be
> determined by vacuum)".

Agreed and it would be nice to have. But functionally it's *almost* the same..

> One of these days when I can afford a 14-hour dump/restore, I'll
> upgrade to 7.4 and try autovacuum :-)

FYi, it runs perfectly on 7.3.x IIRC from some posts earlier..So you don't
have to wait too long..

 Shridhar


Re: Buglist

From
"scott.marlowe"
Date:
On Tue, 19 Aug 2003, Bruno Wolff III wrote:

> On Tue, Aug 19, 2003 at 18:01:33 +0530,
>   Shridhar Daithankar <shridhar_daithankar@persistent.co.in> wrote:
> >
> > Few major differences I can see right here. Correct me on mysql side.
> >
> > - WAL log
> > - Transactabl DDLs
> > - Nested transactions coming soon to PG
> > - PITR coming soon to PG
>
> Note that the last two are not coming soon as in 7.4, but as in maybe for 7.5.
> Which I can't see being out in less than 6 months.

Good point.  We in no way need to point to future improvements anymore to
say Postgresql is a better database than MySQL.  Postgresql 7.3.4 is the
best Open Source database in my humble opinion, and the things MySQL is
missing are complex features that need to be well thought out and well
implemented.  Given the haphazard approach of MySQL to standards
compliance, you are gambling your company on it's non-standard SQL
sticking around if you use it, or hundreds of man hours replacing MySQL
specific SQL if you ever convert.

Examples:

||  In Postgresql || is the concatenation operator.  Why, in god's name,
is || the concatenation operator.  Because the Spec says so.  In MySQL
it's the OR operator, contrary to spec.

Their FKs silently fail without so much as a notice.  Poof, no fks, but
they swallow the syntax as if they do.

They get precision wrong all the time.  Spec says numeric(x1,y1) *
numeric(x2,y2) yeilds numeric(size_of_int_portion.y1+y2)

But in MySQL you get a numeric(size_of_int_portion,max(y1,y2)).

Postgresql gives you the answer.

create table mult (i1 numeric(10,2), i2 numeric(10,3));
insert into mult values (123.33,123.354);
select i1*i2 from mult;


MySQL output:
+-----------+
| i1*i2     |
+-----------+
| 15213.249 |
+-----------+

Postgresql output:
  ?column?
-------------
 15213.24882

I don't know if that's changed since 4.x came out, I gave up on MySQL for
anything other than a text storage / content management engine long ago.


Re: Buglist

From
Bo Lorentsen
Date:
On Tue, 2003-08-19 at 18:17, Vivek Khera wrote:

> Since the beginning of time (at least MySQL v3.22) MySQL has silently
> ignored the foreign key references in table create statement.  Now
> that they have foreign key support (version 4.x), do they honor those
> statements?  Nope.  You have to use their own syntax to declare your
> FKs.  They still silently ignore the references in the table create
> statements.
Is this really true ?? Does 4.x still not support FK, then how about
transactions, does they that not work too ?

Is this not just the MyISAM tables that still got the problem (they are
verison 3.x) ?

> Standards are a good thing, especially when everyone does NOT have
> their own.
:-)

/BL


Re: Buglist

From
"scott.marlowe"
Date:
On 19 Aug 2003, Bo Lorentsen wrote:

> On Tue, 2003-08-19 at 18:17, Vivek Khera wrote:
>
> > Since the beginning of time (at least MySQL v3.22) MySQL has silently
> > ignored the foreign key references in table create statement.  Now
> > that they have foreign key support (version 4.x), do they honor those
> > statements?  Nope.  You have to use their own syntax to declare your
> > FKs.  They still silently ignore the references in the table create
> > statements.
> Is this really true ?? Does 4.x still not support FK, then how about
> transactions, does they that not work too ?
>
> Is this not just the MyISAM tables that still got the problem (they are
> verison 3.x) ?

No, the problem is that in SQL spec, you do it with the foreign key
declaration inside parnes in the create statement like:

create table abc123
(
  id serial unique,
  info text);
create table abc1234
(
  moreinfo text,
  ref_id int,
  foreign key (ref_id)
  references abc123(id)
  on delete cascade
);

In MySQL this syntax is silently swallowed, while their own "proper"
syntax is like this:

create table abc123
(
  id serial unique,
  info text)
type=innodb;
create table abc1234
(
  moreinfo text,
  ref_id int)
foreign key (ref_id) references abc123(id)
on delete CASCADE
type=innodb;

So the syntaxes are different, and one is apparently swallowed without
error or anything, but in fact you have no fks in place.



Re: Buglist

From
Bo Lorentsen
Date:
On Tue, 2003-08-19 at 23:10, scott.marlowe wrote:

> So the syntaxes are different, and one is apparently swallowed without
> error or anything, but in fact you have no fks in place.
Thanks, that helped.

/BL


Re: Buglist

From
Ian Barwick
Date:
On Tuesday 19 August 2003 23:10, scott.marlowe wrote:
> On 19 Aug 2003, Bo Lorentsen wrote:
> > On Tue, 2003-08-19 at 18:17, Vivek Khera wrote:
> > > Since the beginning of time (at least MySQL v3.22) MySQL has silently
> > > ignored the foreign key references in table create statement.  Now
> > > that they have foreign key support (version 4.x), do they honor those
> > > statements?  Nope.  You have to use their own syntax to declare your
> > > FKs.  They still silently ignore the references in the table create
> > > statements.
> >
> > Is this really true ?? Does 4.x still not support FK, then how about
> > transactions, does they that not work too ?
> >
> > Is this not just the MyISAM tables that still got the problem (they are
> > verison 3.x) ?
>
> No, the problem is that in SQL spec, you do it with the foreign key
> declaration inside parnes in the create statement like:
>
> create table abc123
> (
>   id serial unique,
>   info text);
> create table abc1234
> (
>   moreinfo text,
>   ref_id int,
>   foreign key (ref_id)
>   references abc123(id)
>   on delete cascade
> );
>
> In MySQL this syntax is silently swallowed, while their own "proper"
> syntax is like this:
>
> create table abc123
> (
>   id serial unique,
>   info text)
> type=innodb;
> create table abc1234
> (
>   moreinfo text,
>   ref_id int)
> foreign key (ref_id) references abc123(id)
> on delete CASCADE
> type=innodb;

(To be precise this will fail with an obscure message; an
index must be created on ref_id)

> So the syntaxes are different, and one is apparently swallowed without
> error or anything, but in fact you have no fks in place.

Just to confuse things further:
1: if the MySQL version running is not configured for innodb tables,
tables created with type=innodb will be silently converted to
MyISAM;

2: These statements will succeed:
create table abc123 (
  id INT unique,
  info text
) type=innodb;

create table abc1234  (
  moreinfo text,
  ref_id int REFERENCES abc123(id)
)  type=innodb;

but the foreign key defined on ref_id is (I presume)
transported to a remote forest in Sweden and eaten
by goats ;-)

Ian Barwick
barwick@gmx.net


Re: Buglist

From
"Matthew T. O'Connor"
Date:
On Tue, 2003-08-19 at 12:13, Vivek Khera wrote:
> There's a big difference between "noticing that a table needs to be
> vacuumed and running it" and "automatically having the backend free a
> row as soon as we know it is eligible to be (as would normally be
> determined by vacuum)".

<talking beyond my real knowledge>
Changing Postgres to perform as mentioned above is non-trivial, it would
basicially change the entire core of the system.  I think this is due to
the fact that postgres uses a non-overwriting storage manager.  This has
many benefits including MVCC, the primary disadvantage is that you need
a vacuum type process
</talking beyond my real knowledge>

> One of these days when I can afford a 14-hour dump/restore, I'll
> upgrade to 7.4 and try autovacuum :-)

pg_autovacuum does with with 7.3.x, but the source is only included in
7.4.  Just get the pg_autovacuum directory from contrib and use it.

Matthew


Re: Buglist

From
Vivek Khera
Date:
>>>>> "MTO" == Matthew T O'Connor <matthew@zeut.net> writes:

MTO> <talking beyond my real knowledge>
MTO> Changing Postgres to perform as mentioned above is non-trivial, it would
MTO> basicially change the entire core of the system.  I think this is due to
MTO> the fact that postgres uses a non-overwriting storage manager.  This has
MTO> many benefits including MVCC, the primary disadvantage is that you need
MTO> a vacuum type process
MTO> </talking beyond my real knowledge>

I'm not promoting any change in the MVCC.  What I'm saying is that it
would be really cool if the backend process itself could recognize
that a row is no longer referenced by any transactions upon
termination of the transaction, and release it back to the system.
This is just what vacuum does but in a batched manner.  I would love
to see it incremental.  This would result in pretty much near zero
internal fragmentation, I think.

How hard that is, I have no clue.  Right now my DB is saturating the
disk and having to squeeze vacuum into a saturated disk bandwidth is
not pleasant.  Luckily, the 14-disk raid array just
arrived... hopefully that will have higher bandwidth than the 4-disk
array... ;-)



Re: Buglist

From
Bruno Wolff III
Date:
On Tue, Aug 19, 2003 at 21:51:14 -0400,
  Vivek Khera <khera@kcilink.com> wrote:
>
> I'm not promoting any change in the MVCC.  What I'm saying is that it
> would be really cool if the backend process itself could recognize
> that a row is no longer referenced by any transactions upon
> termination of the transaction, and release it back to the system.
> This is just what vacuum does but in a batched manner.  I would love
> to see it incremental.  This would result in pretty much near zero
> internal fragmentation, I think.

Why do you care about about the details of the implementation (rather than
the performance)? If it were faster to do it that way, that's how it would
have been done in the first place. The cost of doing the above is almost
certainly going to be an overall performance loser.

Re: Buglist

From
Vivek Khera
Date:
>>>>> "BW" == Bruno Wolff, <Bruno> writes:

>> to see it incremental.  This would result in pretty much near zero
>> internal fragmentation, I think.

BW> Why do you care about about the details of the implementation (rather than
BW> the performance)? If it were faster to do it that way, that's how it would
BW> have been done in the first place. The cost of doing the above is almost
BW> certainly going to be an overall performance loser.

I care for the performance.  And how are you so sure that it was
faster the way it is now?  Are you sure it was not done this way
because of ease of implementation?

Seriously, how much slower can it be if the backend were to do the
checking for external references upon updating/deleting a row?  The
cost would be distributed across time as opposed to concentrated at
once within a vacuum process.  I am fairly certian it would reduce
disk bandwidth requirements since at least one necessary page will
already be in memory.

Re: Buglist

From
Bruno Wolff III
Date:
On Wed, Aug 20, 2003 at 10:31:25 -0400,
  Vivek Khera <khera@kcilink.com> wrote:
>
> I care for the performance.  And how are you so sure that it was
> faster the way it is now?  Are you sure it was not done this way
> because of ease of implementation?
>
> Seriously, how much slower can it be if the backend were to do the
> checking for external references upon updating/deleting a row?  The
> cost would be distributed across time as opposed to concentrated at
> once within a vacuum process.  I am fairly certian it would reduce
> disk bandwidth requirements since at least one necessary page will
> already be in memory.

It would probably be a lot slower. Any transaction that has started
but not yet finished would need to lock all rows that exist at during
the transaction (for serialized transaction isolation you would only
need to worry about rows that existed at the start of the transaction
or that were modified by the transaction). Immediately that is a big
deal since a typical query may need to lock a bunch of rows that it
will never actually touch (but you don't know that when the transaction
starts). Managing all those locks would take up a lot of semiconductor
memory or a lot of disk writes and be a major source of contention.
The current system just has to mark rows when they are created and
deleted (an update does both operations). The normal vacuum clean up
actually isn't going to be much worse than what you would need to do
at both the start and end of each transaction. The overhead of letting
dead rows hang around for a while after they aren't needed isn't that high.

Also, since at least 7.3, normal vacuums aren't normally going to affect the
performance of your database server that much.

The main issue against the current vacuum system is that it requires the
DBA knowing what vacuum does and figuring out how it should be used in
their situation to get reasonable performance. This makes it a bit harder
for non-DBAs to jump right in to Postgres without running into problems.

However, the work on autovacuum seems to be providing a reasonable solution
to that problem.

Re: Buglist

From
Alvaro Herrera
Date:
On Wed, Aug 20, 2003 at 10:31:25AM -0400, Vivek Khera wrote:

> Seriously, how much slower can it be if the backend were to do the
> checking for external references upon updating/deleting a row?  The
> cost would be distributed across time as opposed to concentrated at
> once within a vacuum process.  I am fairly certian it would reduce
> disk bandwidth requirements since at least one necessary page will
> already be in memory.

There's no way to check for "external references", because said
references are actually the running transactions.  So you can't drop a
row until all the transactions that were active during your transaction
are finished.  Certainly your own backend can't do the checking, because
there's no way to even assert that it will be live when those other
transactions finish.  Who will?  The last of said transactions?  It
certainly will be expensive for a backend to keep track of the
deleted/updated tuples by all _other_ backends, just in case...
Long running transactions can't be neglected, so you can't keep it in
memory.

INVHO the solution to this problem will come in the form of a
autovaccum daemon integrated into the backend...

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Ninguna manada de bestias tiene una voz tan horrible como la humana" (Orual)

Re: Buglist

From
"scott.marlowe"
Date:
On Wed, 20 Aug 2003, Vivek Khera wrote:

> >>>>> "BW" == Bruno Wolff, <Bruno> writes:
>
> >> to see it incremental.  This would result in pretty much near zero
> >> internal fragmentation, I think.
>
> BW> Why do you care about about the details of the implementation (rather than
> BW> the performance)? If it were faster to do it that way, that's how it would
> BW> have been done in the first place. The cost of doing the above is almost
> BW> certainly going to be an overall performance loser.
>
> I care for the performance.  And how are you so sure that it was
> faster the way it is now?  Are you sure it was not done this way
> because of ease of implementation?
>
> Seriously, how much slower can it be if the backend were to do the
> checking for external references upon updating/deleting a row?  The
> cost would be distributed across time as opposed to concentrated at
> once within a vacuum process.  I am fairly certian it would reduce
> disk bandwidth requirements since at least one necessary page will
> already be in memory.

Time for a mental exercise.

Our server has 2 users.  Each backend has to check with all the other
backends when it deletes a tuple (every update is also a delete, remember
every change in an MVCC database is a create / delte cycle.)  Let's create
a name for the time it takes to do the update / mark deleted versus the
time it takes to contact each of those other backends.  Tw is the Time to
do the work here, and Tc is the time to do the cleanup (i.e. vacuum the
tuple)  Note that we'd also need a Ta for answering the requests of all
the other backends, but we can assume that on average, for each request a
child process makes, it will receive exactly that many from each other
backend running.  Let x represent the number of backends.  So the answer
time is equal to x*Tc

Time = Tw + Tc + Ta

Time = Tw + Tc + (x * Tc)

Time = Tw + ((x+1) * Tc)

and our cleanup time starts to grow at an ugly rate as the number of
backends increases.  Lazy vacuuming allows the database to reclaim lost
space in the background, as the newer non-full vacuum does.

Many folks mistake this vacuum process for its older, slower cousin, full
vacuum, which does eat a lot more disk bandwidth and slow the machine
down.

On a Dual CPU X86 box a lazy vacuum running in a continuous loop will eat
about 5% of one CPU and drop pgbench scores by 10 to 15%.  The important
thing here, is that the machine will still run quite snappily when you
throw several hundred clients at it, since the lazy vacuum just sits in
the background using the spare cycles and not much more.

that means your storage usage may baloon somewhat under intense usage, but
you won't have an IPC storm kill the performance of the postgresql server.

Knowing the postgresql development team, I'm sure the reasons they chose
are clearly stated in the hackers mailing list somewhere in time, so I'm
gonna go look, but trust me on one thing, the guys programming this
database don't do much because it's easier / faster to implement without
putting something in the TODO list about making it better some day.


Re: Buglist

From
Stephan Szabo
Date:
On Wed, 20 Aug 2003, Vivek Khera wrote:

> >>>>> "BW" == Bruno Wolff, <Bruno> writes:
>
> >> to see it incremental.  This would result in pretty much near zero
> >> internal fragmentation, I think.
>
> BW> Why do you care about about the details of the implementation (rather than
> BW> the performance)? If it were faster to do it that way, that's how it would
> BW> have been done in the first place. The cost of doing the above is almost
> BW> certainly going to be an overall performance loser.
>
> I care for the performance.  And how are you so sure that it was
> faster the way it is now?  Are you sure it was not done this way
> because of ease of implementation?
>
> Seriously, how much slower can it be if the backend were to do the
> checking for external references upon updating/deleting a row?  The
> cost would be distributed across time as opposed to concentrated at
> once within a vacuum process.  I am fairly certian it would reduce
> disk bandwidth requirements since at least one necessary page will
> already be in memory.

In general, excepting a case with only one session running, I would expect
there to exist one or more transactions that can see the old state of the
row.

I would think it'd be something like:

T1: begin;
T1: update foo set bar=cheese where id=3;
T2: begin;
T2: select * from foo;
 -- this should see the old state of the row, so that update better
 -- not have removed the old one.
T1: end;
 -- at this point read committed transactions may see the new
 -- state, unless it's currently running a query which is accessing
 -- that table, and in practice it's probably more complicated than
 -- that (since you might run functions tha access the table).
 -- You'd presumably need to wait until say the end of that full
 -- statement at the very least.  And any serializable transactions
 -- would probably mean you can't do it either.

Under some circumstances it might be better, yes.  And there are ways to
do things that don't involve non-overwriting (like Oracle's system) but
they all have downsides. This might not be the best solution, but you're
going to have to put alot more work into showing that your solution is
better. ;)



Re: Buglist

From
Jan Wieck
Date:
Alvaro Herrera wrote:

> On Wed, Aug 20, 2003 at 10:31:25AM -0400, Vivek Khera wrote:
>
>> Seriously, how much slower can it be if the backend were to do the
>> checking for external references upon updating/deleting a row?  The
>> cost would be distributed across time as opposed to concentrated at
>> once within a vacuum process.  I am fairly certian it would reduce
>> disk bandwidth requirements since at least one necessary page will
>> already be in memory.
>
> There's no way to check for "external references", because said
> references are actually the running transactions.  So you can't drop a
> row until all the transactions that were active during your transaction
> are finished.  Certainly your own backend can't do the checking, because
> there's no way to even assert that it will be live when those other
> transactions finish.  Who will?  The last of said transactions?  It
> certainly will be expensive for a backend to keep track of the
> deleted/updated tuples by all _other_ backends, just in case...
> Long running transactions can't be neglected, so you can't keep it in
> memory.

This isn't so. It's regular backends that turn on the flag in a tuples
header telling that it's committed deleted. And they do so during a
normal scan, so even during a SELECT. It's not the backend who actually
deleted a row that knows, it's a later one checking the tuple visibility
that can tell "this one's dead for all of us and will not come alive via
rollback any more".

But, that doesn't help for the index removal explained earlier.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: Buglist

From
Vivek Khera
Date:
>>>>> "BW" == Bruno Wolff, <Bruno> writes:

BW> It would probably be a lot slower. Any transaction that has started
BW> but not yet finished would need to lock all rows that exist at during
BW> the transaction (for serialized transaction isolation you would only

Why would you need to lock rows?  Does the current vacuum need it?  I
don't think it does.  Why can't the functionality of vacuum be made to
operate incrementally per row delete/update?  I don't know if it is
possible.

BW> Also, since at least 7.3, normal vacuums aren't normally going to
BW> affect the performance of your database server that much.

I disagree.  Triggering a vacuum on a db that is nearly saturating the
disk bandwidth has a significant impact.

BW> The main issue against the current vacuum system is that it requires the
BW> DBA knowing what vacuum does and figuring out how it should be used in
BW> their situation to get reasonable performance. This makes it a bit harder
BW> for non-DBAs to jump right in to Postgres without running into problems.

BW> However, the work on autovacuum seems to be providing a reasonable solution
BW> to that problem.

Yes, this is a good thing.

Re: Buglist

From
Vivek Khera
Date:
>>>>> "AH" == Alvaro Herrera <alvherre@dcc.uchile.cl> writes:

>> once within a vacuum process.  I am fairly certian it would reduce
>> disk bandwidth requirements since at least one necessary page will
>> already be in memory.

AH> There's no way to check for "external references", because said
AH> references are actually the running transactions.  So you can't drop a
AH> row until all the transactions that were active during your transaction

How does the backend running vacuum detect that a row has no active
references?  Why can't the current backend know that same information?

Re: Buglist

From
Jan Wieck
Date:
Vivek Khera wrote:

>>>>>> "BW" == Bruno Wolff, <Bruno> writes:
>
>>> to see it incremental.  This would result in pretty much near zero
>>> internal fragmentation, I think.
>
> BW> Why do you care about about the details of the implementation (rather than
> BW> the performance)? If it were faster to do it that way, that's how it would
> BW> have been done in the first place. The cost of doing the above is almost
> BW> certainly going to be an overall performance loser.
>
> I care for the performance.  And how are you so sure that it was
> faster the way it is now?  Are you sure it was not done this way
> because of ease of implementation?

Among some locking issues when doing btree deletes as opposed to scan
and insert operations, there is no direct pointer from a data (heap) row
to it's index entries. VACUUM remembers all the ctid's it removed from
the heap in it's batch run and then does a full scan of the indexes to
remove all the index entries pointing to these ctid's. Your idea is (so
far) lacking a place where to remember all the single removed rows and I
assume you're not planning to pay the cost of a full scan over all
indexes of a table to reclaim the space of one data row, are you?

>
> Seriously, how much slower can it be if the backend were to do the
> checking for external references upon updating/deleting a row?  The
> cost would be distributed across time as opposed to concentrated at
> once within a vacuum process.  I am fairly certian it would reduce
> disk bandwidth requirements since at least one necessary page will
> already be in memory.

I am fairly certain that holds true for tables without indexes only.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: Buglist

From
Bruno Wolff III
Date:
On Wed, Aug 20, 2003 at 12:40:03 -0400,
  Vivek Khera <khera@kcilink.com> wrote:
> >>>>> "BW" == Bruno Wolff, <Bruno> writes:
>
> BW> It would probably be a lot slower. Any transaction that has started
> BW> but not yet finished would need to lock all rows that exist at during
> BW> the transaction (for serialized transaction isolation you would only
>
> Why would you need to lock rows?  Does the current vacuum need it?  I
> don't think it does.  Why can't the functionality of vacuum be made to
> operate incrementally per row delete/update?  I don't know if it is
> possible.

How do you plan on knowing when no one is using a row any more?
The way vacuum uses is not suitable for deleting the row after the
last transaction that can see the row completes. When rows are created
they are marked with the transaction id of the transaction that created
the row. When they are deleted they are marked with the transaction id
of the transaction that deleted the row. Any transaction with an id
between those two ids can see the row. So it isn't the transaction that
deletes a row that needs to worry about marking its space as available,
but the last transaction that a row is visible to that would have to
do it. Naively, this entails scanning the whole database, just like a vacuum
would, after each completed transaction.

> BW> Also, since at least 7.3, normal vacuums aren't normally going to
> BW> affect the performance of your database server that much.
>
> I disagree.  Triggering a vacuum on a db that is nearly saturating the
> disk bandwidth has a significant impact.

If you are running that close to the edge you have potential problems
of which running vacuum is just one.

Re: Buglist

From
Vivek Khera
Date:
>>>>> "JW" == Jan Wieck <JanWieck@Yahoo.com> writes:

JW> remove all the index entries pointing to these ctid's. Your idea is (so
JW> far) lacking a place where to remember all the single removed rows and I
JW> assume you're not planning to pay the cost of a full scan over all
JW> indexes of a table to reclaim the space of one data row, are you?

Well, that pretty much kills my idea...  back to autovacuum ;-)


Re: Buglist

From
Dennis Gearon
Date:
I don't think the problem will be deleting the tuple itself, if I've been following this correclty. If every
transactionleft it's signature in some common, global array, where each transaction finishing its work could find out
'who'sout there', killing unnneed transactions could be part of the exit routine. 

But getting the indexes cleaned up, that's another story, from what I've been listening to.

Bruno Wolff III wrote:

> On Wed, Aug 20, 2003 at 12:40:03 -0400,
>   Vivek Khera <khera@kcilink.com> wrote:
>
>>>>>>>"BW" == Bruno Wolff, <Bruno> writes:
>>
>>BW> It would probably be a lot slower. Any transaction that has started
>>BW> but not yet finished would need to lock all rows that exist at during
>>BW> the transaction (for serialized transaction isolation you would only
>>
>>Why would you need to lock rows?  Does the current vacuum need it?  I
>>don't think it does.  Why can't the functionality of vacuum be made to
>>operate incrementally per row delete/update?  I don't know if it is
>>possible.
>
>
> How do you plan on knowing when no one is using a row any more?
> The way vacuum uses is not suitable for deleting the row after the
> last transaction that can see the row completes. When rows are created
> they are marked with the transaction id of the transaction that created
> the row. When they are deleted they are marked with the transaction id
> of the transaction that deleted the row. Any transaction with an id
> between those two ids can see the row. So it isn't the transaction that
> deletes a row that needs to worry about marking its space as available,
> but the last transaction that a row is visible to that would have to
> do it. Naively, this entails scanning the whole database, just like a vacuum
> would, after each completed transaction.
>
>
>>BW> Also, since at least 7.3, normal vacuums aren't normally going to
>>BW> affect the performance of your database server that much.
>>
>>I disagree.  Triggering a vacuum on a db that is nearly saturating the
>>disk bandwidth has a significant impact.
>
>
> If you are running that close to the edge you have potential problems
> of which running vacuum is just one.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>


Re: Buglist

From
Vivek Khera
Date:
>>>>> "BW" == Bruno Wolff, <Bruno> writes:

>> I disagree.  Triggering a vacuum on a db that is nearly saturating the
>> disk bandwidth has a significant impact.

BW> If you are running that close to the edge you have potential problems
BW> of which running vacuum is just one.

Which is why i just laid down some serious buckage for a 14-disk
array...

Re: Buglist

From
Jan Wieck
Date:
Vivek Khera wrote:

>>>>>> "JW" == Jan Wieck <JanWieck@Yahoo.com> writes:
>
> JW> remove all the index entries pointing to these ctid's. Your idea is (so
> JW> far) lacking a place where to remember all the single removed rows and I
> JW> assume you're not planning to pay the cost of a full scan over all
> JW> indexes of a table to reclaim the space of one data row, are you?
>
> Well, that pretty much kills my idea...  back to autovacuum ;-)

Sorry that.

While at the "reclaim" and [non-] overwriting topic, the other day I had
some brilliant idea about index entries, so here comes your chance for
revenge ...

Assume we would have multiple ctid slots per index entry. An UPDATE on a
row that doesn't change indexed fields allways inserts the same index
value with a new ctid, so it'll come across this index tuple anyway. If
said index tuple has an empty slot, just put the ctid there and done. If
it does not, check the ctid's that are there if they point to "known
dead and ready to vacuum" tuples and make space. If that fails too, well
then insert another index tuple.

This has some advantages. Updating rows without changing key fields does
not result in constant index growth. An index scan will be fast even for
a value with lots of dead heap tuples. For non-unique indexes, multiple
data rows can "share" one index entry.

Index cleanup for vacuum remains pretty easy. Clear the ctid slots
pointing to removed tuples. Remove index tuples that have all slots cleared.



Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: Buglist

From
Stephan Szabo
Date:
On Wed, 20 Aug 2003, Jan Wieck wrote:

> Alvaro Herrera wrote:
>
> > On Wed, Aug 20, 2003 at 10:31:25AM -0400, Vivek Khera wrote:
> >
> >> Seriously, how much slower can it be if the backend were to do the
> >> checking for external references upon updating/deleting a row?  The
> >> cost would be distributed across time as opposed to concentrated at
> >> once within a vacuum process.  I am fairly certian it would reduce
> >> disk bandwidth requirements since at least one necessary page will
> >> already be in memory.
> >
> > There's no way to check for "external references", because said
> > references are actually the running transactions.  So you can't drop a
> > row until all the transactions that were active during your transaction
> > are finished.  Certainly your own backend can't do the checking, because
> > there's no way to even assert that it will be live when those other
> > transactions finish.  Who will?  The last of said transactions?  It
> > certainly will be expensive for a backend to keep track of the
> > deleted/updated tuples by all _other_ backends, just in case...
> > Long running transactions can't be neglected, so you can't keep it in
> > memory.
>
> This isn't so. It's regular backends that turn on the flag in a tuples
> header telling that it's committed deleted. And they do so during a
> normal scan, so even during a SELECT. It's not the backend who actually
> deleted a row that knows, it's a later one checking the tuple visibility
> that can tell "this one's dead for all of us and will not come alive via
> rollback any more".

True, but the message being responded to was specifically "if the backend
were to do the checking for external references upon updating/deleting a
row".  In any case, I thought it only does the committed deleted stuff
when it comes upon a row in a scan, which means that it's still not
automatic clean up in general since any particular deleted row may not get
looked at for some amount of time after all possible viewers are gone.



Re: Buglist

From
Tom Lane
Date:
Vivek Khera <khera@kcilink.com> writes:
> "JW" == Jan Wieck <JanWieck@Yahoo.com> writes:
> JW> remove all the index entries pointing to these ctid's. Your idea is (so
> JW> far) lacking a place where to remember all the single removed rows and I
> JW> assume you're not planning to pay the cost of a full scan over all
> JW> indexes of a table to reclaim the space of one data row, are you?

> Well, that pretty much kills my idea...  back to autovacuum ;-)

In addition to the index-cleanup issue that Jan explained, there are
locking problems.  The tuple-is-dead hint bit mechanism is very
carefully designed so that a backend can set the hint bits while holding
only a shared lock on the page containing the tuple.  Physically
removing a tuple requires a far stronger lock (see the source code for
details).  Thus, having ordinary interactive backends remove tuples
would have bad consequences for concurrent performance.

But I think the real point here is that there's no reason to think that
doing tuple deletion on-the-fly in foreground transactions is superior
to doing it in background with a vacuum process.  You're taking what
should be noncritical maintenance work and moving it into the critical
paths of your foreground applications.  Not only that, but you're
probably doing more total work per tuple --- VACUUM "batches" its work
in more ways than just the index cleanup aspect, IIRC.

            regards, tom lane

Re: Buglist

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> True, but the message being responded to was specifically "if the backend
> were to do the checking for external references upon updating/deleting a
> row".

It's clearly impossible for a backend to remove a row immediately upon
updating/deleting it, since it cannot know whether it will succeed in
committing its transaction.  The implementable variant of this would
correspond to extending the check-whether-committed-deleted code to see
whether a previously deleted tuple is now removable --- that is, moving
VACUUM's processing of the tuple into the main line.

> In any case, I thought it only does the committed deleted stuff
> when it comes upon a row in a scan, which means that it's still not
> automatic clean up in general since any particular deleted row may not get
> looked at for some amount of time after all possible viewers are gone.

Recall also that "committed deleted" does not mean "safe to remove".
There may still be live transactions that could see the tuple.  The
"committed deleted" bit just exists to allow subsequent visitors to the
row to skip one of the more expensive steps in deciding whether they can
see it or not.

To determine that a row is removable requires additional tests above and
beyond what backends normally do.

            regards, tom lane

Re: Buglist

From
Jan Wieck
Date:
Tom Lane wrote:

> Recall also that "committed deleted" does not mean "safe to remove".
> There may still be live transactions that could see the tuple.  The
> "committed deleted" bit just exists to allow subsequent visitors to the
> row to skip one of the more expensive steps in deciding whether they can
> see it or not.
>
> To determine that a row is removable requires additional tests above and
> beyond what backends normally do.

Aah - there is the first bullet hole in my multi-ctid-index-idea. Now
the question becomes how expensive these tests are (if a normal backend
can do them at all within reason)?


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: Buglist

From
Tom Lane
Date:
Jan Wieck <JanWieck@Yahoo.com> writes:
> Aah - there is the first bullet hole in my multi-ctid-index-idea. Now
> the question becomes how expensive these tests are (if a normal backend
> can do them at all within reason)?

It's not hugely expensive, IIRC, you just need to make some additional
checks against global xmin (compare HeapTupleSatisfiesVacuum against
the others).  We're already doing something similar for the optimization
that suppresses subsequent heap lookups for globally-dead index tuples.

I'm dubious about the multi-ctid idea though because it would mean
bloating the index tuple header, whether there was any use for multiple
entries or not.  (Could we make the header variable size?  Not sure it's
worth the trouble.)

            regards, tom lane

Re: Buglist

From
Andrew Sullivan
Date:
On Wed, Aug 20, 2003 at 12:40:03PM -0400, Vivek Khera wrote:
> >>>>> "BW" == Bruno Wolff, <Bruno> writes:
> BW> Also, since at least 7.3, normal vacuums aren't normally going to
> BW> affect the performance of your database server that much.
>
> I disagree.  Triggering a vacuum on a db that is nearly saturating the
> disk bandwidth has a significant impact.

Vivek is right about this.  If your system is already very busy, then
a vacuum on a largish table is painful.

I don't actually think having the process done in real time will
help, though -- it seems to me what would be more useful is an even
lazier vacuum: something that could be told "clean up as cycles are
available, but make sure you stay out of the way."  Of course, that's
easy to say glibly, and mighty hard to do, I expect.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: Buglist

From
Jan Wieck
Date:
Andrew Sullivan wrote:

> On Wed, Aug 20, 2003 at 12:40:03PM -0400, Vivek Khera wrote:
>> >>>>> "BW" == Bruno Wolff, <Bruno> writes:
>> BW> Also, since at least 7.3, normal vacuums aren't normally going to
>> BW> affect the performance of your database server that much.
>>
>> I disagree.  Triggering a vacuum on a db that is nearly saturating the
>> disk bandwidth has a significant impact.
>
> Vivek is right about this.  If your system is already very busy, then
> a vacuum on a largish table is painful.
>
> I don't actually think having the process done in real time will
> help, though -- it seems to me what would be more useful is an even
> lazier vacuum: something that could be told "clean up as cycles are
> available, but make sure you stay out of the way."  Of course, that's
> easy to say glibly, and mighty hard to do, I expect.

What about a little hint to the buffer management that if it has to
evict another buffer to physically read this one (meaning the buffer
pool was full already) then it will not put this buffer at the top of
the LRU chain but rather at it's end? This way a vacuum on a large table
will not cause a complete cache eviction.

Might be a useful hint for sequential scans too.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: Buglist

From
Tom Lane
Date:
Andrew Sullivan <andrew@libertyrms.info> writes:
>> I disagree.  Triggering a vacuum on a db that is nearly saturating the
>> disk bandwidth has a significant impact.

> Vivek is right about this.  If your system is already very busy, then
> a vacuum on a largish table is painful.

> I don't actually think having the process done in real time will
> help, though -- it seems to me what would be more useful is an even
> lazier vacuum: something that could be told "clean up as cycles are
> available, but make sure you stay out of the way."  Of course, that's
> easy to say glibly, and mighty hard to do, I expect.

I'd love to be able to do that, but I can't think of a good way.

Just nice'ing the VACUUM process is likely to be counterproductive
because of locking issues (priority inversion).  Though if anyone cares
to try it on a heavily-loaded system, I'd be interested to hear the
results...

            regards, tom lane

Re: Buglist

From
Karsten Hilbert
Date:
> it seems to me what would be more useful is an even
> lazier vacuum: something that could be told "clean up as cycles are
> available, but make sure you stay out of the way."  Of course, that's
> easy to say glibly, and mighty hard to do, I expect.
You mean, like, "nice 19" or so ?

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: Buglist

From
Edmund Dengler
Date:
What about the use of priority inheritance to deal with the issue of
priority inversion (a standard methodology within the real-time world)?

Then we could have priorities, but still have low priority processes
bumped up if a high level one is waiting on them.

Regards,
Ed

On Wed, 20 Aug 2003, Tom Lane wrote:

> Andrew Sullivan <andrew@libertyrms.info> writes:
> >> I disagree.  Triggering a vacuum on a db that is nearly saturating the
> >> disk bandwidth has a significant impact.
>
> > Vivek is right about this.  If your system is already very busy, then
> > a vacuum on a largish table is painful.
>
> > I don't actually think having the process done in real time will
> > help, though -- it seems to me what would be more useful is an even
> > lazier vacuum: something that could be told "clean up as cycles are
> > available, but make sure you stay out of the way."  Of course, that's
> > easy to say glibly, and mighty hard to do, I expect.
>
> I'd love to be able to do that, but I can't think of a good way.
>
> Just nice'ing the VACUUM process is likely to be counterproductive
> because of locking issues (priority inversion).  Though if anyone cares
> to try it on a heavily-loaded system, I'd be interested to hear the
> results...
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>


Re: Buglist

From
Karsten Hilbert
Date:
> Just nice'ing the VACUUM process is likely to be counterproductive
> because of locking issues (priority inversion).
OK, getting out the brown paper bag |-(

Is there any concept of db engine idleness obtainable from
states of PG internal variables that might be leveraged ?

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: Buglist

From
Tom Lane
Date:
Jan Wieck <JanWieck@Yahoo.com> writes:
> What about a little hint to the buffer management that if it has to
> evict another buffer to physically read this one (meaning the buffer
> pool was full already) then it will not put this buffer at the top of
> the LRU chain but rather at it's end? This way a vacuum on a large table
> will not cause a complete cache eviction.

I think what we really need is a way to schedule VACUUM's I/O at a lower
priority than normal I/Os.  Wouldn't be very portable :-( ... but if the
OS offers a facility for requesting this, it'd be worth experimenting
with.

            regards, tom lane

Re: Buglist

From
Alvaro Herrera
Date:
On Wed, Aug 20, 2003 at 05:21:35PM -0400, Jan Wieck wrote:

> What about a little hint to the buffer management that if it has to
> evict another buffer to physically read this one (meaning the buffer
> pool was full already) then it will not put this buffer at the top of
> the LRU chain but rather at it's end? This way a vacuum on a large table
> will not cause a complete cache eviction.
>
> Might be a useful hint for sequential scans too.

Somebody was playing with using LRU-2 or ARC or some other algorithm for
page replacement instead of the current LRU... wouldn't it help with
this too?

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"La vida es para el que se aventura"

Re: Buglist

From
Martijn van Oosterhout
Date:
On Wed, Aug 20, 2003 at 05:41:18PM -0400, Tom Lane wrote:
> Andrew Sullivan <andrew@libertyrms.info> writes:
> > I don't actually think having the process done in real time will
> > help, though -- it seems to me what would be more useful is an even
> > lazier vacuum: something that could be told "clean up as cycles are
> > available, but make sure you stay out of the way."  Of course, that's
> > easy to say glibly, and mighty hard to do, I expect.
>
> I'd love to be able to do that, but I can't think of a good way.
>
> Just nice'ing the VACUUM process is likely to be counterproductive
> because of locking issues (priority inversion).  Though if anyone cares
> to try it on a heavily-loaded system, I'd be interested to hear the
> results...

How about the really simple solution: explicit yields. Along the lines of:

for each page
  vacuum page
  sleep 5ms

As long as you sleep without holding any locks, this means it would (very
slowly) interate over the table. If the auto-vacuum can target tables it
would be even more effective. A slightly more sophisticated version would
be:

VACUUM MAX 10MB/s  (or some such syntax)

Then you just keep a count of the number of processed pages and the amount
of time and if you're getting ahead of yourself, sleep a while.

Given lazy vacuum doesn't hold locks for long periods, it could be an idea
to continuously spend 1% of your disk bandwidth on a background vacuum. As
for vacuum full, I don't know if you could do the same thing.

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "All that is needed for the forces of evil to triumph is for enough good
> men to do nothing." - Edmond Burke
> "The penalty good people pay for not being interested in politics is to be
> governed by people worse than themselves." - Plato

Attachment

Re: Buglist

From
"Shridhar Daithankar"
Date:
On 19 Aug 2003 at 15:26, Bo Lorentsen wrote:

> On Tue, 2003-08-19 at 14:31, Shridhar Daithankar wrote:
> > Well, you could look at release notes. That contains lot of information. Of
> > course archives of pgsql-bugs and pgsql-patches are the ultimate unless you
> > plan to delve into CVS history and sources..
> Ok, I just liked to find something like bugzilla, or an explanation to
> how bugs are garantied to be visible. My boos like to compare this to
> the Mysql model found on : http://bugs.mysql.com/bugstats
>
> > Check this.. http://www.mysql.com/doc/en/ANSI_diff_Transactions.html
> Hmm, it sound like they have transactions on OmniDB tables, but these
> are really slow, and therefor they put much energy into advetising for
> the MyISAM files (non transactional). Or, am I missing something.

Pretty much true but hard to prove by evidence. I would say drop that line f
argument. Mysql has transactions. Period.

OK, if you talk about transactions across two table types, innodb not being
default, might be not being free, then it is valid.

First rule of argument. Always talk facts..

> > To me, it seems that their definition of transaction is limited to preventing
> > two guys writing to same row simaltenously, which is of course a limited view
> > of things.
> This sounds like there MyISAM tables, or ???

I haven't used mysql to be that expert.. sorry..

>
> > Few major differences I can see right here. Correct me on mysql side.
> >
> > - WAL log
> > - Transactabl DDLs
> Yes and lets add :
> - Views
> - subselects
> - plperl, plsql, plpython, plXXX

Extensible operator and type definition
Table files splitting on 1GB boundary
Rules
Inheritance
True foreign keys
Data integrity ( You should watch some mysql excertps produced on advocacy)

Here is one for your reference...
-------------
> * PROPER USAGE OF NULL
>
> mysql> select * from ai_test where id is null;
> +----+-------+
> | id | txt   |
> +----+-------+
> |  1 | hello |
> +----+-------+
> 1 row in set (0.00 sec)
>
> ;-). I digress. Off the top of my head, in no particular order:

You're not trying hard enough:

mysql> create table test3 (a date);
Query OK, 0 rows affected (0.00 sec)

mysql> insert into test3 values (-1);
Query OK, 1 row affected (0.01 sec)

mysql> insert into test3 values ('1996-02-31');
Query OK, 1 row affected (0.00 sec)

mysql> insert into test3 values ('1996-67-31');
Query OK, 1 row affected (0.00 sec)

mysql> select * from test3;
+------------+
| a          |
+------------+
| 0000-00-00 |
| 1996-02-31 |
| 0000-00-00 |
+------------+
3 rows in set (0.00 sec)
-------------

I wouldn't bet my shoe on such database..

>
> > - Nested transactions coming soon to PG
> > - PITR coming soon to PG
> Not good for argumenting with my boss about future :-)

May be right. But a decision maker needs to know roadmap as well. As I said in
another mail, there exists real world code for all of this and it is going to
happen. It's not a vapourware. If you can press it, I think talking about
future is a good idea..

> > I would love to see entire checklist but don't have any time to devote to
> > mysql.
> I do understand, and its no pleasure either :-)

Just to be clear, my unenthusiaism to study mysql has nothing to do with my
time shortage. If I need to study mysql to understand it better and project
postgresql, I would happily do that. But I seriously don't have time..

Bye
 Shridhar

--
Feel free to contact me (flames about my english and the useless of thisdriver
will be redirected to /dev/null, oh no, it's full...).(Michael Beck, describing
the PC-speaker sound device)


Re: Buglist

From
Andrew Sullivan
Date:
On Wed, Aug 20, 2003 at 05:58:32PM -0400, Tom Lane wrote:
> Jan Wieck <JanWieck@Yahoo.com> writes:
> > the LRU chain but rather at it's end? This way a vacuum on a large table
> > will not cause a complete cache eviction.
>
> I think what we really need is a way to schedule VACUUM's I/O at a lower
> priority than normal I/Os.  Wouldn't be very portable :-( ... but if the

Hey, they both sounds like nifty ideas to me!  The portability sure
worries me, though, on that I/O trick.  Still, Oracle (f'rinstance)
made all kinds of optimisations for Sun (and conversely) partly
because, I expect, that's where a lot of their users were, and the
performance or reliability gains were significant.  Whether that is
worth doing for PostgreSQL, when there are probably lots of other
targets to aim at, is an open question.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: Buglist

From
Andrew Sullivan
Date:
On Thu, Aug 21, 2003 at 03:40:29PM +1000, Martijn van Oosterhout wrote:
> Given lazy vacuum doesn't hold locks for long periods, it could be
> an idea to continuously spend 1% of your disk bandwidth on a
> background vacuum. As for vacuum full, I don't know if you could do
> the same thing.

Assuming that one can keep up with the dust bunnies this way, though,
one wouldn't need to do vacuum full.  This would definitely be a way
cool feature, if implementable.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: Buglist

From
Andrew Sullivan
Date:
On Wed, Aug 20, 2003 at 11:41:41PM +0200, Karsten Hilbert wrote:
> You mean, like, "nice 19" or so ?

ISTR someone reporting problems with locking on the performance list
from doing exactly that.  The problem is that the vacuum back end
might take a lock and then not get any processor time -- in which
case everybody else gets their processor slice but can't do anything,
because they have to wait until the niced vacuum process gets back in
line.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: Buglist

From
"Shridhar Daithankar"
Date:
On 21 Aug 2003 at 10:59, Andrew Sullivan wrote:

> On Wed, Aug 20, 2003 at 05:58:32PM -0400, Tom Lane wrote:
> > Jan Wieck <JanWieck@Yahoo.com> writes:
> > > the LRU chain but rather at it's end? This way a vacuum on a large table
> > > will not cause a complete cache eviction.
> >
> > I think what we really need is a way to schedule VACUUM's I/O at a lower
> > priority than normal I/Os.  Wouldn't be very portable :-( ... but if the
>
> Hey, they both sounds like nifty ideas to me!  The portability sure
> worries me, though, on that I/O trick.  Still, Oracle (f'rinstance)
> made all kinds of optimisations for Sun (and conversely) partly
> because, I expect, that's where a lot of their users were, and the
> performance or reliability gains were significant.  Whether that is
> worth doing for PostgreSQL, when there are probably lots of other
> targets to aim at, is an open question.

Well, if you guys remember my posts on performance recently, the said project
will probably drift to mysql as performance requirement on solaris platform
seems pretty steep to postgresql.

Personally I think inserting 5M rows in 11 column table should not take more
than an hour. ( That's the performance criteria). But apparently postgresql is
not making it no matter what..

Just an FYI, I think we need to do something for solaris. If a hourse does not
drink despite being taken to water, throw him in water.. After it's the
database users who are stuck. Not Sun..

Bye
 Shridhar

--
Fun Facts, #14:    In table tennis, whoever gets 21 points first wins.  That's how
it once was in baseball -- whoever got 21 runs first won.


Re: Buglist

From
"Shridhar Daithankar"
Date:
On 21 Aug 2003 at 11:01, Andrew Sullivan wrote:

> On Thu, Aug 21, 2003 at 03:40:29PM +1000, Martijn van Oosterhout wrote:
> > Given lazy vacuum doesn't hold locks for long periods, it could be
> > an idea to continuously spend 1% of your disk bandwidth on a
> > background vacuum. As for vacuum full, I don't know if you could do
> > the same thing.
>
> Assuming that one can keep up with the dust bunnies this way, though,
> one wouldn't need to do vacuum full.  This would definitely be a way
> cool feature, if implementable.

If a database is clean i.e. no dead tuple, an autovacuum daemon with 1 min
interval can achieve pretty much same result, isn't it?

Bye
 Shridhar

--
Drew's Law of Highway Biology:    The first bug to hit a clean windshield lands
directly in front    of your eyes.


Re: Buglist

From
Andrew Sullivan
Date:
On Thu, Aug 21, 2003 at 08:38:14PM +0530, Shridhar Daithankar wrote:
> If a database is clean i.e. no dead tuple, an autovacuum daemon with 1 min
> interval can achieve pretty much same result, isn't it?

But we're talking about the case of large, busy databases that have
already choked their disks.  We have the same problem here in our
test machines.  We start running load tests, and with vacuums nicely
scheduled and everything we start topping out on the performance
pretty quickly, because of I/O bottlenecks on the database.  We know
the difference in I/O bandwidth between our test env. and the
production env., so we can put in a fudge factor for this; but that's
it.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: Buglist

From
"Shridhar Daithankar"
Date:
On 21 Aug 2003 at 11:26, Andrew Sullivan wrote:

> On Thu, Aug 21, 2003 at 08:38:14PM +0530, Shridhar Daithankar wrote:
> > If a database is clean i.e. no dead tuple, an autovacuum daemon with 1 min
> > interval can achieve pretty much same result, isn't it?
>
> But we're talking about the case of large, busy databases that have
> already choked their disks.  We have the same problem here in our
> test machines.  We start running load tests, and with vacuums nicely
> scheduled and everything we start topping out on the performance
> pretty quickly, because of I/O bottlenecks on the database.  We know
> the difference in I/O bandwidth between our test env. and the
> production env., so we can put in a fudge factor for this; but that's
> it.

Well, nothing can help if the database has dead tuples already. Sometime
somebody has to take time to run vacuum full and/or database reload to get a
clean state.

Point I am trying to make is to tune FSM and autovacuum frequency such that you
catch all the dead tuples in RAM, which is non-blocking operation at the
expense of some CPU power. I am sure 1 min autovacuum I suggested is waaay too
aggressive for any scheduled vacuum isn't it?

It would be really good if vacuum analyse gets lock only for pages it's
dealing. That way there would be minimum impact on rest of the system. I don't
know how it is done as of now.

Ideally a vacuum analyze could run in tight loops wasting minimum CPU. But that
is like making two poles of earth hug each other..

Bye
 Shridhar

--
Bolub's Fourth Law of Computerdom:    Project teams detest weekly progress
reporting because it so    vividly manifests their lack of progress.


Re: Buglist

From
Edmund Dengler
Date:
Well, if they are locked waiting on vacuum, then vacuum should upgrade
it's priority to the highest waiting process (priority inheritance).
This way, vacuum will be running at a priority level equivalent to who is
waiting on it.

Regards,
Ed

On Thu, 21 Aug 2003, Andrew Sullivan wrote:

> On Wed, Aug 20, 2003 at 11:41:41PM +0200, Karsten Hilbert wrote:
> > You mean, like, "nice 19" or so ?
>
> ISTR someone reporting problems with locking on the performance list
> from doing exactly that.  The problem is that the vacuum back end
> might take a lock and then not get any processor time -- in which
> case everybody else gets their processor slice but can't do anything,
> because they have to wait until the niced vacuum process gets back in
> line.
>
> A
>
> --
> ----
> Andrew Sullivan                         204-4141 Yonge Street
> Liberty RMS                           Toronto, Ontario Canada
> <andrew@libertyrms.info>                              M2P 2A8
>                                          +1 416 646 3304 x110
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>


Re: Buglist

From
Andrew Sullivan
Date:
On Thu, Aug 21, 2003 at 12:05:28PM -0400, Edmund Dengler wrote:
> Well, if they are locked waiting on vacuum, then vacuum should upgrade
> it's priority to the highest waiting process (priority inheritance).
> This way, vacuum will be running at a priority level equivalent to who is
> waiting on it.

Right, but all that intelligence is something that isn't in there
now.  And anyway, the real issue is I/O, not processor.

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: Buglist

From
Andrew Sullivan
Date:
On Thu, Aug 21, 2003 at 09:10:34PM +0530, Shridhar Daithankar wrote:
> Well, nothing can help if the database has dead tuples already.
> Sometime somebody has to take time to run vacuum full and/or
> database reload to get a clean state.

But if you have a busy system, you'll have new dead tuples.

> Point I am trying to make is to tune FSM and autovacuum frequency
> such that you catch all the dead tuples in RAM, which is
> non-blocking operation at the expense of some CPU power. I am sure
> 1 min autovacuum I suggested is waaay too aggressive for any
> scheduled vacuum isn't it?

Not for some cases.  In (say) 40% write situation, you have _lots_ of
dead tuples.  Perhaps you can make the application more efficient,
but that's not always an option (maybe you don't have the code).

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: Buglist

From
Edmund Dengler
Date:
What I am pointing out is that this is all the same issue, and that
solutions to the "we can't do priorities because of locking issues" have
existed for many years. I/O is the same as processors, it is a resource
that needs managing. So the intelligence can be made to exist, it just
needs to be made.

Now onto other questions: can vacuuming be done without locks? Can it be
done in parts (ie, lock only a bit)? Can the I/O be better managed? Is
this a general model that would work well?

I have plenty of queries that I would love to run on a "as the system
allows" basis, or on a "keep a bit of spare cycles or I/O for the
important stuff", but which I cannot specify. So a vote from me for any
mechanism that allows priorities to be specified. If this is a desired
feature, then comes the hard part of what is feasible, what can be done in
a reasonable amount of time, and of doing it.

Regards!
Ed

On Thu, 21 Aug 2003, Andrew Sullivan wrote:

> On Thu, Aug 21, 2003 at 12:05:28PM -0400, Edmund Dengler wrote:
> > Well, if they are locked waiting on vacuum, then vacuum should upgrade
> > it's priority to the highest waiting process (priority inheritance).
> > This way, vacuum will be running at a priority level equivalent to who is
> > waiting on it.
>
> Right, but all that intelligence is something that isn't in there
> now.  And anyway, the real issue is I/O, not processor.
>
> A
>
> --
> ----
> Andrew Sullivan                         204-4141 Yonge Street
> Liberty RMS                           Toronto, Ontario Canada
> <andrew@libertyrms.info>                              M2P 2A8
>                                          +1 416 646 3304 x110
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>


Decent VACUUM (was: Buglist)

From
Manfred Koizar
Date:
On Wed, 20 Aug 2003 15:39:26 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>But I think the real point here is that there's no reason to think that
>doing tuple deletion on-the-fly in foreground transactions is superior
>to doing it in background with a vacuum process.  You're taking what
>should be noncritical maintenance work and moving it into the critical
>paths of your foreground applications.  Not only that, but you're
>probably doing more total work per tuple --- VACUUM "batches" its work
>in more ways than just the index cleanup aspect, IIRC.

Yes, I sign that, 100%.  That doesn't mean that we couldn't do any
better.  AFAICS Vivek's problem is that it is hard enough to hold a
good part of the working set in the cache, and still his disks are
saturated.  Now a VACUUM not only adds one more process to disk I/O
contention, but also makes sure that the working set pages are *not*
in memory which leads to higher I/O rates after the VACUUM.

I can imagine several use cases where only a small part of a large
relation is subject to DELETEs/UPDATEs.  Maybe Vivek's application
falls into this category.

If we teach VACUUM to not read pages that don't contain any dead
tuples, this could be a significant improvement.  I'm envisioning a
data structure (reclaimable space map, RSM) similar to the FSM.
Whenever a backend encounters a dead tuple it inserts a reference to
its page into the RSM.  Dead tuple detection is no problem, it is
already implemented for marking dead index tuples.  VACUUM, when run
in a new mode ("decent"), only checks pages that are listed in the
RSM.  To get full advantage of not doing unnecessary page reads, we'll
also need to redesign the index bulk delete routines.

The autovaccum daemon will watch the RSM and when the number of
entries is above a configurable threshold, it will start a
VACUUM DECENT ...

Servus
 Manfred

Re: Decent VACUUM (was: Buglist)

From
Manfred Koizar
Date:
[ still brainstorming ... ]

On Thu, 21 Aug 2003 17:16:50 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>> Whenever a backend encounters a dead tuple it inserts a reference to
>> its page into the RSM.
>
>This assumes that backends will visit dead tuples with significant
>probability.  I doubt that assumption is tenable;

Good point.  What about:  Whenever a backend *deletes* a tuple it
inserts a reference to its page into the RSM?  Then an entry in the
RSM doesn't necessarily mean that the referenced page has reclaimable
space, but it would still be valueable information.

Servus
 Manfred

Re: Buglist

From
Manfred Koizar
Date:
On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
<shridhar_daithankar@persistent.co.in> wrote:
>Point I am trying to make is to tune FSM and autovacuum frequency
>such that you catch all the dead tuples in RAM

You might be able to catch the pages with dead tuples in RAM, but
currently there's no way to keep VACUUM from reading in all the clean
pages, which can be far more ...

Servus
 Manfred

Re: Decent VACUUM (was: Buglist)

From
Manfred Koizar
Date:
On Thu, 21 Aug 2003 17:56:02 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>Conceivably it could be a win, though,
>if you could do frequent "vacuum decent"s and only a full-scan vacuum
>once in awhile (once a day maybe).

That's what I had in mind; similar to the current situation where you
can avoid expensive VACUUM FULL by doing lazy VACUUM frequently
enough.

Servus
 Manfred

Re: Buglist

From
Jan Wieck
Date:
Manfred Koizar wrote:
> On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
> <shridhar_daithankar@persistent.co.in> wrote:
>>Point I am trying to make is to tune FSM and autovacuum frequency
>>such that you catch all the dead tuples in RAM
>
> You might be able to catch the pages with dead tuples in RAM, but
> currently there's no way to keep VACUUM from reading in all the clean
> pages, which can be far more ...

Which leads us to a zero gravity vacuum, that does the lazy vacuum for
pages currently available in the buffer cache only. And another pg_stat
column telling the number of tuples vacuumed so that an autovac has a
chance to avoid IO consuming vacuum runs for relations where 99% of the
dead tuples have been caught in memory.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: Buglist

From
"Shridhar Daithankar"
Date:
On 21 Aug 2003 at 18:46, Jan Wieck wrote:

> Manfred Koizar wrote:
> > On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
> > <shridhar_daithankar@persistent.co.in> wrote:
> >>Point I am trying to make is to tune FSM and autovacuum frequency
> >>such that you catch all the dead tuples in RAM
> >
> > You might be able to catch the pages with dead tuples in RAM, but
> > currently there's no way to keep VACUUM from reading in all the clean
> > pages, which can be far more ...
>
> Which leads us to a zero gravity vacuum, that does the lazy vacuum for
> pages currently available in the buffer cache only. And another pg_stat
> column telling the number of tuples vacuumed so that an autovac has a
> chance to avoid IO consuming vacuum runs for relations where 99% of the
> dead tuples have been caught in memory.

Since autovacuum issues vacuum analyze only, is it acceptable to say that this
is taken care of already?

Bye
 Shridhar

--
"One size fits all":    Doesn't fit anyone.


Re: Buglist

From
"Shridhar Daithankar"
Date:
On 21 Aug 2003 at 16:22, Andrew Sullivan wrote:

> On Thu, Aug 21, 2003 at 09:10:34PM +0530, Shridhar Daithankar wrote:
> > Well, nothing can help if the database has dead tuples already.
> > Sometime somebody has to take time to run vacuum full and/or
> > database reload to get a clean state.
>
> But if you have a busy system, you'll have new dead tuples.

Yes. but if you have big enough FSM, you can afford them right? At least till
next vacuum runs..

>
> > Point I am trying to make is to tune FSM and autovacuum frequency
> > such that you catch all the dead tuples in RAM, which is
> > non-blocking operation at the expense of some CPU power. I am sure
> > 1 min autovacuum I suggested is waaay too aggressive for any
> > scheduled vacuum isn't it?
>
> Not for some cases.  In (say) 40% write situation, you have _lots_ of
> dead tuples.  Perhaps you can make the application more efficient,
> but that's not always an option (maybe you don't have the code).

Idea of autovacuum is to reduce load on vacuum full. If you set shared_buffers
higher and FSM properly for he update/delete load, autovacuum is expected to
catch most of the dead tuples in shared cache only. If it is successful in
doubling the frequency on vacuum full, that's a big win, isn't it?


Bye
 Shridhar

--
QOTD:    "It's sort of a threat, you see.  I've never been very good at    them myself, but I'm told they can be very
effective."


Re: Buglist

From
Manfred Koizar
Date:
On Fri, 22 Aug 2003 12:15:33 +0530, "Shridhar Daithankar"
<shridhar_daithankar@persistent.co.in> wrote:
>> Which leads us to a zero gravity vacuum, that does the lazy vacuum for
>> pages currently available in the buffer cache only. [...]
>
>Since autovacuum issues vacuum analyze only, is it acceptable to say that this
>is taken care of already?

Even a plain VACUUM (without FULL) scans the whole relation to find
the (possibly few) pages that need to be changed.  We are trying to
find a way to avoid those needless reads of clean pages, because (a)
they are IOs competing with other disk operations and (b) they push
useful pages out of OS cache and (c) of PG shared buffers.  The latter
might become a non-issue with LRU-k, 2Q or ARC.  But (a) and (b)
remain.

Servus
 Manfred

Re: Buglist

From
Shridhar Daithankar
Date:
On Friday 22 August 2003 16:23, Manfred Koizar wrote:
> On Fri, 22 Aug 2003 12:15:33 +0530, "Shridhar Daithankar"
>
> <shridhar_daithankar@persistent.co.in> wrote:
> >> Which leads us to a zero gravity vacuum, that does the lazy vacuum for
> >> pages currently available in the buffer cache only. [...]
> >
> >Since autovacuum issues vacuum analyze only, is it acceptable to say that
> > this is taken care of already?
>
> Even a plain VACUUM (without FULL) scans the whole relation to find
> the (possibly few) pages that need to be changed.  We are trying to
> find a way to avoid those needless reads of clean pages, because (a)
> they are IOs competing with other disk operations and (b) they push
> useful pages out of OS cache and (c) of PG shared buffers.  The latter
> might become a non-issue with LRU-k, 2Q or ARC.  But (a) and (b)
> remain.

Umm.. What does FSM does then? I was under impression that FSM stores page
pointers and vacuum work on FSM information only. In that case, it wouldn't
have to waste time to find out which pages to clean.

 Shridhar


Re: Buglist

From
Bruno Wolff III
Date:
On Fri, Aug 22, 2003 at 12:17:41 +0530,
  Shridhar Daithankar <shridhar_daithankar@persistent.co.in> wrote:
>
> Idea of autovacuum is to reduce load on vacuum full. If you set shared_buffers
> higher and FSM properly for he update/delete load, autovacuum is expected to
> catch most of the dead tuples in shared cache only. If it is successful in
> doubling the frequency on vacuum full, that's a big win, isn't it?

If you run a normal vacuum often enough, you shouldn't need to regularly
run vacuum full.

Re: Buglist

From
Manfred Koizar
Date:
On Fri, 22 Aug 2003 16:27:53 +0530, Shridhar Daithankar
<shridhar_daithankar@persistent.co.in> wrote:
>What does FSM does then?

FSM = Free Space Map.  VACUUM writes information into the FSM, INSERTs
consult the FSM to find pages with free space for new tuples.

> I was under impression that FSM stores page
>pointers and vacuum work on FSM information only. In that case, it wouldn't
>have to waste time to find out which pages to clean.

This has been discussed yesterday here and on -hackers ("Decent VACUUM
(was: Buglist)").  We were talking about inventing a second data
structure: RSM.

Servus
 Manfred

Re: Buglist

From
Jan Wieck
Date:
Shridhar Daithankar wrote:

> On Friday 22 August 2003 16:23, Manfred Koizar wrote:
>> On Fri, 22 Aug 2003 12:15:33 +0530, "Shridhar Daithankar"
>>
>> <shridhar_daithankar@persistent.co.in> wrote:
>> >> Which leads us to a zero gravity vacuum, that does the lazy vacuum for
>> >> pages currently available in the buffer cache only. [...]
>> >
>> >Since autovacuum issues vacuum analyze only, is it acceptable to say that
>> > this is taken care of already?
>>
>> Even a plain VACUUM (without FULL) scans the whole relation to find
>> the (possibly few) pages that need to be changed.  We are trying to
>> find a way to avoid those needless reads of clean pages, because (a)
>> they are IOs competing with other disk operations and (b) they push
>> useful pages out of OS cache and (c) of PG shared buffers.  The latter
>> might become a non-issue with LRU-k, 2Q or ARC.  But (a) and (b)
>> remain.
>
> Umm.. What does FSM does then? I was under impression that FSM stores page
> pointers and vacuum work on FSM information only. In that case, it wouldn't
> have to waste time to find out which pages to clean.

It's the other way around! VACUUM scan's the tables to find and reclaim
free space and remembers that free space in the FSM. The regular
backends that need storage space to insert tuples then use the free
space in the pages that are recorded in the FSM instead of adding new
pages at the end of the relations.


Jan
>
>  Shridhar
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: [HACKERS] Buglist

From
Tom Lane
Date:
Jan Wieck <JanWieck@Yahoo.com> writes:
> Shridhar Daithankar wrote:
>> Umm.. What does FSM does then? I was under impression that FSM stores page
>> pointers and vacuum work on FSM information only. In that case, it wouldn't
>> have to waste time to find out which pages to clean.

> It's the other way around! VACUUM scan's the tables to find and reclaim
> free space and remembers that free space in the FSM.

Right.  One big question mark in my mind about these "partial vacuum"
proposals is whether they'd still allow adequate FSM information to be
maintained.  If VACUUM isn't looking at most of the pages, there's no
very good way to acquire info about where there's free space.

            regards, tom lane

Re: [HACKERS] Buglist

From
"Shridhar Daithankar"
Date:
On 22 Aug 2003 at 10:45, Tom Lane wrote:

> Jan Wieck <JanWieck@Yahoo.com> writes:
> > Shridhar Daithankar wrote:
> >> Umm.. What does FSM does then? I was under impression that FSM stores page
> >> pointers and vacuum work on FSM information only. In that case, it wouldn't
> >> have to waste time to find out which pages to clean.
>
> > It's the other way around! VACUUM scan's the tables to find and reclaim
> > free space and remembers that free space in the FSM.
>
> Right.  One big question mark in my mind about these "partial vacuum"
> proposals is whether they'd still allow adequate FSM information to be
> maintained.  If VACUUM isn't looking at most of the pages, there's no
> very good way to acquire info about where there's free space.

Somehow it needs to get two types of information.

A. If any transaction is accessing a page
B. If a page contains any free space.

Vacuum needs to look for pages not in A but in B. Can storage manager maintain
two lists/hashes with minimal cost? In that case, all unlocked and not in
transaction pages could be a much smaller subset.

Does it sound bizzare?


Bye
 Shridhar

--
Chemicals, n.:    Noxious substances from which modern foods are made.


Re: [HACKERS] Buglist

From
Jan Wieck
Date:
Tom Lane wrote:

> Jan Wieck <JanWieck@Yahoo.com> writes:
>> Shridhar Daithankar wrote:
>>> Umm.. What does FSM does then? I was under impression that FSM stores page
>>> pointers and vacuum work on FSM information only. In that case, it wouldn't
>>> have to waste time to find out which pages to clean.
>
>> It's the other way around! VACUUM scan's the tables to find and reclaim
>> free space and remembers that free space in the FSM.
>
> Right.  One big question mark in my mind about these "partial vacuum"
> proposals is whether they'd still allow adequate FSM information to be
> maintained.  If VACUUM isn't looking at most of the pages, there's no
> very good way to acquire info about where there's free space.

That's why I think it needs one more pg_stat column to count the number
of vacuumed tuples. If one does

     tuples_updated + tuples_deleted - tuples_vacuumed

he'll get approximately the number of tuples a regular vacuum might be
able to reclaim. If that number is really small, no need for autovacuum
to cause any big trouble by scanning the relation.

Another way to give autovacuum some hints would be to return some number
as commandtuples from vacuum. like the number of tuples actually
vacuumed. That together with the new number of reltuples in pg_class
will tell autovacuum how frequent a relation really needs scanning.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: [HACKERS] Buglist

From
Jan Wieck
Date:
Jan Wieck wrote:

> Another way to give autovacuum some hints would be to return some number
> as commandtuples from vacuum. like the number of tuples actually
> vacuumed. That together with the new number of reltuples in pg_class
> will tell autovacuum how frequent a relation really needs scanning.

Which actually would be much better because it'd work without the
statistics collector configured for gathering IO stats.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: [HACKERS] Buglist

From
"Shridhar Daithankar"
Date:
On 22 Aug 2003 at 11:03, Jan Wieck wrote:

> Tom Lane wrote:
>
> > Jan Wieck <JanWieck@Yahoo.com> writes:
> >> Shridhar Daithankar wrote:
> >>> Umm.. What does FSM does then? I was under impression that FSM stores page
> >>> pointers and vacuum work on FSM information only. In that case, it wouldn't
> >>> have to waste time to find out which pages to clean.
> >
> >> It's the other way around! VACUUM scan's the tables to find and reclaim
> >> free space and remembers that free space in the FSM.
> >
> > Right.  One big question mark in my mind about these "partial vacuum"
> > proposals is whether they'd still allow adequate FSM information to be
> > maintained.  If VACUUM isn't looking at most of the pages, there's no
> > very good way to acquire info about where there's free space.
>
> That's why I think it needs one more pg_stat column to count the number
> of vacuumed tuples. If one does
>
>      tuples_updated + tuples_deleted - tuples_vacuumed
>
> he'll get approximately the number of tuples a regular vacuum might be
> able to reclaim. If that number is really small, no need for autovacuum
> to cause any big trouble by scanning the relation.
>
> Another way to give autovacuum some hints would be to return some number
> as commandtuples from vacuum. like the number of tuples actually
> vacuumed. That together with the new number of reltuples in pg_class
> will tell autovacuum how frequent a relation really needs scanning.

This kind of information does not really help autovacuum. If we are talking
about modifying backend stat collection algo., so that vacuum does minimum
work, is has translate to cheaper vacuum analyze so that autovacuum can fire it
at will any time. In the best case, another resident process like stat
collector can keep cleaning the deads.

This information must be in terms of pages and actually be maintained as per
page stat. Looking at number of tuples values does not give any idea to vacuum
how it is going to flush cache lines, either in postgresql or on OS. I doubt it
will help vacuum command in itself to be any lighter or more efficient.

If it is easy to do, I would favour maitaining two page maps as I mentioned in
another mail. One for pages in cache but not locked by any transaction and
another for pages which has some free space. If it is rare for a page to be
full, we can skip the later one. I think that could be good enough.




Bye
 Shridhar

--
Office Automation:    The use of computers to improve efficiency in the office    by
removing anyone you would want to talk with over coffee.


Re: Buglist

From
"Claudio Lapidus"
Date:
Bruno Wolff III wote:
> On Fri, Aug 22, 2003 at 12:17:41 +0530,
>   Shridhar Daithankar <shridhar_daithankar@persistent.co.in> wrote:
> >
> > Idea of autovacuum is to reduce load on vacuum full. If you set
shared_buffers
> > higher and FSM properly for he update/delete load, autovacuum is
expected to
> > catch most of the dead tuples in shared cache only. If it is successful
in
> > doubling the frequency on vacuum full, that's a big win, isn't it?
>
> If you run a normal vacuum often enough, you shouldn't need to regularly
> run vacuum full.

Hmm, here we have a certain table, sort of FIFO, rows get inserted all the
time, lay there for a couple of hours and get deleted "the other end
around". We run normal vacuum almost constantly, but the table keeps
growing. We had to implement a 'vacuum full' once a week to keep it under
control.

cl.


Re: Buglist

From
"Shridhar Daithankar"
Date:
On 22 Aug 2003 at 12:35, Claudio Lapidus wrote:

> Bruno Wolff III wote:
> > On Fri, Aug 22, 2003 at 12:17:41 +0530,
> >   Shridhar Daithankar <shridhar_daithankar@persistent.co.in> wrote:
> > >
> > > Idea of autovacuum is to reduce load on vacuum full. If you set
> shared_buffers
> > > higher and FSM properly for he update/delete load, autovacuum is
> expected to
> > > catch most of the dead tuples in shared cache only. If it is successful
> in
> > > doubling the frequency on vacuum full, that's a big win, isn't it?
> >
> > If you run a normal vacuum often enough, you shouldn't need to regularly
> > run vacuum full.
>
> Hmm, here we have a certain table, sort of FIFO, rows get inserted all the
> time, lay there for a couple of hours and get deleted "the other end
> around". We run normal vacuum almost constantly, but the table keeps
> growing. We had to implement a 'vacuum full' once a week to keep it under
> control.

I think you could benefit from some kind of emulated partitioning.. If there
are large number of rows getting deleted at the end of two hours, you could
just drop that sub table..

Bye
 Shridhar

--
We come to bury DOS, not to praise it.(Paul Vojta, vojta@math.berkeley.edu,
paraphrasing a quote of Shakespeare)


Re: [HACKERS] Buglist

From
"Matthew T. O'Connor"
Date:
On Fri, 2003-08-22 at 11:08, Jan Wieck wrote:
> > Another way to give autovacuum some hints would be to return some number
> > as commandtuples from vacuum. like the number of tuples actually
> > vacuumed. That together with the new number of reltuples in pg_class
> > will tell autovacuum how frequent a relation really needs scanning.
>
> Which actually would be much better because it'd work without the
> statistics collector configured for gathering IO stats.

Which is certainly a good thing.  Using the stats system is a measurable
performance hit.

I still want to play with pg_autovacuum ignoring the stats system and
just looking at the FSM data.


Re: Buglist

From
"scott.marlowe"
Date:
On Fri, 22 Aug 2003, Claudio Lapidus wrote:

> Bruno Wolff III wote:
> > On Fri, Aug 22, 2003 at 12:17:41 +0530,
> >   Shridhar Daithankar <shridhar_daithankar@persistent.co.in> wrote:
> > >
> > > Idea of autovacuum is to reduce load on vacuum full. If you set
> shared_buffers
> > > higher and FSM properly for he update/delete load, autovacuum is
> expected to
> > > catch most of the dead tuples in shared cache only. If it is successful
> in
> > > doubling the frequency on vacuum full, that's a big win, isn't it?
> >
> > If you run a normal vacuum often enough, you shouldn't need to regularly
> > run vacuum full.
>
> Hmm, here we have a certain table, sort of FIFO, rows get inserted all the
> time, lay there for a couple of hours and get deleted "the other end
> around". We run normal vacuum almost constantly, but the table keeps
> growing. We had to implement a 'vacuum full' once a week to keep it under
> control.

You may just need to increase your fsm settings in postgresql.conf to get
a regular vacuum to work the way you want.


Re: [HACKERS] Buglist

From
"Matthew T. O'Connor"
Date:
On Fri, 2003-08-22 at 11:17, Shridhar Daithankar wrote:
> On 22 Aug 2003 at 11:03, Jan Wieck wrote:
> > That's why I think it needs one more pg_stat column to count the number
> > of vacuumed tuples. If one does
> >
> >      tuples_updated + tuples_deleted - tuples_vacuumed
> >
> > he'll get approximately the number of tuples a regular vacuum might be
> > able to reclaim. If that number is really small, no need for autovacuum
> > to cause any big trouble by scanning the relation.
> >
> > Another way to give autovacuum some hints would be to return some number
> > as commandtuples from vacuum. like the number of tuples actually
> > vacuumed. That together with the new number of reltuples in pg_class
> > will tell autovacuum how frequent a relation really needs scanning.
>
> This kind of information does not really help autovacuum. If we are talking
> about modifying backend stat collection algo., so that vacuum does minimum
> work, is has translate to cheaper vacuum analyze so that autovacuum can fire it
> at will any time. In the best case, another resident process like stat
> collector can keep cleaning the deads.

I believe what Jan is talking about is knowing when to use a normal
vacuum, and when to do a vacuum decent.  So his proposal is working
under the assumption that there would be a cheaper vacuum analyze that
can be run most of the time.


Re: [HACKERS] Buglist

From
"Matthew T. O'Connor"
Date:
On Fri, 2003-08-22 at 10:45, Tom Lane wrote:
> Jan Wieck <JanWieck@Yahoo.com> writes:
> Right.  One big question mark in my mind about these "partial vacuum"
> proposals is whether they'd still allow adequate FSM information to be
> maintained.  If VACUUM isn't looking at most of the pages, there's no
> very good way to acquire info about where there's free space.

Well, pg_autovacuum really needs to be looking at the FSM anyway.  It
could look at the FSM, and choose to to do a vacuum normal when there
the amount of FSM data becomes inadequate.  Of course I'm not sure how
you would differentiate a busy table with "inadequate" FSM data and an
inactive table that doesn't even register in the FSM.  Perhaps you would
still need to consult the stats system.


Re: [HACKERS] Buglist

From
Manfred Koizar
Date:
On Fri, 22 Aug 2003 10:45:50 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>One big question mark in my mind about these "partial vacuum"
>proposals is whether they'd still allow adequate FSM information to be
>maintained.  If VACUUM isn't looking at most of the pages, there's no
>very good way to acquire info about where there's free space.

VACUUM has accurate information about the pages it just visited.  Free
space information for pages not touched by VACUUM is still in the FSM,
unless free space on a page is too low to be interesting.  VACUUM has
to merge these two lists and throw away entries with little free space
if running out of room.

Thus we might end up with new almost full pages in the FSM while there
are pages with more free space lying around that a previous VACUUM
failed to register because there was more free space at that time.

Considering that
 .  FSM is lossy per definitionem
 .  we are targeting at relations with large passive areas
 .  decent VACUUM shall not replace lazy VACUUM
I see no problem here.

Future advice could be: "VACCUM DECENT every hour, VACUUM daily,
VACUUM FULL once a year"  where the first two could be scheduled by
autovacuum ...

Servus
 Manfred

Re: [HACKERS] Buglist

From
Jan Wieck
Date:
Okay, my proposal would be to have a VACUUM mode where it tells the
buffer manager to only return a page if it is already in memory, and
some "not cached" if it would have to read it from disk, and simply skip
the page in that case. Probably needs some modifications in vacuums FSM
handling, but basically that's it. It'll still cause IO for the
resulting index bulk cleaning, so I don't know how efficient it'll be
after all.

The number of vacuumed tuples returned will tell the autovacuum how
useful this vacuum scan was. The less useful it is, the less frequent
it'll be scheduled. There is no point in vacuuming a 50M row table every
hour when the average number of tuples reclaimed is in the hundreds. I
don't intend to avoid a full table scan completely. I only intend to
lower the frequency of them. It will require some fuzzy logic in
autovacuum to figure out if a CACHEONLY vacuum for a table needs to be
more or less frequent to find more tuples though.

So far for what I have in mind. Now what are you proposing down there?
Where do you intend to hold that "per page stat" and what exactly is
maintaining it? And please don't give us any vague "some other resident
process". This only indicates you don't really know what it requires for
a process to be able to read or write data in PostgreSQL.


Jan

Shridhar Daithankar wrote:

> On 22 Aug 2003 at 11:03, Jan Wieck wrote:
>
>> Tom Lane wrote:
>>
>> > Jan Wieck <JanWieck@Yahoo.com> writes:
>> >> Shridhar Daithankar wrote:
>> >>> Umm.. What does FSM does then? I was under impression that FSM stores page
>> >>> pointers and vacuum work on FSM information only. In that case, it wouldn't
>> >>> have to waste time to find out which pages to clean.
>> >
>> >> It's the other way around! VACUUM scan's the tables to find and reclaim
>> >> free space and remembers that free space in the FSM.
>> >
>> > Right.  One big question mark in my mind about these "partial vacuum"
>> > proposals is whether they'd still allow adequate FSM information to be
>> > maintained.  If VACUUM isn't looking at most of the pages, there's no
>> > very good way to acquire info about where there's free space.
>>
>> That's why I think it needs one more pg_stat column to count the number
>> of vacuumed tuples. If one does
>>
>>      tuples_updated + tuples_deleted - tuples_vacuumed
>>
>> he'll get approximately the number of tuples a regular vacuum might be
>> able to reclaim. If that number is really small, no need for autovacuum
>> to cause any big trouble by scanning the relation.
>>
>> Another way to give autovacuum some hints would be to return some number
>> as commandtuples from vacuum. like the number of tuples actually
>> vacuumed. That together with the new number of reltuples in pg_class
>> will tell autovacuum how frequent a relation really needs scanning.
>
> This kind of information does not really help autovacuum. If we are talking
> about modifying backend stat collection algo., so that vacuum does minimum
> work, is has translate to cheaper vacuum analyze so that autovacuum can fire it
> at will any time. In the best case, another resident process like stat
> collector can keep cleaning the deads.
>
> This information must be in terms of pages and actually be maintained as per
> page stat. Looking at number of tuples values does not give any idea to vacuum
> how it is going to flush cache lines, either in postgresql or on OS. I doubt it
> will help vacuum command in itself to be any lighter or more efficient.
>
> If it is easy to do, I would favour maitaining two page maps as I mentioned in
> another mail. One for pages in cache but not locked by any transaction and
> another for pages which has some free space. If it is rare for a page to be
> full, we can skip the later one. I think that could be good enough.
>
>
>
>
> Bye
>  Shridhar
>
> --
> Office Automation:    The use of computers to improve efficiency in the office    by
> removing anyone you would want to talk with over coffee.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: [HACKERS] Buglist

From
Manfred Koizar
Date:
On Fri, 22 Aug 2003 12:18:02 -0400, Jan Wieck <JanWieck@Yahoo.com>
wrote:
>Okay, my proposal would be to have a VACUUM mode where it tells the
>buffer manager to only return a page if it is already in memory

But how can it know?  Yes, we know exactly what we have in PG shared
buffers.  OTOH we keep telling people that they should configure
moderate values for shared_buffers because the OS is better at
caching.  Your CACHEONLY VACUUM wouldn't catch those pages that are in
the OS cache but not in the shared buffers, although they are
retrievable at almost the same low cost.

We should not try to avoid _any_ physical disk access.  It's good
enough to avoid useless reads.  Hence my proposal for a reclaimable
space list ...

Servus
 Manfred

Re: [HACKERS] Buglist

From
Tom Lane
Date:
Jan Wieck <JanWieck@Yahoo.com> writes:
> Okay, my proposal would be to have a VACUUM mode where it tells the
> buffer manager to only return a page if it is already in memory, and
> some "not cached" if it would have to read it from disk, and simply skip
> the page in that case.

Since no such call is available at the OS level, this would only work
well with very large shared_buffers settings (ie, you try to rely on
PG shared buffers to the exclusion of kernel disk cache).  AFAIK the
general consensus is that that's not a good way to run Postgres.

            regards, tom lane

Re: [HACKERS] Buglist

From
Jan Wieck
Date:
Tom Lane wrote:

> Jan Wieck <JanWieck@Yahoo.com> writes:
>> Okay, my proposal would be to have a VACUUM mode where it tells the
>> buffer manager to only return a page if it is already in memory, and
>> some "not cached" if it would have to read it from disk, and simply skip
>> the page in that case.
>
> Since no such call is available at the OS level, this would only work
> well with very large shared_buffers settings (ie, you try to rely on
> PG shared buffers to the exclusion of kernel disk cache).  AFAIK the
> general consensus is that that's not a good way to run Postgres.

Oh-kay ... so yes Manfred, your RSM is probably the better way.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: Buglist

From
Jan Wieck
Date:
Claudio Lapidus wrote:

> Bruno Wolff III wote:
>> On Fri, Aug 22, 2003 at 12:17:41 +0530,
>>   Shridhar Daithankar <shridhar_daithankar@persistent.co.in> wrote:
>> >
>> > Idea of autovacuum is to reduce load on vacuum full. If you set
> shared_buffers
>> > higher and FSM properly for he update/delete load, autovacuum is
> expected to
>> > catch most of the dead tuples in shared cache only. If it is successful
> in
>> > doubling the frequency on vacuum full, that's a big win, isn't it?
>>
>> If you run a normal vacuum often enough, you shouldn't need to regularly
>> run vacuum full.
>
> Hmm, here we have a certain table, sort of FIFO, rows get inserted all the
> time, lay there for a couple of hours and get deleted "the other end
> around". We run normal vacuum almost constantly, but the table keeps
> growing. We had to implement a 'vacuum full' once a week to keep it under
> control.

What is the size of your database, how many tables do you have and what
are your FSM settings?


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: Buglist

From
David Siebert
Date:
I learned MySQL then went on to Postgres. I chose postgres for my in
house project just because of the row locking and transactions. Looking
back I could have used MySQL. I have yet to use stored procedures or
many of the high level functions of Postgres however transactions make
things so much cleaner. I do not think MySQL is a bad system. It works
well for many people in many situations. I think that MySQL and SAP
getting together could be very exciting. When it comes to SQL databases
I would say we have a wealth good choices. This if I use PHP I have to
use MySQL is a load of tripe. PHP can work just fine with Postgres. I
hate to even suggest this but has anyone thought of  adding PHP to the
languages that you can use to write stored procedures in Postgres?


Roderick A. Anderson wrote:

>On 19 Aug 2003, Bo Lorentsen wrote:
>
>
>
>>Also have anyone tryed to compare the new transaction model in MySQL 4.x
>>to PostgreSQL ?
>>
>>
>
>Bo,  I've recently started having to deal with MySQL. (Web sites
>wanting/using php _need/have-to-have_ MySQL. Their words not mine.)  And
>from going from a "I dislike MySQL" to "I'm really hating MySQL" has been
>getting easier and easier.
>   My dealings with MySQL are for the 3.xx version but I semi-followed a
>thread on this several months ago so feel fully qualified to to throw in
>my views.  :-)  My take on others research was that MySQL transaction
>model is a bubble gum and bailing wire add on not an integral part of
>MySQL.  It _was_ tacked onto the top of the database so if either it or
>MySQL failed you were likely to loose data.
>
>
>
>>I'm looking forward to recive even more constructive arguements :-)
>>
>>
>
>How about "Friends don't let friends use MySQL"?
>
>Hopefully others with a stonger knowledge will provide this.
>
>
>Rod
>
>



Re: Buglist

From
Franco Bruno Borghesi
Date:
Initial beta release of plPHP http://www.postgresql.org/news/143.html

On Tue, 2003-08-19 at 10:46, David Siebert wrote:
I learned MySQL then went on to Postgres. I chose postgres for my in 
house project just because of the row locking and transactions. Looking 
back I could have used MySQL. I have yet to use stored procedures or 
many of the high level functions of Postgres however transactions make 
things so much cleaner. I do not think MySQL is a bad system. It works 
well for many people in many situations. I think that MySQL and SAP 
getting together could be very exciting. When it comes to SQL databases 
I would say we have a wealth good choices. This if I use PHP I have to 
use MySQL is a load of tripe. PHP can work just fine with Postgres. I 
hate to even suggest this but has anyone thought of  adding PHP to the 
languages that you can use to write stored procedures in Postgres?


Roderick A. Anderson wrote:

>On 19 Aug 2003, Bo Lorentsen wrote:
>
>  
>
>>Also have anyone tryed to compare the new transaction model in MySQL 4.x
>>to PostgreSQL ?
>>    
>>
>
>Bo,  I've recently started having to deal with MySQL. (Web sites 
>wanting/using php _need/have-to-have_ MySQL. Their words not mine.)  And 
>from going from a "I dislike MySQL" to "I'm really hating MySQL" has been 
>getting easier and easier.
>   My dealings with MySQL are for the 3.xx version but I semi-followed a
>thread on this several months ago so feel fully qualified to to throw in
>my views.  :-)  My take on others research was that MySQL transaction
>model is a bubble gum and bailing wire add on not an integral part of
>MySQL.  It _was_ tacked onto the top of the database so if either it or
>MySQL failed you were likely to loose data.
>
>  
>
>>I'm looking forward to recive even more constructive arguements :-)
>>    
>>
>
>How about "Friends don't let friends use MySQL"?
>
>Hopefully others with a stonger knowledge will provide this.
>
>
>Rod
>  
>



---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate     subscribe-nomail command to majordomo@postgresql.org so that your     message can get through to the mailing list cleanly
Attachment

Re: Buglist

From
Bruce Momjian
Date:
Jan Wieck wrote:
> Manfred Koizar wrote:
> > On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
> > <shridhar_daithankar@persistent.co.in> wrote:
> >>Point I am trying to make is to tune FSM and autovacuum frequency
> >>such that you catch all the dead tuples in RAM
> >
> > You might be able to catch the pages with dead tuples in RAM, but
> > currently there's no way to keep VACUUM from reading in all the clean
> > pages, which can be far more ...
>
> Which leads us to a zero gravity vacuum, that does the lazy vacuum for
> pages currently available in the buffer cache only. And another pg_stat
> column telling the number of tuples vacuumed so that an autovac has a
> chance to avoid IO consuming vacuum runs for relations where 99% of the
> dead tuples have been caught in memory.

What would be really interesting is to look for dead tuples when you
write/discard a buffer page and add them to the FSM --- that is probably
the latest time you still have access to the page and has the highest
probability of being recyclable.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Buglist

From
Jan Wieck
Date:
Bruce Momjian wrote:

> Jan Wieck wrote:
>> Manfred Koizar wrote:
>> > On Thu, 21 Aug 2003 21:10:34 +0530, "Shridhar Daithankar"
>> > <shridhar_daithankar@persistent.co.in> wrote:
>> >>Point I am trying to make is to tune FSM and autovacuum frequency
>> >>such that you catch all the dead tuples in RAM
>> >
>> > You might be able to catch the pages with dead tuples in RAM, but
>> > currently there's no way to keep VACUUM from reading in all the clean
>> > pages, which can be far more ...
>>
>> Which leads us to a zero gravity vacuum, that does the lazy vacuum for
>> pages currently available in the buffer cache only. And another pg_stat
>> column telling the number of tuples vacuumed so that an autovac has a
>> chance to avoid IO consuming vacuum runs for relations where 99% of the
>> dead tuples have been caught in memory.
>
> What would be really interesting is to look for dead tuples when you
> write/discard a buffer page and add them to the FSM --- that is probably
> the latest time you still have access to the page and has the highest
> probability of being recyclable.
>

True, but it's again in the time critical path of a foreground
application because it's done by a backend who has to read another page
on behalf of a waiting client right now. Also, there is only a small
probability that all the pages required to do the index purge for the
tuples reclaimed are in memory too. Plus there is still no direct
connection between a heap tuples ctid and the physical location of it's
index tuples, so purging an index requires a full scan of it, which is
best done in bulk operations.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


Re: Buglist

From
Vivek Khera
Date:
>>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes:

TL> Just nice'ing the VACUUM process is likely to be counterproductive
TL> because of locking issues (priority inversion).  Though if anyone cares
TL> to try it on a heavily-loaded system, I'd be interested to hear the
TL> results...

tried it once.  didn't make much difference except that vacuum took
longer than normal.  i didn't see any deadlocks.

i actually figured out what my main problem was.  vacuum every 6 hours
on my two busiest tables was taking longer than 6 hours when we were
very busy...

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.                Khera Communications, Inc.
Internet: khera@kciLink.com       Rockville, MD       +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

Re: Buglist

From
Jan Wieck
Date:
Tom Lane wrote:
> Jan Wieck <JanWieck@Yahoo.com> writes:
>
>>What about a little hint to the buffer management that if it has to
>>evict another buffer to physically read this one (meaning the buffer
>>pool was full already) then it will not put this buffer at the top of
>>the LRU chain but rather at it's end? This way a vacuum on a large table
>>will not cause a complete cache eviction.
>
>
> I think what we really need is a way to schedule VACUUM's I/O at a lower
> priority than normal I/Os.  Wouldn't be very portable :-( ... but if the
> OS offers a facility for requesting this, it'd be worth experimenting
> with.

Whatever priority it has, I think the fact that a VACUUM is kicking
everything out of a carefully populated buffer cache and possibly
replacing it with data of low to no interest at all should have some
space for improvement. And that one single optimizer mistake choosing a
seqscan over an index scan for a huge table does the same doesn't strike
me as smart either.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: Decent VACUUM (was: Buglist)

From
Tom Lane
Date:
Manfred Koizar <mkoi-pg@aon.at> writes:
> better.  AFAICS Vivek's problem is that it is hard enough to hold a
> good part of the working set in the cache, and still his disks are
> saturated.  Now a VACUUM not only adds one more process to disk I/O
> contention, but also makes sure that the working set pages are *not*
> in memory which leads to higher I/O rates after the VACUUM.

We have had some people looking at improved buffer management
algorithms; LRU-2 or something smarter would help.  I dunno whether
we can dissuade the kernel from flushing its cache though.

> If we teach VACUUM to not read pages that don't contain any dead
> tuples, this could be a significant improvement.  I'm envisioning a
> data structure (reclaimable space map, RSM) similar to the FSM.
> Whenever a backend encounters a dead tuple it inserts a reference to
> its page into the RSM.

This assumes that backends will visit dead tuples with significant
probability.  I doubt that assumption is tenable; it's certainly not
if you assume that no backend is doing seqscans.  (And if they are,
then VACUUM is not the only I/O culprit...)

            regards, tom lane

Re: Decent VACUUM (was: Buglist)

From
Tom Lane
Date:
Manfred Koizar <mkoi-pg@aon.at> writes:
> Good point.  What about:  Whenever a backend *deletes* a tuple it
> inserts a reference to its page into the RSM?  Then an entry in the
> RSM doesn't necessarily mean that the referenced page has reclaimable
> space, but it would still be valueable information.

That might work if the RSM were lossless, but in practice I think it'd
have to be lossy, like the FSM.  Which would mean that you'd still have
to do full-scan vacuums fairly regularly to make sure you hadn't
forgotten any freeable tuples.  Conceivably it could be a win, though,
if you could do frequent "vacuum decent"s and only a full-scan vacuum
once in awhile (once a day maybe).

            regards, tom lane

Re: Buglist

From
Bruce Momjian
Date:
Vivek Khera wrote:
> >>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>
> TL> Just nice'ing the VACUUM process is likely to be counterproductive
> TL> because of locking issues (priority inversion).  Though if anyone cares
> TL> to try it on a heavily-loaded system, I'd be interested to hear the
> TL> results...
>
> tried it once.  didn't make much difference except that vacuum took
> longer than normal.  i didn't see any deadlocks.
>
> i actually figured out what my main problem was.  vacuum every 6 hours
> on my two busiest tables was taking longer than 6 hours when we were
> very busy...

Hold --- a non-FULL vacuum is taking 6+ hours on two tables?  That seems
impossible.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Buglist

From
Bruce Momjian
Date:
Jan Wieck wrote:
> Tom Lane wrote:
> > Jan Wieck <JanWieck@Yahoo.com> writes:
> >
> >>What about a little hint to the buffer management that if it has to
> >>evict another buffer to physically read this one (meaning the buffer
> >>pool was full already) then it will not put this buffer at the top of
> >>the LRU chain but rather at it's end? This way a vacuum on a large table
> >>will not cause a complete cache eviction.
> >
> >
> > I think what we really need is a way to schedule VACUUM's I/O at a lower
> > priority than normal I/Os.  Wouldn't be very portable :-( ... but if the
> > OS offers a facility for requesting this, it'd be worth experimenting
> > with.
>
> Whatever priority it has, I think the fact that a VACUUM is kicking
> everything out of a carefully populated buffer cache and possibly
> replacing it with data of low to no interest at all should have some
> space for improvement. And that one single optimizer mistake choosing a
> seqscan over an index scan for a huge table does the same doesn't strike
> me as smart either.

I am hoping a smarter cache replacement policy, hopefully for 7.5, will
prevent VACUUM from pushing out frequently accessed pages.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [HACKERS] Decent VACUUM (was: Buglist)

From
Curt Sampson
Date:
On Thu, 21 Aug 2003, Tom Lane wrote:

> We have had some people looking at improved buffer management
> algorithms; LRU-2 or something smarter would help.  I dunno whether
> we can dissuade the kernel from flushing its cache though.

Using open/read/write system calls, you can't. You can always use an OS
like Solaris that should detect the sequential read, however, and avoid
blowing out the buffer cache.

Most operating systems support the madvise system call, which does let
you do things like say, "I'm accessing this sequentially" and "I don't
need this to be buffered any more," though how much attention most
operating systems pay to this advice varies with the OS. However, it
turns out to be non-trivial to get postgres to use mmap for data blocks,
since you can't actually write any data to a mmmap'd block until you've
confirmed that the log entry has been written, because once you've
touched data in an mmaped block you have no way of stopping it from
being written to the disk right away.

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.NetBSD.org
    Don't you know, in this new Dark Age, we're all light.  --XTC

Re: Buglist

From
Vivek Khera
Date:
>>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes:


BM> Hold --- a non-FULL vacuum is taking 6+ hours on two tables?  That seems
BM> impossible.

Well, did I mention I'm saturating my disk I/O bandwidth at the same
time with other queries? ;-)

Re: Buglist

From
Vivek Khera
Date:
>>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes:

>>
>> Well, did I mention I'm saturating my disk I/O bandwidth at the same
>> time with other queries? ;-)

BM> But six hours.  It is my understanding that a sequential scan is roughly
BM> the same load as a non-FULL vacuum.  Are you saying a sequential scan
BM> takes +6 hours too?  How can any work get done?

Well, it is a big downward spiral once you saturate your disks.  You
can't get queries done quickly enough, and table size increases with
all those updates, and you have to run vacuum to help that, and that
slows the queries more.  Lather rinse repeat.

The new server, with 14 ultra-160 SCSI disks on a hardware RAID10 is
going to go to the data center tomorrow, and hopefully will be live by
the weekend (if I can get eRServer to replicate the data to the slave
without having to shutdown the whole system).

Then I'll let ya know how long the vacuum takes ;-)

PS: Last time I tried a vauum full on my largest table, I gave up
after 14 hours of down time.  That was not good for our business...

Re: Buglist

From
Bruce Momjian
Date:
Vivek Khera wrote:
> >>>>> "BM" == Bruce Momjian <pgman@candle.pha.pa.us> writes:
>
>
> BM> Hold --- a non-FULL vacuum is taking 6+ hours on two tables?  That seems
> BM> impossible.
>
> Well, did I mention I'm saturating my disk I/O bandwidth at the same
> time with other queries? ;-)

But six hours.  It is my understanding that a sequential scan is roughly
the same load as a non-FULL vacuum.  Are you saying a sequential scan
takes +6 hours too?  How can any work get done?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Buglist

From
Christopher Browne
Date:
After a long battle with technology,khera@kcilink.com (Vivek Khera), an earthling, wrote:
>>>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>
> TL> Just nice'ing the VACUUM process is likely to be counterproductive
> TL> because of locking issues (priority inversion).  Though if anyone cares
> TL> to try it on a heavily-loaded system, I'd be interested to hear the
> TL> results...
>
> tried it once.  didn't make much difference except that vacuum took
> longer than normal.  i didn't see any deadlocks.
>
> i actually figured out what my main problem was.  vacuum every 6 hours
> on my two busiest tables was taking longer than 6 hours when we were
> very busy...

I "wedged" a database server once that way; it was busy, busy, busy
with a multiplicity of processes trying to simultaneously vacuum the
same table.

The "new generation" resolution to that is pg_autovacuum; if you're
running a pre-7.3 version, a good idea is basically to have a vacuum
script that checks a "lock file" and exits if it sees that another
process is already busy vacuuming.
--
output = reverse("gro.mca" "@" "enworbbc")
http://www.ntlug.org/~cbbrowne/postgresql.html
"I am aware of the benefits  of a micro kernel approach.  However, the
fact remains  that Linux is  here, and GNU  isn't --- and  people have
been working on Hurd for a lot longer than Linus has been working on
Linux." -- Ted T'so, 1992.

Re: Buglist

From
Christopher Browne
Date:
After a long battle with technology,JanWieck@Yahoo.com (Jan Wieck), an earthling, wrote:
> Tom Lane wrote:
>> Jan Wieck <JanWieck@Yahoo.com> writes:
>>
>>> What about a little hint to the buffer management that if it has to
>>> evict another buffer to physically read this one (meaning the
>>> buffer pool was full already) then it will not put this buffer at
>>> the top of the LRU chain but rather at it's end? This way a vacuum
>>> on a large table will not cause a complete cache eviction.
>> I think what we really need is a way to schedule VACUUM's I/O at a
>> lower
>> priority than normal I/Os.  Wouldn't be very portable :-( ... but if the
>> OS offers a facility for requesting this, it'd be worth experimenting
>> with.
>
> Whatever priority it has, I think the fact that a VACUUM is kicking
> everything out of a carefully populated buffer cache and possibly
> replacing it with data of low to no interest at all should have some
> space for improvement. And that one single optimizer mistake choosing
> a seqscan over an index scan for a huge table does the same doesn't
> strike me as smart either.

[Thinking out loud...]

There wouldn't be some way of marking the pages that are read in for
a VACUUM as having the lowest possible priority, would there?

It's no grand insult if VACUUM consumes _some_ buffer cache pages, but
if it were set at low priority, once read in, would it not be possible
for those blocks to quickly get reused by the later VACUUM work, thus
limiting the degree to which the cache of _useful_ data got spilled
out?

I suppose this might turn a simple LRU queue into something resembling
a priority queue, but it's a thought...
--
If this was helpful, <http://svcs.affero.net/rm.php?r=cbbrowne> rate me
http://www3.sympatico.ca/cbbrowne/sap.html
"He who  writes the code gets  to choose his license,  and nobody else
gets to complain." -- Linus Torvalds

Re: [HACKERS] Decent VACUUM (was: Buglist)

From
Bruce Momjian
Date:
Manfred Koizar wrote:
> [ still brainstorming ... ]
>
> On Thu, 21 Aug 2003 17:16:50 -0400, Tom Lane <tgl@sss.pgh.pa.us>
> wrote:
> >> Whenever a backend encounters a dead tuple it inserts a reference to
> >> its page into the RSM.
> >
> >This assumes that backends will visit dead tuples with significant
> >probability.  I doubt that assumption is tenable;
>
> Good point.  What about:  Whenever a backend *deletes* a tuple it
> inserts a reference to its page into the RSM?  Then an entry in the
> RSM doesn't necessarily mean that the referenced page has reclaimable
> space, but it would still be valueable information.

Added to TODO:

    * Maintain a map of recently-expired of pages so vacuum can reclaim
      free space without a sequential scan

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: [HACKERS] Buglist

From
Bruce Momjian
Date:
Jan Wieck wrote:
> Tom Lane wrote:
>
> > Jan Wieck <JanWieck@Yahoo.com> writes:
> >> Okay, my proposal would be to have a VACUUM mode where it tells the
> >> buffer manager to only return a page if it is already in memory, and
> >> some "not cached" if it would have to read it from disk, and simply skip
> >> the page in that case.
> >
> > Since no such call is available at the OS level, this would only work
> > well with very large shared_buffers settings (ie, you try to rely on
> > PG shared buffers to the exclusion of kernel disk cache).  AFAIK the
> > general consensus is that that's not a good way to run Postgres.
>
> Oh-kay ... so yes Manfred, your RSM is probably the better way.

Added to TODO.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073