Thread: Autovacuum improvements

Autovacuum improvements

From
Alvaro Herrera
Date:
I've been thinnking how to improve autovacuum so that we can convince
more people that it can be enabled by default.  Here are my thoughts.
There are two areas of improvements:

1. scheduling, and
2. process handling, i.e., how to have multiple vacuum processes running  at any time.

I ripped out the part about having multiple "vacuum queues", as it was
incomplete and it was also getting too complex.  We need to discuss how
to do that, because it's a fundamental part of this proposal; the idea
is to be able to have several vacuums running at any time, but we need
to find a way to specify a policy for it.


Process Handling
================

My idea here is to morph the current autovacuum daemon from an agent
that itself runs a vacuum command, into something that launches other
processes to run those vacuum commands.  I'll call this "the autovacuum
launcher process", or the launcher for short.  The idea here is that the
launcher can take care of the scheduling while the worker processes do
their work.  If the launcher then determines that a particular instant
there should be two vacuums running, then it simply starts two worker
processes.

The launcher would be running continuously, akin to the postmaster, but
would be obviously under control of the latter, so it's postmaster's
responsability to start and stop the launcher.  The launcher would be
connected to shared memory, so it can scan system catalogs to load the
schedule (stored in system catalogs) into memory.  If the launcher dies,
the postmaster should treat it like any other process' crash and cause a
restart cycle.

The workers would not be postmaster's direct children, which could be a
problem.  I'm open to ideas here, but I don't like using the postmaster
directly as a launcher, because of the shmem connection, which would
take robustness away from the postmaster.  One idea to solve this is to
have the launcher process communicate child process IDs to the
postmaster, so that when it (the postmaster) wants to stop, it has those
additional PIDs in its process list and can signal them to stop.  The
launcher process would also signal when it detects that one of the
workers stopped, and the postmaster would remove that process from the
list.  This communication could be made to happen via named pipes, and
since the messages are so simple, there's no reliability concern for the
postmaster; it's very easy to verify that a message is correct by
checking whether the process is actually killable by kill(0).

Another idea that I discarded was to have the launcher communicate back
to the postmaster when new workers should be started.  My fear is that
this type of communication (a lot more complex that just sending a PID)
could be a cause for postmaster instability.


Scheduling
==========

We introduce the following concepts:

1. table groups.  We'll have a system catalog for storing OID and group
name, and another catalog for membership, linking relid to group OID.

pg_av_tablegroup tgrname    name

pg_av_tgroupmembers groupid    oid relid        oid


2. interval groups.  We'll have a catalog for storing igroup name and
OID, and another catalog for membership.  We identify an interval by:  - month of year  - day of month  - day of week
-start time of day  - end time of day
 

This is modelled after crontabs.

pg_av_intervalgroupigrname    name

pg_av_igroupmembersgroupid    oidmonth        intdom        intdow        intstarttime    timetzendtime    timetz

Additionally, we'll have another catalog on which we'll store table
groups to interval groups relationships.  On that catalog we'll also
store those autovacuum settings that we want to be able to override:
whether to disable it for this interval group, or the values for the
vacuum/analyze equations.

pg_av_scheduletgroup                oidigroup                oidenabled            boolqueue
intvac_base_thresh   intvac_scale_factor    floatanl_base_thresh    intanl_scal_factor    floatvac_cost_delay
intvac_cost_limit       intfreeze_min_age        intfreeze_max_age        int
 


So the scheduler, at startup, loads the whole schedule in memory, and
then wakes up at reasonable intervals and checks whether these equations
hold for some of the tables it's monitoring.  If they do, then launch a
new worker process to do the job.

We need a mechanism for having the scheduler rescan the schedule when a
user modifies the catalog -- maybe having a trigger that sends a signal
to the process is good enough (implementation detail: the signal must be
routed via the postmaster, since the backend cannot hope to know the
scheduler's PID.  This is easy enough to do.)

-- 
Alvaro Herrera                          Developer, http://www.PostgreSQL.org/
"The problem with the facetime model is not just that it's demoralizing, but
that the people pretending to work interrupt the ones actually working."
         (Paul Graham)
 


Re: Autovacuum improvements

From
Darcy Buskermolen
Date:
On Sunday 14 January 2007 05:18, Alvaro Herrera wrote:
> I've been thinnking how to improve autovacuum so that we can convince
> more people that it can be enabled by default.  Here are my thoughts.
> There are two areas of improvements:
>
> 1. scheduling, and
> 2. process handling, i.e., how to have multiple vacuum processes running
>    at any time.
>
> I ripped out the part about having multiple "vacuum queues", as it was
> incomplete and it was also getting too complex.  We need to discuss how
> to do that, because it's a fundamental part of this proposal; the idea
> is to be able to have several vacuums running at any time, but we need
> to find a way to specify a policy for it.
====8<  snip >8====
> So the scheduler, at startup, loads the whole schedule in memory, and
> then wakes up at reasonable intervals and checks whether these equations
> hold for some of the tables it's monitoring.  If they do, then launch a
> new worker process to do the job.
>
> We need a mechanism for having the scheduler rescan the schedule when a
> user modifies the catalog -- maybe having a trigger that sends a signal
> to the process is good enough (implementation detail: the signal must be
> routed via the postmaster, since the backend cannot hope to know the
> scheduler's PID.  This is easy enough to do.)

While we are talking autovacuum improvements, I'd like to also see some better 
logging, something that is akin to the important information of vacuum 
verbose being logged to a table or baring that the error_log.  I'd like to be 
able to see what was done, and how long it took to do for each relation 
touched by av.  A thought, having this information may even be usefull for 
the above thought of scheduler because we may be able to build some sort of 
predictive scheduling into this.


Re: Autovacuum improvements

From
"Joshua D. Drake"
Date:
> While we are talking autovacuum improvements, I'd like to also see some better 
> logging, something that is akin to the important information of vacuum 
> verbose being logged to a table or baring that the error_log.  I'd like to be 
> able to see what was done, and how long it took to do for each relation 
> touched by av.  A thought, having this information may even be usefull for 
> the above thought of scheduler because we may be able to build some sort of 
> predictive scheduling into this.

This plays back to the vacuum summary idea that I requested:

http://archives.postgresql.org/pgsql-hackers/2005-07/msg00451.php

(Man our new search engine is so much better than the old one :))

Joshua D. Drake



> 
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
> 
>                 http://www.postgresql.org/about/donate
> 


-- 
     === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997            http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/



Re: Autovacuum improvements

From
"Matthew T. O'Connor"
Date:
First, thanks for working on this.  I hope to be helpful with the design 
discussion and possibly some coding if I can find the time.

My initial reaction to this proposal is that it seems overly complex, 
however I don't see a more elegant solution.  I'm a bit concerned that 
most users won't figure out all the knobs.

Alvaro Herrera wrote:
> I've been thinking how to improve autovacuum so that we can convince
> more people that it can be enabled by default.  

I would like to see it enabled by default too, however the reason it 
isn't already enabled by default is that it caused failures in the 
regression test when we tried to turn it on during the 8.2 dev cycle and 
it was too close to beta to fix everything.  All this new machinery is 
great, but it doesn't address that problem.

> Here are my thoughts.
> There are two areas of improvements:
> 
> 1. scheduling, and
> 2. process handling, i.e., how to have multiple vacuum processes running
>    at any time.

Fail enough, but I would say the two biggest area for improvement are 
scheduling and preventing "HOT" tables from becoming vacuum starved 
(essentially what you said, but with a different emphasis).

[snip]

> Process Handling
> ================
> 
> My idea here is to morph the current autovacuum daemon from an agent
> that itself runs a vacuum command, into something that launches other
> processes to run those vacuum commands.  I'll call this "the autovacuum
> launcher process", or the launcher for short.  The idea here is that the
> launcher can take care of the scheduling while the worker processes do
> their work.  If the launcher then determines that a particular instant
> there should be two vacuums running, then it simply starts two worker
> processes.

How about calling it the autovacuum_master process?

[snip autovacuum launcher process description]

That all sounds reasonable to me.  I think the harder part is what you 
are getting at below (how to get the launcher to figure out what to 
vacuum when).

> Scheduling
> ==========
> We introduce the following concepts:
> 
> 1. table groups.  We'll have a system catalog for storing OID and group
> name, and another catalog for membership, linking relid to group OID.
> 
> pg_av_tablegroup
>   tgrname    name
> 
> pg_av_tgroupmembers
>   groupid    oid
>   relid        oid>
> 2. interval groups.  We'll have a catalog for storing igroup name and
> OID, and another catalog for membership.  We identify an interval by:
>    - month of year
>    - day of month
>    - day of week
>    - start time of day
>    - end time of day
> 
> This is modelled after crontabs.
> 
> pg_av_intervalgroup
>  igrname    name
> 
> pg_av_igroupmembers
>  groupid    oid
>  month        int
>  dom        int
>  dow        int
>  starttime    timetz
>  endtime    timetz

This seems to assume that the start and end time for an interval will be 
on the same day, you probably need to specify a start month, dom, dow, 
time and an end month, dom, dow and time.

Since this is modeled after cron, do we allow wild-cards, or any of the 
other cron tricks like */20 or 1-3,5,7,9-11?

Also your notation above is ambiguous, it took me a while to realize 
that pg_av_igroupmembers.groupid wasn't referencing the id from 
pg_av_tablegroup.

> Additionally, we'll have another catalog on which we'll store table
> groups to interval groups relationships.  On that catalog we'll also
> store those autovacuum settings that we want to be able to override:
> whether to disable it for this interval group, or the values for the
> vacuum/analyze equations.
> 
> pg_av_schedule
>  tgroup                oid
>  igroup                oid
>  enabled            bool
>  queue                int
>  vac_base_thresh    int
>  vac_scale_factor    float
>  anl_base_thresh    int
>  anl_scal_factor    float
>  vac_cost_delay        int
>  vac_cost_limit        int
>  freeze_min_age        int
>  freeze_max_age        int
> 

What is queue for?

> So the scheduler, at startup, loads the whole schedule in memory, and
> then wakes up at reasonable intervals and checks whether these equations
> hold for some of the tables it's monitoring.  If they do, then launch a
> new worker process to do the job.
> 
> We need a mechanism for having the scheduler rescan the schedule when a
> user modifies the catalog -- maybe having a trigger that sends a signal
> to the process is good enough (implementation detail: the signal must be
> routed via the postmaster, since the backend cannot hope to know the
> scheduler's PID.  This is easy enough to do.)

This all looks reasonable if not a bit complex.  Question, what happens 
to the current pg_autovacuum relation?

Also what about system defaults, will we have a hard coded default 
interval of always on, and one default table group that contains all the 
tables with one default entry in pg_av_schedule?

I think we need more discussion on scheduling, we need to make sure this 
solves the vacuum starvation problem.  Does the launcher process 
consider each row in pg_av_schedule that applies at the current time 
separately?  That is say there are three entries in pg_av_schedule that 
apply right now, does that mean that the launcher can fire off three 
different vacuums?  Perhaps we need to add a column to pg_av_tablegroup 
that specifies the max number of concurrent worker processes for this 
table group.

Also, I don't think we need the concept of queues as described in recent 
threads.  I think the idea of the queues was the the system would be 
able to automatically find small tables and vacuum them frequently, in 
this proposal the admin would have to create a group for small tables 
and manually add tables to the group and make sure that there are enough 
worker processes for that group to prevent vacuum starvation.  Perhaps 
we can create a dynamic group that includes all tables with less than a 
certain number of rows or blocks?

Thanks for working on this!

Matt O'Connor





Re: Autovacuum improvements

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> I've been thinnking how to improve autovacuum so that we can convince
> more people that it can be enabled by default.  Here are my thoughts.
> There are two areas of improvements:

> 1. scheduling, and
> 2. process handling, i.e., how to have multiple vacuum processes running
>    at any time.

Actually the reason it's not enabled by default today has nothing to do
with either of those; it's

3. Unexpected side effects on foreground processes, such as surprising
failures of DROP DATABASE commands.  (See archives for details.)

Until (3) is addressed I don't think there is any chance of having
autovac on by default, and so worrying about (1) and (2) seems a bit
premature.  Or at least, if you want to work on those fine, but don't
expect that it will alter the fact that the factory default is "off".
        regards, tom lane


Re: Autovacuum improvements

From
Alvaro Herrera
Date:
Tom Lane wrote:

> Actually the reason it's not enabled by default today has nothing to do
> with either of those; it's
> 
> 3. Unexpected side effects on foreground processes, such as surprising
> failures of DROP DATABASE commands.  (See archives for details.)

The referred to thread starts here:

http://archives.postgresql.org/pgsql-hackers/2006-08/msg01814.php

> Until (3) is addressed I don't think there is any chance of having
> autovac on by default, and so worrying about (1) and (2) seems a bit
> premature.  Or at least, if you want to work on those fine, but don't
> expect that it will alter the fact that the factory default is "off".

Hmm, right.  The mentioned problems are:

> * manual ANALYZE issued by regression tests fails because autovac is
> analyzing the same table concurrently.

> * contrib tests fail in their repeated drop/create database operations
> because autovac is connected to that database.  (pl tests presumably
> have same issue.)

I suggest we should fix at least the second problem and then turn
autovac on by default, to see if there are more hurdles (and to get more
autovacuum testing during this development cycle, at least as far as
regression tests are concerned).  We can turn it back off after the 8.3
cycle is done, if we don't find it living up to expectations.

I'm not sure how to fix the second problem.  If it was autovac's ANALYZE
that was failing, ISTM it would be a simple problem, but we don't have
much control over the regression test's own ANALYZEs.

One idea would be to introduce the concept of launcher process I
mentioned, and have it act like the bgwriter for checkpoints: have it
start the analyze when backends request it, and then inform when the
analyze is done.  So if an analyze is already running, then the launcher
does nothing except inform the backend when the analyze is finished.


So a sort of roadmap for my proposal would be to first introduce the
autovacuum launcher, and have backends communicate with it instead of
doing the work by themselves; and then introduce the scheduling concept
into the launcher.

In fact, if we have the scheduler be a separate process from the
launcher, the scheduler could be pluggable: sites for which the current
autovacuum is enough just use today's autovacuum as scheduler, and sites
which need more elaborate configuration just turn on the advanced
module.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Autovacuum improvements

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Hmm, right.  The mentioned problems are:

>> * manual ANALYZE issued by regression tests fails because autovac is
>> analyzing the same table concurrently.

This problem might have gone away since then --- I think we are now
taking a lock to ensure only one ANALYZE per table at a time; so the
manual ANALYZEs should only be delayed a bit not report errors.

>> * contrib tests fail in their repeated drop/create database operations
>> because autovac is connected to that database.  (pl tests presumably
>> have same issue.)

The DROP is at risk, but CREATE is also at risk because autovac feels
free to connect to template0.  (One of the reasons we invented template0
was to prevent CREATE DATABASE failures due to someone-else-connected,
but autovac has broken that idea.)

Possibly we could handle these by extending create/drop db to check
whether a process-connected-to-the-target-db is an autovac, and if so
send it a SIGINT and wait for the process to terminate, instead of
failing.
        regards, tom lane


Re: Autovacuum improvements

From
Alvaro Herrera
Date:
Matthew T. O'Connor wrote:

> Alvaro Herrera wrote:
> >I've been thinking how to improve autovacuum so that we can convince
> >more people that it can be enabled by default.  
> 
> I would like to see it enabled by default too, however the reason it 
> isn't already enabled by default is that it caused failures in the 
> regression test when we tried to turn it on during the 8.2 dev cycle and 
> it was too close to beta to fix everything.  All this new machinery is 
> great, but it doesn't address that problem.

See my reply to Tom on this topic.


> >pg_av_igroupmembers
> > groupid    oid
> > month        int
> > dom        int
> > dow        int
> > starttime    timetz
> > endtime    timetz
> 
> This seems to assume that the start and end time for an interval will be 
> on the same day, you probably need to specify a start month, dom, dow, 
> time and an end month, dom, dow and time.

Actually, I was thinking that if you want intervals that cross day
boundaries, you just add more tuples (one which finishes at 23:59:59 and
another which starts at 00:00:00 the next day).

> Since this is modeled after cron, do we allow wild-cards, or any of the 
> other cron tricks like */20 or 1-3,5,7,9-11?

Wildcards yes (using NULL), but not the rest because it would make the
autovacuum code responsible for parsing the values which I don't think
is a good idea.  And it's not normalized anyway.

> Also your notation above is ambiguous, it took me a while to realize 
> that pg_av_igroupmembers.groupid wasn't referencing the id from 
> pg_av_tablegroup.

Hmm, yeah, that one is referencing pg_av_intervalgroup.


> >pg_av_schedule
> > tgroup                oid
> > igroup                oid
> > enabled            bool
> > queue                int
> > vac_base_thresh    int
> > vac_scale_factor    float
> > anl_base_thresh    int
> > anl_scal_factor    float
> > vac_cost_delay        int
> > vac_cost_limit        int
> > freeze_min_age        int
> > freeze_max_age        int
> >
> 
> What is queue for?

Sorry, that was part of the queue stuff which I then deleted :-)


> >So the scheduler, at startup, loads the whole schedule in memory, and
> >then wakes up at reasonable intervals and checks whether these equations
> >hold for some of the tables it's monitoring.  If they do, then launch a
> >new worker process to do the job.
> >
> >We need a mechanism for having the scheduler rescan the schedule when a
> >user modifies the catalog -- maybe having a trigger that sends a signal
> >to the process is good enough (implementation detail: the signal must be
> >routed via the postmaster, since the backend cannot hope to know the
> >scheduler's PID.  This is easy enough to do.)
> 
> This all looks reasonable if not a bit complex.  Question, what happens 
> to the current pg_autovacuum relation?

I had two ideas: one was to make pg_autovacuum hold default config for
all tables not mentioned in any group, so sites which are OK with 8.2's
representation can still use it.  The other idea was to remove it and
replace it with this mechanism.

> Also what about system defaults, will we have a hard coded default 
> interval of always on, and one default table group that contains all the 
> tables with one default entry in pg_av_schedule?

Yes, that's what I had in mind.

> I think we need more discussion on scheduling, we need to make sure this 
> solves the vacuum starvation problem.  Does the launcher process 
> consider each row in pg_av_schedule that applies at the current time 
> separately?  That is say there are three entries in pg_av_schedule that 
> apply right now, does that mean that the launcher can fire off three 
> different vacuums?  Perhaps we need to add a column to pg_av_tablegroup 
> that specifies the max number of concurrent worker processes for this 
> table group.

My idea was to assign each table, or maybe each group, to a queue, and
then have as much workers as there are queues.  So you could put them
all in a single queue and it would mean there can be at most one vacuum
running at any time.  Or you could put each group in a queue, and then
there could be as many workers as there are groups.  Or you could mix.

And also there would be a "autovac concurrency limit", which would be
a GUC var saying how many vacuums to have at any time.

> Also, I don't think we need the concept of queues as described in recent 
> threads.  I think the idea of the queues was the the system would be 
> able to automatically find small tables and vacuum them frequently, in 
> this proposal the admin would have to create a group for small tables 
> and manually add tables to the group and make sure that there are enough 
> worker processes for that group to prevent vacuum starvation.  Perhaps 
> we can create a dynamic group that includes all tables with less than a 
> certain number of rows or blocks?

Yeah, my idea of "queues" was slightly different than the queues that
were being discussed.  I was thinking that queues would just be a means
to group the groups to limit concurrency while at the same time prevent
starvation.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Autovacuum improvements

From
Alvaro Herrera
Date:
Tom Lane wrote:

> >> * contrib tests fail in their repeated drop/create database operations
> >> because autovac is connected to that database.  (pl tests presumably
> >> have same issue.)
> 
> The DROP is at risk, but CREATE is also at risk because autovac feels
> free to connect to template0.  (One of the reasons we invented template0
> was to prevent CREATE DATABASE failures due to someone-else-connected,
> but autovac has broken that idea.)
> 
> Possibly we could handle these by extending create/drop db to check
> whether a process-connected-to-the-target-db is an autovac, and if so
> send it a SIGINT and wait for the process to terminate, instead of
> failing.

Hmm, I can see having DROP DATABASE just stopping the autovacuum (since
the work will be thrown away), but is a good idea to stop it on CREATE
DATABASE?  I think it may be better to have CREATE DATABASE wait until
the vacuum is finished.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Autovacuum improvements

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> Possibly we could handle these by extending create/drop db to check
>> whether a process-connected-to-the-target-db is an autovac, and if so
>> send it a SIGINT and wait for the process to terminate, instead of
>> failing.

> Hmm, I can see having DROP DATABASE just stopping the autovacuum (since
> the work will be thrown away), but is a good idea to stop it on CREATE
> DATABASE?  I think it may be better to have CREATE DATABASE wait until
> the vacuum is finished.

It can always be done again later.  I think that the arguments of (1)
only one code path needed and (2) not making the user wait should win
out over concerns about possible wasted autovac effort.  (The wasted
effort should generally be pretty small anyway, since a template
database probably doesn't contain any large tables.)
        regards, tom lane


Re: Autovacuum improvements

From
Peter Eisentraut
Date:
Alvaro Herrera wrote:
> Hmm, I can see having DROP DATABASE just stopping the autovacuum
> (since the work will be thrown away),

For that same reason DROP DATABASE could just cut all connections to the 
database.  Or at least queue up and wait until the session is over.  
(The latter would correspond to what DROP TABLE does on a table that is 
in use.)

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: Autovacuum improvements

From
"Matthew T. O'Connor"
Date:
Alvaro Herrera wrote:
> Matthew T. O'Connor wrote:
>   
>> Alvaro Herrera wrote:
>>     
>>> pg_av_igroupmembers
>>> groupid    oid
>>> month        int
>>> dom        int
>>> dow        int
>>> starttime    timetz
>>> endtime    timetz
>>>       
>> This seems to assume that the start and end time for an interval will be 
>> on the same day, you probably need to specify a start month, dom, dow, 
>> time and an end month, dom, dow and time.
>>     
>
> Actually, I was thinking that if you want intervals that cross day
> boundaries, you just add more tuples (one which finishes at 23:59:59 and
> another which starts at 00:00:00 the next day).
>   

This still seems ambiguous to me, how would I handle a maintenance 
window of Weekends from Friday at 8PM though Monday morning at 6AM? My 
guess from what said is:
mon dom dow starttime endtime
null  null    6      20:00      null
null  null    1      null          06:00

So how do we know to vacuum on Saturday or Sunday?  I think clearly 
defined intervals with explicit start and stop times is cleaner.

>> This all looks reasonable if not a bit complex.  Question, what happens 
>> to the current pg_autovacuum relation?
>>     
>
> I had two ideas: one was to make pg_autovacuum hold default config for
> all tables not mentioned in any group, so sites which are OK with 8.2's
> representation can still use it.  The other idea was to remove it and
> replace it with this mechanism.
>
>   

Probably best to just get rid of it.  GUC variables hold the defaults 
and if we create a default interval / group, it will also have defaults.

>> I think we need more discussion on scheduling, we need to make sure this 
>> solves the vacuum starvation problem.  Does the launcher process 
>> consider each row in pg_av_schedule that applies at the current time 
>> separately?  That is say there are three entries in pg_av_schedule that 
>> apply right now, does that mean that the launcher can fire off three 
>> different vacuums?  Perhaps we need to add a column to pg_av_tablegroup 
>> that specifies the max number of concurrent worker processes for this 
>> table group.
>>     
>
> My idea was to assign each table, or maybe each group, to a queue, and
> then have as much workers as there are queues.  So you could put them
> all in a single queue and it would mean there can be at most one vacuum
> running at any time.  Or you could put each group in a queue, and then
> there could be as many workers as there are groups.  Or you could mix.
>
> And also there would be a "autovac concurrency limit", which would be
> a GUC var saying how many vacuums to have at any time.

Hmm... this seems like queue is nearly a synonym for group.  Can't we 
just add num_workers property to table groups?  That seems to accomplish 
the same thing.  And yes, a GUC variable to limits the total number of 
concurrent autovacuums is probably a good idea.




Re: Autovacuum improvements

From
Alvaro Herrera
Date:
Tom Lane wrote:

> The DROP is at risk, but CREATE is also at risk because autovac feels
> free to connect to template0.  (One of the reasons we invented template0
> was to prevent CREATE DATABASE failures due to someone-else-connected,
> but autovac has broken that idea.)

ALTER DATABASE RENAME also needs the same treatment.

> Possibly we could handle these by extending create/drop db to check
> whether a process-connected-to-the-target-db is an autovac, and if so
> send it a SIGINT and wait for the process to terminate, instead of
> failing.

I'm cooking a patch for this which seems pretty reasonable, but I'm
having a problem: what mechanism do we have for waiting until a process
exits?  Maybe make autovacuum acquire an LWLock at start, which it then
keeps until it's gone, but it seems wasteful to have a lwlock just for
that purpose.

Another idea is to do kill(0, AutoVacPID); sleep(); in a loop, but that
seems pretty stupid.

Better ideas anyone?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Autovacuum improvements

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> I'm cooking a patch for this which seems pretty reasonable, but I'm
> having a problem: what mechanism do we have for waiting until a process
> exits?

None, and I think you probably don't want to sit on the database lock
while waiting, either.  I was envisioning a simple sleep loop, viz
for(;;){    acquire database lock;    foreach(PGPROC entry in that database)    {        if (it's autovac)
sendsigint;        else            fail;    }    if (found any autovacs)    {        release database lock;
sleep(100msor so);        /* loop back and try again */    }    else        break;}
 

Also see Peter's nearby suggestion that we ought to wait instead of fail
for *all* cases of somebody attached to the database.  This would adapt
readily enough to that.

I was complaining elsewhere that I didn't want to use a sleep loop
for fixing the fsync-synchronization issue, but CREATE/DROP DATABASE
seems a much heavier-weight operation, so I don't feel that a sleep
is inappropriate here.

> Maybe make autovacuum acquire an LWLock at start, which it then
> keeps until it's gone, but it seems wasteful to have a lwlock just for
> that purpose.

And it doesn't scale to multiple autovacs anyway, much less the wait-for-
everybody variant.
        regards, tom lane


Re: Autovacuum improvements

From
Alvaro Herrera
Date:
Tom Lane wrote:

> Also see Peter's nearby suggestion that we ought to wait instead of fail
> for *all* cases of somebody attached to the database.  This would adapt
> readily enough to that.
> 
> I was complaining elsewhere that I didn't want to use a sleep loop
> for fixing the fsync-synchronization issue, but CREATE/DROP DATABASE
> seems a much heavier-weight operation, so I don't feel that a sleep
> is inappropriate here.

Note that currently there's no way for a backend to know whether another
backend is autovacuum or not.  I thought about adding a flag to PGPROC,
but eventually considered it ugly, so I started coding it as a shared
memory area instead, similar to what the bgwriter uses (storing the PID
there, etc).  After that was almost done I noticed that it's not a very
good idea either because there's no way to clean the shmem if autovacuum
dies -- the only one who knows about it, postmaster, does not want to
have access to shmem, so it can't do it.

So I'm reverting to using the flag in PGPROC for now, with an eye
towards using shmem eventually if we decide that using an always-running
autovacuum launcher is a good idea.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Autovacuum improvements

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Note that currently there's no way for a backend to know whether another
> backend is autovacuum or not.  I thought about adding a flag to PGPROC,
> but eventually considered it ugly,

No, that was exactly the way I thought we'd do it.  One thing to note is
that to avoid race conditions, the PGPROC entry has to be marked as
autovac from the instant it's inserted into the array --- with a
separate area I think you'd have difficulty avoiding the race condition.
        regards, tom lane


Re: Autovacuum improvements

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Here it is.
> 
> I'd drop the InitProcess API change, which touches many more places than
> you really need, and just have InitProcess check IsAutoVacuumProcess(),
> which should be valid by the time control gets to it.  This is more like
> the way that InitPostgres handles it, anyway.

Hmm, the problem is SubPostmasterMain, which is in the EXEC_BACKEND
path.  It hasn't reached the autovacuum.c code yet, so it hasn't had the
chance to set the am_autovacuum static variable (in autovacuum.c).  I
guess the answer here is to allow that variable to be set from the
outside.

> > Note that I used the same DatabaseHasActiveBackends() function to do the
> > kill.
> 
> Agreed; maybe change the name to something that sounds less like a
> side-effect-free function?

I'm short on ideas on how to name it ...
DatabaseHasActiveBackendsAndKillAutovac() sounds a bit too much :-(
Maybe DatabaseCancelAutovacuumActivity()?  (but then it's not obvious
that it counts other processes at all)  And make it kill all autovac
processes inconditionally, which also fixes thing per your comment
below:

> > Another point to make is that it only kills autovacuum, and only if no
> > other process is found.  So if there are two processes and autovacuum is
> > one of them, it will be allowed to continue.
> 
> What if there are two autovac processes, which seems likely to be
> possible soon enough?

On the other hand, I was thinking that if we're going to have an autovacuum
launcher that's continuously running, we're going to have a lot of API
changes in this area anyway, so I wasn't in a hurry to consider the
posibility of two autovacuum processes.  But I don't think it's very
important anyway.


PS -- first time I try to be strict about switching between
pgsql-hackers and pgsql-patches and already I find it a bit annoying ...
not to mention that this is probably going to look weird on the
archives.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Autovacuum improvements

From
Darcy Buskermolen
Date:
On Sunday 14 January 2007 08:45, Joshua D. Drake wrote:
> > While we are talking autovacuum improvements, I'd like to also see some
> > better logging, something that is akin to the important information of
> > vacuum verbose being logged to a table or baring that the error_log.  I'd
> > like to be able to see what was done, and how long it took to do for each
> > relation touched by av.  A thought, having this information may even be
> > usefull for the above thought of scheduler because we may be able to
> > build some sort of predictive scheduling into this.
>
> This plays back to the vacuum summary idea that I requested:
>
> http://archives.postgresql.org/pgsql-hackers/2005-07/msg00451.php

Well the fsm information is available in the pg_freespace contrib module, 
however it does not help with the "how long does it take to maintian XZY, or 
vacuum of relfoo did ABC".

I'm thinking a logtable of something like the following:

relid
starttime
elapsed_time
rows
rows_removed
pages
pages_removed
reusable_pages
cputime

This information then could be statisticaly used to ballance N queues to 
provide optimal vacuuming performance. 

Josh, is this more of what you were thinking as well ?

>
> (Man our new search engine is so much better than the old one :))
>
> Joshua D. Drake
>
> > ---------------------------(end of broadcast)---------------------------
> > TIP 7: You can help support the PostgreSQL project by donating at
> >
> >                 http://www.postgresql.org/about/donate


Re: Autovacuum improvements

From
"Joshua D. Drake"
Date:
Darcy Buskermolen wrote:
> On Sunday 14 January 2007 08:45, Joshua D. Drake wrote:
>>> While we are talking autovacuum improvements, I'd like to also see some
>>> better logging, something that is akin to the important information of
>>> vacuum verbose being logged to a table or baring that the error_log.  I'd
>>> like to be able to see what was done, and how long it took to do for each
>>> relation touched by av.  A thought, having this information may even be
>>> usefull for the above thought of scheduler because we may be able to
>>> build some sort of predictive scheduling into this.
>> This plays back to the vacuum summary idea that I requested:
>>
>> http://archives.postgresql.org/pgsql-hackers/2005-07/msg00451.php
> 
> Well the fsm information is available in the pg_freespace contrib module, 
> however it does not help with the "how long does it take to maintian XZY, or 
> vacuum of relfoo did ABC".
> 
> I'm thinking a logtable of something like the following:
> 
> relid
> starttime
> elapsed_time
> rows
> rows_removed
> pages
> pages_removed
> reusable_pages
> cputime
> 
> This information then could be statisticaly used to ballance N queues to 
> provide optimal vacuuming performance. 
> 
> Josh, is this more of what you were thinking as well ?


My original thought with Vacuum summary was that it would only give me
the information I need from vacuum verbose. Vacuum Verbose is great if
you want all the info, but normally you just want the last 5 lines :)

If there were functions along with the log table that would give me the
same info that would be great! Something like:

select show_omg_vacuum_now_tables() ;)

Seriously though...

select show_fsm_summary() which would show information over the last 12
hours, 24 hours or since last vacuum or something.

Sincerely,

Joshua D. Drake



> 
>> (Man our new search engine is so much better than the old one :))
>>
>> Joshua D. Drake
>>
>>> ---------------------------(end of broadcast)---------------------------
>>> TIP 7: You can help support the PostgreSQL project by donating at
>>>
>>>                 http://www.postgresql.org/about/donate
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
> 


-- 
     === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997            http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/



Re: Autovacuum improvements

From
Darcy Buskermolen
Date:
On Monday 15 January 2007 15:23, Joshua D. Drake wrote:
> Darcy Buskermolen wrote:
> > On Sunday 14 January 2007 08:45, Joshua D. Drake wrote:
> >>> While we are talking autovacuum improvements, I'd like to also see some
> >>> better logging, something that is akin to the important information of
> >>> vacuum verbose being logged to a table or baring that the error_log. 
> >>> I'd like to be able to see what was done, and how long it took to do
> >>> for each relation touched by av.  A thought, having this information
> >>> may even be usefull for the above thought of scheduler because we may
> >>> be able to build some sort of predictive scheduling into this.
> >>
> >> This plays back to the vacuum summary idea that I requested:
> >>
> >> http://archives.postgresql.org/pgsql-hackers/2005-07/msg00451.php
> >
> > Well the fsm information is available in the pg_freespace contrib module,
> > however it does not help with the "how long does it take to maintian XZY,
> > or vacuum of relfoo did ABC".
> >
> > I'm thinking a logtable of something like the following:
> >
> > relid
> > starttime
> > elapsed_time
> > rows
> > rows_removed
> > pages
> > pages_removed
> > reusable_pages
> > cputime
> >
> > This information then could be statisticaly used to ballance N queues to
> > provide optimal vacuuming performance.
> >
> > Josh, is this more of what you were thinking as well ?
>
> My original thought with Vacuum summary was that it would only give me
> the information I need from vacuum verbose. Vacuum Verbose is great if
> you want all the info, but normally you just want the last 5 lines :)
>
> If there were functions along with the log table that would give me the
> same info that would be great! Something like:
>
> select show_omg_vacuum_now_tables() ;)
>
> Seriously though...
>
> select show_fsm_summary() which would show information over the last 12
> hours, 24 hours or since last vacuum or something.

If it's only fsm you are thinking of then the contrib module is probably good 
enough for you.

>
> Sincerely,
>
> Joshua D. Drake
>
> >> (Man our new search engine is so much better than the old one :))
> >>
> >> Joshua D. Drake
> >>
> >>> ---------------------------(end of
> >>> broadcast)--------------------------- TIP 7: You can help support the
> >>> PostgreSQL project by donating at
> >>>
> >>>                 http://www.postgresql.org/about/donate
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 9: In versions below 8.0, the planner will ignore your desire to
> >        choose an index scan if your joining column's datatypes do not
> >        match


Enabling autovacuum by default (was Re: Autovacuum improvements)

From
Alvaro Herrera
Date:
Tom Lane wrote:

> Actually the reason it's not enabled by default today has nothing to do
> with either of those; it's
>
> 3. Unexpected side effects on foreground processes, such as surprising
> failures of DROP DATABASE commands.  (See archives for details.)
>
> Until (3) is addressed I don't think there is any chance of having
> autovac on by default, and so worrying about (1) and (2) seems a bit
> premature.  Or at least, if you want to work on those fine, but don't
> expect that it will alter the fact that the factory default is "off".

With that taken care of, do I dare propose enabling autovacuum by
default, so that further changes will be picked up quickly by the
buildfarm?

Attached is a patch to do so, based on Peter's last attempt at doing
this.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachment

Re: Autovacuum improvements

From
Alvaro Herrera
Date:
Matthew T. O'Connor wrote:

> This still seems ambiguous to me, how would I handle a maintenance 
> window of Weekends from Friday at 8PM though Monday morning at 6AM? My 
> guess from what said is:
> mon dom dow starttime endtime
> null  null    6      20:00      null
> null  null    1      null          06:00
> 
> So how do we know to vacuum on Saturday or Sunday?  I think clearly 
> defined intervals with explicit start and stop times is cleaner.

mon    dom    dow    start    end
null    null    5    20:00    23:59:59
null    null    6    00:00    23:59:59
null    null    7    00:00    23:59:59
null    null    1    00:00    06:00

(1 = monday, 5 = friday)

Now I'm starting to wonder what will happen between 23:59:59 of day X
and 00:00:00 of day (X+1) ...  Maybe what we should do is not specify
an end time, but a duration as an interval:

month        int
dom        int
dow        int
start        time
duration    interval

That way you can specify the above as
mon    dom    dow    start    duration
null    null    5    20:00    (4 hours + 2 days + 6 hours)

Now, if a DST boundary happens to fall in that interval you'll be an
hour short, or it'll last an hour too long :-)


> >I had two ideas: one was to make pg_autovacuum hold default config for
> >all tables not mentioned in any group, so sites which are OK with 8.2's
> >representation can still use it.  The other idea was to remove it and
> >replace it with this mechanism.
> 
> Probably best to just get rid of it.  GUC variables hold the defaults 
> and if we create a default interval / group, it will also have defaults.

Yeah, maybe.

> >My idea was to assign each table, or maybe each group, to a queue, and
> >then have as much workers as there are queues.  So you could put them
> >all in a single queue and it would mean there can be at most one vacuum
> >running at any time.  Or you could put each group in a queue, and then
> >there could be as many workers as there are groups.  Or you could mix.
> >
> >And also there would be a "autovac concurrency limit", which would be
> >a GUC var saying how many vacuums to have at any time.
> 
> Hmm... this seems like queue is nearly a synonym for group.  Can't we 
> just add num_workers property to table groups?  That seems to accomplish 
> the same thing.  And yes, a GUC variable to limits the total number of 
> concurrent autovacuums is probably a good idea.

queue = group of groups.  But I'm not sure about this at all, which is
why I took it away from the proposal.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Enabling autovacuum by default (was Re: Autovacuum

From
Andrew Dunstan
Date:
Alvaro Herrera wrote:
> Tom Lane wrote:
>
>   
>> Actually the reason it's not enabled by default today has nothing to do
>> with either of those; it's
>>
>> 3. Unexpected side effects on foreground processes, such as surprising
>> failures of DROP DATABASE commands.  (See archives for details.)
>>
>> Until (3) is addressed I don't think there is any chance of having
>> autovac on by default, and so worrying about (1) and (2) seems a bit
>> premature.  Or at least, if you want to work on those fine, but don't
>> expect that it will alter the fact that the factory default is "off".
>>     
>
> With that taken care of, do I dare propose enabling autovacuum by
> default, so that further changes will be picked up quickly by the
> buildfarm?
>
>   

I should have thought most autovacuum problems would only become evident 
after significant running time, while buildfarm runs are quite short.

Of course, some will be apparent right away.

cheers

andrew




Re: Enabling autovacuum by default (was Re: Autovacuum

From
Alvaro Herrera
Date:
Andrew Dunstan wrote:
> Alvaro Herrera wrote:
> >Tom Lane wrote:
> >
> >  
> >>Actually the reason it's not enabled by default today has nothing to do
> >>with either of those; it's
> >>
> >>3. Unexpected side effects on foreground processes, such as surprising
> >>failures of DROP DATABASE commands.  (See archives for details.)
> >>
> >>Until (3) is addressed I don't think there is any chance of having
> >>autovac on by default, and so worrying about (1) and (2) seems a bit
> >>premature.  Or at least, if you want to work on those fine, but don't
> >>expect that it will alter the fact that the factory default is "off".
> >>    
> >
> >With that taken care of, do I dare propose enabling autovacuum by
> >default, so that further changes will be picked up quickly by the
> >buildfarm?
> 
> I should have thought most autovacuum problems would only become evident 
> after significant running time, while buildfarm runs are quite short.

Well, the last time we enabled autovacuum by default (during 8.2's beta
period) there were some buildfarm failures right away, which is why it
was disabled.

On the other hand, it will help uncover possible portability problems in
the code that will be newly written.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Autovacuum improvements

From
Darcy Buskermolen
Date:
On Monday 15 January 2007 15:13, Darcy Buskermolen wrote:
> On Sunday 14 January 2007 08:45, Joshua D. Drake wrote:
> > > While we are talking autovacuum improvements, I'd like to also see some
> > > better logging, something that is akin to the important information of
> > > vacuum verbose being logged to a table or baring that the error_log. 
> > > I'd like to be able to see what was done, and how long it took to do
> > > for each relation touched by av.  A thought, having this information
> > > may even be usefull for the above thought of scheduler because we may
> > > be able to build some sort of predictive scheduling into this.
> >
> > This plays back to the vacuum summary idea that I requested:
> >
> > http://archives.postgresql.org/pgsql-hackers/2005-07/msg00451.php
>
> Well the fsm information is available in the pg_freespace contrib module,
> however it does not help with the "how long does it take to maintian XZY,
> or vacuum of relfoo did ABC".
>
> I'm thinking a logtable of something like the following:
>
> relid
> starttime
> elapsed_time
> rows
> rows_removed
> pages
> pages_removed
> reusable_pages
> cputime

I suppose that we may also want to track if FULL was done or not.


>
> This information then could be statisticaly used to ballance N queues to
> provide optimal vacuuming performance.
>
> Josh, is this more of what you were thinking as well ?
>
> > (Man our new search engine is so much better than the old one :))
> >
> > Joshua D. Drake
> >
> > > ---------------------------(end of
> > > broadcast)--------------------------- TIP 7: You can help support the
> > > PostgreSQL project by donating at
> > >
> > >                 http://www.postgresql.org/about/donate
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match


Re: Enabling autovacuum by default (was Re: Autovacuum improvements)

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> With that taken care of, do I dare propose enabling autovacuum by
> default, so that further changes will be picked up quickly by the
> buildfarm?

Sure, let's try it and see what else breaks ;-)
        regards, tom lane


Re: Autovacuum improvements

From
"Matthew T. O'Connor"
Date:
Alvaro Herrera wrote:
> Matthew T. O'Connor wrote:
>   
>> This still seems ambiguous to me, how would I handle a maintenance 
>> window of Weekends from Friday at 8PM though Monday morning at 6AM? My 
>> guess from what said is:
>> mon dom dow starttime endtime
>> null  null    6      20:00      null
>> null  null    1      null          06:00
>>
>> So how do we know to vacuum on Saturday or Sunday?  I think clearly 
>> defined intervals with explicit start and stop times is cleaner.
>>     
>
> mon    dom    dow    start    end
> null    null    5    20:00    23:59:59
> null    null    6    00:00    23:59:59
> null    null    7    00:00    23:59:59
> null    null    1    00:00    06:00
>
> (1 = monday, 5 = friday)
>   

So it takes 4 lines to handle one logical interval, I don't really like 
that.  I know that your concept of interval groups will help mask this 
but still.

> Now I'm starting to wonder what will happen between 23:59:59 of day X
> and 00:00:00 of day (X+1) ...  Maybe what we should do is not specify
> an end time, but a duration as an interval:
>
> month        int
> dom        int
> dow        int
> start        time
> duration    interval
>
> That way you can specify the above as
> mon    dom    dow    start    duration
> null    null    5    20:00    (4 hours + 2 days + 6 hours)
>
> Now, if a DST boundary happens to fall in that interval you'll be an
> hour short, or it'll last an hour too long :-)
>   

I certainly like this better than the first proposal, but I still don't 
see how it's better than a  full set of columns for start and end 
times.  Can you tell me why you are trying to avoid that design? 

>> Hmm... this seems like queue is nearly a synonym for group.  Can't we 
>> just add num_workers property to table groups?  That seems to accomplish 
>> the same thing.  And yes, a GUC variable to limits the total number of 
>> concurrent autovacuums is probably a good idea.
>>     
>
> queue = group of groups.  But I'm not sure about this at all, which is
> why I took it away from the proposal.

I think we can live without the groups of groups, at least for now. 



Re: Autovacuum improvements

From
tomas@tuxteam.de
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, Jan 16, 2007 at 11:23:36AM -0300, Alvaro Herrera wrote:
> Matthew T. O'Connor wrote:
> 
[...]
> Now I'm starting to wonder what will happen between 23:59:59 of day X
> and 00:00:00 of day (X+1) ...  Maybe what we should do is not specify
> an end time, but a duration as an interval:

+1

regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFFrdUcBcgs9XrR2kYRAlWCAJ9XDNi/oNDfhyWcrnrDAvig/LFs1wCfayC8
9UJy+4XT/n9G4YZ5vG+Fdgg=
=5h9A
-----END PGP SIGNATURE-----



Re: With Function 'Chr', is it a bug?

From
Wang Haiyong
Date:
There are two select statement  with using Function chr(0), I don't know, are they both right ? I think that they are
inconsistent.

Thanks 

[postgres@db2 ~]$ psql
Welcome to psql 8.1.3, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms      \h for help with SQL commands      \? for help with psql commands      \g
orterminate with semicolon to execute query      \q to quit
 

postgres=# select 'abc' || chr(0) || 'abc'  as col;col 
-----abc
(1 row)

postgres=# select length('abc' || chr(0) || 'abc')  as col; col 
-----  7
(1 row)


----------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only
forthe use of the intended recipient and may be confidential and/or privileged of Neusoft Group Ltd., its subsidiaries
and/orits affiliates. If any reader of this communication is not the intended recipient, unauthorized use, forwarding,
printing,storing, disclosure or copying is strictly prohibited, and may be unlawful. If you have received this
communicationin error, please immediately notify the sender by return e-mail, and delete the original message and all
copiesfrom your system. Thank you. 
 
-----------------------------------------------------------------------------------------------



Re: With Function 'Chr', is it a bug?

From
Andrew Dunstan
Date:
Wang Haiyong wrote:
> There are two select statement  with using Function chr(0), I don't know, are they both right ? I think that they are
inconsistent.
>
>   

Off the top of my head I would have thought there was a good case for 
raising an error on chr(0). Aren't null bytes forbidden in text values?

cheers

andrew


Re: With Function 'Chr', is it a bug?

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Off the top of my head I would have thought there was a good case for 
> raising an error on chr(0). Aren't null bytes forbidden in text values?

They're not supported, but we don't make any strenuous efforts to
prevent them.  A watertight prohibition would require extra effort in a
lot of places, not only chr().  The string literal parser and text_recv
and friends come to mind immediately; there are probably some others.

Maybe we should lock all that down, but I don't see any percentage in
fixing just one place.

btw, I just noticed that chr() doesn't complain about arguments
exceeding 255 ...
        regards, tom lane