Thread: Re: BUG #14941: Vacuum crashes

Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/1/17, 11:16 AM, "Tomas Vondra" <tomas.vondra@2ndquadrant.com> wrote:
>> The behavior I would like to see is that the void ignores this table and
>> moves to another instead of being blocked. 
>> 
>
> I believe autovacuum should not block waiting for a heavy-weight lock on
> a table since this commit that went into 9.1:
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=32896c40ca766146312b28a5a0eb3f66ca0300ed
>
> So I'm wondering what problem you're running into.

It sounds like Lyes is doing a periodic database-wide VACUUM that is
getting blocked on certain relations that are already locked (perhaps
because autovacuum is already working on it).  IIUC, the ask is to provide
a way to skip vacuuming relations that cannot be immediately locked.

There is already an internal flag called VACOPT_NOWAIT that gives
autovacuum this ability in certain cases (introduced in the aforementioned
commit), and I recently brought up this potential use-case as
justification for a patch [0].  I'd be happy to submit a patch for
providing VACOPT_NOWAIT to users if others think it's something we should
do.

As a side note, this seems more like a feature request than a bug report,
so I'm adding pgsql-hackers@ here, too.

Nathan

[0] https://www.postgresql.org/message-id/875815E8-7A99-4488-AA07-F254C404E2CF%40amazon.com


Re: BUG #14941: Vacuum crashes

From
Robert Haas
Date:
On Fri, Dec 1, 2017 at 1:53 PM, Bossart, Nathan <bossartn@amazon.com> wrote:
> There is already an internal flag called VACOPT_NOWAIT that gives
> autovacuum this ability in certain cases (introduced in the aforementioned
> commit), and I recently brought up this potential use-case as
> justification for a patch [0].  I'd be happy to submit a patch for
> providing VACOPT_NOWAIT to users if others think it's something we should
> do.

Seems entirely reasonable to me, provided that we only update the
extensible-options syntax:

VACUUM [ ( option [, ...] ) ] [ table_and_columns [, ...] ]

I don't want to add any more options to the older syntax:

VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ table_and_columns [, ...] ]

I am slightly confused as to how we got on to this topic since the
subject is "Vacuum crashes", but perhaps Lyes was simply speaking
imprecisely.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: BUG #14941: Vacuum crashes

From
Robert Haas
Date:
On Fri, Dec 1, 2017 at 1:53 PM, Bossart, Nathan <bossartn@amazon.com> wrote:
> There is already an internal flag called VACOPT_NOWAIT that gives
> autovacuum this ability in certain cases (introduced in the aforementioned
> commit), and I recently brought up this potential use-case as
> justification for a patch [0].  I'd be happy to submit a patch for
> providing VACOPT_NOWAIT to users if others think it's something we should
> do.

Seems entirely reasonable to me, provided that we only update the
extensible-options syntax:

VACUUM [ ( option [, ...] ) ] [ table_and_columns [, ...] ]

I don't want to add any more options to the older syntax:

VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ table_and_columns [, ...] ]

I am slightly confused as to how we got on to this topic since the
subject is "Vacuum crashes", but perhaps Lyes was simply speaking
imprecisely.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/1/17, 3:04 PM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> On Fri, Dec 1, 2017 at 1:53 PM, Bossart, Nathan <bossartn@amazon.com> wrote:
>> There is already an internal flag called VACOPT_NOWAIT that gives
>> autovacuum this ability in certain cases (introduced in the aforementioned
>> commit), and I recently brought up this potential use-case as
>> justification for a patch [0].  I'd be happy to submit a patch for
>> providing VACOPT_NOWAIT to users if others think it's something we should
>> do.
>
> Seems entirely reasonable to me, provided that we only update the
> extensible-options syntax:
>
> VACUUM [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
>
> I don't want to add any more options to the older syntax:
>
> VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ table_and_columns [, ...] ]

Right.  This seems like the right path forward to me, as the
VACUUM documentation states that the unparenthesized syntax is
deprecated, and the DISABLE_PAGE_SKIPPING option was not added
to the old syntax, either.

> I am slightly confused as to how we got on to this topic since the
> subject is "Vacuum crashes", but perhaps Lyes was simply speaking
> imprecisely.

I'm hoping Lyes chimes in soon to clarify if I am interpreting
the original report incorrectly.

Nathan


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/1/17, 3:04 PM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> On Fri, Dec 1, 2017 at 1:53 PM, Bossart, Nathan <bossartn@amazon.com> wrote:
>> There is already an internal flag called VACOPT_NOWAIT that gives
>> autovacuum this ability in certain cases (introduced in the aforementioned
>> commit), and I recently brought up this potential use-case as
>> justification for a patch [0].  I'd be happy to submit a patch for
>> providing VACOPT_NOWAIT to users if others think it's something we should
>> do.
>
> Seems entirely reasonable to me, provided that we only update the
> extensible-options syntax:
>
> VACUUM [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
>
> I don't want to add any more options to the older syntax:
>
> VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ table_and_columns [, ...] ]

Right.  This seems like the right path forward to me, as the
VACUUM documentation states that the unparenthesized syntax is
deprecated, and the DISABLE_PAGE_SKIPPING option was not added
to the old syntax, either.

> I am slightly confused as to how we got on to this topic since the
> subject is "Vacuum crashes", but perhaps Lyes was simply speaking
> imprecisely.

I'm hoping Lyes chimes in soon to clarify if I am interpreting
the original report incorrectly.

Nathan


Re: BUG #14941: Vacuum crashes

From
Lyes Ameddah
Date:
sorry guys, yes I'm talking about a FULL VACUUM and not about Auto-Vacuum. Thank you very match for your feedback. That's Waht I do : vacuum FULL VERBOSE ANALYZE; reindex database postgres; I would be happy if there is a patch to fix that ! 2017-12-01 22:16 GMT+01:00 Bossart, Nathan : > On 12/1/17, 3:04 PM, "Robert Haas" wrote: > > On Fri, Dec 1, 2017 at 1:53 PM, Bossart, Nathan > wrote: > >> There is already an internal flag called VACOPT_NOWAIT that gives > >> autovacuum this ability in certain cases (introduced in the > aforementioned > >> commit), and I recently brought up this potential use-case as > >> justification for a patch [0]. I'd be happy to submit a patch for > >> providing VACOPT_NOWAIT to users if others think it's something we > should > >> do. > > > > Seems entirely reasonable to me, provided that we only update the > > extensible-options syntax: > > > > VACUUM [ ( option [, ...] ) ] [ table_and_columns [, ...] ] > > > > I don't want to add any more options to the older syntax: > > > > VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ table_and_columns > [, ...] ] > > Right. This seems like the right path forward to me, as the > VACUUM documentation states that the unparenthesized syntax is > deprecated, and the DISABLE_PAGE_SKIPPING option was not added > to the old syntax, either. > > > I am slightly confused as to how we got on to this topic since the > > subject is "Vacuum crashes", but perhaps Lyes was simply speaking > > imprecisely. > > I'm hoping Lyes chimes in soon to clarify if I am interpreting > the original report incorrectly. > > Nathan > > -- *Lyes AMEDDAH* *Téléphone portable : 06 66 24 50 70* *Titre RNCP I - Développement Web* *HiTema*

Re: BUG #14941: Vacuum crashes

From
Lyes Ameddah
Date:
sorry guys, yes I'm talking about a FULL VACUUM and not about Auto-Vacuum. Thank you very match for your feedback. That's Waht I do : vacuum FULL VERBOSE ANALYZE; reindex database postgres; I would be happy if there is a patch to fix that ! 2017-12-01 22:16 GMT+01:00 Bossart, Nathan : > On 12/1/17, 3:04 PM, "Robert Haas" wrote: > > On Fri, Dec 1, 2017 at 1:53 PM, Bossart, Nathan > wrote: > >> There is already an internal flag called VACOPT_NOWAIT that gives > >> autovacuum this ability in certain cases (introduced in the > aforementioned > >> commit), and I recently brought up this potential use-case as > >> justification for a patch [0]. I'd be happy to submit a patch for > >> providing VACOPT_NOWAIT to users if others think it's something we > should > >> do. > > > > Seems entirely reasonable to me, provided that we only update the > > extensible-options syntax: > > > > VACUUM [ ( option [, ...] ) ] [ table_and_columns [, ...] ] > > > > I don't want to add any more options to the older syntax: > > > > VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ table_and_columns > [, ...] ] > > Right. This seems like the right path forward to me, as the > VACUUM documentation states that the unparenthesized syntax is > deprecated, and the DISABLE_PAGE_SKIPPING option was not added > to the old syntax, either. > > > I am slightly confused as to how we got on to this topic since the > > subject is "Vacuum crashes", but perhaps Lyes was simply speaking > > imprecisely. > > I'm hoping Lyes chimes in soon to clarify if I am interpreting > the original report incorrectly. > > Nathan > > -- *Lyes AMEDDAH* *Téléphone portable : 06 66 24 50 70* *Titre RNCP I - Développement Web* *HiTema*

Re: BUG #14941: Vacuum crashes

From
Robert Haas
Date:
On Mon, Dec 4, 2017 at 8:59 AM, Lyes Ameddah <lyes.amd@gmail.com> wrote:
> sorry guys, yes I'm talking about a FULL VACUUM and not about Auto-Vacuum.
> Thank you very match for your feedback.

OK, but that's not the confusion.  What you said is that it CRASHES,
but the behavior you described is that it BLOCKS waiting for a lock.
Blocking and crashing are not the same thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: BUG #14941: Vacuum crashes

From
Robert Haas
Date:
On Mon, Dec 4, 2017 at 8:59 AM, Lyes Ameddah <lyes.amd@gmail.com> wrote:
> sorry guys, yes I'm talking about a FULL VACUUM and not about Auto-Vacuum.
> Thank you very match for your feedback.

OK, but that's not the confusion.  What you said is that it CRASHES,
but the behavior you described is that it BLOCKS waiting for a lock.
Blocking and crashing are not the same thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/4/17, 8:52 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> On Mon, Dec 4, 2017 at 8:59 AM, Lyes Ameddah <lyes.amd@gmail.com> wrote:
>> sorry guys, yes I'm talking about a FULL VACUUM and not about Auto-Vacuum.
>> Thank you very match for your feedback.
>
> OK, but that's not the confusion.  What you said is that it CRASHES,
> but the behavior you described is that it BLOCKS waiting for a lock.
> Blocking and crashing are not the same thing.

Provided that Lyes seems to have described wanting to avoid the blocking
behavior (and since I'm interested in adding this functionality anyway),
here are some patches that open up the VACOPT_NOWAIT option to users for
both VACUUM and ANALYZE commands.

---

0001_fix_unparenthesized_vacuum_grammar_v1.patch

One VacuumStmt grammar rule allows users to specify the VACOPT_ANALYZE
option by including an AnalyzeStmt at the end of the command.  This
appears to have been added as part of the introduction of the ANALYZE
command (f905d65e).  However, this means that users can call VACUUM
commands like

        VACUUM VERBOSE ANALYZE VERBOSE pg_class;

Besides being disallowed according to the documentation, I think this
may give users the idea that there is a difference between the VERBOSE
options for VACUUM versus ANALYZE.  In fact, they are the same option,
and specifying it twice is redundant.

This change fixes the unparenthesized VACUUM syntax to disallow the out-
of-order VACUUM options as described above.  This is a prerequisite
change for opening up VACOPT_NOWAIT to users, as adding it to the
grammar as-is would cause statements like

        VACUUM VERBOSE NOWAIT ANALYZE VERBOSE NOWAIT pg_class;

to be allowed.  Since I am also looking to introduce a parenthesized
syntax for ANALYZE, this patch would prevent statements like

        VACUUM VERBOSE ANALYZE (VERBOSE, NOWAIT) pg_class;

from being accepted as well.

---

0002_add_parenthesized_analyze_syntax_v1.patch

As a prerequisite to giving users access to the VACOPT_NOWAIT option, I
am introducing a parenthesized syntax for ANALYZE that is similar to
VACUUM's.  Following VACUUM's example, any new options will be added to
this syntax, and the old style will become deprecated.

Adding a parenthesized syntax for ANALYZE is not strictly necessary for
providing the NOWAIT option, as NOWAIT is already a keyword in gram.y.
However, I thought it would be good to match VACUUM (since the command
structure is so similar), and this work would be required at some point
anyway if ANALYZE ever accepts options that we do not want to make
reserved keywords.

---

0003_add_nowait_vacuum_option_v1.patch

This change provides the existing VACOPT_NOWAIT option to users.  This
option was previously only used by autovacuum in special cases, but it
seems like a potentially valuable option in other cases as well.  For
example, perhaps a user wants to run a nightly VACUUM job that skips
tables that cannot be locked due to autovacuum (or something else)
already working on it.

I chose to use NOWAIT as the option name because it is already a keyword
for the LOCK command.

This patch follows the existing logging precedent added by 11d8d72c and
ab6eaee8: if a relation is explicitly specified in the command, a log
statement will be emitted if it is skipped.  If the relation is not
specified (e.g. "VACUUM FULL;"), no such log statements will be emitted.

---

I'll be adding an entry to the next commitfest for these patches soon.

Nathan


Attachment

Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/4/17, 8:52 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> On Mon, Dec 4, 2017 at 8:59 AM, Lyes Ameddah <lyes.amd@gmail.com> wrote:
>> sorry guys, yes I'm talking about a FULL VACUUM and not about Auto-Vacuum.
>> Thank you very match for your feedback.
>
> OK, but that's not the confusion.  What you said is that it CRASHES,
> but the behavior you described is that it BLOCKS waiting for a lock.
> Blocking and crashing are not the same thing.

Provided that Lyes seems to have described wanting to avoid the blocking
behavior (and since I'm interested in adding this functionality anyway),
here are some patches that open up the VACOPT_NOWAIT option to users for
both VACUUM and ANALYZE commands.

---

0001_fix_unparenthesized_vacuum_grammar_v1.patch

One VacuumStmt grammar rule allows users to specify the VACOPT_ANALYZE
option by including an AnalyzeStmt at the end of the command.  This
appears to have been added as part of the introduction of the ANALYZE
command (f905d65e).  However, this means that users can call VACUUM
commands like

        VACUUM VERBOSE ANALYZE VERBOSE pg_class;

Besides being disallowed according to the documentation, I think this
may give users the idea that there is a difference between the VERBOSE
options for VACUUM versus ANALYZE.  In fact, they are the same option,
and specifying it twice is redundant.

This change fixes the unparenthesized VACUUM syntax to disallow the out-
of-order VACUUM options as described above.  This is a prerequisite
change for opening up VACOPT_NOWAIT to users, as adding it to the
grammar as-is would cause statements like

        VACUUM VERBOSE NOWAIT ANALYZE VERBOSE NOWAIT pg_class;

to be allowed.  Since I am also looking to introduce a parenthesized
syntax for ANALYZE, this patch would prevent statements like

        VACUUM VERBOSE ANALYZE (VERBOSE, NOWAIT) pg_class;

from being accepted as well.

---

0002_add_parenthesized_analyze_syntax_v1.patch

As a prerequisite to giving users access to the VACOPT_NOWAIT option, I
am introducing a parenthesized syntax for ANALYZE that is similar to
VACUUM's.  Following VACUUM's example, any new options will be added to
this syntax, and the old style will become deprecated.

Adding a parenthesized syntax for ANALYZE is not strictly necessary for
providing the NOWAIT option, as NOWAIT is already a keyword in gram.y.
However, I thought it would be good to match VACUUM (since the command
structure is so similar), and this work would be required at some point
anyway if ANALYZE ever accepts options that we do not want to make
reserved keywords.

---

0003_add_nowait_vacuum_option_v1.patch

This change provides the existing VACOPT_NOWAIT option to users.  This
option was previously only used by autovacuum in special cases, but it
seems like a potentially valuable option in other cases as well.  For
example, perhaps a user wants to run a nightly VACUUM job that skips
tables that cannot be locked due to autovacuum (or something else)
already working on it.

I chose to use NOWAIT as the option name because it is already a keyword
for the LOCK command.

This patch follows the existing logging precedent added by 11d8d72c and
ab6eaee8: if a relation is explicitly specified in the command, a log
statement will be emitted if it is skipped.  If the relation is not
specified (e.g. "VACUUM FULL;"), no such log statements will be emitted.

---

I'll be adding an entry to the next commitfest for these patches soon.

Nathan


Attachment

Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Sat, Dec 2, 2017 at 6:16 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
> On 12/1/17, 3:04 PM, "Robert Haas" <robertmhaas@gmail.com> wrote:
>> Seems entirely reasonable to me, provided that we only update the
>> extensible-options syntax:
>>
>> VACUUM [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
>>
>> I don't want to add any more options to the older syntax:
>>
>> VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ table_and_columns [, ...] ]
>
> Right.  This seems like the right path forward to me, as the
> VACUUM documentation states that the unparenthesized syntax is
> deprecated, and the DISABLE_PAGE_SKIPPING option was not added
> to the old syntax, either.

Yeah, the exact same discussion has happened when what has become
DISABLE_PAGE_SKIPPING was discussed for the freeze map.
-- 
Michael


Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Sat, Dec 2, 2017 at 6:16 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
> On 12/1/17, 3:04 PM, "Robert Haas" <robertmhaas@gmail.com> wrote:
>> Seems entirely reasonable to me, provided that we only update the
>> extensible-options syntax:
>>
>> VACUUM [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
>>
>> I don't want to add any more options to the older syntax:
>>
>> VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ table_and_columns [, ...] ]
>
> Right.  This seems like the right path forward to me, as the
> VACUUM documentation states that the unparenthesized syntax is
> deprecated, and the DISABLE_PAGE_SKIPPING option was not added
> to the old syntax, either.

Yeah, the exact same discussion has happened when what has become
DISABLE_PAGE_SKIPPING was discussed for the freeze map.
-- 
Michael


Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Wed, Dec 6, 2017 at 1:52 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
> 0001_fix_unparenthesized_vacuum_grammar_v1.patch
>
> One VacuumStmt grammar rule allows users to specify the VACOPT_ANALYZE
> option by including an AnalyzeStmt at the end of the command.  This
> appears to have been added as part of the introduction of the ANALYZE
> command (f905d65e).  However, this means that users can call VACUUM
> commands like
>
>         VACUUM VERBOSE ANALYZE VERBOSE pg_class;
>
> Besides being disallowed according to the documentation, I think this
> may give users the idea that there is a difference between the VERBOSE
> options for VACUUM versus ANALYZE.  In fact, they are the same option,
> and specifying it twice is redundant.

Hmm. Yeah. If adding a parenthesis grammar to ANALYZE this is a good
move for long purposes. If we don't do that now, or at least for the
purpose of this patch set, then a AnalyzeStmt node embedded directly
in the grammar of VACUUM is likely to bite in the future.

> This change fixes the unparenthesized VACUUM syntax to disallow the out-
> of-order VACUUM options as described above.  This is a prerequisite
> change for opening up VACOPT_NOWAIT to users, as adding it to the
> grammar as-is would cause statements like
>
>         VACUUM VERBOSE NOWAIT ANALYZE VERBOSE NOWAIT pg_class;
>
> to be allowed.  Since I am also looking to introduce a parenthesized
> syntax for ANALYZE, this patch would prevent statements like

If you add only a parenthesized grammar of ANALYZE, it seems to me
that this would not be an allowed query anyway.

>         VACUUM VERBOSE ANALYZE (VERBOSE, NOWAIT) pg_class;
>
> from being accepted as well.

This could also be equivalent to ANALYZE(VERBOSE, NOWAIT). I can see
pros and cons by keeping the complete extension of AnalyzeStmt in the
VACUUM grammar, but as the documentation does not mention VACUUM
VERBOSE ANALYZE VERBOSE as a legit query I guess that ripping it out
is not going to annoy many users. Still let's see opinions from other
folks on -hackers as another approach to the feature you want to add
here would be to just implement the grammar for VACUUM but *not*
ANALYZE, but I'd like to think that we still want ANALYZE to be
supported, and also we want it to be extensible as a different thing
than what VACUUM is.

> 0002_add_parenthesized_analyze_syntax_v1.patch
>
> As a prerequisite to giving users access to the VACOPT_NOWAIT option, I
> am introducing a parenthesized syntax for ANALYZE that is similar to
> VACUUM's.  Following VACUUM's example, any new options will be added to
> this syntax, and the old style will become deprecated.
>
> Adding a parenthesized syntax for ANALYZE is not strictly necessary for
> providing the NOWAIT option, as NOWAIT is already a keyword in gram.y.
> However, I thought it would be good to match VACUUM (since the command
> structure is so similar), and this work would be required at some point
> anyway if ANALYZE ever accepts options that we do not want to make
> reserved keywords.

Yep, agreed with the contents of this patch.

> 0003_add_nowait_vacuum_option_v1.patch
>
> This change provides the existing VACOPT_NOWAIT option to users.  This
> option was previously only used by autovacuum in special cases, but it
> seems like a potentially valuable option in other cases as well.  For
> example, perhaps a user wants to run a nightly VACUUM job that skips
> tables that cannot be locked due to autovacuum (or something else)
> already working on it.
>
> I chose to use NOWAIT as the option name because it is already a keyword
> for the LOCK command.

Makes sense to me.

> This patch follows the existing logging precedent added by 11d8d72c and
> ab6eaee8: if a relation is explicitly specified in the command, a log
> statement will be emitted if it is skipped.  If the relation is not
> specified (e.g. "VACUUM FULL;"), no such log statements will be emitted.

+WARNING:  skipping analyze of "test1" --- lock not available
+step analyze_specified: ANALYZE (NOWAIT) test1, test2;
+step commit:
+   COMMIT;
OK for a WARNING for me in this case. We already do that for the other entries.

Let's see what other folks think first about the ANALYZE grammar in
VACUUM, as well as having ANALYZE use a parenthesized grammer. FWIW, I
like the contents of this patch set, and the thing is non-intrusive. I
think that NOWAIT gains more value now that multiple relations can be
vacuumed or analyzed with a single manual command.
-- 
Michael


Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Wed, Dec 6, 2017 at 1:52 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
> 0001_fix_unparenthesized_vacuum_grammar_v1.patch
>
> One VacuumStmt grammar rule allows users to specify the VACOPT_ANALYZE
> option by including an AnalyzeStmt at the end of the command.  This
> appears to have been added as part of the introduction of the ANALYZE
> command (f905d65e).  However, this means that users can call VACUUM
> commands like
>
>         VACUUM VERBOSE ANALYZE VERBOSE pg_class;
>
> Besides being disallowed according to the documentation, I think this
> may give users the idea that there is a difference between the VERBOSE
> options for VACUUM versus ANALYZE.  In fact, they are the same option,
> and specifying it twice is redundant.

Hmm. Yeah. If adding a parenthesis grammar to ANALYZE this is a good
move for long purposes. If we don't do that now, or at least for the
purpose of this patch set, then a AnalyzeStmt node embedded directly
in the grammar of VACUUM is likely to bite in the future.

> This change fixes the unparenthesized VACUUM syntax to disallow the out-
> of-order VACUUM options as described above.  This is a prerequisite
> change for opening up VACOPT_NOWAIT to users, as adding it to the
> grammar as-is would cause statements like
>
>         VACUUM VERBOSE NOWAIT ANALYZE VERBOSE NOWAIT pg_class;
>
> to be allowed.  Since I am also looking to introduce a parenthesized
> syntax for ANALYZE, this patch would prevent statements like

If you add only a parenthesized grammar of ANALYZE, it seems to me
that this would not be an allowed query anyway.

>         VACUUM VERBOSE ANALYZE (VERBOSE, NOWAIT) pg_class;
>
> from being accepted as well.

This could also be equivalent to ANALYZE(VERBOSE, NOWAIT). I can see
pros and cons by keeping the complete extension of AnalyzeStmt in the
VACUUM grammar, but as the documentation does not mention VACUUM
VERBOSE ANALYZE VERBOSE as a legit query I guess that ripping it out
is not going to annoy many users. Still let's see opinions from other
folks on -hackers as another approach to the feature you want to add
here would be to just implement the grammar for VACUUM but *not*
ANALYZE, but I'd like to think that we still want ANALYZE to be
supported, and also we want it to be extensible as a different thing
than what VACUUM is.

> 0002_add_parenthesized_analyze_syntax_v1.patch
>
> As a prerequisite to giving users access to the VACOPT_NOWAIT option, I
> am introducing a parenthesized syntax for ANALYZE that is similar to
> VACUUM's.  Following VACUUM's example, any new options will be added to
> this syntax, and the old style will become deprecated.
>
> Adding a parenthesized syntax for ANALYZE is not strictly necessary for
> providing the NOWAIT option, as NOWAIT is already a keyword in gram.y.
> However, I thought it would be good to match VACUUM (since the command
> structure is so similar), and this work would be required at some point
> anyway if ANALYZE ever accepts options that we do not want to make
> reserved keywords.

Yep, agreed with the contents of this patch.

> 0003_add_nowait_vacuum_option_v1.patch
>
> This change provides the existing VACOPT_NOWAIT option to users.  This
> option was previously only used by autovacuum in special cases, but it
> seems like a potentially valuable option in other cases as well.  For
> example, perhaps a user wants to run a nightly VACUUM job that skips
> tables that cannot be locked due to autovacuum (or something else)
> already working on it.
>
> I chose to use NOWAIT as the option name because it is already a keyword
> for the LOCK command.

Makes sense to me.

> This patch follows the existing logging precedent added by 11d8d72c and
> ab6eaee8: if a relation is explicitly specified in the command, a log
> statement will be emitted if it is skipped.  If the relation is not
> specified (e.g. "VACUUM FULL;"), no such log statements will be emitted.

+WARNING:  skipping analyze of "test1" --- lock not available
+step analyze_specified: ANALYZE (NOWAIT) test1, test2;
+step commit:
+   COMMIT;
OK for a WARNING for me in this case. We already do that for the other entries.

Let's see what other folks think first about the ANALYZE grammar in
VACUUM, as well as having ANALYZE use a parenthesized grammer. FWIW, I
like the contents of this patch set, and the thing is non-intrusive. I
think that NOWAIT gains more value now that multiple relations can be
vacuumed or analyzed with a single manual command.
-- 
Michael


Re: BUG #14941: Vacuum crashes

From
Masahiko Sawada
Date:
On Mon, Dec 11, 2017 at 2:13 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Dec 6, 2017 at 1:52 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
>> 0001_fix_unparenthesized_vacuum_grammar_v1.patch
>>
>> One VacuumStmt grammar rule allows users to specify the VACOPT_ANALYZE
>> option by including an AnalyzeStmt at the end of the command.  This
>> appears to have been added as part of the introduction of the ANALYZE
>> command (f905d65e).  However, this means that users can call VACUUM
>> commands like
>>
>>         VACUUM VERBOSE ANALYZE VERBOSE pg_class;
>>
>> Besides being disallowed according to the documentation, I think this
>> may give users the idea that there is a difference between the VERBOSE
>> options for VACUUM versus ANALYZE.  In fact, they are the same option,
>> and specifying it twice is redundant.
>
> Hmm. Yeah. If adding a parenthesis grammar to ANALYZE this is a good
> move for long purposes. If we don't do that now, or at least for the
> purpose of this patch set, then a AnalyzeStmt node embedded directly
> in the grammar of VACUUM is likely to bite in the future.
>
>> This change fixes the unparenthesized VACUUM syntax to disallow the out-
>> of-order VACUUM options as described above.  This is a prerequisite
>> change for opening up VACOPT_NOWAIT to users, as adding it to the
>> grammar as-is would cause statements like
>>
>>         VACUUM VERBOSE NOWAIT ANALYZE VERBOSE NOWAIT pg_class;
>>
>> to be allowed.  Since I am also looking to introduce a parenthesized
>> syntax for ANALYZE, this patch would prevent statements like
>
> If you add only a parenthesized grammar of ANALYZE, it seems to me
> that this would not be an allowed query anyway.
>
>>         VACUUM VERBOSE ANALYZE (VERBOSE, NOWAIT) pg_class;
>>
>> from being accepted as well.
>
> This could also be equivalent to ANALYZE(VERBOSE, NOWAIT). I can see
> pros and cons by keeping the complete extension of AnalyzeStmt in the
> VACUUM grammar, but as the documentation does not mention VACUUM
> VERBOSE ANALYZE VERBOSE as a legit query I guess that ripping it out
> is not going to annoy many users. Still let's see opinions from other
> folks on -hackers as another approach to the feature you want to add
> here would be to just implement the grammar for VACUUM but *not*
> ANALYZE, but I'd like to think that we still want ANALYZE to be
> supported, and also we want it to be extensible as a different thing
> than what VACUUM is.
>
>> 0002_add_parenthesized_analyze_syntax_v1.patch
>>
>> As a prerequisite to giving users access to the VACOPT_NOWAIT option, I
>> am introducing a parenthesized syntax for ANALYZE that is similar to
>> VACUUM's.  Following VACUUM's example, any new options will be added to
>> this syntax, and the old style will become deprecated.
>>
>> Adding a parenthesized syntax for ANALYZE is not strictly necessary for
>> providing the NOWAIT option, as NOWAIT is already a keyword in gram.y.
>> However, I thought it would be good to match VACUUM (since the command
>> structure is so similar), and this work would be required at some point
>> anyway if ANALYZE ever accepts options that we do not want to make
>> reserved keywords.
>
> Yep, agreed with the contents of this patch.
>
>> 0003_add_nowait_vacuum_option_v1.patch
>>
>> This change provides the existing VACOPT_NOWAIT option to users.  This
>> option was previously only used by autovacuum in special cases, but it
>> seems like a potentially valuable option in other cases as well.  For
>> example, perhaps a user wants to run a nightly VACUUM job that skips
>> tables that cannot be locked due to autovacuum (or something else)
>> already working on it.
>>
>> I chose to use NOWAIT as the option name because it is already a keyword
>> for the LOCK command.
>
> Makes sense to me.
>
>> This patch follows the existing logging precedent added by 11d8d72c and
>> ab6eaee8: if a relation is explicitly specified in the command, a log
>> statement will be emitted if it is skipped.  If the relation is not
>> specified (e.g. "VACUUM FULL;"), no such log statements will be emitted.
>
> +WARNING:  skipping analyze of "test1" --- lock not available
> +step analyze_specified: ANALYZE (NOWAIT) test1, test2;
> +step commit:
> +   COMMIT;
> OK for a WARNING for me in this case. We already do that for the other entries.

I also reviewed the patches.

For 0003 patch, if we do VACUUM (NOWAIT) to whole database we don't
get the above WARNING messages, but I think we should get them. The
reported issue also did VACUUM FULL and REINDEX to whole database.
Especially when VACUUM to whole database the information of skipped
relations would be useful for users.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: BUG #14941: Vacuum crashes

From
Masahiko Sawada
Date:
On Mon, Dec 11, 2017 at 2:13 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Dec 6, 2017 at 1:52 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
>> 0001_fix_unparenthesized_vacuum_grammar_v1.patch
>>
>> One VacuumStmt grammar rule allows users to specify the VACOPT_ANALYZE
>> option by including an AnalyzeStmt at the end of the command.  This
>> appears to have been added as part of the introduction of the ANALYZE
>> command (f905d65e).  However, this means that users can call VACUUM
>> commands like
>>
>>         VACUUM VERBOSE ANALYZE VERBOSE pg_class;
>>
>> Besides being disallowed according to the documentation, I think this
>> may give users the idea that there is a difference between the VERBOSE
>> options for VACUUM versus ANALYZE.  In fact, they are the same option,
>> and specifying it twice is redundant.
>
> Hmm. Yeah. If adding a parenthesis grammar to ANALYZE this is a good
> move for long purposes. If we don't do that now, or at least for the
> purpose of this patch set, then a AnalyzeStmt node embedded directly
> in the grammar of VACUUM is likely to bite in the future.
>
>> This change fixes the unparenthesized VACUUM syntax to disallow the out-
>> of-order VACUUM options as described above.  This is a prerequisite
>> change for opening up VACOPT_NOWAIT to users, as adding it to the
>> grammar as-is would cause statements like
>>
>>         VACUUM VERBOSE NOWAIT ANALYZE VERBOSE NOWAIT pg_class;
>>
>> to be allowed.  Since I am also looking to introduce a parenthesized
>> syntax for ANALYZE, this patch would prevent statements like
>
> If you add only a parenthesized grammar of ANALYZE, it seems to me
> that this would not be an allowed query anyway.
>
>>         VACUUM VERBOSE ANALYZE (VERBOSE, NOWAIT) pg_class;
>>
>> from being accepted as well.
>
> This could also be equivalent to ANALYZE(VERBOSE, NOWAIT). I can see
> pros and cons by keeping the complete extension of AnalyzeStmt in the
> VACUUM grammar, but as the documentation does not mention VACUUM
> VERBOSE ANALYZE VERBOSE as a legit query I guess that ripping it out
> is not going to annoy many users. Still let's see opinions from other
> folks on -hackers as another approach to the feature you want to add
> here would be to just implement the grammar for VACUUM but *not*
> ANALYZE, but I'd like to think that we still want ANALYZE to be
> supported, and also we want it to be extensible as a different thing
> than what VACUUM is.
>
>> 0002_add_parenthesized_analyze_syntax_v1.patch
>>
>> As a prerequisite to giving users access to the VACOPT_NOWAIT option, I
>> am introducing a parenthesized syntax for ANALYZE that is similar to
>> VACUUM's.  Following VACUUM's example, any new options will be added to
>> this syntax, and the old style will become deprecated.
>>
>> Adding a parenthesized syntax for ANALYZE is not strictly necessary for
>> providing the NOWAIT option, as NOWAIT is already a keyword in gram.y.
>> However, I thought it would be good to match VACUUM (since the command
>> structure is so similar), and this work would be required at some point
>> anyway if ANALYZE ever accepts options that we do not want to make
>> reserved keywords.
>
> Yep, agreed with the contents of this patch.
>
>> 0003_add_nowait_vacuum_option_v1.patch
>>
>> This change provides the existing VACOPT_NOWAIT option to users.  This
>> option was previously only used by autovacuum in special cases, but it
>> seems like a potentially valuable option in other cases as well.  For
>> example, perhaps a user wants to run a nightly VACUUM job that skips
>> tables that cannot be locked due to autovacuum (or something else)
>> already working on it.
>>
>> I chose to use NOWAIT as the option name because it is already a keyword
>> for the LOCK command.
>
> Makes sense to me.
>
>> This patch follows the existing logging precedent added by 11d8d72c and
>> ab6eaee8: if a relation is explicitly specified in the command, a log
>> statement will be emitted if it is skipped.  If the relation is not
>> specified (e.g. "VACUUM FULL;"), no such log statements will be emitted.
>
> +WARNING:  skipping analyze of "test1" --- lock not available
> +step analyze_specified: ANALYZE (NOWAIT) test1, test2;
> +step commit:
> +   COMMIT;
> OK for a WARNING for me in this case. We already do that for the other entries.

I also reviewed the patches.

For 0003 patch, if we do VACUUM (NOWAIT) to whole database we don't
get the above WARNING messages, but I think we should get them. The
reported issue also did VACUUM FULL and REINDEX to whole database.
Especially when VACUUM to whole database the information of skipped
relations would be useful for users.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/10/17, 11:13 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
> Let's see what other folks think first about the ANALYZE grammar in
> VACUUM, as well as having ANALYZE use a parenthesized grammer. FWIW, I
> like the contents of this patch set, and the thing is non-intrusive. I
> think that NOWAIT gains more value now that multiple relations can be
> vacuumed or analyzed with a single manual command.

Thanks for taking a look.

Nathan


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/10/17, 11:13 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
> Let's see what other folks think first about the ANALYZE grammar in
> VACUUM, as well as having ANALYZE use a parenthesized grammer. FWIW, I
> like the contents of this patch set, and the thing is non-intrusive. I
> think that NOWAIT gains more value now that multiple relations can be
> vacuumed or analyzed with a single manual command.

Thanks for taking a look.

Nathan


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/11/17, 9:41 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
> I also reviewed the patches.

Thanks for taking a look.

> For 0003 patch, if we do VACUUM (NOWAIT) to whole database we don't
> get the above WARNING messages, but I think we should get them. The
> reported issue also did VACUUM FULL and REINDEX to whole database.
> Especially when VACUUM to whole database the information of skipped
> relations would be useful for users.

Perhaps we could attempt to construct the RangeVar just before
logging in this case.

Nathan


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/11/17, 9:41 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
> I also reviewed the patches.

Thanks for taking a look.

> For 0003 patch, if we do VACUUM (NOWAIT) to whole database we don't
> get the above WARNING messages, but I think we should get them. The
> reported issue also did VACUUM FULL and REINDEX to whole database.
> Especially when VACUUM to whole database the information of skipped
> relations would be useful for users.

Perhaps we could attempt to construct the RangeVar just before
logging in this case.

Nathan


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/11/17, 9:41 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
> For 0003 patch, if we do VACUUM (NOWAIT) to whole database we don't
> get the above WARNING messages, but I think we should get them. The
> reported issue also did VACUUM FULL and REINDEX to whole database.
> Especially when VACUUM to whole database the information of skipped
> relations would be useful for users.

Here is a new set of patches.  0001 and 0002 are essentially the same
as before except for a rebase and some small indentation fixes.  0003
now includes logic to log all skipped relations regardless of whether
they were specified in the command.  I've also modified the isolation
test slightly to use partitioned tables instead of running VACUUM and
ANALYZE on the whole database.  This helps prevent timeouts on build-
farm machines with CLOBBER_CACHE_ALWAYS enabled (like with 0a3edbb3).

Nathan


Attachment

Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/11/17, 9:41 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
> For 0003 patch, if we do VACUUM (NOWAIT) to whole database we don't
> get the above WARNING messages, but I think we should get them. The
> reported issue also did VACUUM FULL and REINDEX to whole database.
> Especially when VACUUM to whole database the information of skipped
> relations would be useful for users.

Here is a new set of patches.  0001 and 0002 are essentially the same
as before except for a rebase and some small indentation fixes.  0003
now includes logic to log all skipped relations regardless of whether
they were specified in the command.  I've also modified the isolation
test slightly to use partitioned tables instead of running VACUUM and
ANALYZE on the whole database.  This helps prevent timeouts on build-
farm machines with CLOBBER_CACHE_ALWAYS enabled (like with 0a3edbb3).

Nathan


Attachment

Re: BUG #14941: Vacuum crashes

From
Masahiko Sawada
Date:
On Tue, Dec 19, 2017 at 2:35 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
> On 12/11/17, 9:41 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
>> For 0003 patch, if we do VACUUM (NOWAIT) to whole database we don't
>> get the above WARNING messages, but I think we should get them. The
>> reported issue also did VACUUM FULL and REINDEX to whole database.
>> Especially when VACUUM to whole database the information of skipped
>> relations would be useful for users.
>
> Here is a new set of patches.  0001 and 0002 are essentially the same
> as before except for a rebase and some small indentation fixes.  0003
> now includes logic to log all skipped relations regardless of whether
> they were specified in the command.  I've also modified the isolation
> test slightly to use partitioned tables instead of running VACUUM and
> ANALYZE on the whole database.  This helps prevent timeouts on build-
> farm machines with CLOBBER_CACHE_ALWAYS enabled (like with 0a3edbb3).
>

Thank you for updating the patches.

According to the following old comment, there might be reason why we
didn't pass the information to vacuum_rel(). But your patch fetches
the relation
name even if the "relation" is not provided. I wonder if it can be
problem in some cases.

-                * If the RangeVar is not defined, we do not have
enough information
-                * to provide a meaningful log statement.  Chances are that
-                * vacuum_rel's caller has intentionally not provided
this information
-                * so that this logging is skipped, anyway.
+                * If relname is NULL, we do not have enough
information to provide a
+                * meaningful log statement.  Chances are that this
information was
+                * intentionally not provided so that this logging is
skipped, anyway.


It would be an another discussion but autovacuum logs usually use
explicit names. Since the following two logs could be emitted by
autovacuum I wonder if we can make an explicit relation name if we're
autovacuum.

     if (elevel != 0)
     {
         if (!rel_lock)
             ereport(elevel,
                     (errcode(ERRCODE_LOCK_NOT_AVAILABLE),
                      errmsg("skipping vacuum of \"%s\" --- lock not available",
                            relname)));
         else
             ereport(elevel,
                     (errcode(ERRCODE_UNDEFINED_TABLE),
                      errmsg("skipping vacuum of \"%s\" --- relation
no longer exists",
                            relname)));
     }

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: BUG #14941: Vacuum crashes

From
Masahiko Sawada
Date:
On Tue, Dec 19, 2017 at 2:35 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
> On 12/11/17, 9:41 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
>> For 0003 patch, if we do VACUUM (NOWAIT) to whole database we don't
>> get the above WARNING messages, but I think we should get them. The
>> reported issue also did VACUUM FULL and REINDEX to whole database.
>> Especially when VACUUM to whole database the information of skipped
>> relations would be useful for users.
>
> Here is a new set of patches.  0001 and 0002 are essentially the same
> as before except for a rebase and some small indentation fixes.  0003
> now includes logic to log all skipped relations regardless of whether
> they were specified in the command.  I've also modified the isolation
> test slightly to use partitioned tables instead of running VACUUM and
> ANALYZE on the whole database.  This helps prevent timeouts on build-
> farm machines with CLOBBER_CACHE_ALWAYS enabled (like with 0a3edbb3).
>

Thank you for updating the patches.

According to the following old comment, there might be reason why we
didn't pass the information to vacuum_rel(). But your patch fetches
the relation
name even if the "relation" is not provided. I wonder if it can be
problem in some cases.

-                * If the RangeVar is not defined, we do not have
enough information
-                * to provide a meaningful log statement.  Chances are that
-                * vacuum_rel's caller has intentionally not provided
this information
-                * so that this logging is skipped, anyway.
+                * If relname is NULL, we do not have enough
information to provide a
+                * meaningful log statement.  Chances are that this
information was
+                * intentionally not provided so that this logging is
skipped, anyway.


It would be an another discussion but autovacuum logs usually use
explicit names. Since the following two logs could be emitted by
autovacuum I wonder if we can make an explicit relation name if we're
autovacuum.

     if (elevel != 0)
     {
         if (!rel_lock)
             ereport(elevel,
                     (errcode(ERRCODE_LOCK_NOT_AVAILABLE),
                      errmsg("skipping vacuum of \"%s\" --- lock not available",
                            relname)));
         else
             ereport(elevel,
                     (errcode(ERRCODE_UNDEFINED_TABLE),
                      errmsg("skipping vacuum of \"%s\" --- relation
no longer exists",
                            relname)));
     }

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/18/17, 3:30 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
> According to the following old comment, there might be reason why we
> didn't pass the information to vacuum_rel(). But your patch fetches
> the relation
> name even if the "relation" is not provided. I wonder if it can be
> problem in some cases.

Thanks for taking another look.

I've thought through a few edge cases here, but I haven't noticed
anything that I think is a problem.  If an unspecified relation is
renamed prior to get_rel_name(), we'll use the updated name, which
doesn't seem like an issue.  If an unspecified relation is renamed
between get_rel_name() and the log statement, we'll use the old name,
which seems possible in the current logging logic for VACUUM/ANALYZE.
And if an unspecified relation is dropped just prior to
get_rel_name(), the result will be NULL, and the logging will be
skipped (as it already is for concurrently dropped relations that are
not specified in the command).

Nathan


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/18/17, 3:30 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
> According to the following old comment, there might be reason why we
> didn't pass the information to vacuum_rel(). But your patch fetches
> the relation
> name even if the "relation" is not provided. I wonder if it can be
> problem in some cases.

Thanks for taking another look.

I've thought through a few edge cases here, but I haven't noticed
anything that I think is a problem.  If an unspecified relation is
renamed prior to get_rel_name(), we'll use the updated name, which
doesn't seem like an issue.  If an unspecified relation is renamed
between get_rel_name() and the log statement, we'll use the old name,
which seems possible in the current logging logic for VACUUM/ANALYZE.
And if an unspecified relation is dropped just prior to
get_rel_name(), the result will be NULL, and the logging will be
skipped (as it already is for concurrently dropped relations that are
not specified in the command).

Nathan


Re: BUG #14941: Vacuum crashes

From
Masahiko Sawada
Date:
On Tue, Dec 19, 2017 at 7:18 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
> On 12/18/17, 3:30 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
>> According to the following old comment, there might be reason why we
>> didn't pass the information to vacuum_rel(). But your patch fetches
>> the relation
>> name even if the "relation" is not provided. I wonder if it can be
>> problem in some cases.
>
> Thanks for taking another look.
>
> I've thought through a few edge cases here, but I haven't noticed
> anything that I think is a problem.  If an unspecified relation is
> renamed prior to get_rel_name(), we'll use the updated name, which
> doesn't seem like an issue.  If an unspecified relation is renamed
> between get_rel_name() and the log statement, we'll use the old name,
> which seems possible in the current logging logic for VACUUM/ANALYZE.
> And if an unspecified relation is dropped just prior to
> get_rel_name(), the result will be NULL, and the logging will be
> skipped (as it already is for concurrently dropped relations that are
> not specified in the command).
>

Thank you for explanation.

There are three cases where "relation" is set NULL:
 * Vacuum to whole database
 * Child tables when vacuum to parent table
 * TOAST tables when vacuum to normal table
As current related comment says, it would not be appropriate to
complain if we could not open such unspecified relations later. But
with you patch, we would complain it only if we could not  take a lock
on these relations. It's fine with me.

On the latest patch, it looks good to me except for a following comment.

-                * If the RangeVar is not defined, we do not have
enough information
-                * to provide a meaningful log statement.  Chances are that
-                * vacuum_rel's caller has intentionally not provided
this information
-                * so that this logging is skipped, anyway.
+                * If relname is NULL, we do not have enough
information to provide a
+                * meaningful log statement.  Chances are that this
information was
+                * intentionally not provided so that this logging is
skipped, anyway.

This comment looks odd to me because we ourselves didn't set relname
just before. For example, we can move the sentence to above comment
block as follows. Thoughts?

        /*
         * If relation is NULL, we do not have enough information to provide a
         * meaningful log statement. Chances are that this information was
         * intentionally not provided so that this logging is skipped, anyway.
         * However, iff VACOPT_NOWAIT is specified, we always try to emit
         * a log statement even if relation is NULL.
         */

(snip)

        /*
         * Determine the log level.
         *
         * For autovacuum logs, we emit a LOG if
         * log_autovacuum_min_duration is not diabled. For manual VACUUM, we
         * emit a WARNING to match the log statements in the permission
         * checks.
         */

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: BUG #14941: Vacuum crashes

From
Masahiko Sawada
Date:
On Tue, Dec 19, 2017 at 7:18 AM, Bossart, Nathan <bossartn@amazon.com> wrote:
> On 12/18/17, 3:30 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
>> According to the following old comment, there might be reason why we
>> didn't pass the information to vacuum_rel(). But your patch fetches
>> the relation
>> name even if the "relation" is not provided. I wonder if it can be
>> problem in some cases.
>
> Thanks for taking another look.
>
> I've thought through a few edge cases here, but I haven't noticed
> anything that I think is a problem.  If an unspecified relation is
> renamed prior to get_rel_name(), we'll use the updated name, which
> doesn't seem like an issue.  If an unspecified relation is renamed
> between get_rel_name() and the log statement, we'll use the old name,
> which seems possible in the current logging logic for VACUUM/ANALYZE.
> And if an unspecified relation is dropped just prior to
> get_rel_name(), the result will be NULL, and the logging will be
> skipped (as it already is for concurrently dropped relations that are
> not specified in the command).
>

Thank you for explanation.

There are three cases where "relation" is set NULL:
 * Vacuum to whole database
 * Child tables when vacuum to parent table
 * TOAST tables when vacuum to normal table
As current related comment says, it would not be appropriate to
complain if we could not open such unspecified relations later. But
with you patch, we would complain it only if we could not  take a lock
on these relations. It's fine with me.

On the latest patch, it looks good to me except for a following comment.

-                * If the RangeVar is not defined, we do not have
enough information
-                * to provide a meaningful log statement.  Chances are that
-                * vacuum_rel's caller has intentionally not provided
this information
-                * so that this logging is skipped, anyway.
+                * If relname is NULL, we do not have enough
information to provide a
+                * meaningful log statement.  Chances are that this
information was
+                * intentionally not provided so that this logging is
skipped, anyway.

This comment looks odd to me because we ourselves didn't set relname
just before. For example, we can move the sentence to above comment
block as follows. Thoughts?

        /*
         * If relation is NULL, we do not have enough information to provide a
         * meaningful log statement. Chances are that this information was
         * intentionally not provided so that this logging is skipped, anyway.
         * However, iff VACOPT_NOWAIT is specified, we always try to emit
         * a log statement even if relation is NULL.
         */

(snip)

        /*
         * Determine the log level.
         *
         * For autovacuum logs, we emit a LOG if
         * log_autovacuum_min_duration is not diabled. For manual VACUUM, we
         * emit a WARNING to match the log statements in the permission
         * checks.
         */

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/21/17, 11:07 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
> On the latest patch, it looks good to me except for a following comment.
>
> -                * If the RangeVar is not defined, we do not have
> enough information
> -                * to provide a meaningful log statement.  Chances are that
> -                * vacuum_rel's caller has intentionally not provided
> this information
> -                * so that this logging is skipped, anyway.
> +                * If relname is NULL, we do not have enough
> information to provide a
> +                * meaningful log statement.  Chances are that this
> information was
> +                * intentionally not provided so that this logging is
> skipped, anyway.
>
> This comment looks odd to me because we ourselves didn't set relname
> just before. For example, we can move the sentence to above comment
> block as follows. Thoughts?
>
>         /*
>          * If relation is NULL, we do not have enough information to provide a
>          * meaningful log statement. Chances are that this information was
>          * intentionally not provided so that this logging is skipped, anyway.
>          * However, iff VACOPT_NOWAIT is specified, we always try to emit
>          * a log statement even if relation is NULL.
>          */
>
> (snip)
>
>         /*
>          * Determine the log level.
>          *
>          * For autovacuum logs, we emit a LOG if
>          * log_autovacuum_min_duration is not diabled. For manual VACUUM, we
>          * emit a WARNING to match the log statements in the permission
>          * checks.
>          */

I agree, this makes more sense.  I've made this change in v3 of 0003.

Nathan


Attachment

Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 12/21/17, 11:07 PM, "Masahiko Sawada" <sawada.mshk@gmail.com> wrote:
> On the latest patch, it looks good to me except for a following comment.
>
> -                * If the RangeVar is not defined, we do not have
> enough information
> -                * to provide a meaningful log statement.  Chances are that
> -                * vacuum_rel's caller has intentionally not provided
> this information
> -                * so that this logging is skipped, anyway.
> +                * If relname is NULL, we do not have enough
> information to provide a
> +                * meaningful log statement.  Chances are that this
> information was
> +                * intentionally not provided so that this logging is
> skipped, anyway.
>
> This comment looks odd to me because we ourselves didn't set relname
> just before. For example, we can move the sentence to above comment
> block as follows. Thoughts?
>
>         /*
>          * If relation is NULL, we do not have enough information to provide a
>          * meaningful log statement. Chances are that this information was
>          * intentionally not provided so that this logging is skipped, anyway.
>          * However, iff VACOPT_NOWAIT is specified, we always try to emit
>          * a log statement even if relation is NULL.
>          */
>
> (snip)
>
>         /*
>          * Determine the log level.
>          *
>          * For autovacuum logs, we emit a LOG if
>          * log_autovacuum_min_duration is not diabled. For manual VACUUM, we
>          * emit a WARNING to match the log statements in the permission
>          * checks.
>          */

I agree, this makes more sense.  I've made this change in v3 of 0003.

Nathan


Attachment

Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Thu, Dec 28, 2017 at 10:46:18PM +0000, Bossart, Nathan wrote:
> I agree, this makes more sense.  I've made this change in v3 of 0003.

Based on the opinions gathered on this thread, 0001 and 0002 seem to be
in the shape wanted, and those look good for shipping. Now for 0003 we
are not there yet.

-       if (relation == NULL)
+       if (relation != NULL)
+           relname = relation->relname;
+       else if (!rel_lock)
+           relname = get_rel_name(relid);
+
+       if (relname == NULL)
I think that you are doing it wrong here. In get_all_vacuum_rels() you
should build a RangeVar to be reused in the context of this error
message, and hence you'll save an extra lookup based on the relation
OID here, saving from any timing issues that you have overseen as in
this code path a lock on the relation whose name is looked at is not
taken. Relying on the RangeVar being NULL to not generate any logs is
fine as a concept to me, but let's fill it where it is needed, and in
the case of this patch a VACUUM NOWAIT on the whole database is such a
case.
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Thu, Dec 28, 2017 at 10:46:18PM +0000, Bossart, Nathan wrote:
> I agree, this makes more sense.  I've made this change in v3 of 0003.

Based on the opinions gathered on this thread, 0001 and 0002 seem to be
in the shape wanted, and those look good for shipping. Now for 0003 we
are not there yet.

-       if (relation == NULL)
+       if (relation != NULL)
+           relname = relation->relname;
+       else if (!rel_lock)
+           relname = get_rel_name(relid);
+
+       if (relname == NULL)
I think that you are doing it wrong here. In get_all_vacuum_rels() you
should build a RangeVar to be reused in the context of this error
message, and hence you'll save an extra lookup based on the relation
OID here, saving from any timing issues that you have overseen as in
this code path a lock on the relation whose name is looked at is not
taken. Relying on the RangeVar being NULL to not generate any logs is
fine as a concept to me, but let's fill it where it is needed, and in
the case of this patch a VACUUM NOWAIT on the whole database is such a
case.
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
Robert Haas
Date:
On Mon, Jan 8, 2018 at 11:27 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Dec 28, 2017 at 10:46:18PM +0000, Bossart, Nathan wrote:
>> I agree, this makes more sense.  I've made this change in v3 of 0003.
>
> Based on the opinions gathered on this thread, 0001 and 0002 seem to be
> in the shape wanted, and those look good for shipping.

Committed 0001.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: BUG #14941: Vacuum crashes

From
Robert Haas
Date:
On Mon, Jan 8, 2018 at 11:27 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Dec 28, 2017 at 10:46:18PM +0000, Bossart, Nathan wrote:
>> I agree, this makes more sense.  I've made this change in v3 of 0003.
>
> Based on the opinions gathered on this thread, 0001 and 0002 seem to be
> in the shape wanted, and those look good for shipping.

Committed 0001.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 1/8/18, 10:28 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
> Based on the opinions gathered on this thread, 0001 and 0002 seem to be
> in the shape wanted, and those look good for shipping. Now for 0003 we
> are not there yet.

Thanks for taking a look.

> -       if (relation == NULL)
> +       if (relation != NULL)
> +           relname = relation->relname;
> +       else if (!rel_lock)
> +           relname = get_rel_name(relid);
> +
> +       if (relname == NULL)
> I think that you are doing it wrong here. In get_all_vacuum_rels() you
> should build a RangeVar to be reused in the context of this error
> message, and hence you'll save an extra lookup based on the relation
> OID here, saving from any timing issues that you have overseen as in
> this code path a lock on the relation whose name is looked at is not
> taken. Relying on the RangeVar being NULL to not generate any logs is
> fine as a concept to me, but let's fill it where it is needed, and in
> the case of this patch a VACUUM NOWAIT on the whole database is such a
> case.

I understand what you are saying here.  I think there are two competing
logging behaviors:

  1. If a relation is concurrently dropped, we should skip logging if
     the relation was not specified in the original VACUUM/ANALYZE
     command [0]
  2. If a relation is skipped because the lock is not available, we
     should never skip logging

Since we rely on the RangeVar being NULL to determine whether or not
we should log for case 1, filling in the RangeVar during
get_all_vacuum_rels() would add complexity.  Any time VACOPT_NOWAIT is
specified, we must construct a RangeVar for all relations.  Also, we
must find a new way to track whether or not the relation was specified
in the original command in order to maintain the behavior of case 1.

IMO it is simpler to call get_rel_name() to discover the relation name
in this specific case (NOWAIT is specified, lock is not available, and
the relation was not specified in the original command).  I detailed
some potential edge cases of this approach earlier [1].  Even if I am
missing some, I think the worst case is that get_rel_name() returns
NULL and the logging is skipped.

What do you think?  I will take a deeper look into how your suggested
approach might be achieved.

Nathan

[0] https://postgr.es/m/28748.1507071576%40sss.pgh.pa.us
[1] https://postgr.es/m/32F4B778-292B-41AF-891A-497B8127CD59%40amazon.com


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 1/8/18, 10:28 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
> Based on the opinions gathered on this thread, 0001 and 0002 seem to be
> in the shape wanted, and those look good for shipping. Now for 0003 we
> are not there yet.

Thanks for taking a look.

> -       if (relation == NULL)
> +       if (relation != NULL)
> +           relname = relation->relname;
> +       else if (!rel_lock)
> +           relname = get_rel_name(relid);
> +
> +       if (relname == NULL)
> I think that you are doing it wrong here. In get_all_vacuum_rels() you
> should build a RangeVar to be reused in the context of this error
> message, and hence you'll save an extra lookup based on the relation
> OID here, saving from any timing issues that you have overseen as in
> this code path a lock on the relation whose name is looked at is not
> taken. Relying on the RangeVar being NULL to not generate any logs is
> fine as a concept to me, but let's fill it where it is needed, and in
> the case of this patch a VACUUM NOWAIT on the whole database is such a
> case.

I understand what you are saying here.  I think there are two competing
logging behaviors:

  1. If a relation is concurrently dropped, we should skip logging if
     the relation was not specified in the original VACUUM/ANALYZE
     command [0]
  2. If a relation is skipped because the lock is not available, we
     should never skip logging

Since we rely on the RangeVar being NULL to determine whether or not
we should log for case 1, filling in the RangeVar during
get_all_vacuum_rels() would add complexity.  Any time VACOPT_NOWAIT is
specified, we must construct a RangeVar for all relations.  Also, we
must find a new way to track whether or not the relation was specified
in the original command in order to maintain the behavior of case 1.

IMO it is simpler to call get_rel_name() to discover the relation name
in this specific case (NOWAIT is specified, lock is not available, and
the relation was not specified in the original command).  I detailed
some potential edge cases of this approach earlier [1].  Even if I am
missing some, I think the worst case is that get_rel_name() returns
NULL and the logging is skipped.

What do you think?  I will take a deeper look into how your suggested
approach might be achieved.

Nathan

[0] https://postgr.es/m/28748.1507071576%40sss.pgh.pa.us
[1] https://postgr.es/m/32F4B778-292B-41AF-891A-497B8127CD59%40amazon.com


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 1/9/18, 9:24 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> On Mon, Jan 8, 2018 at 11:27 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Thu, Dec 28, 2017 at 10:46:18PM +0000, Bossart, Nathan wrote:
>>> I agree, this makes more sense.  I've made this change in v3 of 0003.
>>
>> Based on the opinions gathered on this thread, 0001 and 0002 seem to be
>> in the shape wanted, and those look good for shipping.
>
> Committed 0001.

Thanks!

Nathan


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 1/9/18, 9:24 AM, "Robert Haas" <robertmhaas@gmail.com> wrote:
> On Mon, Jan 8, 2018 at 11:27 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Thu, Dec 28, 2017 at 10:46:18PM +0000, Bossart, Nathan wrote:
>>> I agree, this makes more sense.  I've made this change in v3 of 0003.
>>
>> Based on the opinions gathered on this thread, 0001 and 0002 seem to be
>> in the shape wanted, and those look good for shipping.
>
> Committed 0001.

Thanks!

Nathan


Re: BUG #14941: Vacuum crashes

From
Andres Freund
Date:
On 2018-01-09 10:23:12 -0500, Robert Haas wrote:
> On Mon, Jan 8, 2018 at 11:27 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
> > On Thu, Dec 28, 2017 at 10:46:18PM +0000, Bossart, Nathan wrote:
> >> I agree, this makes more sense.  I've made this change in v3 of 0003.
> >
> > Based on the opinions gathered on this thread, 0001 and 0002 seem to be
> > in the shape wanted, and those look good for shipping.
> 
> Committed 0001.

FWIW, I think in the future it'd be good to develop new features outside
of a thread about "vacuum crashes".

Greetings,

Andres Freund


Re: BUG #14941: Vacuum crashes

From
Andres Freund
Date:
On 2018-01-09 10:23:12 -0500, Robert Haas wrote:
> On Mon, Jan 8, 2018 at 11:27 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
> > On Thu, Dec 28, 2017 at 10:46:18PM +0000, Bossart, Nathan wrote:
> >> I agree, this makes more sense.  I've made this change in v3 of 0003.
> >
> > Based on the opinions gathered on this thread, 0001 and 0002 seem to be
> > in the shape wanted, and those look good for shipping.
> 
> Committed 0001.

FWIW, I think in the future it'd be good to develop new features outside
of a thread about "vacuum crashes".

Greetings,

Andres Freund


Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Tue, Jan 09, 2018 at 09:40:50PM +0000, Bossart, Nathan wrote:
> On 1/8/18, 10:28 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
>> I think that you are doing it wrong here. In get_all_vacuum_rels() you
>> should build a RangeVar to be reused in the context of this error
>> message, and hence you'll save an extra lookup based on the relation
>> OID here, saving from any timing issues that you have overseen as in
>> this code path a lock on the relation whose name is looked at is not
>> taken. Relying on the RangeVar being NULL to not generate any logs is
>> fine as a concept to me, but let's fill it where it is needed, and in
>> the case of this patch a VACUUM NOWAIT on the whole database is such a
>> case.
>
> I understand what you are saying here.  I think there are two competing
> logging behaviors:
>
>   1. If a relation is concurrently dropped, we should skip logging if
>      the relation was not specified in the original VACUUM/ANALYZE
>      command [0]
>   2. If a relation is skipped because the lock is not available, we
>      should never skip logging

At the end this comes back to if the relation is explicitely listed or
not in the command specified by the user..

> What do you think?  I will take a deeper look into how your suggested
> approach might be achieved.

Backpedalling a bit on this point and coming back to this message from
Tom (https://www.postgresql.org/message-id/28748.1507071576%40sss.pgh.pa.us)
which you just cited. Why do we actually need to issue any WARNING
messages for unlisted relations? Contrary to what Sawada-san complained
upthread, it looks sane to me to not log anything if a relation is not
explicitely listed. So you should not get any warnings for a
database-wide VACUUM if a relation is dropped while the thing is
running, and what you proposed initially in
https://www.postgresql.org/message-id/D3FC73E2-9B1A-4DB4-8180-55F57D116B4E@amazon.com
is more simple, does not need to worry about any kind of timing issues,
and is consistent with autovacuum.
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Tue, Jan 09, 2018 at 09:40:50PM +0000, Bossart, Nathan wrote:
> On 1/8/18, 10:28 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
>> I think that you are doing it wrong here. In get_all_vacuum_rels() you
>> should build a RangeVar to be reused in the context of this error
>> message, and hence you'll save an extra lookup based on the relation
>> OID here, saving from any timing issues that you have overseen as in
>> this code path a lock on the relation whose name is looked at is not
>> taken. Relying on the RangeVar being NULL to not generate any logs is
>> fine as a concept to me, but let's fill it where it is needed, and in
>> the case of this patch a VACUUM NOWAIT on the whole database is such a
>> case.
>
> I understand what you are saying here.  I think there are two competing
> logging behaviors:
>
>   1. If a relation is concurrently dropped, we should skip logging if
>      the relation was not specified in the original VACUUM/ANALYZE
>      command [0]
>   2. If a relation is skipped because the lock is not available, we
>      should never skip logging

At the end this comes back to if the relation is explicitely listed or
not in the command specified by the user..

> What do you think?  I will take a deeper look into how your suggested
> approach might be achieved.

Backpedalling a bit on this point and coming back to this message from
Tom (https://www.postgresql.org/message-id/28748.1507071576%40sss.pgh.pa.us)
which you just cited. Why do we actually need to issue any WARNING
messages for unlisted relations? Contrary to what Sawada-san complained
upthread, it looks sane to me to not log anything if a relation is not
explicitely listed. So you should not get any warnings for a
database-wide VACUUM if a relation is dropped while the thing is
running, and what you proposed initially in
https://www.postgresql.org/message-id/D3FC73E2-9B1A-4DB4-8180-55F57D116B4E@amazon.com
is more simple, does not need to worry about any kind of timing issues,
and is consistent with autovacuum.
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Tue, Jan 09, 2018 at 01:46:55PM -0800, Andres Freund wrote:
> On 2018-01-09 10:23:12 -0500, Robert Haas wrote:
> > On Mon, Jan 8, 2018 at 11:27 PM, Michael Paquier
> > <michael.paquier@gmail.com> wrote:
> > > On Thu, Dec 28, 2017 at 10:46:18PM +0000, Bossart, Nathan wrote:
> > >> I agree, this makes more sense.  I've made this change in v3 of 0003.
> > >
> > > Based on the opinions gathered on this thread, 0001 and 0002 seem to be
> > > in the shape wanted, and those look good for shipping.
> >
> > Committed 0001.
>
> FWIW, I think in the future it'd be good to develop new features outside
> of a thread about "vacuum crashes".

You are right atht this is bad practice. Partially my fault for not
recommending the move to a new thread.
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Tue, Jan 09, 2018 at 01:46:55PM -0800, Andres Freund wrote:
> On 2018-01-09 10:23:12 -0500, Robert Haas wrote:
> > On Mon, Jan 8, 2018 at 11:27 PM, Michael Paquier
> > <michael.paquier@gmail.com> wrote:
> > > On Thu, Dec 28, 2017 at 10:46:18PM +0000, Bossart, Nathan wrote:
> > >> I agree, this makes more sense.  I've made this change in v3 of 0003.
> > >
> > > Based on the opinions gathered on this thread, 0001 and 0002 seem to be
> > > in the shape wanted, and those look good for shipping.
> >
> > Committed 0001.
>
> FWIW, I think in the future it'd be good to develop new features outside
> of a thread about "vacuum crashes".

You are right atht this is bad practice. Partially my fault for not
recommending the move to a new thread.
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 1/9/18, 8:54 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
> Backpedalling a bit on this point and coming back to this message from
> Tom (https://www.postgresql.org/message-id/28748.1507071576%40sss.pgh.pa.us)
> which you just cited. Why do we actually need to issue any WARNING
> messages for unlisted relations? Contrary to what Sawada-san complained
> upthread, it looks sane to me to not log anything if a relation is not
> explicitely listed. So you should not get any warnings for a
> database-wide VACUUM if a relation is dropped while the thing is
> running, and what you proposed initially in
> https://www.postgresql.org/message-id/D3FC73E2-9B1A-4DB4-8180-55F57D116B4E@amazon.com
> is more simple, does not need to worry about any kind of timing issues,
> and is consistent with autovacuum.

Right.  I don't have a terribly strong opinion either way.  I think the
counter-argument is that logging skipped relations might provide
valuable feedback to users.  If I ran a database-wide VACUUM and a
relation was skipped due to lock contention, it might be nice to know
that so I can handle the relation individually.

Perhaps any logging changes for VACOPT_NOWAIT could be handled in a
separate thread.  For now, I've updated 0003 to remove the logging
changes.

Nathan


Attachment

Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 1/9/18, 8:54 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
> Backpedalling a bit on this point and coming back to this message from
> Tom (https://www.postgresql.org/message-id/28748.1507071576%40sss.pgh.pa.us)
> which you just cited. Why do we actually need to issue any WARNING
> messages for unlisted relations? Contrary to what Sawada-san complained
> upthread, it looks sane to me to not log anything if a relation is not
> explicitely listed. So you should not get any warnings for a
> database-wide VACUUM if a relation is dropped while the thing is
> running, and what you proposed initially in
> https://www.postgresql.org/message-id/D3FC73E2-9B1A-4DB4-8180-55F57D116B4E@amazon.com
> is more simple, does not need to worry about any kind of timing issues,
> and is consistent with autovacuum.

Right.  I don't have a terribly strong opinion either way.  I think the
counter-argument is that logging skipped relations might provide
valuable feedback to users.  If I ran a database-wide VACUUM and a
relation was skipped due to lock contention, it might be nice to know
that so I can handle the relation individually.

Perhaps any logging changes for VACOPT_NOWAIT could be handled in a
separate thread.  For now, I've updated 0003 to remove the logging
changes.

Nathan


Attachment

Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 1/9/18, 8:55 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
> On Tue, Jan 09, 2018 at 01:46:55PM -0800, Andres Freund wrote:
>> On 2018-01-09 10:23:12 -0500, Robert Haas wrote:
>> > On Mon, Jan 8, 2018 at 11:27 PM, Michael Paquier
>> > <michael.paquier@gmail.com> wrote:
>> > > On Thu, Dec 28, 2017 at 10:46:18PM +0000, Bossart, Nathan wrote:
>> > >> I agree, this makes more sense.  I've made this change in v3 of 0003.
>> > >
>> > > Based on the opinions gathered on this thread, 0001 and 0002 seem to be
>> > > in the shape wanted, and those look good for shipping.
>> > 
>> > Committed 0001.
>> 
>> FWIW, I think in the future it'd be good to develop new features outside
>> of a thread about "vacuum crashes".
>
> You are right atht this is bad practice. Partially my fault for not
> recommending the move to a new thread.

Yes, sorry about that.  I should have opened a new thread.

Nathan


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 1/9/18, 8:55 PM, "Michael Paquier" <michael.paquier@gmail.com> wrote:
> On Tue, Jan 09, 2018 at 01:46:55PM -0800, Andres Freund wrote:
>> On 2018-01-09 10:23:12 -0500, Robert Haas wrote:
>> > On Mon, Jan 8, 2018 at 11:27 PM, Michael Paquier
>> > <michael.paquier@gmail.com> wrote:
>> > > On Thu, Dec 28, 2017 at 10:46:18PM +0000, Bossart, Nathan wrote:
>> > >> I agree, this makes more sense.  I've made this change in v3 of 0003.
>> > >
>> > > Based on the opinions gathered on this thread, 0001 and 0002 seem to be
>> > > in the shape wanted, and those look good for shipping.
>> > 
>> > Committed 0001.
>> 
>> FWIW, I think in the future it'd be good to develop new features outside
>> of a thread about "vacuum crashes".
>
> You are right atht this is bad practice. Partially my fault for not
> recommending the move to a new thread.

Yes, sorry about that.  I should have opened a new thread.

Nathan


Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Wed, Jan 10, 2018 at 05:26:43PM +0000, Bossart, Nathan wrote:
> Right.  I don't have a terribly strong opinion either way.  I think the
> counter-argument is that logging skipped relations might provide
> valuable feedback to users.  If I ran a database-wide VACUUM and a
> relation was skipped due to lock contention, it might be nice to know
> that so I can handle the relation individually.

Or users may not care about getting information they don't care
about. When running a database-wide VACUUM or ANALYZE you don't know
what is exactly the list of relations this is working on. Getting
information about things you don't know beforehand can be considered as
misinformation.

Perhaps others have different opinions. Sawada-san?

> Perhaps any logging changes for VACOPT_NOWAIT could be handled in a
> separate thread.  For now, I've updated 0003 to remove the logging
> changes.

Thanks. I am marking those as ready for committer, you are providing the
set patch patch which offer the most consistent experience.
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Wed, Jan 10, 2018 at 05:26:43PM +0000, Bossart, Nathan wrote:
> Right.  I don't have a terribly strong opinion either way.  I think the
> counter-argument is that logging skipped relations might provide
> valuable feedback to users.  If I ran a database-wide VACUUM and a
> relation was skipped due to lock contention, it might be nice to know
> that so I can handle the relation individually.

Or users may not care about getting information they don't care
about. When running a database-wide VACUUM or ANALYZE you don't know
what is exactly the list of relations this is working on. Getting
information about things you don't know beforehand can be considered as
misinformation.

Perhaps others have different opinions. Sawada-san?

> Perhaps any logging changes for VACOPT_NOWAIT could be handled in a
> separate thread.  For now, I've updated 0003 to remove the logging
> changes.

Thanks. I am marking those as ready for committer, you are providing the
set patch patch which offer the most consistent experience.
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
Masahiko Sawada
Date:
On Thu, Jan 11, 2018 at 8:14 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Jan 10, 2018 at 05:26:43PM +0000, Bossart, Nathan wrote:
>> Right.  I don't have a terribly strong opinion either way.  I think the
>> counter-argument is that logging skipped relations might provide
>> valuable feedback to users.  If I ran a database-wide VACUUM and a
>> relation was skipped due to lock contention, it might be nice to know
>> that so I can handle the relation individually.
>
> Or users may not care about getting information they don't care
> about. When running a database-wide VACUUM or ANALYZE you don't know
> what is exactly the list of relations this is working on. Getting
> information about things you don't know beforehand can be considered as
> misinformation.
>
> Perhaps others have different opinions. Sawada-san?

Hmm, I agree that v4 patch is more simple and consistent with current
behavior. The logging is not necessary if relation has been dropped
but I think that it's necessary if a relation exists but database-wide
VACUUM or ANALYZE could not take the lock on it. I can image a use
case where a user don't rely on autovacuum and do manual vacuums in a
maintenance window (per-week or per-day). In this case the log message
of skipping vacuum would be useful for user. The user might not be
interested in what is exactly the list of relations but might be
interested in the relations that were skipped to vacuum. IIUC since
autovacuum always passes a list of VacuumRelation (length is 1 and
having non-NULL RangeVar) to vacuum(), if autovacuum could not take
the lock on the relation it certainly emits that log. But with v4
patch, that log message is not emitted only if user do database-wide
VACUUM or ANALYZE and could not take the lock. On the other hand, the
downside of that we emits that log even if relations == NULL is we
emits that log even for toast tables and child tables. It would be
annoy user and might be unnecessary information. To deal with it, if
NOWAIT is specified we can fill RangeVar in get_all_vacuum_rels and
make vacuum_rel not emit "relation no longer exists" log. That way we
can emits that log by database-wide VACUUM/ANALYZE, while not emitting
for toast table and child tables. But I'm not sure it's worth to
effort for this direction (v3patch + tricks) going so far as making
such tricks. I'd like to hear opinions.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: BUG #14941: Vacuum crashes

From
Masahiko Sawada
Date:
On Thu, Jan 11, 2018 at 8:14 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Wed, Jan 10, 2018 at 05:26:43PM +0000, Bossart, Nathan wrote:
>> Right.  I don't have a terribly strong opinion either way.  I think the
>> counter-argument is that logging skipped relations might provide
>> valuable feedback to users.  If I ran a database-wide VACUUM and a
>> relation was skipped due to lock contention, it might be nice to know
>> that so I can handle the relation individually.
>
> Or users may not care about getting information they don't care
> about. When running a database-wide VACUUM or ANALYZE you don't know
> what is exactly the list of relations this is working on. Getting
> information about things you don't know beforehand can be considered as
> misinformation.
>
> Perhaps others have different opinions. Sawada-san?

Hmm, I agree that v4 patch is more simple and consistent with current
behavior. The logging is not necessary if relation has been dropped
but I think that it's necessary if a relation exists but database-wide
VACUUM or ANALYZE could not take the lock on it. I can image a use
case where a user don't rely on autovacuum and do manual vacuums in a
maintenance window (per-week or per-day). In this case the log message
of skipping vacuum would be useful for user. The user might not be
interested in what is exactly the list of relations but might be
interested in the relations that were skipped to vacuum. IIUC since
autovacuum always passes a list of VacuumRelation (length is 1 and
having non-NULL RangeVar) to vacuum(), if autovacuum could not take
the lock on the relation it certainly emits that log. But with v4
patch, that log message is not emitted only if user do database-wide
VACUUM or ANALYZE and could not take the lock. On the other hand, the
downside of that we emits that log even if relations == NULL is we
emits that log even for toast tables and child tables. It would be
annoy user and might be unnecessary information. To deal with it, if
NOWAIT is specified we can fill RangeVar in get_all_vacuum_rels and
make vacuum_rel not emit "relation no longer exists" log. That way we
can emits that log by database-wide VACUUM/ANALYZE, while not emitting
for toast table and child tables. But I'm not sure it's worth to
effort for this direction (v3patch + tricks) going so far as making
such tricks. I'd like to hear opinions.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: BUG #14941: Vacuum crashes

From
Andres Freund
Date:
On 2018-01-11 08:14:42 +0900, Michael Paquier wrote:
> On Wed, Jan 10, 2018 at 05:26:43PM +0000, Bossart, Nathan wrote:
> > Perhaps any logging changes for VACOPT_NOWAIT could be handled in a
> > separate thread.  For now, I've updated 0003 to remove the logging
> > changes.
> 
> Thanks. I am marking those as ready for committer, you are providing the
> set patch patch which offer the most consistent experience.

I was working on committing 0002 and 0003, when I noticed that the
second patch doesn't actually fully works.  NOWAIT does what it says on
the tin iff the table is locked with a lower lock level than access
exclusive.  But if AEL is used, the command is blocked in

static List *
expand_vacuum_rel(VacuumRelation *vrel)
...
        /*
         * We transiently take AccessShareLock to protect the syscache lookup
         * below, as well as find_all_inheritors's expectation that the caller
         * holds some lock on the starting relation.
         */
        relid = RangeVarGetRelid(vrel->relation, AccessShareLock, false);

ISTM has been added after the patches initially were proposed. See
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=19de0ab23ccba12567c18640f00b49f01471018d

I'm a bit disappointed that the tests didn't catch this, they're clearly
not fully there. They definitely should test the AEL case, as
demonstrated here.


Independent of that, I'm also concerned that NOWAIT isn't implemented
consistently with other commands. Aren't we erroring out in other uses
of NOWAIT?  ISTM a more appropriate keyword would have been SKIP
LOCKED.  I think the behaviour makes sense, but I'd rename the internal
flag and the grammar to use SKIP LOCKED.

Lightly edited patches attached. Please preserve commit messages while
fixing these issues.

Greetings,

Andres Freund

Attachment

Re: BUG #14941: Vacuum crashes

From
Andres Freund
Date:
On 2018-01-11 08:14:42 +0900, Michael Paquier wrote:
> On Wed, Jan 10, 2018 at 05:26:43PM +0000, Bossart, Nathan wrote:
> > Perhaps any logging changes for VACOPT_NOWAIT could be handled in a
> > separate thread.  For now, I've updated 0003 to remove the logging
> > changes.
> 
> Thanks. I am marking those as ready for committer, you are providing the
> set patch patch which offer the most consistent experience.

I was working on committing 0002 and 0003, when I noticed that the
second patch doesn't actually fully works.  NOWAIT does what it says on
the tin iff the table is locked with a lower lock level than access
exclusive.  But if AEL is used, the command is blocked in

static List *
expand_vacuum_rel(VacuumRelation *vrel)
...
        /*
         * We transiently take AccessShareLock to protect the syscache lookup
         * below, as well as find_all_inheritors's expectation that the caller
         * holds some lock on the starting relation.
         */
        relid = RangeVarGetRelid(vrel->relation, AccessShareLock, false);

ISTM has been added after the patches initially were proposed. See
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=19de0ab23ccba12567c18640f00b49f01471018d

I'm a bit disappointed that the tests didn't catch this, they're clearly
not fully there. They definitely should test the AEL case, as
demonstrated here.


Independent of that, I'm also concerned that NOWAIT isn't implemented
consistently with other commands. Aren't we erroring out in other uses
of NOWAIT?  ISTM a more appropriate keyword would have been SKIP
LOCKED.  I think the behaviour makes sense, but I'd rename the internal
flag and the grammar to use SKIP LOCKED.

Lightly edited patches attached. Please preserve commit messages while
fixing these issues.

Greetings,

Andres Freund

Attachment

Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
Thanks for taking a look.

On 3/3/18, 6:13 PM, "Andres Freund" <andres@anarazel.de> wrote:
> I was working on committing 0002 and 0003, when I noticed that the
> second patch doesn't actually fully works.  NOWAIT does what it says on
> the tin iff the table is locked with a lower lock level than access
> exclusive.  But if AEL is used, the command is blocked in
>
> static List *
> expand_vacuum_rel(VacuumRelation *vrel)
> ...
>         /*
>          * We transiently take AccessShareLock to protect the syscache lookup
>          * below, as well as find_all_inheritors's expectation that the caller
>          * holds some lock on the starting relation.
>          */
>         relid = RangeVarGetRelid(vrel->relation, AccessShareLock, false);
>
> ISTM has been added after the patches initially were proposed. See
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=19de0ab23ccba12567c18640f00b49f01471018d
>
> I'm a bit disappointed that the tests didn't catch this, they're clearly
> not fully there. They definitely should test the AEL case, as
> demonstrated here.

Sorry about that.  I've added logic to handle ACCESS EXCLUSIVE in v6 of 0002,
and I've extended the tests to cover that case.

> Independent of that, I'm also concerned that NOWAIT isn't implemented
> consistently with other commands. Aren't we erroring out in other uses
> of NOWAIT?  ISTM a more appropriate keyword would have been SKIP
> LOCKED.  I think the behaviour makes sense, but I'd rename the internal
> flag and the grammar to use SKIP LOCKED.

Agreed.  I've updated 0002 to use SKIP LOCKED instead of NOWAIT.

Nathan


Attachment

Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
Thanks for taking a look.

On 3/3/18, 6:13 PM, "Andres Freund" <andres@anarazel.de> wrote:
> I was working on committing 0002 and 0003, when I noticed that the
> second patch doesn't actually fully works.  NOWAIT does what it says on
> the tin iff the table is locked with a lower lock level than access
> exclusive.  But if AEL is used, the command is blocked in
>
> static List *
> expand_vacuum_rel(VacuumRelation *vrel)
> ...
>         /*
>          * We transiently take AccessShareLock to protect the syscache lookup
>          * below, as well as find_all_inheritors's expectation that the caller
>          * holds some lock on the starting relation.
>          */
>         relid = RangeVarGetRelid(vrel->relation, AccessShareLock, false);
>
> ISTM has been added after the patches initially were proposed. See
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=19de0ab23ccba12567c18640f00b49f01471018d
>
> I'm a bit disappointed that the tests didn't catch this, they're clearly
> not fully there. They definitely should test the AEL case, as
> demonstrated here.

Sorry about that.  I've added logic to handle ACCESS EXCLUSIVE in v6 of 0002,
and I've extended the tests to cover that case.

> Independent of that, I'm also concerned that NOWAIT isn't implemented
> consistently with other commands. Aren't we erroring out in other uses
> of NOWAIT?  ISTM a more appropriate keyword would have been SKIP
> LOCKED.  I think the behaviour makes sense, but I'd rename the internal
> flag and the grammar to use SKIP LOCKED.

Agreed.  I've updated 0002 to use SKIP LOCKED instead of NOWAIT.

Nathan


Attachment

Re: BUG #14941: Vacuum crashes

From
Andres Freund
Date:
On 2018-03-03 16:12:52 -0800, Andres Freund wrote:
> On 2018-01-11 08:14:42 +0900, Michael Paquier wrote:
> > On Wed, Jan 10, 2018 at 05:26:43PM +0000, Bossart, Nathan wrote:
> > > Perhaps any logging changes for VACOPT_NOWAIT could be handled in a
> > > separate thread.  For now, I've updated 0003 to remove the logging
> > > changes.
> > 
> > Thanks. I am marking those as ready for committer, you are providing the
> > set patch patch which offer the most consistent experience.
> 
> I was working on committing 0002 and 0003, when I noticed that the
> second patch doesn't actually fully works.  NOWAIT does what it says on
> the tin iff the table is locked with a lower lock level than access
> exclusive.  But if AEL is used, the command is blocked in

I've pushed the first one now. There seems to be no reason to wait with
it, even it takes a bit till we can get the second part right.

- Andres


Re: BUG #14941: Vacuum crashes

From
Andres Freund
Date:
On 2018-03-03 16:12:52 -0800, Andres Freund wrote:
> On 2018-01-11 08:14:42 +0900, Michael Paquier wrote:
> > On Wed, Jan 10, 2018 at 05:26:43PM +0000, Bossart, Nathan wrote:
> > > Perhaps any logging changes for VACOPT_NOWAIT could be handled in a
> > > separate thread.  For now, I've updated 0003 to remove the logging
> > > changes.
> > 
> > Thanks. I am marking those as ready for committer, you are providing the
> > set patch patch which offer the most consistent experience.
> 
> I was working on committing 0002 and 0003, when I noticed that the
> second patch doesn't actually fully works.  NOWAIT does what it says on
> the tin iff the table is locked with a lower lock level than access
> exclusive.  But if AEL is used, the command is blocked in

I've pushed the first one now. There seems to be no reason to wait with
it, even it takes a bit till we can get the second part right.

- Andres


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 3/5/18, 7:58 PM, "Andres Freund" <andres@anarazel.de> wrote:
> I've pushed the first one now. There seems to be no reason to wait with
> it, even it takes a bit till we can get the second part right.

Thanks!

Nathan


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 3/5/18, 7:58 PM, "Andres Freund" <andres@anarazel.de> wrote:
> I've pushed the first one now. There seems to be no reason to wait with
> it, even it takes a bit till we can get the second part right.

Thanks!

Nathan


Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Mon, Mar 05, 2018 at 09:55:13PM +0000, Bossart, Nathan wrote:
> On 3/3/18, 6:13 PM, "Andres Freund" <andres@anarazel.de> wrote:
>> I was working on committing 0002 and 0003, when I noticed that the
>> second patch doesn't actually fully works.  NOWAIT does what it says on
>> the tin iff the table is locked with a lower lock level than access
>> exclusive.  But if AEL is used, the command is blocked in
>>
>> static List *
>> expand_vacuum_rel(VacuumRelation *vrel)
>> ...
>>         /*
>>          * We transiently take AccessShareLock to protect the syscache lookup
>>          * below, as well as find_all_inheritors's expectation that the caller
>>          * holds some lock on the starting relation.
>>          */
>>         relid = RangeVarGetRelid(vrel->relation, AccessShareLock, false);
>>
>> ISTM has been added after the patches initially were proposed. See
>> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=19de0ab23ccba12567c18640f00b49f01471018d
>>
>> I'm a bit disappointed that the tests didn't catch this, they're clearly
>> not fully there. They definitely should test the AEL case, as
>> demonstrated here.
>
> Sorry about that.  I've added logic to handle ACCESS EXCLUSIVE in v6 of 0002,
> and I've extended the tests to cover that case.

Hm, yes.  I have also let those patches rot a bit myself.  Sorry for
that.

>> Independent of that, I'm also concerned that NOWAIT isn't implemented
>> consistently with other commands. Aren't we erroring out in other uses
>> of NOWAIT?  ISTM a more appropriate keyword would have been SKIP
>> LOCKED.  I think the behaviour makes sense, but I'd rename the internal
>> flag and the grammar to use SKIP LOCKED.
>
> Agreed.  I've updated 0002 to use SKIP LOCKED instead of NOWAIT.

So, NOWAIT means "please try to take a lock, don't wait for it and issue
an error immediately if the lock cannot be taken".  SKIP_LOCKED means
"please try to take a lock, don't wait for it, but do not issue an error
if the lock cannot be taken and move on to next steps".  I agree that
this is more consistent.

+       if (!(options & VACOPT_SKIP_LOCKED))
+           relid = RangeVarGetRelid(vrel->relation, AccessShareLock,
false);
+       else
+       {
+           relid = RangeVarGetRelid(vrel->relation, NoLock, false);
Yeah, I agree with Andres that getting all this logic done in
RangeVarGetRelidExtended would be cleaner.  Let's see where the other
thread leads us to:
https://www.postgresql.org/message-id/20180306005349.b65whmvj7z6hbe2y%40alap3.anarazel.de

+       /*
+        * We must downcase the statement type for log message
consistency
+        * between expand_vacuum_rel(), vacuum_rel(), and analyze_rel().
+        */
+       stmttype_lower = asc_tolower(stmttype, strlen(stmttype));
This blows up on multi-byte characters and may report incorrect relation
name if double quotes are used, no?

+       /*
+        * Since autovacuum workers supply OIDs when calling vacuum(), no
+        * autovacuum worker should reach this code, and we can make
+        * assumptions about the logging levels we should use in the
checks
+        * below.
+        */
+       Assert(!IsAutoVacuumWorkerProcess());
This is a good idea, should be a separate patch as other folks tend to
be confused with relid handling in expand_vacuum_rel().

+      Specifies that <command>VACUUM</command> should not wait for any
+      conflicting locks to be released: if a relation cannot be locked
+      immediately without waiting, the relation is skipped
Should this mention as well that no errors are raised, allowing the
process to move on with the next relations?
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Mon, Mar 05, 2018 at 09:55:13PM +0000, Bossart, Nathan wrote:
> On 3/3/18, 6:13 PM, "Andres Freund" <andres@anarazel.de> wrote:
>> I was working on committing 0002 and 0003, when I noticed that the
>> second patch doesn't actually fully works.  NOWAIT does what it says on
>> the tin iff the table is locked with a lower lock level than access
>> exclusive.  But if AEL is used, the command is blocked in
>>
>> static List *
>> expand_vacuum_rel(VacuumRelation *vrel)
>> ...
>>         /*
>>          * We transiently take AccessShareLock to protect the syscache lookup
>>          * below, as well as find_all_inheritors's expectation that the caller
>>          * holds some lock on the starting relation.
>>          */
>>         relid = RangeVarGetRelid(vrel->relation, AccessShareLock, false);
>>
>> ISTM has been added after the patches initially were proposed. See
>> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=19de0ab23ccba12567c18640f00b49f01471018d
>>
>> I'm a bit disappointed that the tests didn't catch this, they're clearly
>> not fully there. They definitely should test the AEL case, as
>> demonstrated here.
>
> Sorry about that.  I've added logic to handle ACCESS EXCLUSIVE in v6 of 0002,
> and I've extended the tests to cover that case.

Hm, yes.  I have also let those patches rot a bit myself.  Sorry for
that.

>> Independent of that, I'm also concerned that NOWAIT isn't implemented
>> consistently with other commands. Aren't we erroring out in other uses
>> of NOWAIT?  ISTM a more appropriate keyword would have been SKIP
>> LOCKED.  I think the behaviour makes sense, but I'd rename the internal
>> flag and the grammar to use SKIP LOCKED.
>
> Agreed.  I've updated 0002 to use SKIP LOCKED instead of NOWAIT.

So, NOWAIT means "please try to take a lock, don't wait for it and issue
an error immediately if the lock cannot be taken".  SKIP_LOCKED means
"please try to take a lock, don't wait for it, but do not issue an error
if the lock cannot be taken and move on to next steps".  I agree that
this is more consistent.

+       if (!(options & VACOPT_SKIP_LOCKED))
+           relid = RangeVarGetRelid(vrel->relation, AccessShareLock,
false);
+       else
+       {
+           relid = RangeVarGetRelid(vrel->relation, NoLock, false);
Yeah, I agree with Andres that getting all this logic done in
RangeVarGetRelidExtended would be cleaner.  Let's see where the other
thread leads us to:
https://www.postgresql.org/message-id/20180306005349.b65whmvj7z6hbe2y%40alap3.anarazel.de

+       /*
+        * We must downcase the statement type for log message
consistency
+        * between expand_vacuum_rel(), vacuum_rel(), and analyze_rel().
+        */
+       stmttype_lower = asc_tolower(stmttype, strlen(stmttype));
This blows up on multi-byte characters and may report incorrect relation
name if double quotes are used, no?

+       /*
+        * Since autovacuum workers supply OIDs when calling vacuum(), no
+        * autovacuum worker should reach this code, and we can make
+        * assumptions about the logging levels we should use in the
checks
+        * below.
+        */
+       Assert(!IsAutoVacuumWorkerProcess());
This is a good idea, should be a separate patch as other folks tend to
be confused with relid handling in expand_vacuum_rel().

+      Specifies that <command>VACUUM</command> should not wait for any
+      conflicting locks to be released: if a relation cannot be locked
+      immediately without waiting, the relation is skipped
Should this mention as well that no errors are raised, allowing the
process to move on with the next relations?
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 3/6/18, 11:04 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
> +       if (!(options & VACOPT_SKIP_LOCKED))
> +           relid = RangeVarGetRelid(vrel->relation, AccessShareLock,
> false);
> +       else
> +       {
> +           relid = RangeVarGetRelid(vrel->relation, NoLock, false);
> Yeah, I agree with Andres that getting all this logic done in
> RangeVarGetRelidExtended would be cleaner.  Let's see where the other
> thread leads us to:
> https://www.postgresql.org/message-id/20180306005349.b65whmvj7z6hbe2y%40alap3.anarazel.de

I had missed that thread.  Thanks for pointing it out.

> +       /*
> +        * We must downcase the statement type for log message
> consistency
> +        * between expand_vacuum_rel(), vacuum_rel(), and analyze_rel().
> +        */
> +       stmttype_lower = asc_tolower(stmttype, strlen(stmttype));
> This blows up on multi-byte characters and may report incorrect relation
> name if double quotes are used, no?

At the moment, stmttype is either "VACUUM" or "ANALYZE", but I suppose
there still might be multi-byte risk in translations.

> +       /*
> +        * Since autovacuum workers supply OIDs when calling vacuum(), no
> +        * autovacuum worker should reach this code, and we can make
> +        * assumptions about the logging levels we should use in the
> checks
> +        * below.
> +        */
> +       Assert(!IsAutoVacuumWorkerProcess());
> This is a good idea, should be a separate patch as other folks tend to
> be confused with relid handling in expand_vacuum_rel().

Will do.

> +      Specifies that <command>VACUUM</command> should not wait for any
> +      conflicting locks to be released: if a relation cannot be locked
> +      immediately without waiting, the relation is skipped
> Should this mention as well that no errors are raised, allowing the
> process to move on with the next relations?

IMO that is already covered by saying the relation is skipped,
although I'm not against explicitly stating it, too.  Perhaps we could
outline the logging behavior as well.

Nathan


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 3/6/18, 11:04 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
> +       if (!(options & VACOPT_SKIP_LOCKED))
> +           relid = RangeVarGetRelid(vrel->relation, AccessShareLock,
> false);
> +       else
> +       {
> +           relid = RangeVarGetRelid(vrel->relation, NoLock, false);
> Yeah, I agree with Andres that getting all this logic done in
> RangeVarGetRelidExtended would be cleaner.  Let's see where the other
> thread leads us to:
> https://www.postgresql.org/message-id/20180306005349.b65whmvj7z6hbe2y%40alap3.anarazel.de

I had missed that thread.  Thanks for pointing it out.

> +       /*
> +        * We must downcase the statement type for log message
> consistency
> +        * between expand_vacuum_rel(), vacuum_rel(), and analyze_rel().
> +        */
> +       stmttype_lower = asc_tolower(stmttype, strlen(stmttype));
> This blows up on multi-byte characters and may report incorrect relation
> name if double quotes are used, no?

At the moment, stmttype is either "VACUUM" or "ANALYZE", but I suppose
there still might be multi-byte risk in translations.

> +       /*
> +        * Since autovacuum workers supply OIDs when calling vacuum(), no
> +        * autovacuum worker should reach this code, and we can make
> +        * assumptions about the logging levels we should use in the
> checks
> +        * below.
> +        */
> +       Assert(!IsAutoVacuumWorkerProcess());
> This is a good idea, should be a separate patch as other folks tend to
> be confused with relid handling in expand_vacuum_rel().

Will do.

> +      Specifies that <command>VACUUM</command> should not wait for any
> +      conflicting locks to be released: if a relation cannot be locked
> +      immediately without waiting, the relation is skipped
> Should this mention as well that no errors are raised, allowing the
> process to move on with the next relations?

IMO that is already covered by saying the relation is skipped,
although I'm not against explicitly stating it, too.  Perhaps we could
outline the logging behavior as well.

Nathan


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 3/7/18, 9:34 AM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
> On 3/6/18, 11:04 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
>> +       if (!(options & VACOPT_SKIP_LOCKED))
>> +           relid = RangeVarGetRelid(vrel->relation, AccessShareLock,
>> false);
>> +       else
>> +       {
>> +           relid = RangeVarGetRelid(vrel->relation, NoLock, false);
>> Yeah, I agree with Andres that getting all this logic done in
>> RangeVarGetRelidExtended would be cleaner.  Let's see where the other
>> thread leads us to:
>> https://www.postgresql.org/message-id/20180306005349.b65whmvj7z6hbe2y%40alap3.anarazel.de
>
> I had missed that thread.  Thanks for pointing it out.

I've noticed one more problem with ACCESS EXCLUSIVE.  If an ACCESS
EXCLUSIVE lock is held on a child relation of a partitioned table,
an ANALYZE on the partitioned table will be blocked at
acquire_inherited_sample_rows().

static int
acquire_inherited_sample_rows(Relation onerel, int elevel,
                              HeapTuple *rows, int targrows,
                              double *totalrows, double *totaldeadrows)
{
...
    /*
     * Find all members of inheritance set.  We only need AccessShareLock on
     * the children.
     */
    tableOIDs =
        find_all_inheritors(RelationGetRelid(onerel), AccessShareLock, NULL);

It also seems possible for the call to vac_open_indexes() in
do_analyze_rel() to block.

    /*
     * Open all indexes of the relation, and see if there are any analyzable
     * columns in the indexes.  We do not analyze index columns if there was
     * an explicit column list in the ANALYZE command, however.  If we are
     * doing a recursive scan, we don't want to touch the parent's indexes at
     * all.
     */
    if (!inh)
        vac_open_indexes(onerel, AccessShareLock, &nindexes, &Irel);
    else
    {
        Irel = NULL;
        nindexes = 0;
    }

I think the most straightforward approach for fixing this is to add
skip-locked functionality in find_all_inheritors(),
find_inheritance_children(), and vac_open_indexes(), but I am curious
what others think.

Nathan


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 3/7/18, 9:34 AM, "Bossart, Nathan" <bossartn@amazon.com> wrote:
> On 3/6/18, 11:04 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
>> +       if (!(options & VACOPT_SKIP_LOCKED))
>> +           relid = RangeVarGetRelid(vrel->relation, AccessShareLock,
>> false);
>> +       else
>> +       {
>> +           relid = RangeVarGetRelid(vrel->relation, NoLock, false);
>> Yeah, I agree with Andres that getting all this logic done in
>> RangeVarGetRelidExtended would be cleaner.  Let's see where the other
>> thread leads us to:
>> https://www.postgresql.org/message-id/20180306005349.b65whmvj7z6hbe2y%40alap3.anarazel.de
>
> I had missed that thread.  Thanks for pointing it out.

I've noticed one more problem with ACCESS EXCLUSIVE.  If an ACCESS
EXCLUSIVE lock is held on a child relation of a partitioned table,
an ANALYZE on the partitioned table will be blocked at
acquire_inherited_sample_rows().

static int
acquire_inherited_sample_rows(Relation onerel, int elevel,
                              HeapTuple *rows, int targrows,
                              double *totalrows, double *totaldeadrows)
{
...
    /*
     * Find all members of inheritance set.  We only need AccessShareLock on
     * the children.
     */
    tableOIDs =
        find_all_inheritors(RelationGetRelid(onerel), AccessShareLock, NULL);

It also seems possible for the call to vac_open_indexes() in
do_analyze_rel() to block.

    /*
     * Open all indexes of the relation, and see if there are any analyzable
     * columns in the indexes.  We do not analyze index columns if there was
     * an explicit column list in the ANALYZE command, however.  If we are
     * doing a recursive scan, we don't want to touch the parent's indexes at
     * all.
     */
    if (!inh)
        vac_open_indexes(onerel, AccessShareLock, &nindexes, &Irel);
    else
    {
        Irel = NULL;
        nindexes = 0;
    }

I think the most straightforward approach for fixing this is to add
skip-locked functionality in find_all_inheritors(),
find_inheritance_children(), and vac_open_indexes(), but I am curious
what others think.

Nathan


Re: BUG #14941: Vacuum crashes

From
Andres Freund
Date:
Hi,

On 2018-03-30 18:39:01 +0000, Bossart, Nathan wrote:
> I've noticed one more problem with ACCESS EXCLUSIVE.  If an ACCESS
> EXCLUSIVE lock is held on a child relation of a partitioned table,
> an ANALYZE on the partitioned table will be blocked at
> acquire_inherited_sample_rows().
> 
> static int
> acquire_inherited_sample_rows(Relation onerel, int elevel,
>                               HeapTuple *rows, int targrows,
>                               double *totalrows, double *totaldeadrows)
> {
> ...
>     /*
>      * Find all members of inheritance set.  We only need AccessShareLock on
>      * the children.
>      */
>     tableOIDs =
>         find_all_inheritors(RelationGetRelid(onerel), AccessShareLock, NULL);

Right.


> It also seems possible for the call to vac_open_indexes() in
> do_analyze_rel() to block.

Yup.


> I think the most straightforward approach for fixing this is to add
> skip-locked functionality in find_all_inheritors(),
> find_inheritance_children(), and vac_open_indexes(), but I am curious
> what others think.

I'm actually wondering if we shouldn't just ignore this problem. While
the other cases VACUUM (SKIP LOCKED) are intended to solve seem common,
these seem less so. But it'd be a bit weird, too..

Could you post a rebased version of the patch, with an incremental
addition of what you propose in a separate patch?

Greetings,

Andres Freund


Re: BUG #14941: Vacuum crashes

From
Andres Freund
Date:
Hi,

On 2018-03-30 18:39:01 +0000, Bossart, Nathan wrote:
> I've noticed one more problem with ACCESS EXCLUSIVE.  If an ACCESS
> EXCLUSIVE lock is held on a child relation of a partitioned table,
> an ANALYZE on the partitioned table will be blocked at
> acquire_inherited_sample_rows().
> 
> static int
> acquire_inherited_sample_rows(Relation onerel, int elevel,
>                               HeapTuple *rows, int targrows,
>                               double *totalrows, double *totaldeadrows)
> {
> ...
>     /*
>      * Find all members of inheritance set.  We only need AccessShareLock on
>      * the children.
>      */
>     tableOIDs =
>         find_all_inheritors(RelationGetRelid(onerel), AccessShareLock, NULL);

Right.


> It also seems possible for the call to vac_open_indexes() in
> do_analyze_rel() to block.

Yup.


> I think the most straightforward approach for fixing this is to add
> skip-locked functionality in find_all_inheritors(),
> find_inheritance_children(), and vac_open_indexes(), but I am curious
> what others think.

I'm actually wondering if we shouldn't just ignore this problem. While
the other cases VACUUM (SKIP LOCKED) are intended to solve seem common,
these seem less so. But it'd be a bit weird, too..

Could you post a rebased version of the patch, with an incremental
addition of what you propose in a separate patch?

Greetings,

Andres Freund


Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Fri, Mar 30, 2018 at 05:19:46PM -0700, Andres Freund wrote:
> On 2018-03-30 18:39:01 +0000, Bossart, Nathan wrote:
>> I think the most straightforward approach for fixing this is to add
>> skip-locked functionality in find_all_inheritors(),
>> find_inheritance_children(), and vac_open_indexes(), but I am curious
>> what others think.

For vac_open_indexes() it may make the most sense to introduce a new
try_open_index which wraps on top of try_relation_open with checks on
the opened relation's relkind.

> I'm actually wondering if we shouldn't just ignore this problem. While
> the other cases VACUUM (SKIP LOCKED) are intended to solve seem common,
> these seem less so. But it'd be a bit weird, too..

Wouldn't that cause deadlocks if one session has VACUUM (SKIP_LOCKED)
and another session running vacuum or autovacuum if simply ignored?
That looks dangerous to me than simply skipping a lock that cannot be
taken.
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Fri, Mar 30, 2018 at 05:19:46PM -0700, Andres Freund wrote:
> On 2018-03-30 18:39:01 +0000, Bossart, Nathan wrote:
>> I think the most straightforward approach for fixing this is to add
>> skip-locked functionality in find_all_inheritors(),
>> find_inheritance_children(), and vac_open_indexes(), but I am curious
>> what others think.

For vac_open_indexes() it may make the most sense to introduce a new
try_open_index which wraps on top of try_relation_open with checks on
the opened relation's relkind.

> I'm actually wondering if we shouldn't just ignore this problem. While
> the other cases VACUUM (SKIP LOCKED) are intended to solve seem common,
> these seem less so. But it'd be a bit weird, too..

Wouldn't that cause deadlocks if one session has VACUUM (SKIP_LOCKED)
and another session running vacuum or autovacuum if simply ignored?
That looks dangerous to me than simply skipping a lock that cannot be
taken.
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 3/30/18, 7:19 PM, "Andres Freund" <andres@anarazel.de> wrote:
> Could you post a rebased version of the patch, with an incremental
> addition of what you propose in a separate patch?

Here is a new patch set.  I have attempted to add skip-locked
functionality everywhere it is needed, including VACUUM FULL.

On 4/3/18, 12:12 AM, "Michael Paquier" <michael@paquier.xyz> wrote:
> On Fri, Mar 30, 2018 at 05:19:46PM -0700, Andres Freund wrote:
>> On 2018-03-30 18:39:01 +0000, Bossart, Nathan wrote:
>>> I think the most straightforward approach for fixing this is to add
>>> skip-locked functionality in find_all_inheritors(),
>>> find_inheritance_children(), and vac_open_indexes(), but I am curious
>>> what others think.
>
> For vac_open_indexes() it may make the most sense to introduce a new
> try_open_index which wraps on top of try_relation_open with checks on
> the opened relation's relkind. 

I looked into this, but I noticed that the analogous
try_relation_open() implements missing-ok for relation_open().  So, I
thought it might be confusing to have try_index_open() be a skip-
locked version of index_open().

The new tests are now passing as expected, but I am still doing some
manual testing.  Since there is quite a bit of added complexity in
this new patch set, I will certainly not be surprised if this gets
moved to the next commitfest.

Nathan


Attachment

Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 3/30/18, 7:19 PM, "Andres Freund" <andres@anarazel.de> wrote:
> Could you post a rebased version of the patch, with an incremental
> addition of what you propose in a separate patch?

Here is a new patch set.  I have attempted to add skip-locked
functionality everywhere it is needed, including VACUUM FULL.

On 4/3/18, 12:12 AM, "Michael Paquier" <michael@paquier.xyz> wrote:
> On Fri, Mar 30, 2018 at 05:19:46PM -0700, Andres Freund wrote:
>> On 2018-03-30 18:39:01 +0000, Bossart, Nathan wrote:
>>> I think the most straightforward approach for fixing this is to add
>>> skip-locked functionality in find_all_inheritors(),
>>> find_inheritance_children(), and vac_open_indexes(), but I am curious
>>> what others think.
>
> For vac_open_indexes() it may make the most sense to introduce a new
> try_open_index which wraps on top of try_relation_open with checks on
> the opened relation's relkind. 

I looked into this, but I noticed that the analogous
try_relation_open() implements missing-ok for relation_open().  So, I
thought it might be confusing to have try_index_open() be a skip-
locked version of index_open().

The new tests are now passing as expected, but I am still doing some
manual testing.  Since there is quite a bit of added complexity in
this new patch set, I will certainly not be surprised if this gets
moved to the next commitfest.

Nathan


Attachment

Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Thu, Apr 05, 2018 at 08:29:38PM +0000, Bossart, Nathan wrote:
> The new tests are now passing as expected, but I am still doing some
> manual testing.  Since there is quite a bit of added complexity in
> this new patch set, I will certainly not be surprised if this gets
> moved to the next commitfest.

That would be wiser.  We are two days away from the end of the CF and
this patch gets quite invasive with a set of new concepts, so my
recommendation would be to do so.  There has been already much done with
the introduction of the new grammar for ANALYZE, and the introduction of
RangeVarGetRelidExtended().
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Thu, Apr 05, 2018 at 08:29:38PM +0000, Bossart, Nathan wrote:
> The new tests are now passing as expected, but I am still doing some
> manual testing.  Since there is quite a bit of added complexity in
> this new patch set, I will certainly not be surprised if this gets
> moved to the next commitfest.

That would be wiser.  We are two days away from the end of the CF and
this patch gets quite invasive with a set of new concepts, so my
recommendation would be to do so.  There has been already much done with
the introduction of the new grammar for ANALYZE, and the introduction of
RangeVarGetRelidExtended().
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Fri, Apr 06, 2018 at 06:41:14AM +0900, Michael Paquier wrote:
> That would be wiser.  We are two days away from the end of the CF and
> this patch gets quite invasive with a set of new concepts, so my
> recommendation would be to do so.  There has been already much done with
> the introduction of the new grammar for ANALYZE, and the introduction of
> RangeVarGetRelidExtended().

By the way, let's do that on a new thread next time :)
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
Michael Paquier
Date:
On Fri, Apr 06, 2018 at 06:41:14AM +0900, Michael Paquier wrote:
> That would be wiser.  We are two days away from the end of the CF and
> this patch gets quite invasive with a set of new concepts, so my
> recommendation would be to do so.  There has been already much done with
> the introduction of the new grammar for ANALYZE, and the introduction of
> RangeVarGetRelidExtended().

By the way, let's do that on a new thread next time :)
--
Michael

Attachment

Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 4/5/18, 9:48 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
> On Fri, Apr 06, 2018 at 06:41:14AM +0900, Michael Paquier wrote:
>> That would be wiser.  We are two days away from the end of the CF and
>> this patch gets quite invasive with a set of new concepts, so my
>> recommendation would be to do so.  There has been already much done with
>> the introduction of the new grammar for ANALYZE, and the introduction of
>> RangeVarGetRelidExtended().
>
> By the way, let's do that on a new thread next time :)

Sounds good.  I've bumped it to the next commitfest.

Nathan


Re: BUG #14941: Vacuum crashes

From
"Bossart, Nathan"
Date:
On 4/5/18, 9:48 PM, "Michael Paquier" <michael@paquier.xyz> wrote:
> On Fri, Apr 06, 2018 at 06:41:14AM +0900, Michael Paquier wrote:
>> That would be wiser.  We are two days away from the end of the CF and
>> this patch gets quite invasive with a set of new concepts, so my
>> recommendation would be to do so.  There has been already much done with
>> the introduction of the new grammar for ANALYZE, and the introduction of
>> RangeVarGetRelidExtended().
>
> By the way, let's do that on a new thread next time :)

Sounds good.  I've bumped it to the next commitfest.

Nathan