Thread: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18310 Logged by: Alexander Lakhin Email address: exclusion@gmail.com PostgreSQL version: 16.1 Operating system: Ubuntu 22.04 Description: [ this bug reported to fix defects discovered while working on bug #18297 ] The following query: CREATE ROLE u; DROP ROLE u, u; fails with ERROR: tuple already updated by self Whilst all the other DROP commands, accepting a list of objects (namely, AGGREGATE, DOMAIN, EXTENSION, FOREIGN DATA WRAPPER, FOREIGN TABLE, FUNCTION, INDEX, MATERIALIZED VIEW, OPERATOR, PROCEDURE, ROUTINE, SEQUENCE, SERVER, STATISTICS, TABLE, TYPE, VIEW), handle such duplicates with no error. Also, the following ALTER query: CREATE TEXT SEARCH CONFIGURATION ispell_tst (COPY=english); CREATE TEXT SEARCH DICTIONARY ispell (Template=ispell, DictFile=ispell_sample, AffFile=ispell_sample); ALTER TEXT SEARCH CONFIGURATION ispell_tst ALTER MAPPING FOR word, word WITH ispell; fails with the same ERROR: tuple already updated by self Again, I re-checked all the other ALTER commands, that accept a list of objects, and could not find other ones producing the same error.
Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Michael Paquier
Date:
On Thu, Jan 25, 2024 at 04:00:01AM +0000, PG Bug reporting form wrote: > The following query: > CREATE ROLE u; > DROP ROLE u, u; > fails with > ERROR: tuple already updated by self Pas glop. > Whilst all the other DROP commands, accepting a list of objects (namely, > AGGREGATE, DOMAIN, EXTENSION, FOREIGN DATA WRAPPER, FOREIGN TABLE, > FUNCTION, INDEX, MATERIALIZED VIEW, OPERATOR, PROCEDURE, ROUTINE, > SEQUENCE, SERVER, STATISTICS, TABLE, TYPE, VIEW), handle such duplicates > with no error. DROP ROLE has it own parsing node and its own dropping logic because of ACL checks and such, while the others used the unified route with dependency.c & friends. For this job, I think user.c is just overengineered and could be simplified to use a list of OIDs. There is no need for ObjectAddress or even ObjectAddresses: all of them are created with AuthIdRelationId as class ID, and the second loop just fetches the role IDs retrieved from the ObjectAddresses created in the first loop. We don't use the individual ObjectAddress either. On top of all that, it makes the code a bit easier to follow. > Also, the following ALTER query: > CREATE TEXT SEARCH CONFIGURATION ispell_tst (COPY=english); > CREATE TEXT SEARCH DICTIONARY ispell (Template=ispell, > DictFile=ispell_sample, AffFile=ispell_sample); > ALTER TEXT SEARCH CONFIGURATION ispell_tst ALTER MAPPING FOR word, word > WITH ispell; > fails with the same > ERROR: tuple already updated by self > > Again, I re-checked all the other ALTER commands, that accept a list of > objects, and could not find other ones producing the same error. This one comes down to the list of manipulation of the list for token names, where tsearchcmds.c has a few assumptions about the way to do its job across multiple catalog scans. We could adjust the code so as the list of String nodes is rebuilt with some list_append_unique(), but I cannot convince myself that this is worth the complication. -- Michael
Attachment
Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Tender Wang
Date:
Michael Paquier <michael@paquier.xyz> 于2024年1月26日周五 10:51写道:
On Thu, Jan 25, 2024 at 04:00:01AM +0000, PG Bug reporting form wrote:
> The following query:
> CREATE ROLE u;
> DROP ROLE u, u;
> fails with
> ERROR: tuple already updated by self
Pas glop.
> Whilst all the other DROP commands, accepting a list of objects (namely,
> AGGREGATE, DOMAIN, EXTENSION, FOREIGN DATA WRAPPER, FOREIGN TABLE,
> FUNCTION, INDEX, MATERIALIZED VIEW, OPERATOR, PROCEDURE, ROUTINE,
> SEQUENCE, SERVER, STATISTICS, TABLE, TYPE, VIEW), handle such duplicates
> with no error.
DROP ROLE has it own parsing node and its own dropping logic because
of ACL checks and such, while the others used the unified route with
dependency.c & friends.
For this job, I think user.c is just overengineered and could be
simplified to use a list of OIDs. There is no need for ObjectAddress
or even ObjectAddresses: all of them are created with
AuthIdRelationId as class ID, and the second loop just fetches the
role IDs retrieved from the ObjectAddresses created in the first loop.
We don't use the individual ObjectAddress either. On top of all that,
it makes the code a bit easier to follow.
Agreed. +1
> Also, the following ALTER query:
> CREATE TEXT SEARCH CONFIGURATION ispell_tst (COPY=english);
> CREATE TEXT SEARCH DICTIONARY ispell (Template=ispell,
> DictFile=ispell_sample, AffFile=ispell_sample);
> ALTER TEXT SEARCH CONFIGURATION ispell_tst ALTER MAPPING FOR word, word
> WITH ispell;
> fails with the same
> ERROR: tuple already updated by self
>
> Again, I re-checked all the other ALTER commands, that accept a list of
> objects, and could not find other ones producing the same error.
This one comes down to the list of manipulation of the list for token
names, where tsearchcmds.c has a few assumptions about the way to do
its job across multiple catalog scans. We could adjust the code so as
the list of String nodes is rebuilt with some list_append_unique(),
but I cannot convince myself that this is worth the complication.
--
Michael
Tender Wang
OpenPie: https://en.openpie.com/Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Tender Wang
Date:
Michael Paquier <michael@paquier.xyz> 于2024年1月26日周五 10:51写道:
On Thu, Jan 25, 2024 at 04:00:01AM +0000, PG Bug reporting form wrote:
> The following query:
> CREATE ROLE u;
> DROP ROLE u, u;
> fails with
> ERROR: tuple already updated by self
Pas glop.
> Whilst all the other DROP commands, accepting a list of objects (namely,
> AGGREGATE, DOMAIN, EXTENSION, FOREIGN DATA WRAPPER, FOREIGN TABLE,
> FUNCTION, INDEX, MATERIALIZED VIEW, OPERATOR, PROCEDURE, ROUTINE,
> SEQUENCE, SERVER, STATISTICS, TABLE, TYPE, VIEW), handle such duplicates
> with no error.
DROP ROLE has it own parsing node and its own dropping logic because
of ACL checks and such, while the others used the unified route with
dependency.c & friends.
For this job, I think user.c is just overengineered and could be
simplified to use a list of OIDs. There is no need for ObjectAddress
or even ObjectAddresses: all of them are created with
AuthIdRelationId as class ID, and the second loop just fetches the
role IDs retrieved from the ObjectAddresses created in the first loop.
We don't use the individual ObjectAddress either. On top of all that,
it makes the code a bit easier to follow.
> Also, the following ALTER query:
> CREATE TEXT SEARCH CONFIGURATION ispell_tst (COPY=english);
> CREATE TEXT SEARCH DICTIONARY ispell (Template=ispell,
> DictFile=ispell_sample, AffFile=ispell_sample);
> ALTER TEXT SEARCH CONFIGURATION ispell_tst ALTER MAPPING FOR word, word
> WITH ispell;
> fails with the same
> ERROR: tuple already updated by self
>
> Again, I re-checked all the other ALTER commands, that accept a list of
> objects, and could not find other ones producing the same error.
This one comes down to the list of manipulation of the list for token
names, where tsearchcmds.c has a few assumptions about the way to do
its job across multiple catalog scans. We could adjust the code so as
the list of String nodes is rebuilt with some list_append_unique(),
but I cannot convince myself that this is worth the complication.
I have an idea.
How about using list *res replace int *res in getTokenTypes(), so we can use list_append_unique().
In outer function, use list *tokens to replace int *tokens, and ntokens = list_length(tokens) not list_length(stmt->tokentype).
--
Michael
Tender Wang
OpenPie: https://en.openpie.com/Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Michael Paquier
Date:
On Fri, Jan 26, 2024 at 11:33:24AM +0800, Tender Wang wrote: > How about using list *res replace int *res in getTokenTypes(), so we can > use list_append_unique(). > In outer function, use list *tokens to replace int *tokens, and ntokens = > list_length(tokens) not list_length(stmt->tokentype). Yeah, I was wondering about this code path. If you feel strongly about that, would you like to write a patch? -- Michael
Attachment
Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Tender Wang
Date:
Michael Paquier <michael@paquier.xyz> 于2024年1月26日周五 11:41写道:
On Fri, Jan 26, 2024 at 11:33:24AM +0800, Tender Wang wrote:
> How about using list *res replace int *res in getTokenTypes(), so we can
> use list_append_unique().
> In outer function, use list *tokens to replace int *tokens, and ntokens =
> list_length(tokens) not list_length(stmt->tokentype).
Yeah, I was wondering about this code path. If you feel strongly
about that, would you like to write a patch?
Ok, I try to fix it.
--
Michael
Tender Wang
OpenPie: https://en.openpie.com/Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Tender Wang
Date:
Tender Wang <tndrwang@gmail.com> 于2024年1月26日周五 12:54写道:
Michael Paquier <michael@paquier.xyz> 于2024年1月26日周五 11:41写道:On Fri, Jan 26, 2024 at 11:33:24AM +0800, Tender Wang wrote:
> How about using list *res replace int *res in getTokenTypes(), so we can
> use list_append_unique().
> In outer function, use list *tokens to replace int *tokens, and ntokens =
> list_length(tokens) not list_length(stmt->tokentype).
Yeah, I was wondering about this code path. If you feel strongly
about that, would you like to write a patch?Ok, I try to fix it.
As I said before, return List looks like not complicated to solve this issue.
I found another problem, it didn't report NOTICE if SQL has IF EXISTS, for example:
postgres=# alter text search configuration ispell_tst drop mapping if exists for test;
ERROR: token type "test" does not exist
ERROR: token type "test" does not exist
So I change the func getTokenTypes() interface and remove DropConfigurationMapping() error report into getTokenTypes().
Tender Wang
OpenPie: https://en.openpie.com/Attachment
Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Yongtao Huang
Date:
Hi Tender Wang,
I think your code changes LGTM. Thanks for your contribution.
But I meet these errors when I git apply your patch, so I do some tiny formate modification.
``` bash
$ git apply 0001-Fix-reporting-error-when-process-duplicate-token-typ.patch
0001-Fix-reporting-error-when-process-duplicate-token-typ.patch:38: indent with spaces.
List *tokennames = stmt->tokentype;
0001-Fix-reporting-error-when-process-duplicate-token-typ.patch:73: indent with spaces.
{
0001-Fix-reporting-error-when-process-duplicate-token-typ.patch:74: indent with spaces.
if (!stmt->missing_ok)
0001-Fix-reporting-error-when-process-duplicate-token-typ.patch:75: indent with spaces.
ereport(ERROR,
0001-Fix-reporting-error-when-process-duplicate-token-typ.patch:76: indent with spaces.
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
warning: squelched 9 whitespace errors
warning: 14 lines add whitespace errors.
```
I think your code changes LGTM. Thanks for your contribution.
But I meet these errors when I git apply your patch, so I do some tiny formate modification.
Please refer to the patch which is still authored by you.
``` bash
$ git apply 0001-Fix-reporting-error-when-process-duplicate-token-typ.patch
0001-Fix-reporting-error-when-process-duplicate-token-typ.patch:38: indent with spaces.
List *tokennames = stmt->tokentype;
0001-Fix-reporting-error-when-process-duplicate-token-typ.patch:73: indent with spaces.
{
0001-Fix-reporting-error-when-process-duplicate-token-typ.patch:74: indent with spaces.
if (!stmt->missing_ok)
0001-Fix-reporting-error-when-process-duplicate-token-typ.patch:75: indent with spaces.
ereport(ERROR,
0001-Fix-reporting-error-when-process-duplicate-token-typ.patch:76: indent with spaces.
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
warning: squelched 9 whitespace errors
warning: 14 lines add whitespace errors.
```
Regrads
Yongtao Huang
Broadcom Greenplum
Tender Wang <tndrwang@gmail.com> 于2024年1月26日周五 14:21写道:
Tender Wang <tndrwang@gmail.com> 于2024年1月26日周五 12:54写道:Michael Paquier <michael@paquier.xyz> 于2024年1月26日周五 11:41写道:On Fri, Jan 26, 2024 at 11:33:24AM +0800, Tender Wang wrote:
> How about using list *res replace int *res in getTokenTypes(), so we can
> use list_append_unique().
> In outer function, use list *tokens to replace int *tokens, and ntokens =
> list_length(tokens) not list_length(stmt->tokentype).
Yeah, I was wondering about this code path. If you feel strongly
about that, would you like to write a patch?Ok, I try to fix it.As I said before, return List looks like not complicated to solve this issue.I found another problem, it didn't report NOTICE if SQL has IF EXISTS, for example:postgres=# alter text search configuration ispell_tst drop mapping if exists for test;
ERROR: token type "test" does not existSo I change the func getTokenTypes() interface and remove DropConfigurationMapping() error report into getTokenTypes().--Tender WangOpenPie: https://en.openpie.com/
Attachment
Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Michael Paquier
Date:
On Sun, Jan 28, 2024 at 01:32:08AM +0800, Yongtao Huang wrote: > Hi Tender Wang, > > I think your code changes *LGTM*. Thanks for your contribution. Yes, incorrect indentation is something I have noticed in the previous patches you have sent to the lists: all the new lines have an incorrect indentation. See here, where it is mentioned that the code base uses 4 column tab spacing with tabs preserved: https://www.postgresql.org/docs/devel/source-format.html FWIW, I am OK for simple patches, but this is going to make reviews harder when it comes to read more complicated patches that touch more code paths. > Broadcom Greenplum Glad to see you're still alive and around. -- Michael
Attachment
Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Michael Paquier
Date:
On Fri, Jan 26, 2024 at 11:23:08AM +0800, Tender Wang wrote: > Michael Paquier <michael@paquier.xyz> 于2024年1月26日周五 10:51写道: >> For this job, I think user.c is just overengineered and could be >> simplified to use a list of OIDs. There is no need for ObjectAddress >> or even ObjectAddresses: all of them are created with >> AuthIdRelationId as class ID, and the second loop just fetches the >> role IDs retrieved from the ObjectAddresses created in the first loop. >> We don't use the individual ObjectAddress either. On top of all that, >> it makes the code a bit easier to follow. > > Agreed. +1 Thanks, I've applied this one down v16 as of 50b797dc99ec. v15 and older do not use a two-pass logic in DropRole() so one would fail with a simpler "role does not exist", which is kind of correct in its own way because the second role cannot be scanned in pg_authid after the CCI of the first one, so I have let it be in older versions. It's not like somebody complained about that in the past, either, as far as I recall. -- Michael
Attachment
Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Tender Wang
Date:
Michael Paquier <michael@paquier.xyz> 于2024年1月29日周一 07:18写道:
On Sun, Jan 28, 2024 at 01:32:08AM +0800, Yongtao Huang wrote:
> Hi Tender Wang,
>
> I think your code changes *LGTM*. Thanks for your contribution.
Thanks for reviewing this patch and fixing the indentation.
Yes, incorrect indentation is something I have noticed in the previous
patches you have sent to the lists: all the new lines have an
incorrect indentation. See here, where it is mentioned that the code
base uses 4 column tab spacing with tabs preserved:
https://www.postgresql.org/docs/devel/source-format.html
FWIW, I am OK for simple patches, but this is going to make reviews
harder when it comes to read more complicated patches that touch more
code paths.
Sorry, my mistake. Thanks for reminding. Yongtao has help me fix the indentation issue in his patch,
and I don't have new code to add.
> Broadcom Greenplum
Glad to see you're still alive and around.
--
Michael
Tender Wang
OpenPie: https://en.openpie.com/Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Michael Paquier
Date:
On Mon, Jan 29, 2024 at 11:27:15AM +0800, Tender Wang wrote: > Michael Paquier <michael@paquier.xyz> 于2024年1月29日周一 07:18写道: >> Yes, incorrect indentation is something I have noticed in the previous >> patches you have sent to the lists: all the new lines have an >> incorrect indentation. See here, where it is mentioned that the code >> base uses 4 column tab spacing with tabs preserved: >> https://www.postgresql.org/docs/devel/source-format.html >> >> FWIW, I am OK for simple patches, but this is going to make reviews >> harder when it comes to read more complicated patches that touch more >> code paths. >> > > Sorry, my mistake. Thanks for reminding. Yongtao has help me fix the > indentation issue in his patch, > and I don't have new code to add. The latest version posted on [1] was still not right. I got a lot of complaints from a simple `git diff --check`. I am just looking at the patch, no need to send something else. [1]: https://www.postgresql.org/message-id/CAHewXNk5yeUo2XPaqTYSfbkKhEBfKd4VYa7NHv2MzV-mJ9ULpA@mail.gmail.com -- Michael
Attachment
Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Michael Paquier
Date:
On Fri, Jan 26, 2024 at 02:21:12PM +0800, Tender Wang wrote: > As I said before, return List looks like not complicated to solve this > issue. > I found another problem, it didn't report NOTICE if SQL has IF EXISTS, for > example: > > postgres=# alter text search configuration ispell_tst drop mapping if > exists for test; > ERROR: token type "test" does not exist It seems to me that there is a misunderstanding here, because this query is correct depending on the parser used by a tsearch configuration, and the code. See details below. > So I change the func getTokenTypes() interface and remove > DropConfigurationMapping() error report into getTokenTypes(). + if (!stmt->missing_ok) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("token type \"%s\" does not exist", + strVal(val)))); + else + { + ereport(NOTICE, + (errmsg("token type \"%s\" does not exist, skipping", + strVal(val)))); + } [ ... ] - if (!found) - { - if (!stmt->missing_ok) - { - ereport(ERROR, - (errcode(ERRCODE_UNDEFINED_OBJECT), - errmsg("mapping for token type \"%s\" does not exist", - strVal(val)))); - } - else - { - ereport(NOTICE, - (errmsg("mapping for token type \"%s\" does not exist, skipping", - strVal(val)))); - } - } Your patch looks incorrect to me, and changes existing behaviors because of the changes you have done in both getTokenTypes() *and* DropConfigurationMapping() because the error handling you have removed can be reached when attempting to drop a token that is included in the parser but has *no* mapping. For example, reusing the test of your patch, this would include a bunch of tokens: CREATE TEXT SEARCH CONFIGURATION dup_ispell_tst (COPY=english); \dF+ dup_ispell_tst On HEAD, attempting to drop a token that does not exist for the configuration fails, because the defined token is not part of the parser: =# ALTER TEXT SEARCH CONFIGURATION dup_ispell_tst DROP MAPPING IF EXISTS FOR not_here; ERROR: 22023: token type "not_here" does not exist But if a token is dropped the configuration with the parser supporting it, we'd get that: =# ALTER TEXT SEARCH CONFIGURATION dup_ispell_tst DROP MAPPING FOR word; ALTER TEXT SEARCH CONFIGURATION =# ALTER TEXT SEARCH CONFIGURATION dup_ispell_tst DROP MAPPING IF EXISTS FOR word; NOTICE: 00000: mapping for token type "word" does not exist, skipping With the patch and the same queries as previously, we now get that: =# ALTER TEXT SEARCH CONFIGURATION dup_ispell_tst DROP MAPPING IF EXISTS FOR not_here; NOTICE: 00000: token type "not_here" does not exist, skipping =# ALTER TEXT SEARCH CONFIGURATION dup_ispell_tst DROP MAPPING IF EXISTS FOR word; ALTER TEXT SEARCH CONFIGURATION =# ALTER TEXT SEARCH CONFIGURATION dup_ispell_tst DROP MAPPING IF EXISTS FOR word; ALTER TEXT SEARCH CONFIGURATION This is incorrect for two reasons: - In the first query, the mapping "not_here" is not part of the parser, and this pattern *has to* be a hard ERROR even on IF EXISTS. - The token "word" is part in the parser, and we finish by incorrectly reporting that it gets dropped, missing the NOTICE from the IF EXISTS. So the patch makes a worse experience the user, because he/she then does not know if the mapping has been dropped or not, while not knowing anymore if the mapping was part of the parser used by the configuration. The root of the problem is that we should fail for an unknown token name and we skip a failure if the mapping does not exist with IF EXISTS, while still knowing the name of the token to be able to report it correctly, and while handling duplicates in getTokenTypes(). So the patch should *not* change the error checks at all. I think that we should tweak getTokenTypes() so as we return *two* Lists, one for the IDs and a second with the token names, then use forboth() in DropConfigurationMapping() with the two lists and a simple foreach in MakeConfigurationMapping() when overriding the mappings, while getTokenTypes() checks if a number is in the first list before adding the name of a token in the second list. Or we could use a simple list with pointers to a local structure, but the second list is only needed by DropConfigurationMapping(). That would enforce a correct order of the token names and numbers, at least. I would be tempted to just use one List with a structure (number, token_name). It makes the checks for duplicates O(N^2) but we will never have hundreds of mapping entries in these DDL queries. What's your pick? -- Michael
Attachment
Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Tender Wang
Date:
Michael Paquier <michael@paquier.xyz> 于2024年1月30日周二 14:28写道:
On Fri, Jan 26, 2024 at 02:21:12PM +0800, Tender Wang wrote:
> As I said before, return List looks like not complicated to solve this
> issue.
> I found another problem, it didn't report NOTICE if SQL has IF EXISTS, for
> example:
>
> postgres=# alter text search configuration ispell_tst drop mapping if
> exists for test;
> ERROR: token type "test" does not exist
It seems to me that there is a misunderstanding here, because this
query is correct depending on the parser used by a tsearch
configuration, and the code. See details below.
Yeah, you are right.
> So I change the func getTokenTypes() interface and remove
> DropConfigurationMapping() error report into getTokenTypes().
+ if (!stmt->missing_ok)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("token type \"%s\" does not exist",
+ strVal(val))));
+ else
+ {
+ ereport(NOTICE,
+ (errmsg("token type \"%s\" does not exist, skipping",
+ strVal(val))));
+ }
[ ... ]
- if (!found)
- {
- if (!stmt->missing_ok)
- {
- ereport(ERROR,
- (errcode(ERRCODE_UNDEFINED_OBJECT),
- errmsg("mapping for token type \"%s\" does not exist",
- strVal(val))));
- }
- else
- {
- ereport(NOTICE,
- (errmsg("mapping for token type \"%s\" does not exist, skipping",
- strVal(val))));
- }
- }
Your patch looks incorrect to me, and changes existing behaviors
because of the changes you have done in both getTokenTypes() *and*
DropConfigurationMapping() because the error handling you have removed
can be reached when attempting to drop a token that is included in the
parser but has *no* mapping. For example, reusing the test of your
patch, this would include a bunch of tokens:
CREATE TEXT SEARCH CONFIGURATION dup_ispell_tst (COPY=english);
\dF+ dup_ispell_tst
On HEAD, attempting to drop a token that does not exist for the
configuration fails, because the defined token is not part of the
parser:
=# ALTER TEXT SEARCH CONFIGURATION dup_ispell_tst
DROP MAPPING IF EXISTS FOR not_here;
ERROR: 22023: token type "not_here" does not exist
But if a token is dropped the configuration with the parser supporting
it, we'd get that:
=# ALTER TEXT SEARCH CONFIGURATION dup_ispell_tst
DROP MAPPING FOR word;
ALTER TEXT SEARCH CONFIGURATION
=# ALTER TEXT SEARCH CONFIGURATION dup_ispell_tst
DROP MAPPING IF EXISTS FOR word;
NOTICE: 00000: mapping for token type "word" does not exist, skipping
With the patch and the same queries as previously, we now get that:
=# ALTER TEXT SEARCH CONFIGURATION dup_ispell_tst
DROP MAPPING IF EXISTS FOR not_here;
NOTICE: 00000: token type "not_here" does not exist, skipping
=# ALTER TEXT SEARCH CONFIGURATION dup_ispell_tst
DROP MAPPING IF EXISTS FOR word;
ALTER TEXT SEARCH CONFIGURATION
=# ALTER TEXT SEARCH CONFIGURATION dup_ispell_tst
DROP MAPPING IF EXISTS FOR word;
ALTER TEXT SEARCH CONFIGURATION
Thanks for your explanation.
This is incorrect for two reasons:
- In the first query, the mapping "not_here" is not part of the
parser, and this pattern *has to* be a hard ERROR even on IF EXISTS.
- The token "word" is part in the parser, and we finish by incorrectly
reporting that it gets dropped, missing the NOTICE from the IF EXISTS.
So the patch makes a worse experience the user, because he/she then
does not know if the mapping has been dropped or not, while not
knowing anymore if the mapping was part of the parser used by the
configuration.
The root of the problem is that we should fail for an unknown token
name and we skip a failure if the mapping does not exist with IF
EXISTS, while still knowing the name of the token to be able to report
it correctly, and while handling duplicates in getTokenTypes(). So
the patch should *not* change the error checks at all.
Yeah, it works for MakeConfigurationMapping() when just returns list, but for DropConfigurationMapping(),
it need to know the non-exist token name. So it doesn't work if we just return oid list in getTokenTypes().
When I try to fix the problem, I found that getTokenType() will report ERROR and DropConfigurationMapping()
will report ERROR too. I felt a little confused at that time.
I misunderstund the behavior in getTokenTypes() and in DropConfigurationMapping(). I only realized it can fix the reported issue.
I think that we should tweak getTokenTypes() so as we return *two*
Lists, one for the IDs and a second with the token names, then use
forboth() in DropConfigurationMapping() with the two lists and a
simple foreach in MakeConfigurationMapping() when overriding the
mappings, while getTokenTypes() checks if a number is in the first
list before adding the name of a token in the second list. Or we
could use a simple list with pointers to a local structure, but the
second list is only needed by DropConfigurationMapping(). That would
enforce a correct order of the token names and numbers, at least.
I would be tempted to just use one List with a structure (number,
token_name). It makes the checks for duplicates O(N^2) but we will
never have hundreds of mapping entries in these DDL queries.
Hmm, I agree with you.
What's your pick?
--
Michael
Tender Wang
OpenPie: https://en.openpie.com/Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Michael Paquier
Date:
On Tue, Jan 30, 2024 at 03:29:46PM +0800, Tender Wang wrote: > Michael Paquier <michael@paquier.xyz> 于2024年1月30日周二 14:28写道: > I misunderstund the behavior in getTokenTypes() and in > DropConfigurationMapping(). I only realized it can fix the reported issue. Well, I don't think you should blame yourself, so no worries we are all here to learn :) tsdicts.sql has little to no coverage of the various grammar flavors we are discussing here, so we should try to close the hole with all the expectations I've just guessed while analyzing the code and the patch. So it is a problem of fixing the root issue as much as expanding the regression tests with token names and IF EXISTS. >> I think that we should tweak getTokenTypes() so as we return *two* >> Lists, one for the IDs and a second with the token names, then use >> forboth() in DropConfigurationMapping() with the two lists and a >> simple foreach in MakeConfigurationMapping() when overriding the >> mappings, while getTokenTypes() checks if a number is in the first >> list before adding the name of a token in the second list. Or we >> could use a simple list with pointers to a local structure, but the >> second list is only needed by DropConfigurationMapping(). That would >> enforce a correct order of the token names and numbers, at least. >> I would be tempted to just use one List with a structure (number, >> token_name). It makes the checks for duplicates O(N^2) but we will >> never have hundreds of mapping entries in these DDL queries. >> > > Hmm, I agree with you. Perhaps you'd like to give it a second try? Based on my suggestion, you just need to englobe the tokens retrieved into a single structure like that: typedef struct { int num; char *name; } TSTokenTypes; Then handle a List of these (could be as well an array with its length returned as an output argument of getTokenTypes(), to keep a non-duplicated list of tokens with a strict mapping between token name and token number. -- Michael
Attachment
Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Tender Wang
Date:
Michael Paquier <michael@paquier.xyz> 于2024年1月30日周二 15:42写道:
On Tue, Jan 30, 2024 at 03:29:46PM +0800, Tender Wang wrote:
> Michael Paquier <michael@paquier.xyz> 于2024年1月30日周二 14:28写道:
> I misunderstund the behavior in getTokenTypes() and in
> DropConfigurationMapping(). I only realized it can fix the reported issue.
Well, I don't think you should blame yourself, so no worries we are
all here to learn :)
tsdicts.sql has little to no coverage of the various grammar flavors
we are discussing here, so we should try to close the hole with all
the expectations I've just guessed while analyzing the code and the
patch. So it is a
problem of fixing the root issue as much as expanding the regression
tests with token names and IF EXISTS.
>> I think that we should tweak getTokenTypes() so as we return *two*
>> Lists, one for the IDs and a second with the token names, then use
>> forboth() in DropConfigurationMapping() with the two lists and a
>> simple foreach in MakeConfigurationMapping() when overriding the
>> mappings, while getTokenTypes() checks if a number is in the first
>> list before adding the name of a token in the second list. Or we
>> could use a simple list with pointers to a local structure, but the
>> second list is only needed by DropConfigurationMapping(). That would
>> enforce a correct order of the token names and numbers, at least.
>> I would be tempted to just use one List with a structure (number,
>> token_name). It makes the checks for duplicates O(N^2) but we will
>> never have hundreds of mapping entries in these DDL queries.
>>
>
> Hmm, I agree with you.
Perhaps you'd like to give it a second try? Based on my suggestion,
you just need to englobe the tokens retrieved into a single structure
like that:
typedef struct
{
int num;
char *name;
} TSTokenTypes;
Then handle a List of these (could be as well an array with its length
returned as an output argument of getTokenTypes(), to keep a
non-duplicated list of tokens with a strict mapping between token name
and token number.
Thanks a lot. I want to give it a second try. I will send a patch based on your suggestion.
--
Michael
Tender Wang
OpenPie: https://en.openpie.com/Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Michael Paquier
Date:
On Tue, Jan 30, 2024 at 04:02:06PM +0800, Tender Wang wrote: > Thanks a lot. I want to give it a second try. I will send a patch based on > your suggestion. Sure, feel free. -- Michael
Attachment
Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Tender Wang
Date:
I refactor previes patch based on your suggestion, please review the new attached patch.
Michael Paquier <michael@paquier.xyz> 于2024年1月30日周二 16:15写道:
On Tue, Jan 30, 2024 at 04:02:06PM +0800, Tender Wang wrote:
> Thanks a lot. I want to give it a second try. I will send a patch based on
> your suggestion.
Sure, feel free.
--
Michael
Tender Wang
OpenPie: https://en.openpie.com/Attachment
Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Michael Paquier
Date:
On Tue, Jan 30, 2024 at 07:51:45PM +0800, Tender Wang wrote: > I refactor previes patch based on your suggestion, please review the new > attached patch. Thanks for the new patch. I've found the logic to be basically OK, minus edits with the structure, the routine names and some comments. Unfortunately, the set of regression tests was too shy because this only tested for the case of duplicated tokens when overridding mappings, while missing: - ADD MAPPING that failed on a constraint failure. - DROP MAPPING that failed with an deletion failure. - The case of IF EXISTS with tokens supported or not supported by a configuration's parser. That was not related to the report of this patch, but as we've discussed it is very easy to miss the historical distinction between the way this clause is handled. In short, there was zero coverage for these code paths. See DropConfigurationMapping() that was completely red, for one: https://coverage.postgresql.org/src/backend/commands/tsearchcmds.c.gcov.html This should be marked as covered pretty soon. I was also hesitating about the addition of a test with REPLACE, actually.. -- Michael
Attachment
Re: BUG #18310: Some SQL commands fail to process duplicate objects with error: tuple already updated by self
From
Tender Wang
Date:
Thanks for your work on this issue, and thanks for pushing it, again.
Michael Paquier <michael@paquier.xyz> 于2024年1月31日周三 12:25写道:
On Tue, Jan 30, 2024 at 07:51:45PM +0800, Tender Wang wrote:
> I refactor previes patch based on your suggestion, please review the new
> attached patch.
Thanks for the new patch. I've found the logic to be basically OK,
minus edits with the structure, the routine names and some comments.
Unfortunately, the set of regression tests was too shy because this
only tested for the case of duplicated tokens when overridding
mappings, while missing:
- ADD MAPPING that failed on a constraint failure.
- DROP MAPPING that failed with an deletion failure.
- The case of IF EXISTS with tokens supported or not supported by a
configuration's parser. That was not related to the report of this
patch, but as we've discussed it is very easy to miss the historical
distinction between the way this clause is handled.
In short, there was zero coverage for these code paths. See
DropConfigurationMapping() that was completely red, for one:
https://coverage.postgresql.org/src/backend/commands/tsearchcmds.c.gcov.html
This should be marked as covered pretty soon. I was also hesitating
about the addition of a test with REPLACE, actually..
--
Michael
Tender Wang
OpenPie: https://en.openpie.com/