Thread: Re: system catalog relation of a table and a serial sequence

Re: system catalog relation of a table and a serial sequence

From
Brent Verner
Date:
[2001-12-14 22:53] Brent Verner said:
| I propose adding a function to pg_dump.c for now.

Patch adding a getSerialSequenceName() function to pg_dump.[ch] is
attached.

I'm aware that this is not the /best/ solution to this problem, but
it is better than the current breakage in pg_dump.

feedback appreciated.

cheers.
  brent

--
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing."  -- Duane Allman

Re: system catalog relation of a table and a serial sequence

From
Tom Lane
Date:
Brent Verner <brent@rcfile.org> writes:
> I'm aware that this is not the /best/ solution to this problem, but
> it is better than the current breakage in pg_dump.

I'd dispute that, primarily because the patch blithely assumes that
there is no other kind of default value than a serial-created nextval().
It looks to me like it will either coredump or do the wrong thing
with other default-value strings.

            regards, tom lane

Re: system catalog relation of a table and a serial sequence

From
Brent Verner
Date:
[2001-12-15 21:02] Tom Lane said:
| Brent Verner <brent@rcfile.org> writes:
| > I'm aware that this is not the /best/ solution to this problem, but
| > it is better than the current breakage in pg_dump.
|
| I'd dispute that, primarily because the patch blithely assumes that
| there is no other kind of default value than a serial-created nextval().
| It looks to me like it will either coredump or do the wrong thing
| with other default-value strings.

monkey me!  Yes, quite a nasty oversight!  I'll clean this up.

thanks.
  brent

--
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing."  -- Duane Allman

Re: system catalog relation of a table and a serial sequence

From
Brent Verner
Date:
[2001-12-15 21:12] Tom Lane said:
| While you're at it, why not fix the code so that it can deal with
| multiple SERIALs attached to a table?

will do.  I'd appreciate a bit of advice on both of the issues to
be addressed.

1) Is a strcmp(firststrtok,"nextval('") == 0  sufficient to determine
   that the adsrc is indeed one that we're looking for?  If not,
   suggestions are greatly appreciated :-)

2) Should this function now look like .. ?
     char** getSerialSequenceNames(const char* table)
   Or would you suggest it return a smarter struct?

thanks.
  brent

--
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing."  -- Duane Allman

Re: system catalog relation of a table and a serial sequence

From
Tom Lane
Date:
While you're at it, why not fix the code so that it can deal with
multiple SERIALs attached to a table?

            regards, tom lane

Re: system catalog relation of a table and a serial sequence

From
Tom Lane
Date:
Brent Verner <brent@rcfile.org> writes:
> 1) Is a strcmp(firststrtok,"nextval('") == 0  sufficient to determine
>    that the adsrc is indeed one that we're looking for?  If not,
>    suggestions are greatly appreciated :-)

I would not use strtok at all, but look for nextval('" at the start
of the string and "'::text) at the end.  If both match, and there's
at least one character between, then the intervening text can be
presumed to be a sequence name.  You might further check that the
apparent sequence name ends with _seq --- if not, it wasn't generated
by SERIAL.

> 2) Should this function now look like .. ?
>      char** getSerialSequenceNames(const char* table)
>    Or would you suggest it return a smarter struct?

char** (null-terminated vector) would probably work.

BTW, don't forget you have the OID of the table available from the table
list, so you can avoid the subselect, as well as the relname quoting
issues that you didn't take care of.  When a tablename argument is
provided, I'd be inclined to make a pre-pass over the table list to see
if it matches any non-sequence table names, and if so build a list of
their associated sequence name(s).  Keep in mind that we'll probably
generalize the tablename argument to support wildcarding someday soon,
so it'd be good if the code could cope with more than one matching
table.

            regards, tom lane

Re: system catalog relation of a table and a serial sequence

From
Brent Verner
Date:
[2001-12-15 21:43] Tom Lane said:
| Brent Verner <brent@rcfile.org> writes:
| > 1) Is a strcmp(firststrtok,"nextval('") == 0  sufficient to determine
| >    that the adsrc is indeed one that we're looking for?  If not,
| >    suggestions are greatly appreciated :-)
|
| I would not use strtok at all, but look for nextval('" at the start
| of the string and "'::text) at the end.  If both match, and there's
| at least one character between, then the intervening text can be
| presumed to be a sequence name.  You might further check that the
| apparent sequence name ends with _seq --- if not, it wasn't generated
| by SERIAL.

Why not use strtok?  The following should be safe, no?

  t1 = strtok(adsrc,"\"");
  t2 = strtok(NULL,"\"");
  t3 = strtok(NULL,"\"");

  if( t0 && t2
      && strcmp(t0,"nextval('") == 0
      && strcmp(t2,"'::text)") == 0 ){
    /* this is a call to nextval, check for t1 =~ /_seq$/ */

  }

| > 2) Should this function now look like .. ?
| >      char** getSerialSequenceNames(const char* table)
| >    Or would you suggest it return a smarter struct?
|
| char** (null-terminated vector) would probably work.
|
| BTW, don't forget you have the OID of the table available from the table
| list, so you can avoid the subselect, as well as the relname quoting
| issues that you didn't take care of.  When a tablename argument is
| provided, I'd be inclined to make a pre-pass over the table list to see
| if it matches any non-sequence table names, and if so build a list of
| their associated sequence name(s).  Keep in mind that we'll probably
| generalize the tablename argument to support wildcarding someday soon,
| so it'd be good if the code could cope with more than one matching
| table.

gotcha.

thanks.
  brent

--
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing."  -- Duane Allman

Re: [HACKERS] system catalog relation of a table and a serial sequence

From
Brent Verner
Date:
[2001-12-15 22:26] Rod Taylor said:
|
| > Brent Verner <brent@rcfile.org> writes:
| > > 1) Is a strcmp(firststrtok,"nextval('") == 0  sufficient to
| determine
| > >    that the adsrc is indeed one that we're looking for?  If not,
| > >    suggestions are greatly appreciated :-)
| >
| > I would not use strtok at all, but look for nextval('" at the start
| > of the string and "'::text) at the end.  If both match, and there's
| > at least one character between, then the intervening text can be
| > presumed to be a sequence name.  You might further check that the
| > apparent sequence name ends with _seq --- if not, it wasn't
| generated
| > by SERIAL.
|
| Wouldn't you want to include user sequences that are required for
| using the table?  If someone has used their own sequence as the
| default value for a column it would be nice to have it dumped as well.

This is my thought as well.  Hopefully Tom will concur.

cheers.
  brent

--
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing."  -- Duane Allman

Re: system catalog relation of a table and a serial sequence

From
Tom Lane
Date:
Brent Verner <brent@rcfile.org> writes:
> Why not use strtok?

Well, it's ugly (I don't like non-reentrant library routines), it's
not really buying anything, and I don't think you've got the corner
cases right anyway.  I'd go for something like

    if (strlen(adsrc) > 19 &&
        strncmp(adsrc, "nextval('\"", 10) == 0 &&
        strcmp(adsrc + strlen(adsrc) - 9, "\"'::text)") == 0)

            regards, tom lane

Re: [HACKERS] system catalog relation of a table and a serial sequence

From
Tom Lane
Date:
> | > You might further check that the
> | > apparent sequence name ends with _seq --- if not, it wasn't
> | > generated by SERIAL.
> |
> | Wouldn't you want to include user sequences that are required for
> | using the table?  If someone has used their own sequence as the
> | default value for a column it would be nice to have it dumped as well.

> This is my thought as well.  Hopefully Tom will concur.

Well, that's why I said "might".  I'm not sure what the correct behavior
is here.  If we had an actual SERIAL datatype --- that is, we could
unambiguously tell that a given column was SERIAL --- then a case could
be made that "pg_dump -t table" should dump only those sequences
associated with table's SERIAL columns.

I think it'd be a bit surprising if "pg_dump -t table" would dump
sequences declared independently of the table.  An example where you'd
likely not be happy with that is if the same sequence is being used to
feed multiple tables.

I agree that dumping all such sequences will often be the desired
behavior, but that doesn't leave me convinced that it's the right
thing to do.

Any comments out there?

            regards, tom lane

Re: [HACKERS] system catalog relation of a table and a serial sequence

From
Brent Verner
Date:
[2001-12-15 23:17] Tom Lane said:
| > | > You might further check that the
| > | > apparent sequence name ends with _seq --- if not, it wasn't
| > | > generated by SERIAL.
| > |
| > | Wouldn't you want to include user sequences that are required for
| > | using the table?  If someone has used their own sequence as the
| > | default value for a column it would be nice to have it dumped as well.
|
| > This is my thought as well.  Hopefully Tom will concur.
|
| Well, that's why I said "might".  I'm not sure what the correct behavior
| is here.  If we had an actual SERIAL datatype --- that is, we could
| unambiguously tell that a given column was SERIAL --- then a case could
| be made that "pg_dump -t table" should dump only those sequences
| associated with table's SERIAL columns.
|
| I think it'd be a bit surprising if "pg_dump -t table" would dump
| sequences declared independently of the table.  An example where you'd
| likely not be happy with that is if the same sequence is being used to
| feed multiple tables.
|
| I agree that dumping all such sequences will often be the desired
| behavior, but that doesn't leave me convinced that it's the right
| thing to do.
|
| Any comments out there?

sure :-)  What we can do is determine /any/ sequence referenced
by a nextval(..) attribute default with the following SELECT query.

create sequence non_serial_sequence;
create table aaa (
  id serial,
  nonid int default nextval('non_serial_sequence')
);
SELECT adsrc FROM pg_attrdef WHERE adrelid=(
  SELECT oid FROM pg_class WHERE relname='aaa'
);

                adsrc
--------------------------------------
 nextval('"aaa_id_seq"'::text)
 nextval('non_serial_sequence'::text)

We get the nextval(..) calls to both of the referenced sequences,
and the strtok code I'm using extracts the proper sequence names.
Am I overlooking something here?  Is there any other way a nextval(..)
adsrc would appear not containing a sequence related to this relation?

cheers.
  brent

--
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing."  -- Duane Allman

Re: system catalog relation of a table and a serial sequence

From
Brent Verner
Date:
[2001-12-15 23:25] Tom Lane said:
| Brent Verner <brent@rcfile.org> writes:
| > Why not use strtok?
|
| Well, it's ugly (I don't like non-reentrant library routines), it's
| not really buying anything, and I don't think you've got the corner
| cases right anyway.  I'd go for something like

How about strtok_r?  I /really/ like the fact that strtok will
eat either of the tokens ['"] that might be around the sequence
name... just call me lazy :-)

thanks.
  brent

--
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing."  -- Duane Allman

Re: system catalog relation of a table and a serial sequence

From
Tom Lane
Date:
Brent Verner <brent@rcfile.org> writes:
> How about strtok_r?  I /really/ like the fact that strtok will
> eat either of the tokens ['"] that might be around the sequence
> name... just call me lazy :-)

That behavior creates one of the "corner cases" I was alluding to.
Shall I leave the difficulty as an exercise for the student?

            regards, tom lane

Re: system catalog relation of a table and a serial sequence

From
Brent Verner
Date:
[2001-12-16 00:42] Tom Lane said:
| Brent Verner <brent@rcfile.org> writes:
| > How about strtok_r?  I /really/ like the fact that strtok will
| > eat either of the tokens ['"] that might be around the sequence
| > name... just call me lazy :-)
|
| That behavior creates one of the "corner cases" I was alluding to.
| Shall I leave the difficulty as an exercise for the student?

sure.  I'm assuming the strtok_r is not acceptable :-)  I'll work on
this a bit more tonight.  Expect a better patch sometiime tomorrow.

thanks.
  brent

--
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing."  -- Duane Allman

Re: system catalog relation of a table and a serial

From
Philip Warner
Date:
At 01:20 16/12/01 -0500, Brent Verner wrote:
>
> Expect a better patch sometiime tomorrow.
>

The hard part in this might be getting pg_restore to know that the sequence
is part of the table definition; perhaps it is adequate to use the 'deps'
field.

I think it is currently unused for SEQUENCE (and SEQUENCE SET) entries, so
we could assume that if it is set, the sequence is logically part of a
table. You could set the deps to the table OID, then the restore operation
(_tocEntryRequired) could scan the TOC for a matching table, and see if the
matching table is being restored (ie. _tocEntryRequired would, in the case
of 'SEQUENCE' and 'SEQUENCE SET' entries, scan the entire TOC for the
matching table then call itself recursively on the table entry.




----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|
                                 |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/

Re: system catalog relation of a table and a serial sequence

From
Brent Verner
Date:
[2001-12-16 00:42] Tom Lane said:
| Brent Verner <brent@rcfile.org> writes:
| > How about strtok_r?  I /really/ like the fact that strtok will
| > eat either of the tokens ['"] that might be around the sequence
| > name... just call me lazy :-)
|
| That behavior creates one of the "corner cases" I was alluding to.
| Shall I leave the difficulty as an exercise for the student?

Ok... I ended up working longer than I'd thought :-)

* no strtok were used in this patch. ;-)
* Handles both serial-sequences and user-sequences referenced in
  nextval(...) default column defs.
* Loop over tables so we can check wildcard table name in the future
  per your suggestion.  I've only noted a TODO: regarding the wildcard
  matching.
* Instead of using a NULL terminated char** array to hold the collected
  sequence names, I put in a simple strarray ADT -- mostly so I could
  have the strarrayContains() test to call from the conditional around
  dumpSequence().  If this is just dumb, I'll replace it with a simple
  char** implementation.  Did I overlook some utility funcs in the
  PG source that already does this?  If so, I'll gladly use those.
* Patch is really attached :-P

comments?

tired.
  b

--
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing."  -- Duane Allman

Attachment

Re: system catalog relation of a table and a serial sequence

From
Brent Verner
Date:
[2001-12-16 06:30] Brent Verner said:
| [2001-12-16 00:42] Tom Lane said:
| | Brent Verner <brent@rcfile.org> writes:
| | > How about strtok_r?  I /really/ like the fact that strtok will
| | > eat either of the tokens ['"] that might be around the sequence
| | > name... just call me lazy :-)
| |
| | That behavior creates one of the "corner cases" I was alluding to.
| | Shall I leave the difficulty as an exercise for the student?
|
| Ok... I ended up working longer than I'd thought :-)
|
| * no strtok were used in this patch. ;-)
| * Handles both serial-sequences and user-sequences referenced in
|   nextval(...) default column defs.
| * Loop over tables so we can check wildcard table name in the future
|   per your suggestion.  I've only noted a TODO: regarding the wildcard
|   matching.
| * Instead of using a NULL terminated char** array to hold the collected
|   sequence names, I put in a simple strarray ADT -- mostly so I could
|   have the strarrayContains() test to call from the conditional around
|   dumpSequence().  If this is just dumb, I'll replace it with a simple
|   char** implementation.  Did I overlook some utility funcs in the
|   PG source that already does this?  If so, I'll gladly use those.
| * Patch is really attached :-P

This patch needs a fix already...  I just realized (while playing with
this code in a different context) that I forgot to change the malloc
line in strarrayInit() after typedef'ing strarray as pointer to struct,
instead of just the struct.

-  strarray _ary = (strarray)malloc(sizeof(strarray));
+  strarray _ary = (strarray)malloc(sizeof(struct strarray));

cheers.
  brent

--
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing."  -- Duane Allman

Re: [HACKERS] system catalog relation of a table and a

From
Andrew McMillan
Date:
On Sun, 2001-12-16 at 17:17, Tom Lane wrote:
> > | > You might further check that the
> > | > apparent sequence name ends with _seq --- if not, it wasn't
> > | > generated by SERIAL.
> > |
> > | Wouldn't you want to include user sequences that are required for
> > | using the table?  If someone has used their own sequence as the
> > | default value for a column it would be nice to have it dumped as well.
>
> > This is my thought as well.  Hopefully Tom will concur.
>
> Well, that's why I said "might".  I'm not sure what the correct behavior
> is here.  If we had an actual SERIAL datatype --- that is, we could
> unambiguously tell that a given column was SERIAL --- then a case could
> be made that "pg_dump -t table" should dump only those sequences
> associated with table's SERIAL columns.
>
> I think it'd be a bit surprising if "pg_dump -t table" would dump
> sequences declared independently of the table.  An example where you'd
> likely not be happy with that is if the same sequence is being used to
> feed multiple tables.
>
> I agree that dumping all such sequences will often be the desired
> behavior, but that doesn't leave me convinced that it's the right
> thing to do.
>
> Any comments out there?

Along with "DROP COLUMN" this is probably one of the biggest "I can't
believe it doesn't" things out there.

I would tend to say that Brent's patch, in dumping all of the sequences
used by a table, is erring on the _correct_ side of caution.

Remember that someone who this is a problem for can easily post-process
the sequence out of the dump with sed or something, but someone for whom
the opposite is true doesn't have anything like as trivial a job to put
it back in there.

Cheers,
                    Andrew.
--
--------------------------------------------------------------------
Andrew @ Catalyst .Net.NZ Ltd, PO Box 11-053, Manners St, Wellington
WEB: http://catalyst.net.nz/        PHYS: Level 2, 150-154 Willis St
DDI: +64(4)916-7201    MOB: +64(21)635-694    OFFICE: +64(4)499-2267


Re: system catalog relation of a table and a serial sequence

From
Bruce Momjian
Date:
Brent, do you have a new, final patch that you want to submit for this?


---------------------------------------------------------------------------

Brent Verner wrote:
> [2001-12-16 06:30] Brent Verner said:
> | [2001-12-16 00:42] Tom Lane said:
> | | Brent Verner <brent@rcfile.org> writes:
> | | > How about strtok_r?  I /really/ like the fact that strtok will
> | | > eat either of the tokens ['"] that might be around the sequence
> | | > name... just call me lazy :-)
> | |
> | | That behavior creates one of the "corner cases" I was alluding to.
> | | Shall I leave the difficulty as an exercise for the student?
> |
> | Ok... I ended up working longer than I'd thought :-)
> |
> | * no strtok were used in this patch. ;-)
> | * Handles both serial-sequences and user-sequences referenced in
> |   nextval(...) default column defs.
> | * Loop over tables so we can check wildcard table name in the future
> |   per your suggestion.  I've only noted a TODO: regarding the wildcard
> |   matching.
> | * Instead of using a NULL terminated char** array to hold the collected
> |   sequence names, I put in a simple strarray ADT -- mostly so I could
> |   have the strarrayContains() test to call from the conditional around
> |   dumpSequence().  If this is just dumb, I'll replace it with a simple
> |   char** implementation.  Did I overlook some utility funcs in the
> |   PG source that already does this?  If so, I'll gladly use those.
> | * Patch is really attached :-P
>
> This patch needs a fix already...  I just realized (while playing with
> this code in a different context) that I forgot to change the malloc
> line in strarrayInit() after typedef'ing strarray as pointer to struct,
> instead of just the struct.
>
> -  strarray _ary = (strarray)malloc(sizeof(strarray));
> +  strarray _ary = (strarray)malloc(sizeof(struct strarray));
>
> cheers.
>   brent
>
> --
> "Develop your talent, man, and leave the world something. Records are
> really gifts from people. To think that an artist would love you enough
> to share his music with anyone is a beautiful thing."  -- Duane Allman
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026