Thread: [HACKERS] outfuncs.c utility statement support
While debugging some other stuff, I was wondering to what extent node types supporting utility statements should be supported in outfuncs.c. Running with --debug-print-parse=on, executing create table test1 (a int, b text); gives output that is truncated somewhere in the middle (possibly a null byte), and drop table test1; shows (among other things) :utilityStmt ? Is there a way to check that everything that needs to be there is there and that the stuff that is there works correctly or is reachable at all? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > While debugging some other stuff, I was wondering to what extent node > types supporting utility statements should be supported in outfuncs.c. We've largely not tried too hard in that department. From a debugging standpoint it'd be useful to have more complete coverage, but I don't know how much additional code and maintenance effort would be required to get to full coverage. (Hmm .. I think copyfuncs.c does have complete coverage, and it's about 5500 lines vs. 4200 lines for outfuncs.c. So perhaps this would mean about 30% growth of outfuncs.c and another 20KB or so in the executable.) regards, tom lane
On 6/13/17 11:25, Peter Eisentraut wrote: > Running with --debug-print-parse=on, executing > > create table test1 (a int, b text); > > gives output that is truncated somewhere in the middle (possibly a null > byte) So this seems to be a pretty basic bug. Some node fields of type char may be zero, and so printing them as a zero byte just truncates the whole output string. This could be fixed by printing chars like strings with the full escaping mechanism. See attached patch. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On 2017/06/14 12:49, Peter Eisentraut wrote: > On 6/13/17 11:25, Peter Eisentraut wrote: >> Running with --debug-print-parse=on, executing >> >> create table test1 (a int, b text); >> >> gives output that is truncated somewhere in the middle (possibly a null >> byte) > > So this seems to be a pretty basic bug. Some node fields of type char > may be zero, and so printing them as a zero byte just truncates the > whole output string. This could be fixed by printing chars like strings > with the full escaping mechanism. See attached patch. +1. I've been meaning to report about zero-byte-truncating-the-whole-output-string thing for some time now. Thanks, Amit
On Tue, Jun 13, 2017 at 11:59 PM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote: >> So this seems to be a pretty basic bug. Some node fields of type char >> may be zero, and so printing them as a zero byte just truncates the >> whole output string. This could be fixed by printing chars like strings >> with the full escaping mechanism. See attached patch. > > +1. I've been meaning to report about > zero-byte-truncating-the-whole-output-string thing for some time now. I've been bitten by this, too. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > So this seems to be a pretty basic bug. Some node fields of type char > may be zero, and so printing them as a zero byte just truncates the > whole output string. This could be fixed by printing chars like strings > with the full escaping mechanism. See attached patch. +1 for fixing this, but you have not handled the read side correctly. pg_strtok doesn't promise to return a null-terminated string, so without changes, readfuncs.c would not successfully decode a zero-byte char field. Also it would do the wrong thing for any character code that outToken had decided to prefix with a backslash. I think you could fix both problems like this: /* Read a char field (ie, one ascii character) */#define READ_CHAR_FIELD(fldname) \ token = pg_strtok(&length); /* skip :fldname */ \ token = pg_strtok(&length); /* get field value */ \ - local_node->fldname = token[0] + local_node->fldname = debackslash(token, length)[0] although that's annoyingly expensive for the normal case where no special processing is needed. Maybe better + local_node->fldname = (length == 0) ? '\0' : (token[0] == '\\') ? token[1] : token[0] regards, tom lane
On 6/14/17 12:05, Tom Lane wrote: > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: >> So this seems to be a pretty basic bug. Some node fields of type char >> may be zero, and so printing them as a zero byte just truncates the >> whole output string. This could be fixed by printing chars like strings >> with the full escaping mechanism. See attached patch. > > +1 for fixing this, but you have not handled the read side correctly. > pg_strtok doesn't promise to return a null-terminated string, so without > changes, readfuncs.c would not successfully decode a zero-byte char field. > Also it would do the wrong thing for any character code that outToken had > decided to prefix with a backslash. I think you could fix both problems > like this: > > /* Read a char field (ie, one ascii character) */ > #define READ_CHAR_FIELD(fldname) \ > token = pg_strtok(&length); /* skip :fldname */ \ > token = pg_strtok(&length); /* get field value */ \ > - local_node->fldname = token[0] > + local_node->fldname = debackslash(token, length)[0] An empty token produces "<>", so just debackslashing wouldn't work. Maybe local_node->fldname = length ? token[0] : '\0'; ? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > An empty token produces "<>", so just debackslashing wouldn't work. pg_strtok recognizes "<>" and returns length = 0, so debackslash() would produce the right answer AFAICS (admittedly, I haven't tested). But I don't really want to do it like that because of the wasted palloc space and cycles. > Maybe > local_node->fldname = length ? token[0] : '\0'; > ? Doesn't cope with backslash-quoted characters. If we're going to bother to do anything here, I think we ought to make it reversible for all possible characters. regards, tom lane
On 6/18/17 10:14, Tom Lane wrote: > pg_strtok recognizes "<>" and returns length = 0, so debackslash() > would produce the right answer AFAICS (admittedly, I haven't tested). > But I don't really want to do it like that because of the wasted > palloc space and cycles. > >> Maybe >> local_node->fldname = length ? token[0] : '\0'; >> ? > > Doesn't cope with backslash-quoted characters. If we're going to bother > to do anything here, I think we ought to make it reversible for all > possible characters. Makes sense. Updated patch attached. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 6/18/17 10:14, Tom Lane wrote: >> Doesn't cope with backslash-quoted characters. If we're going to bother >> to do anything here, I think we ought to make it reversible for all >> possible characters. > Makes sense. Updated patch attached. That looks sane to me, though I've still not actually tested any interesting cases. regards, tom lane
On 6/21/17 23:40, Tom Lane wrote: > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: >> On 6/18/17 10:14, Tom Lane wrote: >>> Doesn't cope with backslash-quoted characters. If we're going to bother >>> to do anything here, I think we ought to make it reversible for all >>> possible characters. > >> Makes sense. Updated patch attached. > > That looks sane to me, though I've still not actually tested any > interesting cases. I have committed this. I had to build a bunch more stuff around it to be able to test it, which I will tidy up and submit at some later point. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services