Re: [BUGS] Nasty tsvector can make dumps unrestorable - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [BUGS] Nasty tsvector can make dumps unrestorable
Date
Msg-id 200711100319.lAA3Ji219846@momjian.us
Whole thread Raw
Responses Re: [BUGS] Nasty tsvector can make dumps unrestorable  (Andrew Dunstan <andrew@dunslane.net>)
Re: [BUGS] Nasty tsvector can make dumps unrestorable  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Stuart Bishop <stuart@stuartbishop.net> writes:
> > The attached script creates a tsvector with a value that can be dumped using
> > pg_dump, but not loaded again using pg_restore. This causes restores of a
> > dump containing this value to fail.
>
> Hmm, sorta looks like tsvectorout should be doubling backslashes?

I think the larger question is why tsvectorin() requires
double-backslashes?  It seems it is for marking of single-quotes in
phrases, from what I can tell from the code and regression test usage:

    SELECT E'''1 \\''2'' 3'::tsvector;
      tsvector
    -------------
     '3' '1 ''2'
    (1 row)

My guess is that the '' is used to start/stop phrases, and \\'' puts a
literal '' in the phrase.

I have developed the attached patch which doubles backslashes on output:

    test=> INSERT INTO Foo(bar) VALUES (E'\\\\x');
    INSERT 0 1
    test=> select * from foo;
      bar
    -------
     '\\x'
    (1 row)

However, I am still unclear if the dump code is correct because I don't
see the backslash preserved in \\'' cases, just \\\\ cases:

    test=> CREATE TABLE Foo(bar tsvector);
    CREATE
    test=> INSERT INTO Foo(bar) VALUES (E'\\''x');
    INSERT 0 1
    test=> select * from foo;
      bar
    -------
     '''x'
    (1 row)

and pg_dump outputs:

    COPY foo (bar) FROM stdin;
    '''x'
    \.


While the COPY will load into the table, this doesn't:

    test=> INSERT INTO Foo(bar) VALUES (E'''''x');
    ERROR:  syntax error in tsvector: "''x"

I am confused.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: src/backend/utils/adt/tsvector.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/tsvector.c,v
retrieving revision 1.6
diff -c -c -r1.6 tsvector.c
*** src/backend/utils/adt/tsvector.c    23 Oct 2007 00:51:23 -0000    1.6
--- src/backend/utils/adt/tsvector.c    9 Nov 2007 23:59:06 -0000
***************
*** 345,350 ****
--- 345,352 ----

              if (t_iseq(curin, '\''))
                  *curout++ = '\'';
+             else if (t_iseq(curin, '\\'))
+                 *curout++ = '\\';

              while (len--)
                  *curout++ = *curin++;

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Need for advice and direction (again)
Next
From: Andrew Dunstan
Date:
Subject: Re: [BUGS] Nasty tsvector can make dumps unrestorable