Thread: PQescapeStringConn

PQescapeStringConn

From
Scott Frankel
Date:

Hi all,

What's the best way to insert long strings that contain numerous special characters into a PG database?  

I'm assuming that importing an SQL script with prepared statements is the way to go.  If so, how to escape all the special characters?

I've found documentation on PQescapeStringConn but haven't found any examples of it in use. 

I have a number of very long strings that each contain many instances of semi-colons, single quotes, forward and back slashes, etc.  I'm looking for an efficient and safe way to write them to my db using a prepared statement.

An example follows.

Thanks in advance!
Scott


CREATE TABLE foo (
foo_id SERIAL PRIMARY KEY,
name VARCHAR(32) UNIQUE NOT NULL,
description TEXT,
body TEXT DEFAULT NULL,
created timestamp DEFAULT CURRENT_TIMESTAMP,
UNIQUE (name));


PREPARE fooprep (VARCHAR(32), text, text) AS
    INSERT INTO foo (name, description, body) VALUES ($1, $2, $3);
EXECUTE fooprep('foo1', 'this is foo1', 

'#!()[]{};
qwe'poi'asdlkj"zxcmnb";
/\1\2\3\4\5\6\7\8\9/'

);



Re: PQescapeStringConn

From
Richard Huxton
Date:
On 30/07/10 07:52, Scott Frankel wrote:
> I have a number of very long strings that each contain many instances of
> semi-colons, single quotes, forward and back slashes, etc. I'm looking
> for an efficient and safe way to write them to my db using a prepared
> statement.

What language? From "C"?

> PREPARE fooprep (VARCHAR(32), text, text) AS
> INSERT INTO foo (name, description, body) VALUES ($1, $2, $3);
> EXECUTE fooprep('foo1', 'this is foo1',

This is basically PQprepare+PQexecPrepared, or PQexecParams if you want
to do both in one step. There is no need to escape strings if they are
passed as parameters - the library knows it's a string and handles that
for you.

Where you *do* have to worry about escaping strings is if you are
building up a query and have e.g. a varying table-name. It's legal for
table names to contain spaces etc. but they need to be quoted correctly.

Every application language will have its own library, but they all have
a similar prepare+exec option (and I think most use the "C" libpq
interface underneath).

--
   Richard Huxton
   Archonet Ltd

Re: PQescapeStringConn

From
Scott Frankel
Date:
On Jul 30, 2010, at 1:13 AM, Richard Huxton wrote:

> On 30/07/10 07:52, Scott Frankel wrote:
>> I have a number of very long strings that each contain many
>> instances of
>> semi-colons, single quotes, forward and back slashes, etc. I'm
>> looking
>> for an efficient and safe way to write them to my db using a prepared
>> statement.
>
> What language? From "C"?

Importing an SQL script.  eg:  \i my_script_of_prepared_statements.sql


>
>> PREPARE fooprep (VARCHAR(32), text, text) AS
>> INSERT INTO foo (name, description, body) VALUES ($1, $2, $3);
>> EXECUTE fooprep('foo1', 'this is foo1',

The full statement (below) illustrates the problem I'm encountering.
The text I'm trying to insert has single quotes and semi-colons in
it.  These get interpreted, causing errors.  I'm looking for a way to
insert strings with special characters into my db, hopefully avoiding
having to escape each one by hand.  (They are numerous and the strings
quite long.)  eg:

    INSERT INTO foo (name, body) VALUES ('foo', 'this will fail 'fer
sher;' on the characters inside the string');

Thanks again!
Scott


PREPARE fooprep (VARCHAR(32), text, text) AS
     INSERT INTO foo (name, description, body) VALUES ($1, $2, $3);
EXECUTE fooprep('foo1', 'this is foo1',
'#!()[]{};
qwe'poi'asdlkj"zxcmnb";
/\1\2\3\4\5\6\7\8\9/'
);




> This is basically PQprepare+PQexecPrepared, or PQexecParams if you
> want to do both in one step. There is no need to escape strings if
> they are passed as parameters - the library knows it's a string and
> handles that for you.
>
> Where you *do* have to worry about escaping strings is if you are
> building up a query and have e.g. a varying table-name. It's legal
> for table names to contain spaces etc. but they need to be quoted
> correctly.
>
> Every application language will have its own library, but they all
> have a similar prepare+exec option (and I think most use the "C"
> libpq interface underneath).
>
> --
>  Richard Huxton
>  Archonet Ltd
>


Re: PQescapeStringConn

From
Richard Huxton
Date:
On 30/07/10 16:57, Scott Frankel wrote:
>
> On Jul 30, 2010, at 1:13 AM, Richard Huxton wrote:
>
>> On 30/07/10 07:52, Scott Frankel wrote:
>>> I have a number of very long strings that each contain many instances of
>>> semi-colons, single quotes, forward and back slashes, etc. I'm looking
>>> for an efficient and safe way to write them to my db using a prepared
>>> statement.
>>
>> What language? From "C"?
>
> Importing an SQL script. eg: \i my_script_of_prepared_statements.sql

> The full statement (below) illustrates the problem I'm encountering.

> INSERT INTO foo (name, body) VALUES ('foo', 'this will fail 'fer sher;'
> on the characters inside the string');

Ah - the solution is: don't do that.

You're going to have to pre-process the strings in some way, or there
will always be the chance of problems. Probably the best way to handle a
bulk insert is through the COPY command:

BEGIN;

COPY foo (name, body) FROM stdin;
n1    b1
n2    b2
foo    this will fail 'fer sher;' on the characters inside the string
\.

COMMIT;

By default COPY expects one line per row, with columns separated by tab
characters. You can also have '/path/to/file/name' instead of stdin, but
the file will need to be accessible from the backend process. If that's
not the case (and it probably isn't) then you want to use psql's "\copy"
variant which views the world from the client end of things.

COPY is faster than separate inserts and the only characters you need to
worry about are tab, carriage-return and newline. These would be
replaced by the sequences "\t", "\r", "\n".

I don't know what format your strings are in initially, but a bit of
perl/python/ruby can easily tidy them up.

Finally, more recent versions of PG have a COPY that supports CSV
formatting too. See the manuals for more details on this.

--
   Richard Huxton
   Archonet Ltd

Re: PQescapeStringConn

From
Scott Frankel
Date:

On Jul 30, 2010, at 9:11 AM, Richard Huxton wrote:

On 30/07/10 16:57, Scott Frankel wrote:

On Jul 30, 2010, at 1:13 AM, Richard Huxton wrote:

On 30/07/10 07:52, Scott Frankel wrote:
I have a number of very long strings that each contain many instances of
semi-colons, single quotes, forward and back slashes, etc. I'm looking
for an efficient and safe way to write them to my db using a prepared
statement.

You're going to have to pre-process the strings in some way, or there will always be the chance of problems. Probably the best way to handle a bulk insert is through the COPY command:

Excellent!  Thanks for the lead.  I see from the docs:
COPY weather FROM '/home/user/weather.txt';

I am using Python to create the strings; and Python is the ultimate consumer of the strings after storage in the db.  Thus I have a fair degree of control over the strings' formatting.  COPY from a plain text file on my server looks very promising.

Thanks!
Scott



BEGIN;

COPY foo (name, body) FROM stdin;
n1 b1
n2 b2
foo this will fail 'fer sher;' on the characters inside the string
\.

COMMIT;

By default COPY expects one line per row, with columns separated by tab characters. You can also have '/path/to/file/name' instead of stdin, but the file will need to be accessible from the backend process. If that's not the case (and it probably isn't) then you want to use psql's "\copy" variant which views the world from the client end of things.

COPY is faster than separate inserts and the only characters you need to worry about are tab, carriage-return and newline. These would be replaced by the sequences "\t", "\r", "\n".

I don't know what format your strings are in initially, but a bit of perl/python/ruby can easily tidy them up.

Finally, more recent versions of PG have a COPY that supports CSV formatting too. See the manuals for more details on this.

--
 Richard Huxton
 Archonet Ltd