Thread: ORDER BY in prepared statements
In a number of places on the web I've seen it claimed that ordering can be set via prepared statements. Indeed, the expected syntax is accepted on my 9.3 server without errors:
sandbox=# CREATE TABLE test (
id serial PRIMARY KEY,
gender char
);
sandbox=# INSERT INTO test(gender) VALUES('m') VALUES('f') VALUES('m') VALUES('f') VALUES('m');
sandbox=# PREPARE testplan(text) AS
SELECT * FROM test ORDER BY $1;
But the output is not what one would expect:
sandbox=# EXECUTE testplan('gender');
id | gender
----+--------
1 | m
2 | f
3 | m
4 | f
5 | m
6 | f
(6 rows)
As opposed to:
sandbox=# SELECT * FROM test ORDER BY gender;
id | gender
----+--------
2 | f
4 | f
6 | f
1 | m
3 | m
5 | m
(6 rows)
It would seem that the ORDER BY clause is simply ignored in the prepared statement. Is this deliberate behaviour? I can well understand that supporting this kind of query would be tricky, but it would be very handy.
Many thanks,
Bryn
sandbox=# CREATE TABLE test (
id serial PRIMARY KEY,
gender char
);
sandbox=# INSERT INTO test(gender) VALUES('m') VALUES('f') VALUES('m') VALUES('f') VALUES('m');
sandbox=# PREPARE testplan(text) AS
SELECT * FROM test ORDER BY $1;
But the output is not what one would expect:
sandbox=# EXECUTE testplan('gender');
id | gender
----+--------
1 | m
2 | f
3 | m
4 | f
5 | m
6 | f
(6 rows)
As opposed to:
sandbox=# SELECT * FROM test ORDER BY gender;
id | gender
----+--------
2 | f
4 | f
6 | f
1 | m
3 | m
5 | m
(6 rows)
It would seem that the ORDER BY clause is simply ignored in the prepared statement. Is this deliberate behaviour? I can well understand that supporting this kind of query would be tricky, but it would be very handy.
Many thanks,
Bryn
> In a number of places on the web I've seen it claimed that ordering can be > set via prepared statements. > ... > sandbox=# PREPARE testplan(text) AS > SELECT * FROM test ORDER BY $1; > > But the output is not what one would expect: > > sandbox=# EXECUTE testplan('gender'); > ... > As opposed to: > sandbox=# SELECT * FROM test ORDER BY gender; Your prepared statement version is actually comparable to this SQL: SELECT * FROM test ORDER BY 'gender' which is effectually ordering by random. I'm not sure how to make a prepared statement that lets you name a column when you execute it. Maybe someone else can chime in if that's possible. Paul -- _________________________________ Pulchritudo splendor veritatis.
On 01/21/2015 12:51 PM, Bryn Jeffries wrote: > In a number of places on the web I've seen it claimed that ordering can > be set via prepared statements. Can you give a link to one of those examples? > > Many thanks, > > Bryn -- Adrian Klaver adrian.klaver@aklaver.com
Paul Jungwirth wrote >> In a number of places on the web I've seen it claimed that ordering can >> be >> set via prepared statements. >> ... >> sandbox=# PREPARE testplan(text) AS >> SELECT * FROM test ORDER BY $1; >> >> But the output is not what one would expect: >> >> sandbox=# EXECUTE testplan('gender'); >> ... >> As opposed to: >> sandbox=# SELECT * FROM test ORDER BY gender; > > Your prepared statement version is actually comparable to this SQL: > > SELECT * FROM test ORDER BY 'gender' > > which is effectually ordering by random. > > I'm not sure how to make a prepared statement that lets you name a > column when you execute it. Maybe someone else can chime in if that's > possible. > > Paul You cannot. By definition parameters, in this context, are values - not identifiers. Queries with variable identifiers are called "dynamic SQL" and can only be realized via the EXECUTE statement in pl/pgsql. Yes, same name different behavior because it is a different language. http://www.postgresql.org/docs/9.4/interactive/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN It too has a "prepare" capability (USING) that is also limited to data values and not identifiers. Basically what this gives you is an easy-to-access language and structure (i.e., function) to execute dynamic SQL. You can accomplish the same thing in whatever language and client library you are using by creating a dynamic SQL statement to pass to SQL PREPARE. In both situations there is no way for the planner to plan and cache a single query whose order by column varies. No matter what you do at best you can have a single plan for each explicit order by column that you wish to specify. David J. -- View this message in context: http://postgresql.nabble.com/ORDER-BY-in-prepared-statements-tp5834944p5834948.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Sorry, I can't find any now. It's cropped up in a few forums, in the context of executing queries from web services. Clearly not significantly enough to show up in Google...
----- Reply message -----
From: "Adrian Klaver" <adrian.klaver@aklaver.com>
To: "Bryn Jeffries" <bryn.jeffries@sydney.edu.au>, "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: [GENERAL] ORDER BY in prepared statements
Date: Thu, Jan 22, 2015 08:18
From: "Adrian Klaver" <adrian.klaver@aklaver.com>
To: "Bryn Jeffries" <bryn.jeffries@sydney.edu.au>, "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: [GENERAL] ORDER BY in prepared statements
Date: Thu, Jan 22, 2015 08:18
On 01/21/2015 12:51 PM, Bryn Jeffries wrote:
> In a number of places on the web I've seen it claimed that ordering can
> be set via prepared statements.
Can you give a link to one of those examples?
>
> Many thanks,
>
> Bryn
--
Adrian Klaver
adrian.klaver@aklaver.com
> In a number of places on the web I've seen it claimed that ordering can
> be set via prepared statements.
Can you give a link to one of those examples?
>
> Many thanks,
>
> Bryn
--
Adrian Klaver
adrian.klaver@aklaver.com
Paul Jungwirth wrote > I'm not sure how to make a prepared statement that lets you name a > column when you execute it. Maybe someone else can chime in if that's > possible. David J. responded > You cannot. By definition parameters, in this context, are values - not > identifiers. > [...] > In both situations there is no way for the planner to plan and cache a > single query whose order by column varies. No matter what you do at best > you can have a single plan for each explicit order by column that you wish > to specify. That's what I'd figured. The motivation to use prepared statements in application layers is not so much having a single plan but more the insulation from SQL injection. The intent of the given ORDER BY example was to restricts inputs to valid identifiers rather than part of the query expression. Maybe what we need in ODBC libs and the like is a "protected statement" that follows the same construction as a prepared statement but additionally checks catalogs to validate identifiers. Bryn
Paul Jungwirth wrote
> I'm not sure how to make a prepared statement that lets you name a
> column when you execute it. Maybe someone else can chime in if that's
> possible.
David J. responded
> You cannot. By definition parameters, in this context, are values - not
> identifiers.
> [...]
> In both situations there is no way for the planner to plan and cache a
> single query whose order by column varies. No matter what you do at best
> you can have a single plan for each explicit order by column that you wish
> to specify.
That's what I'd figured. The motivation to use prepared statements in
application layers is not so much having a single plan but more the
insulation from SQL injection. The intent of the given ORDER BY example was
to restricts inputs to valid identifiers rather than part of the query
expression.
Maybe what we need in ODBC libs and the like is a "protected
statement" that follows the same construction as a prepared statement but
additionally checks catalogs to validate identifiers.
Bryn
The canonical way to do this, in reasonably recent PostgreSQL versions, is to wrap your desired dynamic SQL statement in a function. Within that function construct the SQL string with the assistance of the "format(...)" function. That function has specific placeholders for literals and identifiers that will ensure that the constructed SQL string is built in a safe manner.
Then you call the function and pass in the arguments are value parameters; which the function then converts into either literal or identifiers as instructed to by the format expression.
David J.
On 01/21/2015 03:09 PM, Bryn Jeffries wrote: > Paul Jungwirth wrote >> I'm not sure how to make a prepared statement that lets you name a >> column when you execute it. Maybe someone else can chime in if that's >> possible. > > David J. responded >> You cannot. By definition parameters, in this context, are values - not >> identifiers. >> [...] >> In both situations there is no way for the planner to plan and cache a >> single query whose order by column varies. No matter what you do at best >> you can have a single plan for each explicit order by column that you wish >> to specify. > > That's what I'd figured. The motivation to use prepared statements in > application layers is not so much having a single plan but more the > insulation from SQL injection. The intent of the given ORDER BY example was > to restricts inputs to valid identifiers rather than part of the query > expression. In addition to what David said, applications/frameworks may provide that functionality. For example in Django: https://docs.djangoproject.com/en/1.7/ref/models/querysets/#order-by > > Maybe what we need in ODBC libs and the like is a "protected > statement" that follows the same construction as a prepared statement but > additionally checks catalogs to validate identifiers. > > Bryn > -- Adrian Klaver adrian.klaver@aklaver.com
Maybe what we need in ODBC libs and the like is a "protected
statement" that follows the same construction as a prepared statement but
additionally checks catalogs to validate identifiers.
I'm not sure whether this would actually be a feasible solution to the problem. Note that most frameworks (well, the format solution I outlined at least) for doing identifier replacement safely require that you actually tell the system what is expected to be an identifier and what is expected to be a data value. The general implementation is that, in the case of PostgreSQL, double-quotes will be added to the identifier value if required to make it a valid identifier. Since any injection would rely on supply mandatory quote identifiers this solves the problem quite neatly.
The one part I am not positive on is dealing with case-folding when using format's %I placeholder; this seems to be a documentation deficiency though I may just not have found it yet...or reasoned out the logical outcome (which I shouldn't need to do)...
Catalog lookups would be expensive to do pro-actively. The goal is to form a safe query for the parser and let the planner deal with any identifiers that end up being invalid either through attempted injection or simply usage errors.
David J.