Thread: ORDER BY in prepared statements

ORDER BY in prepared statements

From
Bryn Jeffries
Date:
In a number of places on the web I've seen it claimed that ordering can be set via prepared statements. Indeed, the expected syntax is accepted on my 9.3 server without errors:

sandbox=# CREATE TABLE test (
id serial PRIMARY KEY,
gender char
);

sandbox=# INSERT INTO test(gender)  VALUES('m') VALUES('f') VALUES('m') VALUES('f') VALUES('m');

sandbox=# PREPARE testplan(text) AS
SELECT * FROM test ORDER BY $1;

But the output is not what one would expect:

sandbox=# EXECUTE testplan('gender');
id | gender
----+--------
  1 | m
  2 | f
  3 | m
  4 | f
  5 | m
  6 | f
(6 rows)

As opposed to:
sandbox=# SELECT * FROM test ORDER BY gender;
 id | gender
----+--------
  2 | f
  4 | f
  6 | f
  1 | m
  3 | m
  5 | m
(6 rows)

It would seem that the ORDER BY clause is simply ignored in the prepared statement. Is this deliberate behaviour? I can well understand that supporting this kind of query would be tricky, but it would be very handy.

Many thanks,

Bryn

Re: ORDER BY in prepared statements

From
Paul Jungwirth
Date:
> In a number of places on the web I've seen it claimed that ordering can be
> set via prepared statements.
> ...
> sandbox=# PREPARE testplan(text) AS
> SELECT * FROM test ORDER BY $1;
>
> But the output is not what one would expect:
>
> sandbox=# EXECUTE testplan('gender');
> ...
> As opposed to:
> sandbox=# SELECT * FROM test ORDER BY gender;

Your prepared statement version is actually comparable to this SQL:

    SELECT * FROM test ORDER BY 'gender'

which is effectually ordering by random.

I'm not sure how to make a prepared statement that lets you name a
column when you execute it. Maybe someone else can chime in if that's
possible.

Paul


--
_________________________________
Pulchritudo splendor veritatis.


Re: ORDER BY in prepared statements

From
Adrian Klaver
Date:
On 01/21/2015 12:51 PM, Bryn Jeffries wrote:
> In a number of places on the web I've seen it claimed that ordering can
> be set via prepared statements.

Can you give a link to one of those examples?

>
> Many thanks,
>
> Bryn


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: ORDER BY in prepared statements

From
David G Johnston
Date:
Paul Jungwirth wrote
>> In a number of places on the web I've seen it claimed that ordering can
>> be
>> set via prepared statements.
>> ...
>> sandbox=# PREPARE testplan(text) AS
>> SELECT * FROM test ORDER BY $1;
>>
>> But the output is not what one would expect:
>>
>> sandbox=# EXECUTE testplan('gender');
>> ...
>> As opposed to:
>> sandbox=# SELECT * FROM test ORDER BY gender;
>
> Your prepared statement version is actually comparable to this SQL:
>
>     SELECT * FROM test ORDER BY 'gender'
>
> which is effectually ordering by random.
>
> I'm not sure how to make a prepared statement that lets you name a
> column when you execute it. Maybe someone else can chime in if that's
> possible.
>
> Paul

You cannot.  By definition parameters, in this context, are values - not
identifiers.  Queries with variable identifiers are called "dynamic SQL" and
can only be realized via the EXECUTE statement in pl/pgsql.  Yes, same name
different behavior because it is a different language.

http://www.postgresql.org/docs/9.4/interactive/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN

It too has a "prepare" capability (USING) that is also limited to data
values and not identifiers.  Basically what this gives you is an
easy-to-access language and structure (i.e., function) to execute dynamic
SQL.  You can accomplish the same thing in whatever language and client
library you are using by creating a dynamic SQL statement to pass to SQL
PREPARE.

In both situations there is no way for the planner to plan and cache a
single query whose order by column varies.  No matter what you do at best
you can have a single plan for each explicit order by column that you wish
to specify.

David J.







--
View this message in context: http://postgresql.nabble.com/ORDER-BY-in-prepared-statements-tp5834944p5834948.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


Re: ORDER BY in prepared statements

From
Bryn Jeffries
Date:
Sorry, I can't find any now. It's cropped up in a few forums, in the context of executing queries from web services. Clearly not significantly enough to show up in Google...
----- Reply message -----
From: "Adrian Klaver" <adrian.klaver@aklaver.com>
To: "Bryn Jeffries" <bryn.jeffries@sydney.edu.au>, "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: [GENERAL] ORDER BY in prepared statements
Date: Thu, Jan 22, 2015 08:18

On 01/21/2015 12:51 PM, Bryn Jeffries wrote:
> In a number of places on the web I've seen it claimed that ordering can
> be set via prepared statements.

Can you give a link to one of those examples?

>
> Many thanks,
>
> Bryn


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: ORDER BY in prepared statements

From
Bryn Jeffries
Date:
Paul Jungwirth wrote
> I'm not sure how to make a prepared statement that lets you name a
> column when you execute it. Maybe someone else can chime in if that's
> possible.

David J. responded
> You cannot.  By definition parameters, in this context, are values - not
> identifiers.
> [...]
> In both situations there is no way for the planner to plan and cache a
> single query whose order by column varies.  No matter what you do at best
> you can have a single plan for each explicit order by column that you wish
> to specify.

That's what I'd figured. The motivation to use prepared statements in
application layers is not so much having a single plan but more the
insulation from SQL injection. The intent of the given ORDER BY example was
to restricts inputs to valid identifiers rather than part of the query
expression.

Maybe what we need in ODBC libs and the like is a "protected
statement" that follows the same construction as a prepared statement but
additionally checks catalogs to validate identifiers.

Bryn

Re: ORDER BY in prepared statements

From
David Johnston
Date:
On Wed, Jan 21, 2015 at 4:09 PM, Bryn Jeffries <bryn.jeffries@sydney.edu.au> wrote:
Paul Jungwirth wrote
> I'm not sure how to make a prepared statement that lets you name a
> column when you execute it. Maybe someone else can chime in if that's
> possible.

David J. responded
> You cannot.  By definition parameters, in this context, are values - not
> identifiers.
> [...]
> In both situations there is no way for the planner to plan and cache a
> single query whose order by column varies.  No matter what you do at best
> you can have a single plan for each explicit order by column that you wish
> to specify.

That's what I'd figured. The motivation to use prepared statements in
application layers is not so much having a single plan but more the
insulation from SQL injection. The intent of the given ORDER BY example was
to restricts inputs to valid identifiers rather than part of the query
expression.

Maybe what we need in ODBC libs and the like is a "protected
statement" that follows the same construction as a prepared statement but
additionally checks catalogs to validate identifiers.

Bryn

​The canonical way to do this, in reasonably recent PostgreSQL versions, is to wrap your desired dynamic SQL statement in a function.  Within that function construct the SQL string with the assistance of the "format(...)" function.  That function has specific placeholders for literals and identifiers that will ensure that the constructed SQL string is built in a safe manner.


Then you call the function and pass in the arguments are value parameters; which the function then converts into either literal or identifiers as instructed to by the format expression.

David J.

Re: ORDER BY in prepared statements

From
Adrian Klaver
Date:
On 01/21/2015 03:09 PM, Bryn Jeffries wrote:
> Paul Jungwirth wrote
>> I'm not sure how to make a prepared statement that lets you name a
>> column when you execute it. Maybe someone else can chime in if that's
>> possible.
>
> David J. responded
>> You cannot.  By definition parameters, in this context, are values - not
>> identifiers.
>> [...]
>> In both situations there is no way for the planner to plan and cache a
>> single query whose order by column varies.  No matter what you do at best
>> you can have a single plan for each explicit order by column that you wish
>> to specify.
>
> That's what I'd figured. The motivation to use prepared statements in
> application layers is not so much having a single plan but more the
> insulation from SQL injection. The intent of the given ORDER BY example was
> to restricts inputs to valid identifiers rather than part of the query
> expression.


In addition to what David said, applications/frameworks may provide that
functionality. For example in Django:

https://docs.djangoproject.com/en/1.7/ref/models/querysets/#order-by


>
> Maybe what we need in ODBC libs and the like is a "protected
> statement" that follows the same construction as a prepared statement but
> additionally checks catalogs to validate identifiers.
>
> Bryn
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: ORDER BY in prepared statements

From
David Johnston
Date:

On Wed, Jan 21, 2015 at 4:09 PM, Bryn Jeffries
<bryn.jeffries@sydney.edu.au> wrote:

Maybe what we need in ODBC libs and the like is a "protected
statement" that follows the same construction as a prepared statement but
additionally checks catalogs to validate identifiers.

​I'm not sure whether this would actually be a feasible solution to the problem.  Note that most frameworks (well, the format solution I outlined at least) for doing identifier replacement safely require that you actually tell the system what is expected to be an identifier and what is expected to be a data value.​  The general implementation is that, in the case of PostgreSQL, double-quotes will be added to the identifier value if required to make it a valid identifier. Since any injection would rely on supply mandatory quote identifiers this solves the problem quite neatly.

​The one part I am not positive on is dealing with case-folding when using format's %I placeholder; this seems to be a documentation deficiency though I may just not have found it yet...or reasoned out the logical outcome (which I shouldn't need to do)...

Catalog lookups would be expensive to do pro-actively.  The goal is to form a safe query for the parser and let the planner deal with any identifiers that end up being invalid either through attempted injection or simply usage errors.

David J.