Re: Fetch zero result rows when executing a query? - Mailing list pgsql-hackers

From Shay Rojansky
Subject Re: Fetch zero result rows when executing a query?
Date
Msg-id CADT4RqDTQrvHVwFG1Y0K33gESzzxUiNkO69NTf2no=NiWQfjgA@mail.gmail.com
Whole thread Raw
In response to Re: Fetch zero result rows when executing a query?  (Marko Tiikkaja <marko@joh.to>)
Responses Re: Fetch zero result rows when executing a query?  (David G Johnston <david.g.johnston@gmail.com>)
List pgsql-hackers
Sorry everyone, I was unexpectedly very busy and couldn't respond earlier... My apologies.

I'll clarify just a little... I am indeed talking about the PostgreSQL network protocol, and not about query optimization (with LIMIT or omitting RETURNING etc.). I am implementing ADO.NET's ExecuteNonQuery, through which the user indicates they're not interested in any result rows whether those exist or not. For the situation where a user does ExecuteNonQuery but the query returns result rows, the driver can save the needless network transfers. We can definitely say it's the user's fault for providing a query with a resultset to ExecuteNonQuery, but we *do* have the user's clear intention that no rows be fetched so why not act on it. I agree this isn't a terribly important optimization, the trigger for this question was first and foremost curiosity: it seems strange the protocol allows you to specify max_rows for any value other than 0.

Here's a possible believable use-case which doesn't involve user neglect: imagine some server-side function which has side-effects and also returns some rows. In some situations the user is interested in the result rows, but in others they only want the side-effect. The user would probably have no control over the function, and their only way to *not* transfer the result rows would be with a mechanism such as max_rows.

Marko, regarding your general criticism of max_rows:

> This seems to be a common pattern, and I think it's a *huge* mistake to specify maxrows=1 and/or ignore rows after the first one in
> the driver layer.  If the user says "give me the only row returned by this query", the interface should check that only one row is in
> reality returned by the query.  If the query returns more than one row, the user made a mistake in formulating the query and she
> probably wants to know about it.  If she genuinely doesn't care about the rows after the first one, she can always specify LIMIT 1.

> For a sad example, look at PL/PgSQL's  SELECT .. INTO ..; it's not terribly difficult to write a query which returns more than one
> row *by mistake* and have something really bad happen later on since it went undetected during testing because you just
> happened to get the expected row back first.  And when you do want to specifically enforce it for e.g. security critical code,
> you have to resort to really ugly hacks like window functions.

There are some problems with what you say... First, the ADO.NET API provides a SingleRow API option which explicitly provides exactly this. This API option doesn't at all mean that there *should* be only one row (i.e. an error should be raised if otherwise), but simply that any other rows beyond the first should be discarded. So regardless of what we think best practices are on this, this behavior is mandated/specified by a major API.

More to the point, doesn't max_rows=1 have exactly the same dangers as LIMIT 1? The two seem to be identical, except that one is expressed in the SQL query and the other at the network protocol level. The way I see it, if the user specifies one of them without specifying ORDER BY, they are explicitly saying they don't care which row comes out. And if their testing code fails because this is wrong, then they've made a mistake - IMHo this isn't a reason to kill the entire feature.

In general, in my view it's beneficial to separate between the SQL queries and the features that the driver is supposed to provide in its API. The SQL may be written or managed by one entity, reused in many places (some of which want all rows and others which want only 1).

pgsql-hackers by date:

Previous
From: Corey Huinker
Date:
Subject: Re: GRANT USAGE on FOREIGN SERVER exposes passwords
Next
From: David G Johnston
Date:
Subject: Re: Fetch zero result rows when executing a query?