Thread: Hmm, nodeUnique doesn't really support backwards scan too well

Hmm, nodeUnique doesn't really support backwards scan too well

From

Tom Lane

Date:

05 August 2008, 14:07:16

In the regression database:

regression=# select distinct on (ten) ten, thousand from tenk1 order by ten, thousand;
 ten | thousand
-----+----------
   0 |        0
   1 |        1
   2 |        2
   3 |        3
   4 |        4
   5 |        5
   6 |        6
   7 |        7
   8 |        8
   9 |        9
(10 rows)

This is correct, but watch this:

regression=# begin;
BEGIN
regression=# declare c cursor for
regression-# select distinct on (ten) ten, thousand from tenk1 order by ten, thousand;
DECLARE CURSOR
regression=# fetch forward all in c;
 ten | thousand
-----+----------
   0 |        0
   1 |        1
   2 |        2
   3 |        3
   4 |        4
   5 |        5
   6 |        6
   7 |        7
   8 |        8
   9 |        9
(10 rows)

regression=# fetch backward all in c;
 ten | thousand
-----+----------
   9 |      999
   8 |      998
   7 |      997
   6 |      996
   5 |      995
   4 |      994
   3 |      993
   2 |      992
   1 |      991
   0 |      990
(10 rows)

This happens in all supported releases (and even further back;
it's broken in 7.1 which is the oldest release I have running
at the moment).

The reason is that nodeUnique claims to support backwards scan, but
what it actually delivers during backwards scanning is the last
tuple (the first-encountered one) from each group, not the first
tuple (the last-encountered one) as would be needed to maintain
consistency with the forward scan direction.

We could probably fix this by complicating the logic in ExecUnique,
but I wonder whether it wouldn't be better to just stop treating
Unique nodes as backwards-scannable.  The only reason for that
node type to exist (as opposed to using Group nodes) is that it's
simple and low-overhead.  So complicating it to support a corner case
that no one has noticed in many years might be counterproductive.
Thoughts?

            regards, tom lane

Re: Hmm, nodeUnique doesn't really support backwards scan too well

From

Simon Riggs

Date:

05 August 2008, 18:56:44

On Tue, 2008-08-05 at 13:07 -0400, Tom Lane wrote:

> We could probably fix this by complicating the logic in ExecUnique,
> but I wonder whether it wouldn't be better to just stop treating
> Unique nodes as backwards-scannable.

No problem there.

>  The only reason for that
> node type to exist (as opposed to using Group nodes) is that it's
> simple and low-overhead.  So complicating it to support a corner case
> that no one has noticed in many years might be counterproductive.
> Thoughts?

I've never seen anyone scan backwards like this at all in practical use.

I knew it was possible, but never seen it done.

It seems entirely probable nobody else has either. It's a PostgreSQL
extension, so people arriving from outside don't even know it exists,
plus its always had bugs so those in-the-know don't use it either:
http://archives.postgresql.org/pgsql-bugs/1998-06/msg00049.php

My perceptions may not match others...

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

Re: Hmm, nodeUnique doesn't really support backwards scan too well

From

Tom Lane

Date:

05 August 2008, 19:40:30

Simon Riggs <simon@2ndquadrant.com> writes:
> I've never seen anyone scan backwards like this at all in practical use.

> I knew it was possible, but never seen it done.

> It seems entirely probable nobody else has either. It's a PostgreSQL
> extension, so people arriving from outside don't even know it exists,

Say again?  Both the SCROLL option to DECLARE CURSOR and FETCH PRIOR are
straight out of the SQL92 spec.

            regards, tom lane

Re: Hmm, nodeUnique doesn't really support backwards scan too well

From

"Jaime Casanova"

Date:

05 August 2008, 20:00:15

On 8/5/08, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > I've never seen anyone scan backwards like this at all in practical use.
>
> > I knew it was possible, but never seen it done.
>
> > It seems entirely probable nobody else has either. It's a PostgreSQL
> > extension, so people arriving from outside don't even know it exists,
>
> Say again?  Both the SCROLL option to DECLARE CURSOR and FETCH PRIOR are
> straight out of the SQL92 spec.
>

i think Simon is talking about DISTINCT ON

--=20
regards,
Jaime Casanova
Soporte y capacitaci=F3n de PostgreSQL
Guayaquil - Ecuador
Cel. (593) 87171157

Re: Hmm, nodeUnique doesn't really support backwards scan too well

From

Simon Riggs

Date:

06 August 2008, 01:11:53

On Tue, 2008-08-05 at 18:00 -0500, Jaime Casanova wrote:
> On 8/5/08, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Simon Riggs <simon@2ndquadrant.com> writes:
> > > I've never seen anyone scan backwards like this at all in practical use.
> >
> > > I knew it was possible, but never seen it done.
> >
> > > It seems entirely probable nobody else has either. It's a PostgreSQL
> > > extension, so people arriving from outside don't even know it exists,
> >
> > Say again?  Both the SCROLL option to DECLARE CURSOR and FETCH PRIOR are
> > straight out of the SQL92 spec.

Yep. I was talking about FETCH BACKWARDS n | ALL
though forgot that this is the same thing as FETCH PRIOR.

But even use of FETCH PRIOR is fairly rare, so IMHO your proposal is
still safe.

> i think Simon is talking about DISTINCT ON

No, don't confuse things!

That is rare also, but I know of various places that is used.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support

Re: Hmm, nodeUnique doesn't really support backwards scan too well

From

Gregory Stark

Date:

06 August 2008, 11:21:34

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> We could probably fix this by complicating the logic in ExecUnique,
> but I wonder whether it wouldn't be better to just stop treating
> Unique nodes as backwards-scannable.  The only reason for that
> node type to exist (as opposed to using Group nodes) is that it's
> simple and low-overhead.  So complicating it to support a corner case
> that no one has noticed in many years might be counterproductive.
> Thoughts?

Hm, that has the nasty side effect that someone who uses SCROLL but doesn't
fetch backwards much or at all suddenly gets a much more expensive plan than
if they didn't.

On the other hand someone who does actually use the scrollability of the
cursor to fetch forward and backwards a lot, repeatedly fetching the same
records, would actually get significantly better performance out of a
materialized result than having to skip over the duplicates repeatedly.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!

Re: Hmm, nodeUnique doesn't really support backwards scan too well

From

Tom Lane

Date:

06 August 2008, 11:40:04

Gregory Stark <stark@enterprisedb.com> writes:
> Hm, that has the nasty side effect that someone who uses SCROLL but doesn't
> fetch backwards much or at all suddenly gets a much more expensive plan than
> if they didn't.

Well, what are they using SCROLL for if they don't need it?

A more plausible objection is that previously, (some) cursors using
SELECT DISTINCT would support backwards fetch even if you hadn't said
SCROLL.  The bug I'm concerned about is only manifest with SELECT
DISTINCT ON, so someone might well be happily using DISTINCT in a way
that is affected.  So there might be apps out there that are working
today and will stop working after this change.  But they are very
clearly breaking the rules so I don't have a huge amount of sympathy for
them.  If we were to take this argument seriously, it would mean that
we'd have to not only complicate ExecUnique but back-patch the result
clear back to 7.4.  I'm not even sure how to fix it (the nasty case
is changing directions partway through the scan); let alone how to
fix it in a way that's obviously enough right to make me feel
comfortable in back-patching.

            regards, tom lane

Re: Hmm, nodeUnique doesn't really support backwards scan too well

From

Gregory Stark

Date:

06 August 2008, 16:28:26

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> Hm, that has the nasty side effect that someone who uses SCROLL but doesn't
>> fetch backwards much or at all suddenly gets a much more expensive plan than
>> if they didn't.
>
> Well, what are they using SCROLL for if they don't need it?
>
> A more plausible objection is that previously, (some) cursors using
> SELECT DISTINCT would support backwards fetch even if you hadn't said
> SCROLL.  The bug I'm concerned about is only manifest with SELECT
> DISTINCT ON, so someone might well be happily using DISTINCT in a way
> that is affected.  So there might be apps out there that are working
> today and will stop working after this change.

I must be missing something. Couldn't we just make the paths non-reversible if
there's a DISTINCT ON?

> But they are very clearly breaking the rules so I don't have a huge amount
> of sympathy for them. If we were to take this argument seriously, it would
> mean that we'd have to not only complicate ExecUnique but back-patch the
> result clear back to 7.4. I'm not even sure how to fix it (the nasty case is
> changing directions partway through the scan); let alone how to fix it in a
> way that's obviously enough right to make me feel comfortable in
> back-patching.

It seems like the obvious fix is to just reverse the behaviour -- keep reading
backwards until you see the level break then return the previous record from a
second slot.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

Re: Hmm, nodeUnique doesn't really support backwards scan too well

From

Tom Lane

Date:

06 August 2008, 16:38:00

Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> ...  I'm not even sure how to fix it (the nasty case is
>> changing directions partway through the scan); let alone how to fix it in a
>> way that's obviously enough right to make me feel comfortable in
>> back-patching.

> It seems like the obvious fix is to just reverse the behaviour -- keep
> reading backwards until you see the level break then return the
> previous record from a second slot.

Well, if you think it's easy, the best form of criticism is a patch.
The change-of-direction problem seems to me to be messy --- not
insoluble, but messy enough to need beta testing.

            regards, tom lane

Re: Hmm, nodeUnique doesn't really support backwards scan too well

From

Gregory Stark

Date:

07 August 2008, 16:46:53

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>>> ...  I'm not even sure how to fix it (the nasty case is
>>> changing directions partway through the scan); let alone how to fix it in a
>>> way that's obviously enough right to make me feel comfortable in
>>> back-patching.
>
>> It seems like the obvious fix is to just reverse the behaviour -- keep
>> reading backwards until you see the level break then return the
>> previous record from a second slot.
>
> Well, if you think it's easy, the best form of criticism is a patch.
> The change-of-direction problem seems to me to be messy --- not
> insoluble, but messy enough to need beta testing.

Hm, I must have misunderstood the bug because there's a comment in nodeUnique
which claims it already does precisely what I was suggesting:

     * We return the first tuple from each group of duplicates (or the last
     * tuple of each group, when moving backwards).  At either end of the
     * subplan, clear the result slot so that we correctly return the
     * first/last tuple when reversing direction.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

Re: Hmm, nodeUnique doesn't really support backwards scan too well

From

Tom Lane

Date:

07 August 2008, 16:50:44

Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> Well, if you think it's easy, the best form of criticism is a patch.
>> The change-of-direction problem seems to me to be messy --- not
>> insoluble, but messy enough to need beta testing.

> Hm, I must have misunderstood the bug because there's a comment in nodeUnique
> which claims it already does precisely what I was suggesting:

>      * We return the first tuple from each group of duplicates (or the last
>      * tuple of each group, when moving backwards).  At either end of the
>      * subplan, clear the result slot so that we correctly return the
>      * first/last tuple when reversing direction.

That's what it *used* to say.  But the problem is that that's the wrong
behavior, because you get different tuples returned depending on which way
you are traveling.  It's only workable if the tuples in a group are
completely indistinguishable.

            regards, tom lane

Re: Hmm, nodeUnique doesn't really support backwards scan too well

From

Gregory Stark

Date:

07 August 2008, 16:59:54

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>>> Well, if you think it's easy, the best form of criticism is a patch.
>>> The change-of-direction problem seems to me to be messy --- not
>>> insoluble, but messy enough to need beta testing.
>
>> Hm, I must have misunderstood the bug because there's a comment in nodeUnique
>> which claims it already does precisely what I was suggesting:
>
>>      * We return the first tuple from each group of duplicates (or the last
>>      * tuple of each group, when moving backwards).  At either end of the
>>      * subplan, clear the result slot so that we correctly return the
>>      * first/last tuple when reversing direction.
>
> That's what it *used* to say.  But the problem is that that's the wrong
> behavior, because you get different tuples returned depending on which way
> you are traveling.  It's only workable if the tuples in a group are
> completely indistinguishable.

Oh egads. I see what it's trying to say now. I assumed it meant it worked
*properly* meaning it returned the "last tuple of each group" returned by the
child node as it scanned backwards.

What it actually means it say is that it is *intentionally* behaving
incorrectly! It's returning the last tuple of the set as it scans backward
meaning the first tuple that comes out scanning backwards.

Sigh.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!