Re: FETCH FIRST clause PERCENT option - Mailing list pgsql-hackers

From Surafel Temesgen
Subject Re: FETCH FIRST clause PERCENT option
Date
Msg-id CALAY4q950E1veZENqCB9akHqK8KbC0KkCjOXVWRZaa0-ku8R7w@mail.gmail.com
Whole thread Raw
In response to Re: FETCH FIRST clause PERCENT option  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers


On Fri, Mar 1, 2019 at 4:33 AM Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello.

At Thu, 28 Feb 2019 21:16:25 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <fbd08ad3-5dd8-3169-6cba-38d610d7be7f@2ndquadrant.com>
> > One biggest issue seems to be we don't know the total number of

# One *of* the biggest *issues*?

> > outer tuples before actually reading a null tuple. I doubt of
> > general shortcut for that. It also seems preventing limit node
> > from just using materialized outer.
> >
>
> Sure, if you actually want all tuples, you'll have to execute the outer
> plan till completion. But that's not what I'm talking about - what if we
> only ever need to read one row from the limit?

We have no choice than once reading all tuples just to find we
are to return just one row, since estimator is not guaranteed to
be exact as required for this purpose.

> To give you a (admittedly, somewhat contrived and artificial example):
>
>     SELECT * FROM t1 WHERE id IN (
>       SELECT id FROM t2 ORDER BY x FETCH FIRST 10 PERCENT ROWS ONLY
>     );
>
> Maybe this example is bogus and/or does not really matter in practice. I
> don't know, but I've been unable to convince myself that's the case.

I see such kind of idiom common. Even in the quite simple example
above, *we* cannot tell how many tuples the inner should return
unless we actually fetch all tuples in t2. This is the same
problem with count(*).

The query is equivalent to the folloing one.

 SELECT * FROM t1 WHERE id IN (
   SELECT id FROM t2 ORDER BY x
     FETCH FIRST (SELECT ceil(count(*) * 0.1) FROM t2) ROWS ONLY
 );

This scans t2 twice, but this patch does only one full scan
moving another partial scan to tuplestore. We would win if the
outer is complex enough.

Okay here is the previous implementation with uptread review comment
included and it also consider OFFSET clause in percentage calculation

regards
Surafel 
Attachment

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Problems with plan estimates in postgres_fdw
Next
From: David Rowley
Date:
Subject: Re: NOT IN subquery optimization