Thread: FETCH FIRST clause PERCENT option

FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:

FETCH FIRST with PERCENT option is SQL standard that state limit count to be specified in a percentage in addition to specify it in exact count and listed as one of Major features simply not implemented yet in recent wiki page [1].

I implemented it by executing the query plan twice. One without limit filter to find the total number of row that will be feed to limit node so the exact limit count can be calculated and the query plan execute again with limit filter with newly calculated exact count .


[1].https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL_Standard


Regards

Surafel

Attachment

Re: FETCH FIRST clause PERCENT option

From
Andres Freund
Date:
Hi,

On 2018-08-16 17:27:45 +0300, Surafel Temesgen wrote:
> FETCH FIRST with PERCENT option is SQL standard that state limit count to
> be specified in a percentage in addition to specify it in exact count and
> listed as one of Major features simply not implemented yet in recent wiki
> page [1].
> 
> I implemented it by executing the query plan twice. One without limit
> filter to find the total number of row that will be feed to limit node so
> the exact limit count can be calculated and the query plan execute again
> with limit filter with newly calculated exact count .

Won't that have rather massive issues with multiple evaluations of
clauses in the query?  Besides being really expensive?

I think you'd have to instead spill the query results into a tuplestore
(like we do for FOR HOLD queries at end of xact), and then do the
computations based on that.

- Andres


Re: FETCH FIRST clause PERCENT option

From
Andrew Gierth
Date:
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes:

 Andres> I think you'd have to instead spill the query results into a
 Andres> tuplestore (like we do for FOR HOLD queries at end of xact),
 Andres> and then do the computations based on that.

The approach I've suggested to some people who wanted to look into this
was to inject a window function call into the query, and teach the Limit
node to be able to stop on a boolean condition (which would be based on
that function result). This method should also work for WITH TIES by
using a different window function.

-- 
Andrew (irc:RhodiumToad)


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Thu, Aug 16, 2018 at 5:34 PM, Andres Freund <andres@anarazel.de> wrote:

Won't that have rather massive issues with multiple evaluations of
clauses in the query?  Besides being really expensive?

The plan re-scane after first execution I can’t see issue for multiple execution of a clause in this case


I think you'd have to instead spill the query results into a tuplestore

The downside of it is all the result have to be stored even if needed tuple is a fraction of it and also store it for longer so the choice became memory or cpu utilization  


- Andres
Regards
Surafel

Re: FETCH FIRST clause PERCENT option

From
Andres Freund
Date:
On 2018-08-21 15:47:07 +0300, Surafel Temesgen wrote:
> On Thu, Aug 16, 2018 at 5:34 PM, Andres Freund <andres@anarazel.de> wrote:
> 
> >
> > Won't that have rather massive issues with multiple evaluations of
> > clauses in the query?  Besides being really expensive?
> >
> The plan re-scane after first execution I can’t see issue for multiple
> execution of a clause in this case

Imagine volatile functions being used. That can be problematic because
of repeated side-effects, but also will lead to plainly wrong
results. Consider what happens with your approach where somebody does
something like WHERE value < random().

Greetings,

Andres Freund


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Tue, Aug 21, 2018 at 3:50 PM Andres Freund <andres@anarazel.de> wrote:

Imagine volatile functions being used. That can be problematic because
of repeated side-effects, but also will lead to plainly wrong
results. Consider what happens with your approach where somebody does
something like WHERE value < random().
Thanks for explaining . The attach patch use tuplestore instead
Regards
Surafel
Attachment

Re: FETCH FIRST clause PERCENT option

From
Erik Rijkers
Date:
On 2018-08-28 14:14, Surafel Temesgen wrote:
> On Tue, Aug 21, 2018 at 3:50 PM Andres Freund <andres@anarazel.de> 
> wrote:
> 
>> 
>> Imagine volatile functions being used. That can be problematic because
>> of repeated side-effects, but also will lead to plainly wrong
>> results. Consider what happens with your approach where somebody does
>> something like WHERE value < random().
>> 
> Thanks for explaining . The attach patch use tuplestore instead

> [fetch-wth-percent-v2.patch]

I had a quick look at this.  Apply, compile, make check are ok.

In straightforward usage seems to work ok.


But I also ran across this crash:

create table if not exists t as
    select i from generate_series(1, 100) as f(i);

select * from t
   offset 60 rows
   fetch next 3 percent rows only
   FOR UPDATE
;

TRAP: FailedAssertion("!(slot != ((void *)0))", File: "execTuples.c", 
Line: 428)




Erik Rijkers




















Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Tue, Aug 28, 2018 at 7:33 PM Erik Rijkers <er@xs4all.nl> wrote:
;

TRAP: FailedAssertion("!(slot != ((void *)0))", File: "execTuples.c",
Line: 42
The attache patch include a fix for the crash .can you check it again?
Regards
Surafel
Attachment

Re: FETCH FIRST clause PERCENT option

From
Thomas Munro
Date:
On Fri, Aug 31, 2018 at 11:42 PM Surafel Temesgen <surafel3000@gmail.com> wrote:
> On Tue, Aug 28, 2018 at 7:33 PM Erik Rijkers <er@xs4all.nl> wrote:
>>
>> ;
>>
>> TRAP: FailedAssertion("!(slot != ((void *)0))", File: "execTuples.c",
>> Line: 42
>
> The attache patch include a fix for the crash .can you check it again?

FYI this fails[1] in src/test/modules/test_ddl_deparse's create_table
test because of the keyword:

  CREATE TABLE stud_emp (
  percent int4
  ) INHERITS (emp, student);
! ERROR:  syntax error at or near "percent"

[1] https://travis-ci.org/postgresql-cfbot/postgresql/builds/430777123

--
Thomas Munro
http://www.enterprisedb.com


Re: FETCH FIRST clause PERCENT option

From
Mark Dilger
Date:

> On Aug 16, 2018, at 7:34 AM, Andres Freund <andres@anarazel.de> wrote:
> 
> Hi,
> 
> On 2018-08-16 17:27:45 +0300, Surafel Temesgen wrote:
>> FETCH FIRST with PERCENT option is SQL standard that state limit count to
>> be specified in a percentage in addition to specify it in exact count and
>> listed as one of Major features simply not implemented yet in recent wiki
>> page [1].
>> 
>> I implemented it by executing the query plan twice. One without limit
>> filter to find the total number of row that will be feed to limit node so
>> the exact limit count can be calculated and the query plan execute again
>> with limit filter with newly calculated exact count .

Surafel, there are no regression tests that I can see in your patch.  It
would help if you added some, as then I could precisely what behavior you
are expecting.  As it is, I'm just guessing here, but here goes....


> Won't that have rather massive issues with multiple evaluations of
> clauses in the query?  Besides being really expensive?
> 
> I think you'd have to instead spill the query results into a tuplestore
> (like we do for FOR HOLD queries at end of xact), and then do the
> computations based on that.


I should think that spilling anything to a tuplestore would only be needed
if the query contains an ORDER BY expression.  If you query

    FETCH FIRST 50 PERCENT * FROM foo;

you should just return every other row, discarding the rest, right?  It's
only when an explicit ordering is given that the need to store the results
arises.  Even with

    FETCH FIRST 50 PERCENT name FROM foo ORDER BY name;

you can return one row for every two rows that you get back from the
sort node, reducing the maximum number you need to store at any time to
no more than 25% of all rows.

Or am I missing something?

mark


Re: FETCH FIRST clause PERCENT option

From
Andres Freund
Date:
Hi,

On 2018-09-20 17:06:36 -0700, Mark Dilger wrote:
> I should think that spilling anything to a tuplestore would only be needed
> if the query contains an ORDER BY expression.  If you query
> 
>     FETCH FIRST 50 PERCENT * FROM foo;
> 
> you should just return every other row, discarding the rest, right?  It's
> only when an explicit ordering is given that the need to store the results
> arises.  Even with
> 
>     FETCH FIRST 50 PERCENT name FROM foo ORDER BY name;
> 
> you can return one row for every two rows that you get back from the
> sort node, reducing the maximum number you need to store at any time to
> no more than 25% of all rows.

I'm doubtful about the validity of these optimizations, particularly
around being surprising. But I think more importantly, we should focus
on the basic implementation that's needed anyway.

Greetings,

Andres Freund


Re: FETCH FIRST clause PERCENT option

From
Mark Dilger
Date:

> On Sep 20, 2018, at 5:29 PM, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2018-09-20 17:06:36 -0700, Mark Dilger wrote:
>> I should think that spilling anything to a tuplestore would only be needed
>> if the query contains an ORDER BY expression.  If you query
>>
>>     FETCH FIRST 50 PERCENT * FROM foo;
>>
>> you should just return every other row, discarding the rest, right?  It's
>> only when an explicit ordering is given that the need to store the results
>> arises.  Even with
>>
>>     FETCH FIRST 50 PERCENT name FROM foo ORDER BY name;
>>
>> you can return one row for every two rows that you get back from the
>> sort node, reducing the maximum number you need to store at any time to
>> no more than 25% of all rows.
>
> I'm doubtful about the validity of these optimizations, particularly
> around being surprising. But I think more importantly, we should focus
> on the basic implementation that's needed anyway.

You may be right that getting the basic implementation finished first
is better than optimizing at this stage.  So the rest of what I'm going
to say is just in defense of the optimization, and not an argument for
needing to optimize right away.

As for reducing the surprise factor, I think that it would be surprising
if I ask for a smallish percentage of rows and it takes significantly longer
and significantly more memory or disk than asking for all the rows takes.
If I'm including an explicit ORDER BY, then that explains it, but otherwise,
I'd be surprised.  Note that I'm not saying I'd be surprised by it taking
roughly the same length of time / memory / disk.  I'd only be surprised if
it took a lot more.

There are plenty of SQL generation engines that people put in their software.
I'd expect something like

    sprintf("FETCH FIRST %d PERCENT %s FROM %s", percentage, columns, tablename)

to show up in such engines, and percentage to sometimes be 100.  At least
in that case you should just return all rows rather than dumping them into
a tuplestore.  Likewise, if the percentage is 0, you'll want to finish quickly.
Actually, I don't know if the SQL spec would require side effects to still
happen, in which case you'd still have to generate all rows for their side
effects to happen, and then just not return them.  But still, no tuplestore.
So the implementation of FETCH FIRST would at least need to think about what
percentage is being requested, rather than just mindlessly adding a node to
the tree for storing everything, then computing the LIMIT based on the number
of rows stored, and then returning that number of rows.

mark

Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:
hey

On 9/21/18, Mark Dilger <hornschnorter@gmail.com> wrote:

> Surafel, there are no regression tests that I can see in your patch.  It
> would help if you added some, as then I could precisely what behavior you
> are expecting.
thank you for looking at it .the attach patch add regression tests

Attachment

Re: FETCH FIRST clause PERCENT option

From
Mark Dilger
Date:

> On Sep 25, 2018, at 5:07 AM, Surafel Temesgen <surafel3000@gmail.com> wrote:
>
> hey
>
> On 9/21/18, Mark Dilger <hornschnorter@gmail.com> wrote:
>
>> Surafel, there are no regression tests that I can see in your patch.  It
>> would help if you added some, as then I could precisely what behavior you
>> are expecting.
> thank you for looking at it .the attach patch add regression tests
> <fetch-wth-percent-v4.patch>

Surafel,

I messed around with your changes to the grammar and it seems you don't
need to add PERCENT as a reserved keyword.  Moving this to the unreserved
keyword section does not cause any shift/reduce errors, and the regression
tests still pass.  Relative to your patch v4, these changes help:

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a97ee30924..9872cf7257 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -15186,6 +15186,7 @@ unreserved_keyword:
                        | PARTITION
                        | PASSING
                        | PASSWORD
+                       | PERCENT
                        | PLANS
                        | POLICY
                        | PRECEDING
@@ -15465,7 +15466,6 @@ reserved_keyword:
                        | ONLY
                        | OR
                        | ORDER
-                       | PERCENT
                        | PLACING
                        | PRIMARY
                        | REFERENCES
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 150d51df8f..f23fee0653 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -300,7 +300,7 @@ PG_KEYWORD("partial", PARTIAL, UNRESERVED_KEYWORD)
 PG_KEYWORD("partition", PARTITION, UNRESERVED_KEYWORD)
 PG_KEYWORD("passing", PASSING, UNRESERVED_KEYWORD)
 PG_KEYWORD("password", PASSWORD, UNRESERVED_KEYWORD)
-PG_KEYWORD("percent", PERCENT, RESERVED_KEYWORD)
+PG_KEYWORD("percent", PERCENT, UNRESERVED_KEYWORD)
 PG_KEYWORD("placing", PLACING, RESERVED_KEYWORD)
 PG_KEYWORD("plans", PLANS, UNRESERVED_KEYWORD)
 PG_KEYWORD("policy", POLICY, UNRESERVED_KEYWORD)



Re: FETCH FIRST clause PERCENT option

From
Mark Dilger
Date:

> On Sep 25, 2018, at 8:08 AM, Mark Dilger <hornschnorter@gmail.com> wrote:
>
>
>
>> On Sep 25, 2018, at 5:07 AM, Surafel Temesgen <surafel3000@gmail.com> wrote:
>>
>> hey
>>
>> On 9/21/18, Mark Dilger <hornschnorter@gmail.com> wrote:
>>
>>> Surafel, there are no regression tests that I can see in your patch.  It
>>> would help if you added some, as then I could precisely what behavior you
>>> are expecting.
>> thank you for looking at it .the attach patch add regression tests
>> <fetch-wth-percent-v4.patch>
>
> Surafel,
>
> I messed around with your changes to the grammar and it seems you don't
> need to add PERCENT as a reserved keyword.  Moving this to the unreserved
> keyword section does not cause any shift/reduce errors, and the regression
> tests still pass.  Relative to your patch v4, these changes help:

I spoke too soon.  The main regression tests pass, but your change to
src/test/modules/test_ddl_deparse/sql/create_table.sql per Thomas's
suggestion is no longer needed, since PERCENT no longer needs to be
quoted.

I recommend you also apply the following to your v4 patch, which just
rolls back that one change you made, and at least for me, is enough
to get `make check-world` to pass:

diff --git a/src/test/modules/test_ddl_deparse/sql/create_table.sql
b/src/test/modules/test_ddl_deparse/sql/create_table.sql
index 4325de2d04..5e78452729 100644
--- a/src/test/modules/test_ddl_deparse/sql/create_table.sql
+++ b/src/test/modules/test_ddl_deparse/sql/create_table.sql
@@ -94,7 +94,7 @@ CREATE TABLE student (
 ) INHERITS (person);

 CREATE TABLE stud_emp (
-       "percent"       int4
+       percent         int4
 ) INHERITS (emp, student);




Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:




I messed around with your changes to the grammar and it seems you don't
need to add PERCENT as a reserved keyword.  Moving this to the unreserved
keyword section does not cause any shift/reduce errors, and the regression
tests still pass.  Relative to your patch v4, these changes help:


In sql standard PERCENT list as reserved world so i don't think it is a thing that can change by me.
I think the main reason create_table fail in  test_ddl_deparse  is because i don't surround PERCENT key word in
/src/test/regress/expected/create_table.out too

regards
Surafel
Attachment

Re: FETCH FIRST clause PERCENT option

From
Vik Fearing
Date:
On 25/11/2018 10:00, Surafel Temesgen wrote:
> 
> 
> 
> 
>     I messed around with your changes to the grammar and it seems you don't
>     need to add PERCENT as a reserved keyword.  Moving this to the
>     unreserved
>     keyword section does not cause any shift/reduce errors, and the
>     regression
>     tests still pass.  Relative to your patch v4, these changes help:
> 
> 
> In sql standard PERCENT list as reserved world so i don't think it is a
> thing that can change by me.

Yes it absolutely is.  In PostgreSQL we only make words as reserved as
they need to be, and PERCENT doesn't need to be reserved at all.

Also, this query returns 210 rows instead of the expected 208:

    select *
    from generate_series(1, 1000)
    fetch first 20.8 percent rows only

As for the design, I think you should put some serious consideration
into Andrew Gierth's approach.  I would rather see something like that.
-- 
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Sun, Nov 25, 2018 at 1:24 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:

Also, this query returns 210 rows instead of the expected 208:

    select *
    from generate_series(1, 1000)
    fetch first 20.8 percent rows only

 

this is because fetch first values work with integer and it change fractional number to nearest integer number like select * from generate_series(1, 1000) fetch first 20.3 rows only; is not an error rather it return 20 rows.

 
As for the design, I think you should put some serious consideration
into Andrew Gierth's approach.  I would rather see something like that.

 

I didn't go in that direction because I don’t understand why this approach is inferior to using window function and I am not fully understand on how to implement it.

regards

Surafel
 

Re: FETCH FIRST clause PERCENT option

From
Vik Fearing
Date:
On 25/11/2018 12:49, Surafel Temesgen wrote:
> 
> 
> On Sun, Nov 25, 2018 at 1:24 PM Vik Fearing <vik.fearing@2ndquadrant.com
> <mailto:vik.fearing@2ndquadrant.com>> wrote:
> 
> 
>     Also, this query returns 210 rows instead of the expected 208:
> 
>         select *
>         from generate_series(1, 1000)
>         fetch first 20.8 percent rows only
> 
> this is because fetch first values work with integer and it change
> fractional number to nearest integer number like select * from
> generate_series(1, 1000) fetch first 20.3 rows only; is not an error
> rather it return 20 rows.

I don't see how this behavior is justified by reading the SQL standard.
 Obviously only an integer number of rows is going to be returned, but
the percentage should be calculated correctly.

I assume you meant 200 rows there, but the correct number of rows to
return is 203 for that query.  Please fix it.
-- 
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: FETCH FIRST clause PERCENT option

From
Andrew Gierth
Date:
>>>>> "Surafel" == Surafel Temesgen <surafel3000@gmail.com> writes:

 Surafel> this is because fetch first values work with integer and it
 Surafel> change fractional number to nearest integer number like select
 Surafel> * from generate_series(1, 1000) fetch first 20.3 rows only; is
 Surafel> not an error rather it return 20 rows.

 31) The declared type of the <simple value specification> simply
     contained in <fetch first percentage> shall be numeric.

Note that unlike <fetch first row count> there is no requirement for the
type to be _exact_ numeric or to have scale 0.

later:

    ii) If <fetch first percentage> is specified, then let FFP be the
        <simple value specification> simply contained in <fetch first
        percentage>, and let LOCT be a <literal> whose value is OCT. Let
        FFRC be the value of

          CEILING ( FFP * LOCT / 100.0E0 )

                NOTE 333 -- The percentage is computed using the number
                of rows before removing the rows specified by <offset
                row count>.

ceil(20.3 * 1000 / 100.0E0) is definitely 203, not 200.

-- 
Andrew.


Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:

On 11/25/18 1:05 PM, Vik Fearing wrote:
> On 25/11/2018 12:49, Surafel Temesgen wrote:
>>
>>
>> On Sun, Nov 25, 2018 at 1:24 PM Vik Fearing <vik.fearing@2ndquadrant.com
>> <mailto:vik.fearing@2ndquadrant.com>> wrote:
>>
>>
>>     Also, this query returns 210 rows instead of the expected 208:
>>
>>         select *
>>         from generate_series(1, 1000)
>>         fetch first 20.8 percent rows only
>>
>> this is because fetch first values work with integer and it change
>> fractional number to nearest integer number like select * from
>> generate_series(1, 1000) fetch first 20.3 rows only; is not an error
>> rather it return 20 rows.
> 
> I don't see how this behavior is justified by reading the SQL standard.
>  Obviously only an integer number of rows is going to be returned, but
> the percentage should be calculated correctly.
> 

Right. My draft of SQL standard says this:

<fetch first clause> ::=
  FETCH { FIRST | NEXT } [ <fetch first quantity> ] { ROW | ROWS }
  { ONLY | WITH TIES }

<fetch first quantity> ::= <fetch first row count>
                           | <fetch first percentage>

<fetch first percentage> ::= <simple value specification> PERCENT

and then

30) The declared type of <fetch first row count> shall be an exact
numeric with scale 0 (zero).

31) The declared type of the <simple value specification> simply
contained in <fetch first percentage> shall
be numeric.

So the standard pretty much requires treating the value as numeric.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: FETCH FIRST clause PERCENT option

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On 11/25/18 1:05 PM, Vik Fearing wrote:
>> I don't see how this behavior is justified by reading the SQL standard.
>> Obviously only an integer number of rows is going to be returned, but
>> the percentage should be calculated correctly.

> Right. My draft of SQL standard says this:
> 31) The declared type of the <simple value specification> simply
> contained in <fetch first percentage> shall be numeric.

> So the standard pretty much requires treating the value as numeric.

Note that this should not be read as "it must be of the type that PG
calls numeric".  I'd read it as requiring a value that's in the numeric
type hierarchy.  float8 would be a reasonable implementation choice
if we're going to coerce to a specific type.

            regards, tom lane


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Sun, Nov 25, 2018 at 3:05 PM Vik Fearing <vik.fearing@2ndquadrant.com> wrote:

I assume you meant 200 rows there, but the correct number of rows to
return is 203 for that query.  Please fix it.

Attach is a patch that include a fix for it
regards
Surafel
Attachment

Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Fri, Nov 30, 2018 at 10:23 AM Surafel Temesgen <surafel3000@gmail.com> wrote:
 
Attach is a patch that include a fix for it


By mistake the patch coerce OFFSET clause to float8 too which is not necessary .
The attached patch correct it
regards
Surafel
 
Attachment

Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:
Hi,

I've been looking at this patch today, and I think there's a couple of
issues with the costing and execution. Consider a simple table with 1M rows:

  create table t as select i from generate_series(1,1000000) s(i);

and these two queries, that both select 1% of rows

  select * from t fetch first 10000 rows only;

  select * from t fetch first 1 percent rows only;

which are costed like this

  test=# explain select * from t fetch first 10000 rows only;
                             QUERY PLAN
  -----------------------------------------------------------------
   Limit  (cost=0.00..144.25 rows=10000 width=4)
     ->  Seq Scan on t  (cost=0.00..14425.00 rows=1000000 width=4)
  (2 rows)

  test=# explain select * from t fetch first 1 percent rows only;
                             QUERY PLAN
  -----------------------------------------------------------------
   Limit  (cost=0.00..14425.00 rows=1000000 width=4)
     ->  Seq Scan on t  (cost=0.00..14425.00 rows=1000000 width=4)
  (2 rows)

There are two issues with the costs in the "percent" case.

Firstly, the estimated number of rows should be about the same (10k) in
both cases, but the "percent" case incorrectly uses the total row count
(i.e. as if 100% rows were returned).

It's correct that the total cost for the "percent" case is based on 100%
of rows, of course, because the code has to fetch everything and stash
it into the tuplestore in LIMIT_INITIAL phase.

But that implies the startup cost for the "percent" case can't be 0,
because all this work has to be done before emitting the first row.

So these costing aspects need fixing, IMHO.


The execution part of the patch seems to be working correctly, but I
think there's an improvement - we don't need to execute the outer plan
to completion before emitting the first row. For example, let's say the
outer plan produces 10000 rows in total and we're supposed to return the
first 1% of those rows. We can emit the first row after fetching the
first 100 rows, we don't have to wait for fetching all 10k rows.

This may seem pointless at first because we eventually have to scan all
the rows anyway (as we don't know we're done otherwise). But it may be
nested in another early-terminating node in which case it will matter,
e.g. like this:

  SELECT * FROM t1 WHERE EXISTS (
    SELECT * FROM t2 FETCH FIRST 1 PERCENT ROWS ONLY
  ) foo;

I admit this is example is somewhat artificial, but it's pretty easy to
end with queries like this with views etc. Overall the fast startup
seems like a quite valuable feature and we should try keeping it.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:
Hi

On Tue, Jan 1, 2019 at 10:08 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
Hi,

I've been looking at this patch today, and I think there's a couple of
issues with the costing and execution. Consider a simple table with 1M rows:

  create table t as select i from generate_series(1,1000000) s(i);

and these two queries, that both select 1% of rows

  select * from t fetch first 10000 rows only;

  select * from t fetch first 1 percent rows only;

which are costed like this

  test=# explain select * from t fetch first 10000 rows only;
                             QUERY PLAN
  -----------------------------------------------------------------
   Limit  (cost=0.00..144.25 rows=10000 width=4)
     ->  Seq Scan on t  (cost=0.00..14425.00 rows=1000000 width=4)
  (2 rows)

  test=# explain select * from t fetch first 1 percent rows only;
                             QUERY PLAN
  -----------------------------------------------------------------
   Limit  (cost=0.00..14425.00 rows=1000000 width=4)
     ->  Seq Scan on t  (cost=0.00..14425.00 rows=1000000 width=4)
  (2 rows)

There are two issues with the costs in the "percent" case.

Firstly, the estimated number of rows should be about the same (10k) in
both cases, but the "percent" case incorrectly uses the total row count
(i.e. as if 100% rows were returned).

Ok I will correct it


It's correct that the total cost for the "percent" case is based on 100%
of rows, of course, because the code has to fetch everything and stash
it into the tuplestore in LIMIT_INITIAL phase.

But that implies the startup cost for the "percent" case can't be 0,
because all this work has to be done before emitting the first row.


Correct, startup cost must be equal to total cost of source data path and total cost have to be slightly greater than startup cost . I am planing to increase the total cost by limitCount * 0.1(similar to the parallel_tuple_cost) because getting tuple from tuplestore are almost similar operation to passing a tuple from worker to master backend.  

So these costing aspects need fixing, IMHO.


The execution part of the patch seems to be working correctly, but I
think there's an improvement - we don't need to execute the outer plan
to completion before emitting the first row. For example, let's say the
outer plan produces 10000 rows in total and we're supposed to return the
first 1% of those rows. We can emit the first row after fetching the
first 100 rows, we don't have to wait for fetching all 10k rows.


but total rows count is not given how can we determine safe to return row

Regards
Surafel

 

Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:
On 1/3/19 1:00 PM, Surafel Temesgen wrote:
> Hi
> 
> On Tue, Jan 1, 2019 at 10:08 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
> 
>     ...
> 
>     Firstly, the estimated number of rows should be about the same (10k) in
>     both cases, but the "percent" case incorrectly uses the total row count
>     (i.e. as if 100% rows were returned).
> 
> Ok I will correct it
> 
> 
>     It's correct that the total cost for the "percent" case is based on 100%
>     of rows, of course, because the code has to fetch everything and stash
>     it into the tuplestore in LIMIT_INITIAL phase.
> 
>     But that implies the startup cost for the "percent" case can't be 0,
>     because all this work has to be done before emitting the first row.
> 
> 
> Correct, startup cost must be equal to total cost of source data path
> and total cost have to be slightly greater than startup cost . I am
> planing to increase the total cost by limitCount * 0.1(similar to the
> parallel_tuple_cost) because getting tuple from tuplestore are almost
> similar operation to passing a tuple from worker to master backend.
> 

OK, sounds good in principle.

> 
>     So these costing aspects need fixing, IMHO.
> 
> 
>     The execution part of the patch seems to be working correctly, but I
>     think there's an improvement - we don't need to execute the outer plan
>     to completion before emitting the first row. For example, let's say the
>     outer plan produces 10000 rows in total and we're supposed to return the
>     first 1% of those rows. We can emit the first row after fetching the
>     first 100 rows, we don't have to wait for fetching all 10k rows.
> 
> 
> but total rows count is not given how can we determine safe to return row
> 

But you know how many rows were fetched from the outer plan, and this
number only grows grows. So the number of rows returned by FETCH FIRST
... PERCENT also only grows. For example with 10% of rows, you know that
once you reach 100 rows you should emit ~10 rows, with 200 rows you know
you should emit ~20 rows, etc. So you may track how many rows we're
supposed to return / returned so far, and emit them early.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Thu, Jan 3, 2019 at 4:51 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 1/3/19 1:00 PM, Surafel Temesgen wrote:
> Hi
>
> On Tue, Jan 1, 2019 at 10:08 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:

>     The execution part of the patch seems to be working correctly, but I
>     think there's an improvement - we don't need to execute the outer plan
>     to completion before emitting the first row. For example, let's say the
>     outer plan produces 10000 rows in total and we're supposed to return the
>     first 1% of those rows. We can emit the first row after fetching the
>     first 100 rows, we don't have to wait for fetching all 10k rows.
>
>
> but total rows count is not given how can we determine safe to return row
>

But you know how many rows were fetched from the outer plan, and this
number only grows grows. So the number of rows returned by FETCH FIRST
... PERCENT also only grows. For example with 10% of rows, you know that
once you reach 100 rows you should emit ~10 rows, with 200 rows you know
you should emit ~20 rows, etc. So you may track how many rows we're
supposed to return / returned so far, and emit them early.




yes that is clear but i don't find it easy to put that in formula. may be someone with good mathematics will help

regards
Surafel
 

Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:



On Tue, Jan 1, 2019 at 10:08 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
The execution part of the patch seems to be working correctly, but I
think there's an improvement - we don't need to execute the outer plan
to completion before emitting the first row. For example, let's say the
outer plan produces 10000 rows in total and we're supposed to return the
first 1% of those rows. We can emit the first row after fetching the
first 100 rows, we don't have to wait for fetching all 10k rows.


my other concern is in this case we are going to determine limit and safe row to return on the fly
which have additional computation and i fair that it will not perform better on most of the case

regards
Surafel

Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:

On 1/4/19 7:40 AM, Surafel Temesgen wrote:
> 
> 
> On Thu, Jan 3, 2019 at 4:51 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
> 
> 
>     On 1/3/19 1:00 PM, Surafel Temesgen wrote:
>     > Hi
>     >
>     > On Tue, Jan 1, 2019 at 10:08 PM Tomas Vondra
>     > <tomas.vondra@2ndquadrant.com
>     <mailto:tomas.vondra@2ndquadrant.com>
>     <mailto:tomas.vondra@2ndquadrant.com
>     <mailto:tomas.vondra@2ndquadrant.com>>> wrote:
> 
>     >     The execution part of the patch seems to be working correctly,
>     but I
>     >     think there's an improvement - we don't need to execute the
>     outer plan
>     >     to completion before emitting the first row. For example,
>     let's say the
>     >     outer plan produces 10000 rows in total and we're supposed to
>     return the
>     >     first 1% of those rows. We can emit the first row after
>     fetching the
>     >     first 100 rows, we don't have to wait for fetching all 10k rows.
>     >
>     >
>     > but total rows count is not given how can we determine safe to
>     return row
>     >
> 
>     But you know how many rows were fetched from the outer plan, and this
>     number only grows grows. So the number of rows returned by FETCH FIRST
>     ... PERCENT also only grows. For example with 10% of rows, you know that
>     once you reach 100 rows you should emit ~10 rows, with 200 rows you know
>     you should emit ~20 rows, etc. So you may track how many rows we're
>     supposed to return / returned so far, and emit them early.
> 
> 
> 
> 
> yes that is clear but i don't find it easy to put that in formula. may
> be someone with good mathematics will help
> 

What formula? All the math remains exactly the same, you just need to
update the number of rows to return and track how many rows are already
returned.

I haven't tried doing that, but AFAICS you'd need to tweak how/when
node->count is computed - instead of computing it only once it needs to
be updated after fetching each row from the subplan.

Furthermore, you'll need to stash the subplan rows somewhere (into a
tuplestore probably), and whenever the node->count value increments,
you'll need to grab a row from the tuplestore and return that (i.e.
tweak node->position and set node->subSlot).

I hope that makes sense. The one thing I'm not quite sure about is
whether tuplestore allows adding and getting rows at the same time.

Does that make sense?

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:
On 1/4/19 8:08 AM, Surafel Temesgen wrote:
> 
> 
> 
> On Tue, Jan 1, 2019 at 10:08 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
> 
>     The execution part of the patch seems to be working correctly, but I
>     think there's an improvement - we don't need to execute the outer plan
>     to completion before emitting the first row. For example, let's say the
>     outer plan produces 10000 rows in total and we're supposed to return the
>     first 1% of those rows. We can emit the first row after fetching the
>     first 100 rows, we don't have to wait for fetching all 10k rows.
> 
> 
> my other concern is in this case we are going to determine limit and
> safe row to return on the fly
> which have additional computation and i fair that it will not perform
> better on most of the case
> 

I very much doubt recomputing the number of rows to return will be
measurable at all, considering those are int64 values. It's likely
negligible compared to all the other overhead (reading rows from the
subplan, etc.).

And even if it turns out expensive, there are ways to make it less
expensive (e.g. not computing it every time, but only when it actually
can increment the value - for example with 1% rows to return, it's
enough to recompute it every 100 rows).

It also very much depends on which use cases you consider important.
Clearly, you're only thinking about the case that actually requires
fetching everything. But there's also the use case where you have
another limit node somewhere above.

So let's do a simple "cost" analysis of these two use cases:

1) case #1 (no LIMIT above, requires fetching everything)

Let's assume the "cost" of the current approach (fetching everything in
advance, computing node->count once) is the baseline here.

With the incremental approach, there likely is some extra overhead.
Let's overshoot it a bit - with 5% overhead, it's 1.05 of the baseline.

2) case #2 (LIMIT/EXISTS above, only needs to fetch the first few rows,
perhaps just a single one)

Here, we take the incremental approach as the baseline. With the "fetch
everything first" approach we need to evaluate the whole subplan and
fetch all the results, instead of just some small fraction. But the
exact ratio is pretty much arbitrary - I can easily construct queries
that need to fetch 100x or 100000x more data from the subplan.

So essentially using the "incremental fetch" approach means we may pay
5% overhead in cases where we need to fetch everything, while the "fetch
everything first" approach means we'll add arbitrary amount of overhead
to cases where we don't need to fetch everything.

Of course, you one way to deal with this would be to argue the cases
benefiting from incremental approach are very rare / non-existent.
Perhaps that's the case, although I don't see why it would be.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Fri, Jan 4, 2019 at 5:27 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

What formula? All the math remains exactly the same, you just need to
update the number of rows to return and track how many rows are already
returned.

I haven't tried doing that, but AFAICS you'd need to tweak how/when
node->count is computed - instead of computing it only once it needs to
be updated after fetching each row from the subplan.

Furthermore, you'll need to stash the subplan rows somewhere (into a
tuplestore probably), and whenever the node->count value increments,
you'll need to grab a row from the tuplestore and return that (i.e.
tweak node->position and set node->subSlot).

I hope that makes sense. The one thing I'm not quite sure about is
whether tuplestore allows adding and getting rows at the same time.

Does that make sense


In current implementation in LIMIT_INITIAL state we execute outer plan to the end , store the resulting tuples in tuplestore and calculate limitCount in number .We are in this state only once and did not return any tuple. Once we are at LIMIT_INWINDOW state and inner node execution asking for tuple it return from tuple store immediately.

Inorder to do fast startup what I was thinking was dividing the work done at LIMIT_INITIAL state in to limitCount. For example we want 0.5 percent of the result which means in LIMIT_INWINDOW state we execute outer node 200 times ,store the result in tuplestore and return the first tuple. if we don’t get that much tuple that means we reach end of the limit . But I think i can’t do it in this way because percent have meaning only with total number so LIMIT_INITIAL work can’t be avoid

regards

Surafel
 

Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:
On 1/6/19 12:40 PM, Surafel Temesgen wrote:
> 
> 
> On Fri, Jan 4, 2019 at 5:27 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
> 
> 
>     What formula? All the math remains exactly the same, you just need to
>     update the number of rows to return and track how many rows are already
>     returned.
> 
>     I haven't tried doing that, but AFAICS you'd need to tweak how/when
>     node->count is computed - instead of computing it only once it needs to
>     be updated after fetching each row from the subplan.
> 
>     Furthermore, you'll need to stash the subplan rows somewhere (into a
>     tuplestore probably), and whenever the node->count value increments,
>     you'll need to grab a row from the tuplestore and return that (i.e.
>     tweak node->position and set node->subSlot).
> 
>     I hope that makes sense. The one thing I'm not quite sure about is
>     whether tuplestore allows adding and getting rows at the same time.
> 
>     Does that make sense
> 
> 
> 
> In current implementation in LIMIT_INITIAL state we execute outer
> plan to the end , store the resulting tuples in tuplestore and
> calculate limitCount in number .We are in this state only once and
> did not return any tuple. Once we are at LIMIT_INWINDOW state and
> inner node execution asking for tuple it return from tuple store
> immediately.
> 

Right.

> Inorder to do fast startup what I was thinking was dividing the work 
> done at LIMIT_INITIAL state in to limitCount. For example we want
> 0.5 percent of the result which means in LIMIT_INWINDOW state we
> execute outer node 200 times ,store the result in tuplestore and
> return the first tuple. if we don’t get that much tuple that means we
> reach end of the limit.
Yes, pretty much.

> But I think i can’t do it in this way because percent have meaning
> only with total number so LIMIT_INITIAL work can’t be avoid.
> 

I'm not sure I understand what you mean by "percent would have meaning
only with the total number". The important thing is that you can track
how many tuples you're supposed to return (based on the current number
of rows from the outer plan) and how many you've produced so far. And
those numbers are only growing.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Sun, Jan 6, 2019 at 5:51 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

On 1/6/19 12:40 PM, Surafel Temesgen wrote:
>
>
> On Fri, Jan 4, 2019 at 5:27 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
>
>
>     What formula? All the math remains exactly the same, you just need to
>     update the number of rows to return and track how many rows are already
>     returned.
>
>     I haven't tried doing that, but AFAICS you'd need to tweak how/when
>     node->count is computed - instead of computing it only once it needs to
>     be updated after fetching each row from the subplan.
>
>     Furthermore, you'll need to stash the subplan rows somewhere (into a
>     tuplestore probably), and whenever the node->count value increments,
>     you'll need to grab a row from the tuplestore and return that (i.e.
>     tweak node->position and set node->subSlot).
>
>     I hope that makes sense. The one thing I'm not quite sure about is
>     whether tuplestore allows adding and getting rows at the same time.
>
>     Does that make sense
>
>
>
> In current implementation in LIMIT_INITIAL state we execute outer
> plan to the end , store the resulting tuples in tuplestore and
> calculate limitCount in number .We are in this state only once and
> did not return any tuple. Once we are at LIMIT_INWINDOW state and
> inner node execution asking for tuple it return from tuple store
> immediately.
>

Right.

> Inorder to do fast startup what I was thinking was dividing the work
> done at LIMIT_INITIAL state in to limitCount. For example we want
> 0.5 percent of the result which means in LIMIT_INWINDOW state we
> execute outer node 200 times ,store the result in tuplestore and
> return the first tuple. if we don’t get that much tuple that means we
> reach end of the limit.
Yes, pretty much.


While working on this method i found an issue that did not work will on percent
with fractional part like 2.6 percent which means we have to emit per every 38.4615
tuple so it have to be round up to 39 or round down to 38 either way it leads to incorrect
result  .Even with integer percentage sometimes the result came out with fractional part
and needs rounding

regards
Surafel 

Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:
On 1/9/19 7:09 AM, Surafel Temesgen wrote:
> 
> 
> On Sun, Jan 6, 2019 at 5:51 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
> 
> 
>     On 1/6/19 12:40 PM, Surafel Temesgen wrote:
>     >
>     >
>     > On Fri, Jan 4, 2019 at 5:27 PM Tomas Vondra
>     > <tomas.vondra@2ndquadrant.com
>     <mailto:tomas.vondra@2ndquadrant.com>
>     <mailto:tomas.vondra@2ndquadrant.com
>     <mailto:tomas.vondra@2ndquadrant.com>>> wrote:
>     >
>     >
>     >     What formula? All the math remains exactly the same, you just
>     need to
>     >     update the number of rows to return and track how many rows
>     are already
>     >     returned.
>     >
>     >     I haven't tried doing that, but AFAICS you'd need to tweak
>     how/when
>     >     node->count is computed - instead of computing it only once it
>     needs to
>     >     be updated after fetching each row from the subplan.
>     >
>     >     Furthermore, you'll need to stash the subplan rows somewhere
>     (into a
>     >     tuplestore probably), and whenever the node->count value
>     increments,
>     >     you'll need to grab a row from the tuplestore and return that
>     (i.e.
>     >     tweak node->position and set node->subSlot).
>     >
>     >     I hope that makes sense. The one thing I'm not quite sure about is
>     >     whether tuplestore allows adding and getting rows at the same
>     time.
>     >
>     >     Does that make sense
>     >
>     >
>     >
>     > In current implementation in LIMIT_INITIAL state we execute outer
>     > plan to the end , store the resulting tuples in tuplestore and
>     > calculate limitCount in number .We are in this state only once and
>     > did not return any tuple. Once we are at LIMIT_INWINDOW state and
>     > inner node execution asking for tuple it return from tuple store
>     > immediately.
>     >
> 
>     Right.
> 
>     > Inorder to do fast startup what I was thinking was dividing the work
>     > done at LIMIT_INITIAL state in to limitCount. For example we want
>     > 0.5 percent of the result which means in LIMIT_INWINDOW state we
>     > execute outer node 200 times ,store the result in tuplestore and
>     > return the first tuple. if we don’t get that much tuple that means we
>     > reach end of the limit.
>     Yes, pretty much.
> 
> 
> 
> While working on this method i found an issue that did not work will
> on percent with fractional part like 2.6 percent which means we have
> to emit per every 38.4615 tuple so it have to be round up to 39 or
> round down to 38 either way it leads to incorrect result  .Even with
> integer percentage sometimes the result came out with fractional
> part and needs rounding
> 

It's hard to say what exactly are you doing, because you haven't shared
any code nor examples. But it sounds as if you're computing how often to
produce an additional row at the beginning, and then simply multiplying
it. I'm not sure that's quite possible, because there will be issues due
to rounding.

Firstly, I see your last patch (from 30/11) does this:

    node->count = DatumGetInt64((DatumGetFloat8(val) *
                  tuplestore_tuple_count(node->totalTuple)) / 100);

but the SQL standard I have says this in the <query expression> section:

    If <fetch first percentage> is specified, then let FFP be the
    <simple value specification> simply contained in <fetch first
    percentage>, and let LOCT be a <literal> whose value is OCT.
    Let FFRC be the value of

        CEILING ( FFP * LOCT / 100.0E0 )

That's not what you patch does, though, because it relies on automatic
float->int conversion, which does floor() and not ceil(). FWIW the final
DatumGetInt64 seems unnecessary, because the expression is float8 and
not Datum.

The patch should do something like this instead:

    node->count = ceil((DatumGetFloat8(val) *
                  tuplestore_tuple_count(node->totalTuple)) / 100);

Now, back to the row count issue - let's assume the user requests

    FETCH FIRST 3 PERCENT ROWS ONLY

This means we should produce 1 rows every 33 rows, but only very
roughly. In reality it's a bit less, because 3% is not 1/33 exactly.
Consider this query, which computes number of rows for

    select sample_cnt, l - lag(l) over (order by sample_cnt) AS d from (
        select sample_cnt, min(nrows) AS l from (
          select
              nrows,
              (nrows * 3.0 / 100.0) AS raw_sample_cnt,
              ceil(nrows * 3.0 / 100.0) AS sample_cnt
          from generate_series(1,10000) s(nrows)
        ) foo group by sample_cnt order by sample_cnt
    ) bar;

The inner-most query simply computes how many rows we're supposed to
return based on certain row count. Then we compute points at which the
sample_cnt actually changes. And finally we compute the distance between
those points. You should get something like

 sample_cnt | d
------------+----
          1 |
          2 | 33
          3 | 33
          4 | 34
          5 | 33
          6 | 33
          7 | 34
          8 | 33
          9 | 33
         10 | 34
         11 | 33
         ...

This shows that the distance oscillates between 33 and 34, so no matter
what value you pick, you can't simply multiply it to get the desired
sample size. It needs to be re-computed as you go.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Wed, Jan 9, 2019 at 5:38 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

It's hard to say what exactly are you doing, because you haven't shared
any code nor examples.

okay i attach in progress patch

regards
Surafel

Attachment

Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:
On 1/9/19 4:43 PM, Surafel Temesgen wrote:
> 
> 
> On Wed, Jan 9, 2019 at 5:38 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
> 
> 
>     It's hard to say what exactly are you doing, because you haven't shared
>     any code nor examples. 
> 
> 
> okay i attach in progress patch
> 

Yeah, that's what I thought - the patch computes

    node->count = DatumGetInt64(100 / DatumGetFloat8(val));

and then always fetches this number of records before emitting the next
row from the tuplestore. That's wrong, as I explained before, because
the distance does change, due to rounding.

See the attached patch, which recomputes the count regularly. I don't
claim the patch is committable or that it has no other bugs, but
hopefully it shows what I meant.

FWIW you don't need to create any slots - the two already created are
enough. You certainly don't need to create the slots within a loop.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Wed, Jan 9, 2019 at 8:18 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
 

See the attached patch, which recomputes the count regularly. I don't
claim the patch is committable or that it has no other bugs, but
hopefully it shows what I meant.



I got only one issue it is not work well with cursor

postgres=# START TRANSACTION;

START TRANSACTION

postgres=# create table t as select i from generate_series(1,1000) s(i);

SELECT 1000

postgres=# declare c cursor for select * from t fetch first 5 percent rows only;

DECLARE CURSOR

postgres=# fetch all in c;

ERROR: trying to store a minimal tuple into wrong type of slot


I am looking at it .

meanwhile i fix row estimation and cost and make create_ordered_paths creation with no LIMIT consideration in PERCENTAGE case


regards

Surafel  

Attachment

Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:

On 1/24/19 10:57 AM, Surafel Temesgen wrote:
> 
> 
> On Wed, Jan 9, 2019 at 8:18 PM Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
>  
> 
> 
>     See the attached patch, which recomputes the count regularly. I don't
>     claim the patch is committable or that it has no other bugs, but
>     hopefully it shows what I meant.
> 
> 
> 
> I got only one issue it is not work well with cursor
> 
> postgres=# START TRANSACTION;
> 
> START TRANSACTION
> 
> postgres=# create table t as select i from generate_series(1,1000) s(i);
> 
> SELECT 1000
> 
> postgres=# declare c cursor for select * from t fetch first 5 percent
> rows only;
> 
> DECLARE CURSOR
> 
> postgres=# fetch all in c;
> 
> ERROR: trying to store a minimal tuple into wrong type of slot
> 
> 
> I am looking at it .
> 

OK. Does that mean you agree the incremental approach is reasonable?

> meanwhile i fix row estimation and cost and make create_ordered_paths
> creation with no LIMIT consideration in PERCENTAGE case
> 

I haven't reviewed the costing code yet, but linear interpolation seems
reasonable in principle. I find the create_limit_path changes somewhat
sloppy and difficult to comprehend (particularly without comments).

For example, the fact that the code computes the total cost based on
count_rows, which may be tweaked later seems suspicious. I mean, the doe
now does this:

    if (limitOption = PERCENTAGE)
    {
        ... update count_rows
        ... compute total_cost
    }

    if (count_rows > pathnode->path.rows)
        count_rows = pathnode->path.rows;

    ... compute total_cost for the EXACT_NUMBER case

    pathnode->path.rows = count_rows;

But I need to do think about this a bit more.

FWIW I see there are no regression tests for the PERCENT option.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Mon, Jan 28, 2019 at 1:28 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:


OK. Does that mean you agree the incremental approach is reasonable?

there are no noticeable performance difference but i love previous
approach more regarding cursor operation it fetch tuple forward and
backward from tuplestore only but in incremental approach we have to 
re execute outer node in every forward and backward fetching operation

regards
Surafel

Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:
here is a rebased version of  previous  patch with planner
comment included

regards
Surafel


On Wed, Jan 30, 2019 at 9:07 AM Surafel Temesgen <surafel3000@gmail.com> wrote:


On Mon, Jan 28, 2019 at 1:28 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:


OK. Does that mean you agree the incremental approach is reasonable?

there are no noticeable performance difference but i love previous
approach more regarding cursor operation it fetch tuple forward and
backward from tuplestore only but in incremental approach we have to 
re execute outer node in every forward and backward fetching operation

regards
Surafel
Attachment

Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:

On 1/30/19 7:07 AM, Surafel Temesgen wrote:
> 
> 
> On Mon, Jan 28, 2019 at 1:28 AM Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
> 
> 
> 
>     OK. Does that mean you agree the incremental approach is reasonable?
> 
> 
> there are no noticeable performance difference but i love previous
> approach more regarding cursor operation it fetch tuple forward and
> backward from tuplestore only but in incremental approach we have to 
> re execute outer node in every forward and backward fetching operation
> 

I'm not sure I understand - are you saying every time the user does a
FETCH, we have to run the outer plan from scratch? I don't see why would
that be necessary? And if it is, how come there's no noticeable
performance difference?

Can you share a patch implementing the incremental approach, and a query
demonstrating the issue?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:
On 2/1/19 12:10 PM, Surafel Temesgen wrote:
> here is a rebased version of  previous  patch with planner
> comment included
> 

It's really hard to say which comment is that ..

FWIW I find the changes in nodeLimit lacking sufficient comments. The
comments present are somewhat obvious - what we need are comments
explaining why things happen. For example the LIMIT_INITIAL now includes
this chunk of code:

    case LIMIT_INITIAL:

        if (node->limitOption == PERCENTAGE)
        {

        /*
         * Find all rows the plan will return.
         */
            for (;;)
            {
                slot = ExecProcNode(outerPlan);
                if (TupIsNull(slot))
                {
                    break;
                }
                tuplestore_puttupleslot(node->totalTuple, slot);
            }
        }

Ignoring the fact that the comment is incorrectly indented, it states a
rather obvious fact - that we fetch all rows from a plan and stash them
into a tuplestore. What is missing is comment explaining why we need to
do that.

This applies to other changes in nodeLimit too, and elsewhere.

Another detail is that we generally leave a free line before "if", i.e.
instead of

    }
    if (node->limitOption == PERCENTAGE)
    {

it should be

    }

    if (node->limitOption == PERCENTAGE)
    {

because it's fairly easy to misread as "else if". Even better, there
should be a comment before the "if" explaining what the branch does.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Sun, Feb 10, 2019 at 2:22 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
 

I'm not sure I understand - are you saying every time the user does a
FETCH, we have to run the outer plan from scratch? I don't see why would
that be necessary? And if it is, how come there's no noticeable
performance difference?

Can you share a patch implementing the incremental approach, and a query
demonstrating the issue?


I didn't implement it but its obvious that it doesn't work similarly with previous approach.

We need different implementation and my plan was to use tuplestore per call and clear

it after returning tuple but I see that the plan will not go far because mainly the last returned

slot is not the last slot we get from outerPlan execution


regards

Surafel 

Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:

On 2/23/19 8:53 AM, Surafel Temesgen wrote:
> 
> 
> On Sun, Feb 10, 2019 at 2:22 AM Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
>  
> 
> 
>     I'm not sure I understand - are you saying every time the user does a
>     FETCH, we have to run the outer plan from scratch? I don't see why would
>     that be necessary? And if it is, how come there's no noticeable
>     performance difference?
> 
>     Can you share a patch implementing the incremental approach, and a query
>     demonstrating the issue?
> 
> 
> I didn't implement it but its obvious that it doesn't work similarly
> with previous approach.
> 

Sure, but that's hardly a sufficient argument for the current approach.

> We need different implementation and my plan was to use tuplestore per
> call and clear
> 
> it after returning tuple but I see that the plan will not go far because
> mainly the last returned
> 
> slot is not the last slot we get from outerPlan execution
> 

I'm sorry, I still don't understand what the supposed problem is. I
don't think it's all that different from what nodeMaterial.c does, for
example.

As I explained before, having to execute the outer plan till completion
before returning any tuples is an issue. So either it needs fixing or an
explanation why it's not an issue.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Sun, Feb 24, 2019 at 12:27 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

I'm sorry, I still don't understand what the supposed problem is. I
don't think it's all that different from what nodeMaterial.c does, for
example.


sorry for the noise .Attache is complete patch for incremental approach 

regards
Surafel 
Attachment

Re: FETCH FIRST clause PERCENT option

From
Kyotaro HORIGUCHI
Date:
Hello.

At Sat, 23 Feb 2019 22:27:44 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
<81a5c0e9-c17d-28f3-4647-8a4659cdfdb1@2ndquadrant.com>
>
>
> On 2/23/19 8:53 AM, Surafel Temesgen wrote:
> >
> >
> > On Sun, Feb 10, 2019 at 2:22 AM Tomas Vondra
> > <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
> >  
> >
> >
> >     I'm not sure I understand - are you saying every time the user does a
> >     FETCH, we have to run the outer plan from scratch? I don't see why would
> >     that be necessary? And if it is, how come there's no noticeable
> >     performance difference?
> >
> >     Can you share a patch implementing the incremental approach, and a query
> >     demonstrating the issue?
> >
> >
> > I didn't implement it but its obvious that it doesn't work similarly
> > with previous approach.
> >
>
> Sure, but that's hardly a sufficient argument for the current approach.
>
> > We need different implementation and my plan was to use tuplestore per
> > call and clear
> >
> > it after returning tuple but I see that the plan will not go far because
> > mainly the last returned
> >
> > slot is not the last slot we get from outerPlan execution
> >
>
> I'm sorry, I still don't understand what the supposed problem is. I
> don't think it's all that different from what nodeMaterial.c does, for
> example.
>
> As I explained before, having to execute the outer plan till completion
> before returning any tuples is an issue. So either it needs fixing or an
> explanation why it's not an issue.

One biggest issue seems to be we don't know the total number of
outer tuples before actually reading a null tuple. I doubt of
general shortcut for that. It also seems preventing limit node
from just using materialized outer.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



Re: FETCH FIRST clause PERCENT option

From
Kyotaro HORIGUCHI
Date:
At Thu, 28 Feb 2019 13:32:17 +0300, Surafel Temesgen <surafel3000@gmail.com> wrote in
<CALAY4q_TrFN2z1oQcpxCKB5diy+A7m5hhG4VOV5d1BaWpf+qqA@mail.gmail.com>
> On Sun, Feb 24, 2019 at 12:27 AM Tomas Vondra <tomas.vondra@2ndquadrant.com>
> wrote:
> 
> >
> > I'm sorry, I still don't understand what the supposed problem is. I
> > don't think it's all that different from what nodeMaterial.c does, for
> > example.
> >
> >
> sorry for the noise .Attache is complete patch for incremental approach

Mmm. It seems just moving
reading-all-before-return-the-first-tuple from LIMIT_INITIAL to
LIMIT_RESCAN. If I read it correctly node->conut is not correct
since it is calculated as "the percentage of the number of tuples
read *so far* excluding the offset portion.).


|  while (node->position - node->offset >= node->count)
|  {
|     slot = ExecProcNode(outerPlan);
|     (if Tupleis Null then break;)
|     tuplestore_puttupleslot(node->tuple_store, slot);
|     cnt = tuplestore_tuple_count(node->tuple_store);
|     node->count = ceil(node->percent * cnt / 100.0);
|  }

If I'm missing nothing, this seems almost same with the follows.

  while ((position - offset) * percent / 100 >=
         (position - offset) * percent / 100) {}


In the first place, the patch breaks nodeLimit.c and build fails.

====
-     * previous time we got a different result.
+     * previous time we got a different result.In PERCENTAGE option there are 
+     * no bound on the number of output tuples */
      */
====

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:
On 2/28/19 12:26 PM, Kyotaro HORIGUCHI wrote:
> Hello.
> 
> At Sat, 23 Feb 2019 22:27:44 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
<81a5c0e9-c17d-28f3-4647-8a4659cdfdb1@2ndquadrant.com>
>>
>>
>> On 2/23/19 8:53 AM, Surafel Temesgen wrote:
>>>
>>>
>>> On Sun, Feb 10, 2019 at 2:22 AM Tomas Vondra
>>> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
>>>  
>>>
>>>
>>>     I'm not sure I understand - are you saying every time the user does a
>>>     FETCH, we have to run the outer plan from scratch? I don't see why would
>>>     that be necessary? And if it is, how come there's no noticeable
>>>     performance difference?
>>>
>>>     Can you share a patch implementing the incremental approach, and a query
>>>     demonstrating the issue?
>>>
>>>
>>> I didn't implement it but its obvious that it doesn't work similarly
>>> with previous approach.
>>>
>>
>> Sure, but that's hardly a sufficient argument for the current approach.
>>
>>> We need different implementation and my plan was to use tuplestore per
>>> call and clear
>>>
>>> it after returning tuple but I see that the plan will not go far because
>>> mainly the last returned
>>>
>>> slot is not the last slot we get from outerPlan execution
>>>
>>
>> I'm sorry, I still don't understand what the supposed problem is. I
>> don't think it's all that different from what nodeMaterial.c does, for
>> example.
>>
>> As I explained before, having to execute the outer plan till completion
>> before returning any tuples is an issue. So either it needs fixing or an
>> explanation why it's not an issue.
> 
> One biggest issue seems to be we don't know the total number of
> outer tuples before actually reading a null tuple. I doubt of
> general shortcut for that. It also seems preventing limit node
> from just using materialized outer.
> 

Sure, if you actually want all tuples, you'll have to execute the outer
plan till completion. But that's not what I'm talking about - what if we
only ever need to read one row from the limit?

To give you a (admittedly, somewhat contrived and artificial example):

    SELECT * FROM t1 WHERE id IN (
      SELECT id FROM t2 ORDER BY x FETCH FIRST 10 PERCENT ROWS ONLY
    );

Maybe this example is bogus and/or does not really matter in practice. I
don't know, but I've been unable to convince myself that's the case.


regards


-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: FETCH FIRST clause PERCENT option

From
Kyotaro HORIGUCHI
Date:
Hello.

At Thu, 28 Feb 2019 21:16:25 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
<fbd08ad3-5dd8-3169-6cba-38d610d7be7f@2ndquadrant.com>
> > One biggest issue seems to be we don't know the total number of

# One *of* the biggest *issues*?

> > outer tuples before actually reading a null tuple. I doubt of
> > general shortcut for that. It also seems preventing limit node
> > from just using materialized outer.
> > 
> 
> Sure, if you actually want all tuples, you'll have to execute the outer
> plan till completion. But that's not what I'm talking about - what if we
> only ever need to read one row from the limit?

We have no choice than once reading all tuples just to find we
are to return just one row, since estimator is not guaranteed to
be exact as required for this purpose.

> To give you a (admittedly, somewhat contrived and artificial example):
> 
>     SELECT * FROM t1 WHERE id IN (
>       SELECT id FROM t2 ORDER BY x FETCH FIRST 10 PERCENT ROWS ONLY
>     );
> 
> Maybe this example is bogus and/or does not really matter in practice. I
> don't know, but I've been unable to convince myself that's the case.

I see such kind of idiom common. Even in the quite simple example
above, *we* cannot tell how many tuples the inner should return
unless we actually fetch all tuples in t2. This is the same
problem with count(*).

The query is equivalent to the folloing one.

 SELECT * FROM t1 WHERE id IN (
   SELECT id FROM t2 ORDER BY x
     FETCH FIRST (SELECT ceil(count(*) * 0.1) FROM t2) ROWS ONLY
 );

This scans t2 twice, but this patch does only one full scan
moving another partial scan to tuplestore. We would win if the
outer is complex enough.

Anyway, even excluding the count(*) issue, it seems that we are
not alone in that point. (That is, at least Oracle shows
different E-Rows and A-Rows for PERCENT).

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Fri, Mar 1, 2019 at 4:33 AM Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello.

At Thu, 28 Feb 2019 21:16:25 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <fbd08ad3-5dd8-3169-6cba-38d610d7be7f@2ndquadrant.com>
> > One biggest issue seems to be we don't know the total number of

# One *of* the biggest *issues*?

> > outer tuples before actually reading a null tuple. I doubt of
> > general shortcut for that. It also seems preventing limit node
> > from just using materialized outer.
> >
>
> Sure, if you actually want all tuples, you'll have to execute the outer
> plan till completion. But that's not what I'm talking about - what if we
> only ever need to read one row from the limit?

We have no choice than once reading all tuples just to find we
are to return just one row, since estimator is not guaranteed to
be exact as required for this purpose.

> To give you a (admittedly, somewhat contrived and artificial example):
>
>     SELECT * FROM t1 WHERE id IN (
>       SELECT id FROM t2 ORDER BY x FETCH FIRST 10 PERCENT ROWS ONLY
>     );
>
> Maybe this example is bogus and/or does not really matter in practice. I
> don't know, but I've been unable to convince myself that's the case.

I see such kind of idiom common. Even in the quite simple example
above, *we* cannot tell how many tuples the inner should return
unless we actually fetch all tuples in t2. This is the same
problem with count(*).

The query is equivalent to the folloing one.

 SELECT * FROM t1 WHERE id IN (
   SELECT id FROM t2 ORDER BY x
     FETCH FIRST (SELECT ceil(count(*) * 0.1) FROM t2) ROWS ONLY
 );

This scans t2 twice, but this patch does only one full scan
moving another partial scan to tuplestore. We would win if the
outer is complex enough.

Okay here is the previous implementation with uptread review comment
included and it also consider OFFSET clause in percentage calculation

regards
Surafel 
Attachment

Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:
On 3/1/19 2:31 AM, Kyotaro HORIGUCHI wrote:
> Hello.
> 
> At Thu, 28 Feb 2019 21:16:25 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
<fbd08ad3-5dd8-3169-6cba-38d610d7be7f@2ndquadrant.com>
>>> One biggest issue seems to be we don't know the total number of
> 
> # One *of* the biggest *issues*?
> 
>>> outer tuples before actually reading a null tuple. I doubt of
>>> general shortcut for that. It also seems preventing limit node
>>> from just using materialized outer.
>>>
>>
>> Sure, if you actually want all tuples, you'll have to execute the outer
>> plan till completion. But that's not what I'm talking about - what if we
>> only ever need to read one row from the limit?
> 
> We have no choice than once reading all tuples just to find we
> are to return just one row, since estimator is not guaranteed to
> be exact as required for this purpose.
> 

I don't follow - I'm not suggesting to base this on row estimates at
all. I'm saying that we can read in rows, and decide how many rows we
are expected to produce.

For example, with 1% sample, we'll produce about 1 row for each 100 rows
read from the outer plan. So we read the first row, and compute

  rows_to_produce = ceil(0.01 * rows_read)

and every time this increases, we emit another row from a tuplestore.


>> To give you a (admittedly, somewhat contrived and artificial example):
>>
>>     SELECT * FROM t1 WHERE id IN (
>>       SELECT id FROM t2 ORDER BY x FETCH FIRST 10 PERCENT ROWS ONLY
>>     );
>>
>> Maybe this example is bogus and/or does not really matter in practice. I
>> don't know, but I've been unable to convince myself that's the case.
> 
> I see such kind of idiom common. Even in the quite simple example
> above, *we* cannot tell how many tuples the inner should return
> unless we actually fetch all tuples in t2. This is the same
> problem with count(*).
> 
> The query is equivalent to the folloing one.
> 
>  SELECT * FROM t1 WHERE id IN (
>    SELECT id FROM t2 ORDER BY x
>      FETCH FIRST (SELECT ceil(count(*) * 0.1) FROM t2) ROWS ONLY
>  );
> 
> This scans t2 twice, but this patch does only one full scan
> moving another partial scan to tuplestore. We would win if the
> outer is complex enough.
> 

But what if all t1 rows match the very first row in the subselect? In
that case we don't need to read all rows from the subplan - we only need
to read the first chunk (until we emit the first row).

> Anyway, even excluding the count(*) issue, it seems that we are
> not alone in that point. (That is, at least Oracle shows
> different E-Rows and A-Rows for PERCENT).
> 

I'm not sure hat E-Rows and A-Rows are? I suppose that's estimated and
actual - but as I explained above, that's not what I propose the
incremental approach to use.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Re: FETCH FIRST clause PERCENT option

From
David Steele
Date:
Hi Surafel,

On 3/1/19 8:18 PM, Tomas Vondra wrote:
>>
> 
> I'm not sure hat E-Rows and A-Rows are? I suppose that's estimated and
> actual - but as I explained above, that's not what I propose the
> incremental approach to use.

Marked as Waiting for Author since it appears Tomas is waiting for a 
response from you.

Regards,
-- 
-David
david@pgmasters.net


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:



On Thu, Feb 28, 2019 at 11:16 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

To give you a (admittedly, somewhat contrived and artificial example):

    SELECT * FROM t1 WHERE id IN (
      SELECT id FROM t2 ORDER BY x FETCH FIRST 10 PERCENT ROWS ONLY
    );

Maybe this example is bogus and/or does not really matter in practice. I
don't know, but I've been unable to convince myself that's the case.

does this means we abandon incremental approach? and am not sure of calculating
percentage after OFFSET clause is acceptable or not

regards
Surafel

Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:

Hi,
On Thu, Feb 28, 2019 at 2:50 PM Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
====
-        * previous time we got a different result.
+        * previous time we got a different result.In PERCENTAGE option there are
+        * no bound on the number of output tuples */
         */
====


 Thank you for testing .Fixed
Attachment

Re: FETCH FIRST clause PERCENT option

From
Andres Freund
Date:
Hi,

On 2019-03-29 12:04:50 +0300, Surafel Temesgen wrote:

> +                if (node->limitOption == PERCENTAGE)
> +                {
> +                    while (node->position - node->offset < node->count)
> +                    {
> +                        slot = MakeSingleTupleTableSlot(tupleDescriptor, &TTSOpsMinimalTuple);
> +                        if (tuplestore_gettupleslot(node->tupleStore, true, true, slot))

There's several blocks like this - how's this not going to be a resource
leak? As far as I can tell you're just going to create new slots, and
their previous contents aren't going to be cleared?   I think there very
rarely are cases where an executor node's *Next function (or its
subsidiaries) creates slots. Normally you ought to create them *once* on
the *Init function.

You might not directly leak memory, because this is probably running in
a short lived context, but you'll leak tuple desc references etc. (Or if
it were a heap buffer slot, buffer pins).


> +            /* In PERCENTAGE case the result is already in tuplestore */
> +            if (node->limitOption == PERCENTAGE)
> +            {
> +                slot = MakeSingleTupleTableSlot(tupleDescriptor, &TTSOpsMinimalTuple);
> +                if (tuplestore_gettupleslot(node->tupleStore, false, false, slot))
> +                {
> +                    node->subSlot = slot;
> +                    node->lstate = LIMIT_INWINDOW;
> +                }
> +                else
> +                    elog(ERROR, "LIMIT subplan failed to run backwards");
> +            }
> +            else if (node->limitOption == EXACT_NUMBER)
> +            {
>              /*
>               * Backing up from subplan EOF, so re-fetch previous tuple; there
>               * should be one!  Note previous tuple must be in window.
> @@ -194,6 +329,7 @@ ExecLimit(PlanState *pstate)
>              node->subSlot = slot;
>              node->lstate = LIMIT_INWINDOW;
>              /* position does not change 'cause we didn't advance it before */
> +            }

The indentation here looks wrong...

>              break;
>  
>          case LIMIT_WINDOWEND:
> @@ -278,17 +414,29 @@ recompute_limits(LimitState *node)
>          /* Interpret NULL count as no count (LIMIT ALL) */
>          if (isNull)
>          {
> -            node->count = 0;
> +            node->count = 1;
>              node->noCount = true;

Huh?


>          }
>          else
>          {
> -            node->count = DatumGetInt64(val);
> -            if (node->count < 0)
> -                ereport(ERROR,
> -                        (errcode(ERRCODE_INVALID_ROW_COUNT_IN_LIMIT_CLAUSE),
> -                         errmsg("LIMIT must not be negative")));
> -            node->noCount = false;
> +            if (node->limitOption == PERCENTAGE)
> +            {
> +                /*
> +                 * We expect to return at least one row (unless there
> +                 * are no rows in the subplan), and we'll update this
> +                 * count later as we go.
> +                 */
> +                node->count = 0;
> +                node->percent = DatumGetFloat8(val);
> +            }
> +            else
> +            {
> +                node->count = DatumGetInt64(val);
> +                if (node->count < 0)
> +                    ereport(ERROR,
> +                            (errcode(ERRCODE_INVALID_ROW_COUNT_IN_LIMIT_CLAUSE),
> +                             errmsg("LIMIT must not be negative")));
> +            }

Shouldn't we error out with a negative count, regardless of PERCENT? Or
is that prevented elsewhere?

>          }
>      }
>      else
> @@ -299,8 +447,10 @@ recompute_limits(LimitState *node)
>      }
>  
>      /* Reset position to start-of-scan */
> -    node->position = 0;
> +    node->position = 0;;

unnecessary


>      if (parse->sortClause)
>      {
> -        current_rel = create_ordered_paths(root,
> -                                           current_rel,
> -                                           final_target,
> -                                           final_target_parallel_safe,
> -                                           have_postponed_srfs ? -1.0 :
> -                                           limit_tuples);
> +
> +        /* In PERCENTAGE option there are no bound on the number of output tuples */
> +        if (parse->limitOption == PERCENTAGE)
> +            current_rel = create_ordered_paths(root,
> +                                               current_rel,
> +                                               final_target,
> +                                               final_target_parallel_safe,
> +                                               have_postponed_srfs ? -1.0 :
> +                                               -1.0);
> +        else
> +            current_rel = create_ordered_paths(root,
> +                                               current_rel,
> +                                               final_target,
> +                                               final_target_parallel_safe,
> +                                               have_postponed_srfs ? -1.0 :
> +                                               limit_tuples);

I'd rather not duplicate those two calls, and just have the limit_tuples
computation inside an if.


>  offset_clause:
> @@ -15435,6 +15442,7 @@ reserved_keyword:
>              | ONLY
>              | OR
>              | ORDER
> +            | PERCENT
>              | PLACING
>              | PRIMARY
>              | REFERENCES

Are we really ok with adding a new reserved keyword 'PERCENT' for this?
That seems awfully likely to already exist in columns etc. Is there a
chance we can use a less strong category?

Greetings,

Andres Freund



Re: FETCH FIRST clause PERCENT option

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
>> offset_clause:
>> @@ -15435,6 +15442,7 @@ reserved_keyword:
>>             | ONLY
>>             | OR
>>             | ORDER
>> +            | PERCENT
>>             | PLACING
>>             | PRIMARY
>>             | REFERENCES

> Are we really ok with adding a new reserved keyword 'PERCENT' for this?

I'm not.  It doesn't look like it ought to be necessary to reserve it,
either, given that we don't have to reduce the production right there.

(If that doesn't work right away, try getting rid of row_or_rows
in favor of spelling out those alternatives in separate productions.)

More generally: using an undocumented list as the data structure for
select_limit's result was already a pretty bad idea, I think, and
this patch certainly makes it totally unreadable.  Probably ought
to refactor that to use some sort of struct.

            regards, tom lane



Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Tue, Mar 26, 2019 at 5:46 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Tue, Mar 26, 2019 at 03:06:52PM +0300, Surafel Temesgen wrote:

As for the OFFSET, I don't see why that would be incompatible with PERCENT
clause. I'm not sure if it should compute the percentage from the total
number of rows (including those skipped by the OFFSET) or the rest, but I
assume SQL standard says something about that.


I check SQL standard and it is different in PERCENT cause. It state that
the percentage is computed using the number of rows before removing the
rows specified by offset row count

regards
Surafel

Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:

Thank you for looking at it
On Mon, Apr 1, 2019 at 12:34 AM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2019-03-29 12:04:50 +0300, Surafel Temesgen wrote:

> +                             if (node->limitOption == PERCENTAGE)
> +                             {
> +                                     while (node->position - node->offset < node->count)
> +                                     {
> +                                             slot = MakeSingleTupleTableSlot(tupleDescriptor, &TTSOpsMinimalTuple);
> +                                             if (tuplestore_gettupleslot(node->tupleStore, true, true, slot))

There's several blocks like this - how's this not going to be a resource
leak? As far as I can tell you're just going to create new slots, and
their previous contents aren't going to be cleared?   I think there very
rarely are cases where an executor node's *Next function (or its
subsidiaries) creates slots. Normally you ought to create them *once* on
the *Init function.

 
create it in init stage is good idea. i make it this way because tuplestore_gettupleslot
work with TTSOpsMinimalTuple
 
You might not directly leak memory, because this is probably running in
a short lived context, but you'll leak tuple desc references etc. (Or if
it were a heap buffer slot, buffer pins).


> +                     /* In PERCENTAGE case the result is already in tuplestore */
> +                     if (node->limitOption == PERCENTAGE)
> +                     {
> +                             slot = MakeSingleTupleTableSlot(tupleDescriptor, &TTSOpsMinimalTuple);
> +                             if (tuplestore_gettupleslot(node->tupleStore, false, false, slot))
> +                             {
> +                                     node->subSlot = slot;
> +                                     node->lstate = LIMIT_INWINDOW;
> +                             }
> +                             else
> +                                     elog(ERROR, "LIMIT subplan failed to run backwards");
> +                     }
> +                     else if (node->limitOption == EXACT_NUMBER)
> +                     {
>                       /*
>                        * Backing up from subplan EOF, so re-fetch previous tuple; there
>                        * should be one!  Note previous tuple must be in window.
> @@ -194,6 +329,7 @@ ExecLimit(PlanState *pstate)
>                       node->subSlot = slot;
>                       node->lstate = LIMIT_INWINDOW;
>                       /* position does not change 'cause we didn't advance it before */
> +                     }

The indentation here looks wrong...

>                       break;

>               case LIMIT_WINDOWEND:
> @@ -278,17 +414,29 @@ recompute_limits(LimitState *node)
>               /* Interpret NULL count as no count (LIMIT ALL) */
>               if (isNull)
>               {
> -                     node->count = 0;
> +                     node->count = 1;
>                       node->noCount = true;

Huh?


>               }
>               else
>               {
> -                     node->count = DatumGetInt64(val);
> -                     if (node->count < 0)
> -                             ereport(ERROR,
> -                                             (errcode(ERRCODE_INVALID_ROW_COUNT_IN_LIMIT_CLAUSE),
> -                                              errmsg("LIMIT must not be negative")));
> -                     node->noCount = false;
> +                     if (node->limitOption == PERCENTAGE)
> +                     {
> +                             /*
> +                              * We expect to return at least one row (unless there
> +                              * are no rows in the subplan), and we'll update this
> +                              * count later as we go.
> +                              */
> +                             node->count = 0;
> +                             node->percent = DatumGetFloat8(val);
> +                     }
> +                     else
> +                     {
> +                             node->count = DatumGetInt64(val);
> +                             if (node->count < 0)
> +                                     ereport(ERROR,
> +                                                     (errcode(ERRCODE_INVALID_ROW_COUNT_IN_LIMIT_CLAUSE),
> +                                                      errmsg("LIMIT must not be negative")));
> +                     }

Shouldn't we error out with a negative count, regardless of PERCENT? Or
is that prevented elsewhere?

yes it should error out .will fix it in next patch  


>               }
>       }
>       else
> @@ -299,8 +447,10 @@ recompute_limits(LimitState *node)
>       }

>       /* Reset position to start-of-scan */
> -     node->position = 0;
> +     node->position = 0;;

unnecessary


>       if (parse->sortClause)
>       {
> -             current_rel = create_ordered_paths(root,
> -                                                                                current_rel,
> -                                                                                final_target,
> -                                                                                final_target_parallel_safe,
> -                                                                                have_postponed_srfs ? -1.0 :
> -                                                                                limit_tuples);
> +
> +             /* In PERCENTAGE option there are no bound on the number of output tuples */
> +             if (parse->limitOption == PERCENTAGE)
> +                     current_rel = create_ordered_paths(root,
> +                                                                                        current_rel,
> +                                                                                        final_target,
> +                                                                                        final_target_parallel_safe,
> +                                                                                        have_postponed_srfs ? -1.0 :
> +                                                                                        -1.0);
> +             else
> +                     current_rel = create_ordered_paths(root,
> +                                                                                        current_rel,
> +                                                                                        final_target,
> +                                                                                        final_target_parallel_safe,
> +                                                                                        have_postponed_srfs ? -1.0 :
> +                                                                                        limit_tuples);

I'd rather not duplicate those two calls, and just have the limit_tuples
computation inside an if.


-1.0 means there are no limit. In percentage case the query execute until the end 


>  offset_clause:
> @@ -15435,6 +15442,7 @@ reserved_keyword:
>                       | ONLY
>                       | OR
>                       | ORDER
> +                     | PERCENT
>                       | PLACING
>                       | PRIMARY
>                       | REFERENCES

Are we really ok with adding a new reserved keyword 'PERCENT' for this?
That seems awfully likely to already exist in columns etc. Is there a
chance we can use a less strong category?


There are already a case in regression test. I will make it unreserved keyword in
next patch

regards
Surafel

Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Tue, Mar 26, 2019 at 5:46 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Tue, Mar 26, 2019 at 03:06:52PM +0300, Surafel Temesgen wrote:
>On Thu, Feb 28, 2019 at 11:16 PM Tomas Vondra <tomas.vondra@2ndquadrant.com>
>wrote:
>
>>
>> To give you a (admittedly, somewhat contrived and artificial example):
>>
>>     SELECT * FROM t1 WHERE id IN (
>>       SELECT id FROM t2 ORDER BY x FETCH FIRST 10 PERCENT ROWS ONLY
>>     );
>>
>> Maybe this example is bogus and/or does not really matter in practice. I
>> don't know, but I've been unable to convince myself that's the case.
>
>
>does this means we abandon incremental approach? and am not sure of
>calculating
>percentage after OFFSET clause is acceptable or not
>

I don't follow. I don't think I've suggested to abandon the incremental
approach - I've explained why I think it's what the patch should be doing,
and illustrated that with an example.

but it is more complex than the previous approach and it will be more complex
in starting calculating percentage before offset row count. Doesn't simplicity prefer?

regards
Surafel

Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:
On Fri, Apr 05, 2019 at 03:14:56PM +0300, Surafel Temesgen wrote:
>On Tue, Mar 26, 2019 at 5:46 PM Tomas Vondra <tomas.vondra@2ndquadrant.com>
>wrote:
>
>> On Tue, Mar 26, 2019 at 03:06:52PM +0300, Surafel Temesgen wrote:
>> >On Thu, Feb 28, 2019 at 11:16 PM Tomas Vondra <
>> tomas.vondra@2ndquadrant.com>
>> >wrote:
>> >
>> >>
>> >> To give you a (admittedly, somewhat contrived and artificial example):
>> >>
>> >>     SELECT * FROM t1 WHERE id IN (
>> >>       SELECT id FROM t2 ORDER BY x FETCH FIRST 10 PERCENT ROWS ONLY
>> >>     );
>> >>
>> >> Maybe this example is bogus and/or does not really matter in practice. I
>> >> don't know, but I've been unable to convince myself that's the case.
>> >
>> >
>> >does this means we abandon incremental approach? and am not sure of
>> >calculating
>> >percentage after OFFSET clause is acceptable or not
>> >
>>
>> I don't follow. I don't think I've suggested to abandon the incremental
>> approach - I've explained why I think it's what the patch should be doing,
>> and illustrated that with an example.
>>
>
>but it is more complex than the previous approach and it will be more
>complex
>in starting calculating percentage before offset row count. Doesn't
>simplicity prefer?
>

Sure, simplicity has it's value. And often the simplest solution is the
best one. But in But in plenty of cases it's a tradeoff between simplicity
and efficiency, i.e. the simplest solution may perform poorly. I've tried
to explain why I think the incremental solution is the right approach
here - not having to always execute the subplan to completion. I still
haven't heard reasons why that argument is irrelevant (it may be).

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:
Hello,
The attached patch include the fix for all the comment given
regards
Surafel

On Mon, Apr 1, 2019 at 12:34 AM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2019-03-29 12:04:50 +0300, Surafel Temesgen wrote:

> +                             if (node->limitOption == PERCENTAGE)
> +                             {
> +                                     while (node->position - node->offset < node->count)
> +                                     {
> +                                             slot = MakeSingleTupleTableSlot(tupleDescriptor, &TTSOpsMinimalTuple);
> +                                             if (tuplestore_gettupleslot(node->tupleStore, true, true, slot))

There's several blocks like this - how's this not going to be a resource
leak? As far as I can tell you're just going to create new slots, and
their previous contents aren't going to be cleared?   I think there very
rarely are cases where an executor node's *Next function (or its
subsidiaries) creates slots. Normally you ought to create them *once* on
the *Init function.

You might not directly leak memory, because this is probably running in
a short lived context, but you'll leak tuple desc references etc. (Or if
it were a heap buffer slot, buffer pins).


> +                     /* In PERCENTAGE case the result is already in tuplestore */
> +                     if (node->limitOption == PERCENTAGE)
> +                     {
> +                             slot = MakeSingleTupleTableSlot(tupleDescriptor, &TTSOpsMinimalTuple);
> +                             if (tuplestore_gettupleslot(node->tupleStore, false, false, slot))
> +                             {
> +                                     node->subSlot = slot;
> +                                     node->lstate = LIMIT_INWINDOW;
> +                             }
> +                             else
> +                                     elog(ERROR, "LIMIT subplan failed to run backwards");
> +                     }
> +                     else if (node->limitOption == EXACT_NUMBER)
> +                     {
>                       /*
>                        * Backing up from subplan EOF, so re-fetch previous tuple; there
>                        * should be one!  Note previous tuple must be in window.
> @@ -194,6 +329,7 @@ ExecLimit(PlanState *pstate)
>                       node->subSlot = slot;
>                       node->lstate = LIMIT_INWINDOW;
>                       /* position does not change 'cause we didn't advance it before */
> +                     }

The indentation here looks wrong...

>                       break;

>               case LIMIT_WINDOWEND:
> @@ -278,17 +414,29 @@ recompute_limits(LimitState *node)
>               /* Interpret NULL count as no count (LIMIT ALL) */
>               if (isNull)
>               {
> -                     node->count = 0;
> +                     node->count = 1;
>                       node->noCount = true;

Huh?


>               }
>               else
>               {
> -                     node->count = DatumGetInt64(val);
> -                     if (node->count < 0)
> -                             ereport(ERROR,
> -                                             (errcode(ERRCODE_INVALID_ROW_COUNT_IN_LIMIT_CLAUSE),
> -                                              errmsg("LIMIT must not be negative")));
> -                     node->noCount = false;
> +                     if (node->limitOption == PERCENTAGE)
> +                     {
> +                             /*
> +                              * We expect to return at least one row (unless there
> +                              * are no rows in the subplan), and we'll update this
> +                              * count later as we go.
> +                              */
> +                             node->count = 0;
> +                             node->percent = DatumGetFloat8(val);
> +                     }
> +                     else
> +                     {
> +                             node->count = DatumGetInt64(val);
> +                             if (node->count < 0)
> +                                     ereport(ERROR,
> +                                                     (errcode(ERRCODE_INVALID_ROW_COUNT_IN_LIMIT_CLAUSE),
> +                                                      errmsg("LIMIT must not be negative")));
> +                     }

Shouldn't we error out with a negative count, regardless of PERCENT? Or
is that prevented elsewhere?

>               }
>       }
>       else
> @@ -299,8 +447,10 @@ recompute_limits(LimitState *node)
>       }

>       /* Reset position to start-of-scan */
> -     node->position = 0;
> +     node->position = 0;;

unnecessary


>       if (parse->sortClause)
>       {
> -             current_rel = create_ordered_paths(root,
> -                                                                                current_rel,
> -                                                                                final_target,
> -                                                                                final_target_parallel_safe,
> -                                                                                have_postponed_srfs ? -1.0 :
> -                                                                                limit_tuples);
> +
> +             /* In PERCENTAGE option there are no bound on the number of output tuples */
> +             if (parse->limitOption == PERCENTAGE)
> +                     current_rel = create_ordered_paths(root,
> +                                                                                        current_rel,
> +                                                                                        final_target,
> +                                                                                        final_target_parallel_safe,
> +                                                                                        have_postponed_srfs ? -1.0 :
> +                                                                                        -1.0);
> +             else
> +                     current_rel = create_ordered_paths(root,
> +                                                                                        current_rel,
> +                                                                                        final_target,
> +                                                                                        final_target_parallel_safe,
> +                                                                                        have_postponed_srfs ? -1.0 :
> +                                                                                        limit_tuples);

I'd rather not duplicate those two calls, and just have the limit_tuples
computation inside an if.


>  offset_clause:
> @@ -15435,6 +15442,7 @@ reserved_keyword:
>                       | ONLY
>                       | OR
>                       | ORDER
> +                     | PERCENT
>                       | PLACING
>                       | PRIMARY
>                       | REFERENCES

Are we really ok with adding a new reserved keyword 'PERCENT' for this?
That seems awfully likely to already exist in columns etc. Is there a
chance we can use a less strong category?

Greetings,

Andres Freund
Attachment

Re: FETCH FIRST clause PERCENT option

From
Thomas Munro
Date:
On Thu, Jun 27, 2019 at 9:06 PM Surafel Temesgen <surafel3000@gmail.com> wrote:
> The attached patch include the fix for all the comment given

Hi Surafel,

There's a call to adjust_limit_rows_costs() hiding under
contrib/postgres_fdw, so this fails check-world.

-- 
Thomas Munro
https://enterprisedb.com



Re: FETCH FIRST clause PERCENT option

From
Ryan Lambert
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:       tested, passed
Spec compliant:           not tested
Documentation:            not tested

The basic functionality works as I expect.  In the following example I would have guessed it would return 4 rows
insteadof 5.  I don't mind that it uses ceil here, but think that deserves a mention in the documentation.  
 

CREATE TABLE r100 (id INT);
INSERT INTO r100 SELECT generate_series(1, 100);

SELECT * FROM r100 FETCH FIRST 4.01 PERCENT ROWS ONLY;
 id
----
  1
  2
  3
  4
  5
(5 rows)

There's a missing space between the period and following sentence in src\backend\executor\nodeLimit.c
    "previous time we got a different result.In PERCENTAGE option there are"


There's a missing space and the beginning "w" should be capitalized in doc\src\sgml\ref\select.sgml
    with <literal>PERCENT</literal> count specifies the maximum number of rows to return 
    in percentage.<literal>ROW</literal>

Another missing space after the period.

    previous time we got a different result.In PERCENTAGE option there are"


Ryan Lambert

The new status of this patch is: Waiting on Author

Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:
Hi Thomas,
Thank you for informing me

Hi Surafel,

There's a call to adjust_limit_rows_costs() hiding under
contrib/postgres_fdw, so this fails check-world.

Fixed . I also include the review given by Ryan in attached patch

regards
Surafel
Attachment

Re: FETCH FIRST clause PERCENT option

From
Ryan Lambert
Date:
Hi Surafel,

The v5 patch [1] applies cleanly and passes make installcheck-world.

My concern with this is its performance.  As a user I would expect using this feature to enable queries to run faster than they would simply pulling the full table.  I tested on tables ranging from 10k rows up to 10 million with the same basic result that using FETCH FIRST N PERCENT is slower than using the full table.  At best it ran slightly slower than querying the full table, at worst it increased execution times by 1400% when using a large high percentage (95%).

Testing was on a DigitalOcean 2 CPU, 2GB RAM droplet.  All queries were executed multiple times (6+) with EXPLAIN (ANALYZE, COSTS) and /timing enabled, query plans provided are from the faster end of each query.

Create the test table:

DROP TABLE IF EXISTS r10mwide;
CREATE TABLE r10mwide AS
SELECT generate_series(1, 10000000)::INT AS id,
    random() AS v1,
    random() AS v2,
    random() AS v3,
    random() AS v4,
    random() AS v5,
    random() AS v6,
    random() AS v7,
    random() AS v8,
    random() AS v9,
    random() AS v10
    ;
VACUUM ANALYZE;


Start with a baseline query from the full table, the limiting will be done within the CTE as I would typically do in a production query.  The outer query does basic aggregates.  Below I provide each full query for easy copy/paste but the only real change is how the inner results are limited.  This runs in 2.2s off the full table.

EXPLAIN (ANALYZE, COSTS)
WITH t AS (
SELECT id, v1, v2
    FROM r10mwide
) SELECT AVG(v1), MIN(v1), AVG(v1 + v2) FROM t
;
       
                                                                   QUERY PLAN                                                                    
-------------------------------------------------------------------------------------------------------------------------------------------------
 Finalize Aggregate  (cost=227192.07..227192.08 rows=1 width=24) (actual time=2230.419..2230.419 rows=1 loops=1)
   ->  Gather  (cost=227191.84..227192.05 rows=2 width=72) (actual time=2228.321..2231.225 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=226191.84..226191.85 rows=1 width=72) (actual time=2218.754..2218.754 rows=1 loops=3)
               ->  Parallel Seq Scan on r10mwide  (cost=0.00..184524.92 rows=4166692 width=16) (actual time=0.032..934.652 rows=3333333 loops=3)
 Planning Time: 0.116 ms
 Execution Time: 2231.935 ms
(8 rows)

Time: 2232.459 ms (00:02.232)


It did use parallel, since the FETCH FIRST queries apparently won't I turned that off and ran it again.

SET max_parallel_workers_per_gather = 0;

New query plan for full table, no parallel, just over 4 seconds.

                                                          QUERY PLAN                                                          
-------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=342859.21..342859.22 rows=1 width=24) (actual time=4253.861..4253.862 rows=1 loops=1)
   ->  Seq Scan on r10mwide  (cost=0.00..242858.60 rows=10000060 width=16) (actual time=0.062..1728.346 rows=10000000 loops=1)
 Planning Time: 0.082 ms
 Execution Time: 4253.918 ms
(4 rows)


The following uses the explicit row count to pull 100k rows (1%) and gives a massive improvement in speed, this query ranged from 55- 120ms.  This is the type of speedup I would expect, and it beats the full-table query hands down, regardless of parallel query.

EXPLAIN (ANALYZE, COSTS)
WITH t AS (
SELECT id, v1, v2
    FROM r10mwide
    FETCH FIRST 100000 ROWS ONLY
) SELECT AVG(v1), MIN(v1), AVG(v1 + v2) FROM t
;
       
                                                           QUERY PLAN                                                            
---------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=4428.58..4428.59 rows=1 width=24) (actual time=55.963..55.963 rows=1 loops=1)
   ->  Limit  (cost=0.00..2428.57 rows=100000 width=20) (actual time=0.041..35.725 rows=100000 loops=1)
         ->  Seq Scan on r10mwide  (cost=0.00..242858.60 rows=10000060 width=20) (actual time=0.039..24.796 rows=100000 loops=1)
 Planning Time: 0.134 ms
 Execution Time: 56.032 ms
(5 rows)

Time: 57.262 ms



Using FETCH FIRST 1 PERCENT it takes over 7 seconds, 71% slower than querying the full table.

EXPLAIN (ANALYZE, COSTS)
WITH t AS (
SELECT id, v1, v2
    FROM r10mwide
    FETCH FIRST 1 PERCENT ROWS ONLY
) SELECT AVG(v1), MIN(v1), AVG(v1 + v2) FROM t
;
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=6857.22..6857.23 rows=1 width=24) (actual time=7112.205..7112.206 rows=1 loops=1)
   ->  Limit  (cost=2428.60..4857.19 rows=100001 width=20) (actual time=0.036..7047.394 rows=100000 loops=1)
         ->  Seq Scan on r10mwide  (cost=0.00..242858.60 rows=10000060 width=20) (actual time=0.033..3072.881 rows=10000000 loops=1)
 Planning Time: 0.132 ms
 Execution Time: 7214.302 ms
(5 rows)

Time: 7215.278 ms (00:07.215)


Cranking the percentage up to 95% took 59-75 seconds.

EXPLAIN (ANALYZE, COSTS)
WITH t AS (
SELECT id, v1, v2
    FROM r10mwide
    FETCH FIRST 95 PERCENT ROWS ONLY
) SELECT AVG(v1), MIN(v1), AVG(v1 + v2) FROM t
;
                                                               QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=651432.48..651432.49 rows=1 width=24) (actual time=58981.043..58981.044 rows=1 loops=1)
   ->  Limit  (cost=230715.67..461431.34 rows=9500057 width=20) (actual time=0.017..55799.389 rows=9500000 loops=1)
         ->  Seq Scan on r10mwide  (cost=0.00..242858.60 rows=10000060 width=20) (actual time=0.014..3847.146 rows=10000000 loops=1)
 Planning Time: 0.117 ms
 Execution Time: 59079.680 ms
(5 rows)


Even taking it way down to .001% (100 rows!) didn't run fast by any measure, more than 6 seconds.

EXPLAIN (ANALYZE, COSTS)
WITH t AS (
SELECT id, v1, v2
    FROM r10mwide
    FETCH FIRST .001 PERCENT ROWS ONLY
) SELECT AVG(v1), MIN(v1), AVG(v1 + v2) FROM t
;


                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=6.86..6.87 rows=1 width=24) (actual time=6414.406..6414.406 rows=1 loops=1)
   ->  Limit  (cost=2.43..4.86 rows=100 width=20) (actual time=0.021..6413.981 rows=100 loops=1)
         ->  Seq Scan on r10mwide  (cost=0.00..242858.60 rows=10000060 width=20) (actual time=0.017..3086.504 rows=10000000 loops=1)
 Planning Time: 0.168 ms
 Execution Time: 6495.686 ms



[1] https://www.postgresql.org/message-id/attachment/102333/percent-incremental-v5.patch


Ryan Lambert
RustProof Labs


On Mon, Jul 8, 2019 at 1:09 AM Surafel Temesgen <surafel3000@gmail.com> wrote:
Hi Thomas,
Thank you for informing me

Hi Surafel,

There's a call to adjust_limit_rows_costs() hiding under
contrib/postgres_fdw, so this fails check-world.

Fixed . I also include the review given by Ryan in attached patch

regards
Surafel

Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:

Hi Ryan,
On Tue, Jul 9, 2019 at 1:27 AM Ryan Lambert <ryan@rustprooflabs.com> wrote:
Hi Surafel,

The v5 patch [1] applies cleanly and passes make installcheck-world.

My concern with this is its performance.  As a user I would expect using this feature to enable queries to run faster than they would simply pulling the full table.  I tested on tables ranging from 10k rows up to 10 million with the same basic result that using FETCH FIRST N PERCENT is slower than using the full table.  At best it ran slightly slower than querying the full table, at worst it increased execution times by 1400% when using a large high percentage (95%).

 

The cost of FITCH FIRST N PERCENT execution in current implementation is the cost of pulling the full table plus the cost of storing and fetching the tuple from tuplestore so it can not perform better than pulling the full table in any case . This is because we can't determined the number of rows to return without executing the plan until the end. We can find the estimation of rows that will be return in planner estimation but that is not exact.


regards

Surafel

Re: FETCH FIRST clause PERCENT option

From
Ryan Lambert
Date:
Surafel,

> The cost of FITCH FIRST N PERCENT execution in current implementation is the cost of pulling the full table plus the cost of storing and fetching the tuple from tuplestore so it can > not perform better than pulling the full table in any case . This is because we can't determined the number of rows to return without executing the plan until the end. We can find the > estimation of rows that will be return in planner estimation but that is not exact.

Ok, I can live with that for the normal use cases.  This example from the end of my previous message using 95% seems like a problem still, I don't like syntax that unexpectedly kills performance like this one.  If this can't be improved in the initial release of the feature I'd suggest we at least make a strong disclaimer in the docs, along the lines of:

"It is possible for FETCH FIRST N PERCENT to create poorly performing query plans when the N supplied exceeds 50 percent.  In these cases query execution can take an order of magnitude longer to execute than simply returning the full table.  If performance is critical using an explicit row count for limiting is recommended."

I'm not certain the 50 percent is the true threshold of where things start to fall apart, I just used that as a likely guess for now.  I can do some more testing this week to identify where things start falling apart performance wise.  Thanks,

EXPLAIN (ANALYZE, COSTS)
WITH t AS (
SELECT id, v1, v2
    FROM r10mwide
    FETCH FIRST 95 PERCENT ROWS ONLY
) SELECT AVG(v1), MIN(v1), AVG(v1 + v2) FROM t
;
                                                               QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=651432.48..651432.49 rows=1 width=24) (actual time=58981.043..58981.044 rows=1 loops=1)
   ->  Limit  (cost=230715.67..461431.34 rows=9500057 width=20) (actual time=0.017..55799.389 rows=9500000 loops=1)
         ->  Seq Scan on r10mwide  (cost=0.00..242858.60 rows=10000060 width=20) (actual time=0.014..3847.146 rows=10000000 loops=1)
 Planning Time: 0.117 ms
 Execution Time: 59079.680 ms
(5 rows)  

Ryan Lambert

Re: FETCH FIRST clause PERCENT option

From
Ryan Lambert
Date:

I did some more testing.  I initialized a database with 1 million rows with indexes and joins to test against and ran pgbench with a few different settings for % to return.  I started with a base query not utilizing the new functionality. The queries used are similar to my prior examples, code at [1].

createdb bench_test
psql -d bench_test -f init/reporting.sql -v scale=10

The following provided 3.21 TPS and an average latency of 623.  The "per_change_" columns in the table below use those values.

pgbench -c 2 -j 2 -T 600 -P 60 -s 10 \
   -f tests/reporting1.sql bench_test

The remainder of the tests use the following, only adjusting fetch_percent value:

pgbench -c 2 -j 2 -T 600 -P 60 -s 10 \
   --define=fetch_percent=1 \
   -f tests/reporting_fetch_percent.sql \
   bench_test


Returning 1% it runs well.  By 10% the TPS drops by 30% while the average latency increases by 43%.  When returning 95% of the table latency has increased by 548%.


 fetch_percent | tps  | latency_avg_ms | per_change_tps | per_change_latency
---------------+------+----------------+----------------+--------------------
             1 | 3.37 |            593 |           0.05 |              -0.05
             5 | 2.85 |            700 |          -0.11 |               0.12
            10 | 2.24 |            891 |          -0.30 |               0.43
            25 | 1.40 |           1423 |          -0.56 |               1.28
            45 | 0.93 |           2147 |          -0.71 |               2.45
            95 | 0.49 |           4035 |          -0.85 |               5.48


I manually tested the inner select queries without the outer aggregation thinking it might be a different story with a simple select and no CTE.  Unfortunately it showed the same overall characteristics.  1% returns in about 550 ms, 45% took 1950, and 95% took 4050.

[1] https://github.com/rustprooflabs/pgbench-tests

Ryan


Re: FETCH FIRST clause PERCENT option

From
Kyotaro Horiguchi
Date:
Hello.

At Tue, 9 Jul 2019 21:56:32 -0600, Ryan Lambert <ryan@rustprooflabs.com> wrote in
<CAN-V+g-rwFp=xQEjOwbJuggNLegMi1qDhaJt3h1Eqm16yqwqmw@mail.gmail.com>
> I did some more testing.  I initialized a database with 1 million rows with
> indexes and joins to test against and ran pgbench with a few different
> settings for % to return.  I started with a base query not utilizing the
> new functionality. The queries used are similar to my prior examples, code
> at [1].
> 
> createdb bench_test
> psql -d bench_test -f init/reporting.sql -v scale=10
> 
> The following provided 3.21 TPS and an average latency of 623.  The
> "per_change_" columns in the table below use those values.
> 
> pgbench -c 2 -j 2 -T 600 -P 60 -s 10 \
>    -f tests/reporting1.sql bench_test
> 
> The remainder of the tests use the following, only adjusting fetch_percent
> value:
> 
> pgbench -c 2 -j 2 -T 600 -P 60 -s 10 \
>    --define=fetch_percent=1 \
>    -f tests/reporting_fetch_percent.sql \
>    bench_test
> 
> 
> Returning 1% it runs well.  By 10% the TPS drops by 30% while the average
> latency increases by 43%.  When returning 95% of the table latency has
> increased by 548%.
> 
> 
>  fetch_percent | tps  | latency_avg_ms | per_change_tps | per_change_latency
> ---------------+------+----------------+----------------+--------------------
>              1 | 3.37 |            593 |           0.05 |              -0.05
>              5 | 2.85 |            700 |          -0.11 |               0.12
>             10 | 2.24 |            891 |          -0.30 |               0.43
>             25 | 1.40 |           1423 |          -0.56 |               1.28
>             45 | 0.93 |           2147 |          -0.71 |               2.45
>             95 | 0.49 |           4035 |          -0.85 |               5.48
> 
> 
> I manually tested the inner select queries without the outer aggregation
> thinking it might be a different story with a simple select and no CTE.
> Unfortunately it showed the same overall characteristics.  1% returns in
> about 550 ms, 45% took 1950, and 95% took 4050.
> 
> [1] https://github.com/rustprooflabs/pgbench-tests

It is seen by a simpler test.

create table t as select a from generate_series(0, 99999) a;
analyze t;
explain analyze select * from t order by a desc;
 Execution Time: 116.613 ms
explain analyze select * from t order by a desc fetch first 1 percent rows only;
 Execution Time: 158.458 ms
explain analyze select * from t order by a desc fetch first 100 percent rows only;
 Execution Time: 364.442 ms

I didn't looked closer to the version. Fetching from tuplestore
and returning all tuples costs 206ms and it is exceeding the cost
of fething of the whole table and returning all tuples. I don't
believe tuplestore that isn't splling out to disk is so slower
than (cached) table access.

Other than that, we can rip the clause if it is 100%

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: FETCH FIRST clause PERCENT option

From
Kyotaro Horiguchi
Date:
Hello.

At Wed, 10 Jul 2019 15:02:57 +0900 (Tokyo Standard Time), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
<20190710.150257.260806103.horikyota.ntt@gmail.com>
> It is seen by a simpler test.
> 
> create table t as select a from generate_series(0, 99999) a;
> analyze t;
> explain analyze select * from t order by a desc;
>  Execution Time: 116.613 ms
> explain analyze select * from t order by a desc fetch first 1 percent rows only;
>  Execution Time: 158.458 ms
> explain analyze select * from t order by a desc fetch first 100 percent rows only;
>  Execution Time: 364.442 ms
> 
> I didn't looked closer to the version. Fetching from tuplestore
> and returning all tuples costs 206ms and it is exceeding the cost
> of fething of the whole table and returning all tuples. I don't
> believe tuplestore that isn't splling out to disk is so slower
> than (cached) table access.
> 
> Other than that, we can rip the clause if it is 100%

As a more significant point, I found that the first query in the
aboves runs faster by about 10-18% on master(unpatched).

explain analyze select * from t order by a desc;
 Execution Time: 96.690 ms

But perf didn't give me useful information.

patched:

11857    11.7065  postgres                 qsort_ssup
9026      8.9114  postgres                 ApplySortComparator
6443      6.3612  [vdso] (tgid:8388 range:0x7ffed49ed000-0x7ffed49eefff) [vdso] (tgid:8388
range:0x7ffed49ed000-0x7ffed49eefff)
5826      5.7520  postgres                 btint4fastcmp
4699      4.6393  no-vmlinux               /no-vmlinux
3451      3.4072  libc-2.17.so             __memcpy_ssse3_back
3270      3.2285  postgres                 LogicalTapeWrite
2972      2.9343  postgres                 copytup_heap
2961      2.9234  postgres                 readtup_heap
2769      2.7338  postgres                 LogicalTapeRead
2457      2.4258  postgres                 GetMemoryChunkContext
2147      2.1197  postgres                 InstrStopNode
2021      1.9953  postgres                 heapgettup_pagemode
1583      1.5629  postgres                 writetup_heap
1555      1.5353  postgres                 tuplesort_gettuple_common
1508      1.4889  postgres                 AllocSetAlloc
...

master:

12932    12.0168  postgres                 qsort_ssup
9491      8.8193  postgres                 ApplySortComparator
6705      6.2305  postgres                 btint4fastcmp
6557      6.0930  [vdso] (tgid:6341 range:0x7ffdd0315000-0x7ffdd0316fff) [vdso] (tgid:6341
range:0x7ffdd0315000-0x7ffdd0316fff)
4874      4.5291  no-vmlinux               /no-vmlinux
4059      3.7717  postgres                 readtup_heap
3707      3.4447  libc-2.17.so             __memcpy_ssse3_back
3583      3.3294  postgres                 LogicalTapeWrite
3382      3.1427  postgres                 LogicalTapeRead
3001      2.7886  postgres                 copytup_heap
2522      2.3435  postgres                 GetMemoryChunkContext
2464      2.2896  postgres                 heapgettup_pagemode
2115      1.9653  postgres                 InstrStopNode
1847      1.7163  postgres                 tuplesort_gettuple_common
1652      1.5351  postgres                 writetup_heap
1565      1.4542  postgres                 AllocSetAlloc

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: FETCH FIRST clause PERCENT option

From
Alvaro Herrera
Date:
On 2019-Jul-08, Surafel Temesgen wrote:

> Hi Thomas,
> Thank you for informing me
> 
> Hi Surafel,
> >
> > There's a call to adjust_limit_rows_costs() hiding under
> > contrib/postgres_fdw, so this fails check-world.
> >
> 
> Fixed . I also include the review given by Ryan in attached patch

What's with the new tuplestore function for getting heap tuples?  That
looks really odd.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:

Hi Alvaro,
On Wed, Jul 10, 2019 at 6:44 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
What's with the new tuplestore function for getting heap tuples?  That
looks really odd.
 

Previously I create new TTSOpsMinimalTuple type slots for every tuple returned in order to fetch it from tuplestore because tuplestore store tuple in MinimalTuple format and created slot format changes in subsequent operation. This case resource leak as Andres mention and I create this function to remove the need for creating new slot for every returned tuple because of form difference


regards

Surafel

Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:

Hi Ryan,
On Tue, Jul 9, 2019 at 4:13 PM Ryan Lambert <ryan@rustprooflabs.com> wrote:

"It is possible for FETCH FIRST N PERCENT to create poorly performing query plans when the N supplied exceeds 50 percent.  In these cases query execution can take an order of magnitude longer to execute than simply returning the full table.  If performance is critical using an explicit row count for limiting is recommended."

I don’t understand how fetch first n percent functionality can be replaced with explicit row count limiting. There may be a way to do it in a client side but we can not be sure of its performance advantage


regards

Surafel

  

Re: FETCH FIRST clause PERCENT option

From
Ryan Lambert
Date:
Surafel,

On Wed, Jul 17, 2019 at 3:45 AM Surafel Temesgen <surafel3000@gmail.com> wrote:

Hi Ryan,
On Tue, Jul 9, 2019 at 4:13 PM Ryan Lambert <ryan@rustprooflabs.com> wrote:

"It is possible for FETCH FIRST N PERCENT to create poorly performing query plans when the N supplied exceeds 50 percent.  In these cases query execution can take an order of magnitude longer to execute than simply returning the full table.  If performance is critical using an explicit row count for limiting is recommended."

I don’t understand how fetch first n percent functionality can be replaced with explicit row count limiting. There may be a way to do it in a client side but we can not be sure of its performance advantage


regards 

Surafel 

  

I was suggesting a warning in the documentation so users aren't caught unaware about the performance characteristics.  My first version was very rough, how about adding this in doc/src/sgml/ref/select.sgml?

"Using <literal>PERCENT</literal> is best suited to returning single-digit percentages of the query's total row count."

The following paragraphs in that same section give suggestions and warnings regarding LIMIT and OFFSET usage, I think this is more in line with the wording of those existing warnings.

Other than that, we can rip the clause if it is 100%

You mean if PERCENT=100 it should short circuit and run the query normally?  I like that.
That got me thinking, I didn't check what happens with PERCENT>100, I'll try to test that soon.

Thanks,
Ryan


Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:

Hi Ryan,
sorry for not be fast to replay


I was suggesting a warning in the documentation so users aren't caught unaware about the performance characteristics.  My first version was very rough, how about adding this in doc/src/sgml/ref/select.sgml?

"Using <literal>PERCENT</literal> is best suited to returning single-digit percentages of the query's total row count."

The following paragraphs in that same section give suggestions and warnings regarding LIMIT and OFFSET usage, I think this is more in line with the wording of those existing warnings.


Agreed and done
 
Other than that, we can rip the clause if it is 100%

You mean if PERCENT=100 it should short circuit and run the query normally?  I like that.

The patch did not did it automatically. Its query writer obligation to do that currently   
 
regards
Surafel
Attachment

Re: FETCH FIRST clause PERCENT option

From
Ryan Lambert
Date:
Surafel,

The patch did not did it automatically. Its query writer obligation to do that currently      

Ok.

Your latest patch [1] passes make installcheck-world, I didn't test the actual functionality this round.

   plan = (Plan *) make_limit(plan,
    subparse->limitOffset,
-   subparse->limitCount);
+   subparse->limitCount,
+   subparse->limitOption);
 

I assume the limit percentage number goes into subparse->limitCount?  If so, I don't see that documented.  Or does this truly only store the count?

The remainder of the code seems to make sense.  I attached a patch with a few minor changes in the comments.  I need to go back to my notes and see if I covered all the testing I had thought of, I should get to that later this week.


Ryan Lambert


Attachment

Re: FETCH FIRST clause PERCENT option

From
Kyotaro Horiguchi
Date:
Hello.

At Thu, 1 Aug 2019 15:54:11 +0300, Surafel Temesgen <surafel3000@gmail.com> wrote in
<CALAY4q_icXPTcAVxnPckpLGrZgER9-FWM0VPFvvOsPiJrnK6Dg@mail.gmail.com>
> > Other than that, we can rip the clause if it is 100%
> >
> >
> > You mean if PERCENT=100 it should short circuit and run the query
> > normally?  I like that.
> >
> 
> The patch did not did it automatically. Its query writer obligation to do
> that currently

I have some comments.

This patch uses distinct parameters for exact number and
percentage. On the othe hand planner has a notion of
tuple_fraction covering the both. The same handling is also used
for tuple number estimation. Using the same scheme will make
things far simpler. See the comment of grouping_planner().

In executor part, addition to LimiteState.position, this patch
uses LimiteState.backwardPosition to count how many tuples we're
back from the end of the current tuplestore. But things will get
simpler by just asking the tuplestore for the number of holding
tuples.


+    slot = node->subSlot;

Why is this needed? The variable is properly set before use and
the assignment is bogus in the first place.

The new code block in LIMIT_RESCAN in ExecLimit is useless since
it is exatctly the same with existing code block. Why didn't you
use the existing if block?

>                if (node->limitOption == PERCENTAGE)
+                {
+                    node->perExactCount = ceil(node->percent * node->position / 100.0);
+
+

node->position is the number of tuples returned to upper node so
far (not the number of tuples this node has read from the lower
node so far).  I don't understand what the expression means.


+                    if (node->perExactCount == node->perExactCount + 1)
+                        node->perExactCount++;

What? The condition never be true. As the result, the following
if block below won't run.

>                /*
+                 * Return the tuple up to the number of exact count in OFFSET
+                 * clause without percentage value consideration.
+                 */
+                if (node->perExactCount > 0)
+                {
+




+            /*
+             * We may needed this tuple in backward scan so put it into
+             * tuplestore.
+             */
+            if (node->limitOption == PERCENTAGE)
+            {
+                tuplestore_puttupleslot(node->tupleStore, slot);
+                tuplestore_advance(node->tupleStore, true);
+            }

"needed"->"need" ? The comment says that this is needed for
backward scan, but it is actually required even in forward
scan. More to the point, tuplestore_advance lacks comment.

Anyway, the code in LIMIT_RESCAN is broken in some ways. For
example, if it is used as the inner scan of a NestLoop, the
tuplestore accumulates tuples by every scan. You will see that
the tuplestore stores up to 1000 tuples (10 times of the inner)
by the following query.

create table t0 as (select a from generate_series(0, 99) a);
create table t1 as (select a from generate_series(0, 9) a);
analyze;
set enable_hashjoin to false;
set enable_mergejoin to false;
set enable_material to false;
explain analyze select t.*, t1.* from (select a from t0 fetch first 10 percent rows only) as t join t1 on (t.a =
t1.a);
                               QUERY PLAN                             
-------------------------------------------------------------------------------
 Nested Loop  (cost=0.20..7.35 rows=10 width=8) (actual ...)
   ->  Seq Scan on t1  (cost=0.00..1.10 rows=10 width=4) (actual ...)
   ->  Limit  (cost=0.20..0.40 rows=10 width=4) (actual ...)
         ->  Seq Scan on t0  (cost=0.00..2.00 rows=100 width=4) (actual ...)

That's it for now.


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:

Hi
On Wed, Aug 7, 2019 at 6:11 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

I have some comments.

This patch uses distinct parameters for exact number and
percentage. On the othe hand planner has a notion of
tuple_fraction covering the both. The same handling is also used
for tuple number estimation. Using the same scheme will make
things far simpler. See the comment of grouping_planner().


Its because of data type difference .In planner the data type is the same 

In executor part, addition to LimiteState.position, this patch
uses LimiteState.backwardPosition to count how many tuples we're
back from the end of the current tuplestore. But things will get
simpler by just asking the tuplestore for the number of holding
tuples.


backwardPosition hold how many tuple returned in backward scan
 

+    slot = node->subSlot;

Why is this needed? The variable is properly set before use and
the assignment is bogus in the first place.


its because Tuplestore needs initialized slot.  
 
The new code block in LIMIT_RESCAN in ExecLimit is useless since
it is exatctly the same with existing code block. Why didn't you
use the existing if block?

But they test different scenario
 

>                if (node->limitOption == PERCENTAGE)
+                {
+                    node->perExactCount = ceil(node->percent * node->position / 100.0);
+
+

node->position is the number of tuples returned to upper node so
far (not the number of tuples this node has read from the lower
node so far).  I don't understand what the expression means.

node->position hold the number of tuples this node has read from the lower
node so far. see LIMIT_RESCAN state
 


+                    if (node->perExactCount == node->perExactCount + 1)
+                        node->perExactCount++;

What? The condition never be true. As the result, the following
if block below won't run.

it became true according to the number of tuple returned in from the lower node so far
and percentage specification.
 

>                /*
+                 * Return the tuple up to the number of exact count in OFFSET
+                 * clause without percentage value consideration.
+                 */
+                if (node->perExactCount > 0)
+                {
+




+            /*
+             * We may needed this tuple in backward scan so put it into
+             * tuplestore.
+             */
+            if (node->limitOption == PERCENTAGE)
+            {
+                tuplestore_puttupleslot(node->tupleStore, slot);
+                tuplestore_advance(node->tupleStore, true);
+            }

"needed"->"need" ? The comment says that this is needed for
backward scan, but it is actually required even in forward
scan. More to the point, tuplestore_advance lacks comment.

ok
 

Anyway, the code in LIMIT_RESCAN is broken in some ways. For
example, if it is used as the inner scan of a NestLoop, the
tuplestore accumulates tuples by every scan. You will see that
the tuplestore stores up to 1000 tuples (10 times of the inner)
by the following query.

It this because in percentage we scan the whole table
 

regards
Surafel

Re: FETCH FIRST clause PERCENT option

From
Ryan Lambert
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           not tested
Documentation:            tested, passed

The latest patch [1] and the cleanup patch [2] apply cleanly to the latest master (5f110933).  I reviewed the
conversationand don't see any outstanding questions or concerns.  Updating status to ready for committer.
 

[1] https://www.postgresql.org/message-id/attachment/103028/percent-incremental-v6.patch
[2] https://www.postgresql.org/message-id/attachment/103157/percent-incremental-v6-comment-cleanup.patch

Ryan Lambert

The new status of this patch is: Ready for Committer

Re: FETCH FIRST clause PERCENT option

From
Erik Rijkers
Date:
On 2019-08-19 01:33, Ryan Lambert wrote:
> The following review has been posted through the commitfest 
> application:
> make installcheck-world:  tested, passed
> Implements feature:       tested, passed
> Spec compliant:           not tested
> Documentation:            tested, passed
> 
> The latest patch [1] and the cleanup patch [2] apply cleanly to the
> latest master (5f110933).  I reviewed the conversation and don't see
> any outstanding questions or concerns.  Updating status to ready for
> committer.
> 
> [1] > 
> https://www.postgresql.org/message-id/attachment/103028/percent-incremental-v6.patch
> [2] > 
> https://www.postgresql.org/message-id/attachment/103157/percent-incremental-v6-comment-cleanup.patch
> 
> Ryan Lambert
> 
> The new status of this patch is: Ready for Committer

Hi,

(running with those two patches applied)

   select * from onek where thousand < 5 order by thousand fetch first -1 
percent rows only
is correctly caught (with "ERROR:  PERCENT must not be negative") but:

   select * from  onek where thousand < 5 order by thousand fetch first 
101 percent rows only
is not. It doesn't return, and cannot be Ctrl-C'ed out of, which I guess 
is another bug?


thanks,

Erik Rijkers






Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:

Hi Erik
On Mon, Aug 19, 2019 at 10:47 AM Erik Rijkers <er@xs4all.nl> wrote:
Hi,

(running with those two patches applied)

   select * from onek where thousand < 5 order by thousand fetch first -1
percent rows only
is correctly caught (with "ERROR:  PERCENT must not be negative") but:

   select * from  onek where thousand < 5 order by thousand fetch first
101 percent rows only
is not. It doesn't return, and cannot be Ctrl-C'ed out of, which I guess
is another bug?

Fixed

regards
Surafel
Attachment

Re: FETCH FIRST clause PERCENT option

From
Erik Rijkers
Date:
On 2019-08-19 11:18, Surafel Temesgen wrote:
> 
> [..]
> 
> [percent-incremental-v7.patch]

Thanks.


Another little thing, not sure it's a bug:

limit interprets its argument by rounding up or down as one would 
expect:

table onek limit 10.4;  --> gives 10 rows
table onek limit 10.6;  --> gives 11 rows

but  FETCH count PERCENT  does not do that; it rounds always up.

select * from (table onek limit 100) f fetch first 10.4 percent rows 
only; --> gives 11 rows
select * from (table onek limit 100) f fetch first 10.6 percent rows 
only; --> gives 11 rows

I see that it's documented in the .sgml to behave as it does, but it 
seems wrong to me;
shouldn't that 10.4-percent-query yield 10 rows instead of 11?


thanks,

Erik Rijkers











Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Mon, Aug 19, 2019 at 1:55 PM Erik Rijkers <er@xs4all.nl> wrote:
Another little thing, not sure it's a bug:

limit interprets its argument by rounding up or down as one would
expect:

table onek limit 10.4;  --> gives 10 rows
table onek limit 10.6;  --> gives 11 rows

but  FETCH count PERCENT  does not do that; it rounds always up.

select * from (table onek limit 100) f fetch first 10.4 percent rows
only; --> gives 11 rows
select * from (table onek limit 100) f fetch first 10.6 percent rows
only; --> gives 11 rows

I see that it's documented in the .sgml to behave as it does, but it
seems wrong to me;
shouldn't that 10.4-percent-query yield 10 rows instead of 11?

 

According to sql standard we have to use the result ceiling value

regards
Surafel

Re: FETCH FIRST clause PERCENT option

From
Kyotaro Horiguchi
Date:
Hi,

At Wed, 7 Aug 2019 10:20:09 +0300, Surafel Temesgen <surafel3000@gmail.com> wrote in
<CALAY4q98xbVHtZ4yj9DcCMG2-s1_JuRr7fYaNfW+bKmr22OiVQ@mail.gmail.com>
> Hi
> On Wed, Aug 7, 2019 at 6:11 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> wrote:
> 
> >
> > I have some comments.
> >
> > This patch uses distinct parameters for exact number and
> > percentage. On the othe hand planner has a notion of
> > tuple_fraction covering the both. The same handling is also used
> > for tuple number estimation. Using the same scheme will make
> > things far simpler. See the comment of grouping_planner().
> >
> >
> Its because of data type difference .In planner the data type is the same

I meant that, with the usage of tuple_fraction, one variable can
represent both the absolute limit and relative limit. This
simplifies parse structs.

> > In executor part, addition to LimiteState.position, this patch
> > uses LimiteState.backwardPosition to count how many tuples we're
> > back from the end of the current tuplestore. But things will get
> > simpler by just asking the tuplestore for the number of holding
> > tuples.
> >
> >
> backwardPosition hold how many tuple returned in backward scan

I meant that we don't need to hold the number in estate.

> > +    slot = node->subSlot;
> >
> > Why is this needed? The variable is properly set before use and
> > the assignment is bogus in the first place.
> >
> >
> its because Tuplestore needs initialized slot.

I meant that the initilized slot is overwritten before first use.

> > The new code block in LIMIT_RESCAN in ExecLimit is useless since
> > it is exatctly the same with existing code block. Why didn't you
> > use the existing if block?
> >
> 
> But they test different scenario

I meant that the two different scenario can share the code block.

> > >                if (node->limitOption == PERCENTAGE)
> > +                {
> > +                    node->perExactCount = ceil(node->percent *
> > node->position / 100.0);
> > +
> > +
> >
> > node->position is the number of tuples returned to upper node so
> > far (not the number of tuples this node has read from the lower
> > node so far).  I don't understand what the expression means.
> >
> 
> node->position hold the number of tuples this node has read from the lower
> node so far. see LIMIT_RESCAN state

Reallly? node->position is incremented when
tuplestore_gettupleslot_heaptuple() succeeded and reutnrs the
tuple to the caller immediately...

> > +                    if (node->perExactCount == node->perExactCount + 1)
> > +                        node->perExactCount++;
> >
> > What? The condition never be true. As the result, the following
> > if block below won't run.
> >
> 
> it became true according to the number of tuple returned in from the lower
> node so far
> and percentage specification.

Mmm. How do you think X can be equal to  (X + 1)?

> > >                /*
> > +                 * Return the tuple up to the number of exact count in
> > OFFSET
> > +                 * clause without percentage value consideration.
> > +                 */
> > +                if (node->perExactCount > 0)
> > +                {
> > +
> >
> >
> >
> >
> > +            /*
> > +             * We may needed this tuple in backward scan so put it into
> > +             * tuplestore.
> > +             */
> > +            if (node->limitOption == PERCENTAGE)
> > +            {
> > +                tuplestore_puttupleslot(node->tupleStore, slot);
> > +                tuplestore_advance(node->tupleStore, true);
> > +            }
> >
> > "needed"->"need" ? The comment says that this is needed for
> > backward scan, but it is actually required even in forward
> > scan. More to the point, tuplestore_advance lacks comment.
> 
> 
> ok
> 
> 
> >
> > Anyway, the code in LIMIT_RESCAN is broken in some ways. For
> > example, if it is used as the inner scan of a NestLoop, the
> > tuplestore accumulates tuples by every scan. You will see that
> > the tuplestore stores up to 1000 tuples (10 times of the inner)
> > by the following query.
> >
> 
> It this because in percentage we scan the whole table

It's useless and rather harmful that the tuplestore holds
indefinite number of duplicate set of the whole tuples from the
lower node. We must reuse tuples already stored in the tuplestore
or clear it before the next round.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Tue, Aug 20, 2019 at 9:10 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
Hi,

At Wed, 7 Aug 2019 10:20:09 +0300, Surafel Temesgen <surafel3000@gmail.com> wrote in <CALAY4q98xbVHtZ4yj9DcCMG2-s1_JuRr7fYaNfW+bKmr22OiVQ@mail.gmail.com>
> Hi
> On Wed, Aug 7, 2019 at 6:11 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> wrote:
>
> >
> > I have some comments.
> >
> > This patch uses distinct parameters for exact number and
> > percentage. On the othe hand planner has a notion of
> > tuple_fraction covering the both. The same handling is also used
> > for tuple number estimation. Using the same scheme will make
> > things far simpler. See the comment of grouping_planner().
> >
> >
> Its because of data type difference .In planner the data type is the same

I meant that, with the usage of tuple_fraction, one variable can
represent both the absolute limit and relative limit. This
simplifies parse structs.

In grouping_planner the patch set tuple bound to -1 in create_ordered_paths because it iterate until the end in percentage. Did that answer your question?  


> > In executor part, addition to LimiteState.position, this patch
> > uses LimiteState.backwardPosition to count how many tuples we're
> > back from the end of the current tuplestore. But things will get
> > simpler by just asking the tuplestore for the number of holding
> > tuples.
> >
> >
> backwardPosition hold how many tuple returned in backward scan

I meant that we don't need to hold the number in estate.

I did it this way because I didn't find an API in tuplestore to do this

 

> > +    slot = node->subSlot;
> >
> > Why is this needed? The variable is properly set before use and
> > the assignment is bogus in the first place.
> >
> >
> its because Tuplestore needs initialized slot.

I meant that the initilized slot is overwritten before first use.

> > The new code block in LIMIT_RESCAN in ExecLimit is useless since
> > it is exatctly the same with existing code block. Why didn't you
> > use the existing if block?
> >
>
> But they test different scenario

I meant that the two different scenario can share the code block.

Sorry for not clarifying will .One have to be check before offsetting and the other is after offsetting

 

> > >                if (node->limitOption == PERCENTAGE)
> > +                {
> > +                    node->perExactCount = ceil(node->percent *
> > node->position / 100.0);
> > +
> > +
> >
> > node->position is the number of tuples returned to upper node so
> > far (not the number of tuples this node has read from the lower
> > node so far).  I don't understand what the expression means.
> >
>
> node->position hold the number of tuples this node has read from the lower
> node so far. see LIMIT_RESCAN state

Reallly? node->position is incremented when
tuplestore_gettupleslot_heaptuple() succeeded and reutnrs the
tuple to the caller immediately...

> > +                    if (node->perExactCount == node->perExactCount + 1)
> > +                        node->perExactCount++;
> >
> > What? The condition never be true. As the result, the following
> > if block below won't run.
> >
>
> it became true according to the number of tuple returned in from the lower
> node so far
> and percentage specification.

Mmm. How do you think X can be equal to  (X + 1)?

Oops my bad .The attached patch remove the need of doing that  

> > >                /*
> > +                 * Return the tuple up to the number of exact count in
> > OFFSET
> > +                 * clause without percentage value consideration.
> > +                 */
> > +                if (node->perExactCount > 0)
> > +                {
> > +
> >
> >
> >
> >
> > +            /*
> > +             * We may needed this tuple in backward scan so put it into
> > +             * tuplestore.
> > +             */
> > +            if (node->limitOption == PERCENTAGE)
> > +            {
> > +                tuplestore_puttupleslot(node->tupleStore, slot);
> > +                tuplestore_advance(node->tupleStore, true);
> > +            }
> >
> > "needed"->"need" ? The comment says that this is needed for
> > backward scan, but it is actually required even in forward
> > scan. More to the point, tuplestore_advance lacks comment.
>
>
> ok
>
>
> >
> > Anyway, the code in LIMIT_RESCAN is broken in some ways. For
> > example, if it is used as the inner scan of a NestLoop, the
> > tuplestore accumulates tuples by every scan. You will see that
> > the tuplestore stores up to 1000 tuples (10 times of the inner)
> > by the following query.
> >
>
> It this because in percentage we scan the whole table

It's useless and rather harmful that the tuplestore holds
indefinite number of duplicate set of the whole tuples from the
lower node. We must reuse tuples already stored in the tuplestore
or clear it before the next round.
 

i agree with this optimization but it don't have to be in first version

regards
Surafel
Attachment

Re: FETCH FIRST clause PERCENT option

From
Ryan Lambert
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

Hi everyone,

The v8 patch [1] applies cleanly and passes tests.  I reviewed the conversation to date and from what I can see, all
identifiedbugs have been fixed and concerns either fixed or addressed.  Marking as ready for committer.
 

[1] https://www.postgresql.org/message-id/attachment/103486/percent-incremental-v8.patch

Ryan Lambert

The new status of this patch is: Ready for Committer

Re: FETCH FIRST clause PERCENT option

From
Tom Lane
Date:
Surafel Temesgen <surafel3000@gmail.com> writes:
> [ percent-incremental-v8.patch ]

I took a quick look through this.

* Why is this adding new functionality in tuplestore.c?  Especially
functionality to get out a different tuple representation than what was
put in?  I can't see a reason why nodeLimit should need that, and for
sure I don't see a reason why it would need it for this feature if it
didn't before.

* The business with query.limitCount having different data types
depending on LimitOption seems entirely bletcherous and bug-inducing.
(There are live bugs of that sort in what you have now: notably, that
kluge in adjust_limit_rows_costs will crash and burn if float8 is
pass-by-reference.  Breaking the Datum abstraction is bad practice.)
I wonder if we could get away with making limitCount be float8 all the
time.  This would mean that you'd lose exactness for counts above 2^53
(or so, depending on whether float8 is IEEE or not), but I doubt that
anybody would ever notice.

* Names like "PERCENTAGE" seem mighty generic to be exposing as globally
known symbols; also you're disregarding a fairly widespread PG
convention that members of an enum should have names derived from the
enum type's name.  So I'd be inclined to make the values of enum
LimitOption be LIMIT_OPTION_COUNT and LIMIT_OPTION_PERCENT.  (I don't
especially like EXACT_NUMBER, first because it's going to be quite long
with the prefix, and second because it suggests that maybe there's an
INEXACT_NUMBER alternative.)

* The "void *limitOption" business in gram.y also seems pretty ugly;
it'd be better for that to be enum LimitOption.  I gather you want
the null to indicate "no option given", but likely it'd be better
to invent a LIMIT_OPTION_DEFAULT enum alternative for that purpose.

* An alternative to the three points above is just to separate
Query.limitCount (an int8) from Query.limitPercent (a float8), with the
understanding that at most one of the two can be non-null, and use
similar representations at other steps of the processing chain.  Then
you don't need enum limitOption at all.  That might not scale especially
nicely to additional variants, but trying to overload limitCount even
further isn't going to be nice either.

* The proposed change in grouping_planner() seems like a complete hack.
Why would you not change the way that limit_tuples was computed earlier,
instead?  That is, it seems like the part of the planner to teach about
this feature is preprocess_limit (and limit_needed), not someplace else.
It is surely not sane that only this one planner decision would react to
a percentage-based limit.

* I didn't really study the changes in nodeLimit.c, but doing
"tuplestore_rescan" in ExecReScanLimit is surely just wrong.  You
probably want to delete and recreate the tuplestore, instead, since
whatever data you already collected is of no further use.  Maybe, in
the case where no rescan of the child node is needed, you could re-use
the data already collected; but that would require a bunch of additional
logic.  I'm inclined to think that v1 of the patch shouldn't concern
itself with that sort of optimization.

* I don't see any delta in ruleutils.c, but surely that needs to know
about this feature so that it can correctly print views using it.

            regards, tom lane



Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:

Hi Tom,
On Fri, Sep 6, 2019 at 1:26 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Surafel Temesgen <surafel3000@gmail.com> writes:
> [ percent-incremental-v8.patch ]

I took a quick look through this.

* Why is this adding new functionality in tuplestore.c?  Especially
functionality to get out a different tuple representation than what was
put in? 

If am not mistaken tuplestore store MinimalTuple(tuple without transaction status information) for minimal memory usage. it change what we put in into MinimalTuple so we need the same format for getting out. The other way I can think of doing it is allocating new MinimalTuple slot every time to get the tuple which have resource like. The operation is very similar to RunFromStore without the ability of reusing once allocated slot.

 
I can't see a reason why nodeLimit should need that, and for
sure I don't see a reason why it would need it for this feature if it
didn't before.

Only this feature needs tuple storage



* The business with query.limitCount having different data types
depending on LimitOption seems entirely bletcherous and bug-inducing.
(There are live bugs of that sort in what you have now: notably, that
kluge in adjust_limit_rows_costs will crash and burn if float8 is
pass-by-reference.  Breaking the Datum abstraction is bad practice.)
I wonder if we could get away with making limitCount be float8 all the
time.  This would mean that you'd lose exactness for counts above 2^53
(or so, depending on whether float8 is IEEE or not), but I doubt that
anybody would ever notice.

* Names like "PERCENTAGE" seem mighty generic to be exposing as globally
known symbols; also you're disregarding a fairly widespread PG
convention that members of an enum should have names derived from the
enum type's name.  So I'd be inclined to make the values of enum
LimitOption be LIMIT_OPTION_COUNT and LIMIT_OPTION_PERCENT.  (I don't
especially like EXACT_NUMBER, first because it's going to be quite long
with the prefix, and second because it suggests that maybe there's an
INEXACT_NUMBER alternative.)

* The "void *limitOption" business in gram.y also seems pretty ugly;
it'd be better for that to be enum LimitOption.  I gather you want
the null to indicate "no option given", but likely it'd be better
to invent a LIMIT_OPTION_DEFAULT enum alternative for that purpose.


i like to fix the above three more because of scaling 

* An alternative to the three points above is just to separate
Query.limitCount (an int8) from Query.limitPercent (a float8), with the
understanding that at most one of the two can be non-null, and use
similar representations at other steps of the processing chain.  Then
you don't need enum limitOption at all.  That might not scale especially
nicely to additional variants, but trying to overload limitCount even
further isn't going to be nice either.

* The proposed change in grouping_planner() seems like a complete hack.
Why would you not change the way that limit_tuples was computed earlier,
instead?
That is, it seems like the part of the planner to teach about
this feature is preprocess_limit (and limit_needed), not someplace else.
It is surely not sane that only this one planner decision would react to
a percentage-based limit.

* I didn't really study the changes in nodeLimit.c, but doing
"tuplestore_rescan" in ExecReScanLimit is surely just wrong.  You
probably want to delete and recreate the tuplestore, instead, since
whatever data you already collected is of no further use.  Maybe, in
the case where no rescan of the child node is needed, you could re-use
the data already collected; but that would require a bunch of additional
logic.  I'm inclined to think that v1 of the patch shouldn't concern
itself with that sort of optimization.


ok
 
* I don't see any delta in ruleutils.c, but surely that needs to know
about this feature so that it can correctly print views using it.

ok

regards
Surafel

Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:
Hi Tom,
In the attached patch i include the comments given

regards
Surafel
Attachment

Re: FETCH FIRST clause PERCENT option

From
Ryan Lambert
Date:
On Thu, Sep 19, 2019 at 6:52 AM Surafel Temesgen <surafel3000@gmail.com> wrote:
Hi Tom,
In the attached patch i include the comments given

regards
Surafel

Patch v9 applies and passes make installcheck-world.  

> From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> Date: 2019-09-05 22:26:29

> * I didn't really study the changes in nodeLimit.c, but doing
> "tuplestore_rescan" in ExecReScanLimit is surely just wrong.  You
> probably want to delete and recreate the tuplestore, instead, since
> whatever data you already collected is of no further use.  Maybe, in
> the case where no rescan of the child node is needed, you could re-use
> the data already collected; but that would require a bunch of additional
> logic.  I'm inclined to think that v1 of the patch shouldn't concern
> itself with that sort of optimization.

I don't think this was addressed.  

Ryan Lambert


Re: FETCH FIRST clause PERCENT option

From
Kyotaro Horiguchi
Date:
At Thu, 26 Sep 2019 15:13:42 -0600, Ryan Lambert <ryan@rustprooflabs.com> wrote in
<CAN-V+g_cG4_R9q25nccNoMrtwA7DsCuHh-G5_ygaj8PBFTWUDw@mail.gmail.com>
> On Thu, Sep 19, 2019 at 6:52 AM Surafel Temesgen <surafel3000@gmail.com>
> wrote:
> 
> > Hi Tom,
> > In the attached patch i include the comments given
> >
> > regards
> > Surafel
> >
> 
> Patch v9 applies and passes make installcheck-world.
> 
> > From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> > Date: 2019-09-05 22:26:29
> 
> > * I didn't really study the changes in nodeLimit.c, but doing
> > "tuplestore_rescan" in ExecReScanLimit is surely just wrong.  You
> > probably want to delete and recreate the tuplestore, instead, since
> > whatever data you already collected is of no further use.  Maybe, in
> > the case where no rescan of the child node is needed, you could re-use
> > the data already collected; but that would require a bunch of additional
> > logic.  I'm inclined to think that v1 of the patch shouldn't concern
> > itself with that sort of optimization.
> 
> I don't think this was addressed.

It seems to me that Tom is suggesting to conisider that kind of
optimization after once completing the patch without it. (That in
the first patch was just wrong, but more stuff is needed to
achieve that.)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: FETCH FIRST clause PERCENT option

From
Ryan Lambert
Date:
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           not tested
Documentation:            not tested

The latest patch applies cleanly and passes all tests.  I believe all feedback has been addressed except for the one
casethat can be improved upon in later work.  Updating status as ready for committer.
 

Ryan Lambert

The new status of this patch is: Ready for Committer

Re: FETCH FIRST clause PERCENT option

From
Tomas Vondra
Date:
Hi,

This patch is marked as RFC since September. Since then there was no
discussion on this thread, but Andrew proposed an alternative approach
based on window functions in a separate thread [1] (both for this and
for the WITH TIES case).

I'll set this patch back to "needs review" - at this point we need to
decide which of the approaches is the right one.


regards

[1] https://www.postgresql.org/message-id/flat/87o8wvz253.fsf@news-spur.riddles.org.uk

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: FETCH FIRST clause PERCENT option

From
Kyotaro Horiguchi
Date:
Thank you for the notification, Tomas.

At Thu, 16 Jan 2020 22:33:58 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in 
> This patch is marked as RFC since September. Since then there was no
> discussion on this thread, but Andrew proposed an alternative approach
> based on window functions in a separate thread [1] (both for this and
> for the WITH TIES case).
> 
> I'll set this patch back to "needs review" - at this point we need to
> decide which of the approaches is the right one.

Even consdiering that we have only two usage , WITH TIES and PERCENT,
of the mechanism for now, I like [1] for the generic nature,
simpleness and smaller invasiveness to executor. The fact that cursor
needs materialize to allow backward scan would not be a big problem.
It seems that PERCENT will be easily implemented on that.

However, isn't there a possibility that we allow FETCH FIRST x PERCENT
WITH TIES, though I'm not sure what SQL spec tells about that?  I
don't come up right now how to do that using window functions..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: FETCH FIRST clause PERCENT option

From
Vik Fearing
Date:
On 17/01/2020 07:20, Kyotaro Horiguchi wrote:
> Thank you for the notification, Tomas.
>
> At Thu, 16 Jan 2020 22:33:58 +0100, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in 
>> This patch is marked as RFC since September. Since then there was no
>> discussion on this thread, but Andrew proposed an alternative approach
>> based on window functions in a separate thread [1] (both for this and
>> for the WITH TIES case).
>>
>> I'll set this patch back to "needs review" - at this point we need to
>> decide which of the approaches is the right one.
> Even consdiering that we have only two usage , WITH TIES and PERCENT,
> of the mechanism for now, I like [1] for the generic nature,
> simpleness and smaller invasiveness to executor. The fact that cursor
> needs materialize to allow backward scan would not be a big problem.
> It seems that PERCENT will be easily implemented on that.
>
> However, isn't there a possibility that we allow FETCH FIRST x PERCENT
> WITH TIES, though I'm not sure what SQL spec tells about that?  I
> don't come up right now how to do that using window functions..


PERCENT and WITH TIES can play together, per spec.

-- 

Vik Fearing




Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:

Hi


PERCENT and WITH TIES can play together, per spec.

I also Implement PERCENT WITH TIES option. patch is attached
i don't start a new tread because the patches share common code

regards
Surafel    
Attachment

Re: FETCH FIRST clause PERCENT option

From
Michael Paquier
Date:
On Mon, Aug 10, 2020 at 01:23:44PM +0300, Surafel Temesgen wrote:
> I also Implement PERCENT WITH TIES option. patch is attached
> i don't start a new tread because the patches share common code

This fails to apply per the CF bot.  Could you send a rebase?
--
Michael

Attachment

Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:

Hi Michael
On Thu, Sep 24, 2020 at 6:58 AM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Aug 10, 2020 at 01:23:44PM +0300, Surafel Temesgen wrote:
> I also Implement PERCENT WITH TIES option. patch is attached
> i don't start a new tread because the patches share common code

This fails to apply per the CF bot.  Could you send a rebase?
--


Attaches are rebased patches
 
regards
Surafel
Attachment

Re: FETCH FIRST clause PERCENT option

From
Kyotaro Horiguchi
Date:
Sorry for the dealy. I started to look this.

At Fri, 25 Sep 2020 12:25:24 +0300, Surafel Temesgen <surafel3000@gmail.com> wrote in 
> Hi Michael
> On Thu, Sep 24, 2020 at 6:58 AM Michael Paquier <michael@paquier.xyz> wrote:
> 
> > On Mon, Aug 10, 2020 at 01:23:44PM +0300, Surafel Temesgen wrote:
> > > I also Implement PERCENT WITH TIES option. patch is attached
> > > i don't start a new tread because the patches share common code
> >
> > This fails to apply per the CF bot.  Could you send a rebase?

This still applies on the master HEAD.

percent-incremental-v11.patch

The existing nodeLimit passes the slot of the subnode to the
caller. but this patch changes that behavior. You added a new function
to tuplestore.c not to store a minimal tuple into the slot that passed
from subnode, but we should refrain from scribbling on the slot passed
from the subnode. Instead, the PERCENT path of the limit node should
use its own ResultTupleSlot for the purpose. See nodeSort for a
concrete example.


+    LIMIT_OPTION_PER_WITH_TIES, /* FETCH FIRST... PERCENT WITH TIES */

That name is a bit hard to read. We should spell it with complete
words.

        case LIMIT_INWINDOW:
...
+                if (IsPercentOption(node->limitOption) && node->backwardPosition
...
+                if (IsPercentOption(node->limitOption) && node->reachEnd)
...
+                if (IsPercentOption(node->limitOption))

I think we can use separate lstate state for each condition above
since IsPercentOption() gives a constant result through the execution
time. For example, LIMIT_PERCENT_TUPLESLOT_NOT_FILLED and
LIMIT_PERCENT_TUPLESLOT_FILLED and some derived states similar to the
non-percent path.  I *feel* that makes code simpler.

What do you think about this?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: FETCH FIRST clause PERCENT option

From
Surafel Temesgen
Date:


On Mon, Jan 25, 2021 at 2:39 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
Sorry for the dealy. I started to look this.


Hi kyotaro,
Thanks for looking into this but did we agree to proceed
on this approach? I fear that it will be the west of effort if
Andrew comes up with the patch for his approach. 
Andrew Gierth: Are you working on your approach?

regards
Surafel

Re: FETCH FIRST clause PERCENT option

From
David Steele
Date:
On 1/26/21 11:29 AM, Surafel Temesgen wrote:
> 
> On Mon, Jan 25, 2021 at 2:39 PM Kyotaro Horiguchi 
> <horikyota.ntt@gmail.com <mailto:horikyota.ntt@gmail.com>> wrote:
> 
>     Sorry for the dealy. I started to look this.
> 
> Hi kyotaro,
> Thanks for looking into this but did we agree to proceed
> on this approach? I fear that it will be the west of effort if
> Andrew comes up with the patch for his approach.
> Andrew Gierth: Are you working on your approach?

[Added Andrew to recipients]

Andrew, would you care to comment?

Regards,
-- 
-David
david@pgmasters.net



Re: FETCH FIRST clause PERCENT option

From
Jaime Casanova
Date:
On Thu, Mar 25, 2021 at 11:15:10AM -0400, David Steele wrote:
> On 1/26/21 11:29 AM, Surafel Temesgen wrote:
> > 
> > On Mon, Jan 25, 2021 at 2:39 PM Kyotaro Horiguchi
> > <horikyota.ntt@gmail.com <mailto:horikyota.ntt@gmail.com>> wrote:
> > 
> >     Sorry for the dealy. I started to look this.
> > 
> > Hi kyotaro,
> > Thanks for looking into this but did we agree to proceed
> > on this approach? I fear that it will be the west of effort if
> > Andrew comes up with the patch for his approach.
> > Andrew Gierth: Are you working on your approach?
> 
> [Added Andrew to recipients]
> 
> Andrew, would you care to comment?
> 

Hi everyone,

This is still marked as needs review, but I think it should be marked
as "Returned with feedback" or "waiting on author".

suggestions?

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL



Re: FETCH FIRST clause PERCENT option

From
Jaime Casanova
Date:
On Tue, Jan 26, 2021 at 07:29:11PM +0300, Surafel Temesgen wrote:
> On Mon, Jan 25, 2021 at 2:39 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com>
> wrote:
> 
> > Sorry for the dealy. I started to look this.
> >
> >
> Hi kyotaro,
> Thanks for looking into this but did we agree to proceed
> on this approach? I fear that it will be the west of effort if
> Andrew comes up with the patch for his approach.
> Andrew Gierth: Are you working on your approach?
> 

Hi Surafel,

I'm marking this one as "returned with feedback". Then you or Andrew can
submit a new version of the patch for the next CF.

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL