Thread: Re: sqlsmith crash incremental sort

Re: sqlsmith crash incremental sort

From
Justin Pryzby
Date:
Adding -hackers, originally forgotten.

On Sat, Apr 11, 2020 at 10:26:39PM +0200, Tomas Vondra wrote:
> Thanks! I'll investigate.
> 
> On Sat, Apr 11, 2020 at 02:19:52PM -0500, Justin Pryzby wrote:
> > frequent crash looks like:
> > 
> > #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
> > #1  0x00007fb0a0cda801 in __GI_abort () at abort.c:79
> > #2  0x00007fb0a21ec425 in ExceptionalCondition (conditionName=conditionName@entry=0x7fb0a233a2ed "relid > 0",
errorType=errorType@entry=0x7fb0a224701d"FailedAssertion",
 
> >    fileName=fileName@entry=0x7fb0a2340ce8 "relnode.c", lineNumber=lineNumber@entry=379) at assert.c:67
> > #3  0x00007fb0a2010d3a in find_base_rel (root=root@entry=0x7fb0a2de2d00, relid=<optimized out>) at relnode.c:379
> > #4  0x00007fb0a2199666 in examine_variable (root=root@entry=0x7fb0a2de2d00, node=node@entry=0x7fb0a2e65eb8,
varRelid=varRelid@entry=0,vardata=vardata@entry=0x7ffe7b549e60) at selfuncs.c:4600
 
> > #5  0x00007fb0a219e2ed in estimate_num_groups (root=root@entry=0x7fb0a2de2d00, groupExprs=0x7fb0a2e69118,
input_rows=input_rows@entry=2,pgset=pgset@entry=0x0) at selfuncs.c:3279
 
> > #6  0x00007fb0a1fc198b in cost_incremental_sort (path=path@entry=0x7fb0a2e69080, root=root@entry=0x7fb0a2de2d00,
pathkeys=pathkeys@entry=0x7fb0a2e68b28,presorted_keys=presorted_keys@entry=3,
 
> >    input_startup_cost=103.73424154497742, input_total_cost=<optimized out>, input_tuples=2, width=480,
comparison_cost=comparison_cost@entry=0,sort_mem=4096, limit_tuples=-1) at costsize.c:1832
 
> > #7  0x00007fb0a2007f63 in create_incremental_sort_path (root=root@entry=0x7fb0a2de2d00,
rel=rel@entry=0x7fb0a2e67a38,subpath=subpath@entry=0x7fb0a2e681a0, pathkeys=0x7fb0a2e68b28,
 
> >    presorted_keys=3, limit_tuples=limit_tuples@entry=-1) at pathnode.c:2793
> > #8  0x00007fb0a1fe97cb in create_ordered_paths (limit_tuples=-1, target_parallel_safe=true, target=0x7fb0a2e65568,
input_rel=<optimizedout>, root=0x7fb0a2de2d00) at planner.c:5029
 
> > #9  grouping_planner (root=root@entry=0x7fb0a2de2d00, inheritance_update=inheritance_update@entry=false,
tuple_fraction=<optimizedout>, tuple_fraction@entry=0) at planner.c:2254
 
> > #10 0x00007fb0a1fecd5c in subquery_planner (glob=<optimized out>, parse=parse@entry=0x7fb0a2db7840,
parent_root=parent_root@entry=0x7fb0a2dad650,hasRecursion=hasRecursion@entry=false,
 
> >    tuple_fraction=0) at planner.c:1015
> > #11 0x00007fb0a1fbc286 in set_subquery_pathlist (rte=<optimized out>, rti=<optimized out>, rel=0x7fb0a2db3598,
root=0x7fb0a2dad650)at allpaths.c:2303
 
> > #12 set_rel_size (root=root@entry=0x7fb0a2dad650, rel=rel@entry=0x7fb0a2db1670, rti=rti@entry=2, rte=<optimized
out>)at allpaths.c:422
 
> > #13 0x00007fb0a1fbecad in set_base_rel_sizes (root=<optimized out>) at allpaths.c:323
> > #14 make_one_rel (root=root@entry=0x7fb0a2dad650, joinlist=joinlist@entry=0x7fb0a2db76b8) at allpaths.c:185
> > #15 0x00007fb0a1fe4a2b in query_planner (root=root@entry=0x7fb0a2dad650,
qp_callback=qp_callback@entry=0x7fb0a1fe52c0<standard_qp_callback>, qp_extra=qp_extra@entry=0x7ffe7b54a510)
 
> >    at planmain.c:269
> > #16 0x00007fb0a1fea0b8 in grouping_planner (root=root@entry=0x7fb0a2dad650,
inheritance_update=inheritance_update@entry=false,tuple_fraction=<optimized out>, tuple_fraction@entry=0)
 
> >    at planner.c:2058
> > #17 0x00007fb0a1fecd5c in subquery_planner (glob=glob@entry=0x7fb0a2dab480, parse=parse@entry=0x7fb0a2d48410,
parent_root=parent_root@entry=0x0,hasRecursion=hasRecursion@entry=false,
 
> >    tuple_fraction=tuple_fraction@entry=0) at planner.c:1015
> > #18 0x00007fb0a1fee1df in standard_planner (parse=0x7fb0a2d48410, query_string=<optimized out>, cursorOptions=256,
boundParams=<optimizedout>) at planner.c:405
 
> > 
> > Minimal query like:
> > 
> > explain SELECT * FROM information_schema.transforms AS ref_1 RIGHT JOIN (SELECT 1 FROM pg_catalog.pg_namespace
TABLESAMPLESYSTEM (7.2))AS sample_2 ON (ref_1.specific_name is NULL);
 
> > 
> > -- 
> > Justin



Re: sqlsmith crash incremental sort

From
Tomas Vondra
Date:
Hi,

I've looked into this a bit, and at first I thought that maybe the issue 
is in how cost_incremental_sort picks the EC members. It simply does this:

     EquivalenceMember *member = (EquivalenceMember *)
         linitial(key->pk_eclass->ec_members);

so I was speculating that maybe there are multiple EC members and the
one we need is not the first one. That would have been easy to fix.

But that doesn't seem to be the case - in this example the EC ony has a
single EC member anyway.

     (gdb) p key->pk_eclass->ec_members
     $14 = (List *) 0x12eb958
     (gdb) p *key->pk_eclass->ec_members
     $15 = {type = T_List, length = 1, max_length = 5, elements = 0x12eb970, initial_elements = 0x12eb970}

and the member is a Var with varno=0 (with a RelabelType on top, but 
that's irrelevant).

     (gdb) p *(Var*)((RelabelType*)member->em_expr)->arg
     $12 = {xpr = {type = T_Var}, varno = 0, varattno = 1, vartype = 12445, vartypmod = -1, varcollid = 950,
varlevelsup= 0, varnosyn = 0, varattnosyn = 1, location = -1}
 

which then triggers the assert in find_base_rel. When looking for other
places calling estimate_num_groups I found this in prepunion.c:

      * XXX you don't really want to know about this: we do the estimation
      * using the subquery's original targetlist expressions, not the
      * subroot->processed_tlist which might seem more appropriate.  The
      * reason is that if the subquery is itself a setop, it may return a
      * processed_tlist containing "varno 0" Vars generated by
      * generate_append_tlist, and those would confuse estimate_num_groups
      * mightily.  We ought to get rid of the "varno 0" hack, but that
      * requires a redesign of the parsetree representation of setops, so
      * that there can be an RTE corresponding to each setop's output.

which seems pretty similar to the issue at hand, because the subpath is
T_UpperUniquePath (not sure if that passes as setop, but the symptoms 
match nicely).

Not sure what to do about it in cost_incremental_sort, though :-(


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: sqlsmith crash incremental sort

From
Tomas Vondra
Date:
On Sun, Apr 12, 2020 at 12:44:45AM +0200, Tomas Vondra wrote:
>Hi,
>
>I've looked into this a bit, and at first I thought that maybe the 
>issue is in how cost_incremental_sort picks the EC members. It simply 
>does this:
>
>    EquivalenceMember *member = (EquivalenceMember *)
>        linitial(key->pk_eclass->ec_members);
>
>so I was speculating that maybe there are multiple EC members and the
>one we need is not the first one. That would have been easy to fix.
>
>But that doesn't seem to be the case - in this example the EC ony has a
>single EC member anyway.
>
>    (gdb) p key->pk_eclass->ec_members
>    $14 = (List *) 0x12eb958
>    (gdb) p *key->pk_eclass->ec_members
>    $15 = {type = T_List, length = 1, max_length = 5, elements = 0x12eb970, initial_elements = 0x12eb970}
>
>and the member is a Var with varno=0 (with a RelabelType on top, but 
>that's irrelevant).
>
>    (gdb) p *(Var*)((RelabelType*)member->em_expr)->arg
>    $12 = {xpr = {type = T_Var}, varno = 0, varattno = 1, vartype = 12445, vartypmod = -1, varcollid = 950,
varlevelsup= 0, varnosyn = 0, varattnosyn = 1, location = -1}
 
>
>which then triggers the assert in find_base_rel. When looking for other
>places calling estimate_num_groups I found this in prepunion.c:
>
>     * XXX you don't really want to know about this: we do the estimation
>     * using the subquery's original targetlist expressions, not the
>     * subroot->processed_tlist which might seem more appropriate.  The
>     * reason is that if the subquery is itself a setop, it may return a
>     * processed_tlist containing "varno 0" Vars generated by
>     * generate_append_tlist, and those would confuse estimate_num_groups
>     * mightily.  We ought to get rid of the "varno 0" hack, but that
>     * requires a redesign of the parsetree representation of setops, so
>     * that there can be an RTE corresponding to each setop's output.
>
>which seems pretty similar to the issue at hand, because the subpath is
>T_UpperUniquePath (not sure if that passes as setop, but the symptoms 
>match nicely).
>
>Not sure what to do about it in cost_incremental_sort, though :-(
>

I've been messing with this the whole day, without much progress :-(

I'm 99.9999% sure it's the same issue described by the quoted comment,
because the plan looks like this:

  Nested Loop Left Join
    ->  Sample Scan on pg_namespace
          Sampling: system ('7.2'::real)
    ->  Incremental Sort
          Sort Key: ...
          Presorted Key: ...
          ->  Unique
                ->  Sort
                      Sort Key: ...
                      ->  Append
                            ->  Nested Loop
                                ...
                            ->  Nested Loop
                                ...

so yeah, the plan does have set operations, and generate_append_tlist
does generate Vars with varno == 0, causing this issue.

But I'm not entirely sure what to do about it in cost_incremental_sort.
The comment (introduced by 89deca582a in 2017) suggests a proper fix
would require redesigning the parsetree representation of setops, and
it's a bit too late for that.

So I wonder what a possible solution might look like. I was hoping we
might grab the original target list and use that, similarly to
recurse_set_operations, but I'm not sure how/where to get it.

Another option is to use something as simple as checking for Vars with
varno==0 in cost_incremental_sort() and ignoring them somehow. We could
simply use some arbitrary estimate - by assuming the rows are unique or
something like that. Yes, I agree it's pretty ugly and I'd much rather
find a way to generate something sensible, but I'm not even sure we can
generate good estimate when doing UNION of data from different relations
and so on. The attached (ugly) patch does this.

Justin, can you try if this resolves the crashes or if there's something
else going on?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: sqlsmith crash incremental sort

From
Justin Pryzby
Date:
On Mon, Apr 13, 2020 at 02:09:43AM +0200, Tomas Vondra wrote:
> Justin, can you try if this resolves the crashes or if there's something
> else going on?

With your patch, this no longer crashes:
|explain SELECT * FROM information_schema.transforms AS ref_1 RIGHT JOIN (SELECT 1 FROM pg_catalog.pg_namespace
TABLESAMPLESYSTEM (7.2))AS sample_2 ON (ref_1.specific_name is NULL);
 
..and sqlsmith survived 20min, which is a good sign.

pg_ctl -c -D pgsql13.dat start -o '-c port=1234 -c log_min_messages=fatal'
sqlsmith --target='host=/tmp port=1234 dbname=postgres' --verbose

Previously, I changed find_base_rel()'s Assert to an if(){elog}, so I know from
another 12 sqlsmith-hours that there's no other crash occuring frequently.

(I hadn't used sqlsmith before this weekend, and was excited when I first saw
that it'd crashed overnight, and very surprised (after enabling core dumps) to
see that it crashed within ~10min.)

-- 
Justin



Re: sqlsmith crash incremental sort

From
James Coleman
Date:
On Sun, Apr 12, 2020 at 8:09 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> On Sun, Apr 12, 2020 at 12:44:45AM +0200, Tomas Vondra wrote:
> >Hi,
> >
> >I've looked into this a bit, and at first I thought that maybe the
> >issue is in how cost_incremental_sort picks the EC members. It simply
> >does this:
> >
> >    EquivalenceMember *member = (EquivalenceMember *)
> >        linitial(key->pk_eclass->ec_members);
> >
> >so I was speculating that maybe there are multiple EC members and the
> >one we need is not the first one. That would have been easy to fix.
> >
> >But that doesn't seem to be the case - in this example the EC ony has a
> >single EC member anyway.
> >
> >    (gdb) p key->pk_eclass->ec_members
> >    $14 = (List *) 0x12eb958
> >    (gdb) p *key->pk_eclass->ec_members
> >    $15 = {type = T_List, length = 1, max_length = 5, elements = 0x12eb970, initial_elements = 0x12eb970}
> >
> >and the member is a Var with varno=0 (with a RelabelType on top, but
> >that's irrelevant).
> >
> >    (gdb) p *(Var*)((RelabelType*)member->em_expr)->arg
> >    $12 = {xpr = {type = T_Var}, varno = 0, varattno = 1, vartype = 12445, vartypmod = -1, varcollid = 950,
varlevelsup= 0, varnosyn = 0, varattnosyn = 1, location = -1}
 
> >
> >which then triggers the assert in find_base_rel. When looking for other
> >places calling estimate_num_groups I found this in prepunion.c:
> >
> >     * XXX you don't really want to know about this: we do the estimation
> >     * using the subquery's original targetlist expressions, not the
> >     * subroot->processed_tlist which might seem more appropriate.  The
> >     * reason is that if the subquery is itself a setop, it may return a
> >     * processed_tlist containing "varno 0" Vars generated by
> >     * generate_append_tlist, and those would confuse estimate_num_groups
> >     * mightily.  We ought to get rid of the "varno 0" hack, but that
> >     * requires a redesign of the parsetree representation of setops, so
> >     * that there can be an RTE corresponding to each setop's output.
> >
> >which seems pretty similar to the issue at hand, because the subpath is
> >T_UpperUniquePath (not sure if that passes as setop, but the symptoms
> >match nicely).
> >
> >Not sure what to do about it in cost_incremental_sort, though :-(
> >
>
> I've been messing with this the whole day, without much progress :-(
>
> I'm 99.9999% sure it's the same issue described by the quoted comment,
> because the plan looks like this:
>
>   Nested Loop Left Join
>     ->  Sample Scan on pg_namespace
>           Sampling: system ('7.2'::real)
>     ->  Incremental Sort
>           Sort Key: ...
>           Presorted Key: ...
>           ->  Unique
>                 ->  Sort
>                       Sort Key: ...
>                       ->  Append
>                             ->  Nested Loop
>                                 ...
>                             ->  Nested Loop
>                                 ...
>
> so yeah, the plan does have set operations, and generate_append_tlist
> does generate Vars with varno == 0, causing this issue.

This is a bit of an oddly shaped plan anyway, right? In an ideal world
the sort for the unique would have knowledge about what would be
useful for the parent node, and we wouldn't need the incremental sort
at all.

I'm not sure that that kind of thing is really a new problem, though,
and it might not even be entirely possible to fix directly by trying
to push down knowledge about useful sort keys to whatever created that
sort path; it might only be fixable by having the incremental sort (or
even regular sort) path creation know to "subsume" a sort underneath
it.

Anyway, I think that's a bit off topic, but it stood out to me.

> But I'm not entirely sure what to do about it in cost_incremental_sort.
> The comment (introduced by 89deca582a in 2017) suggests a proper fix
> would require redesigning the parsetree representation of setops, and
> it's a bit too late for that.
>
> So I wonder what a possible solution might look like. I was hoping we
> might grab the original target list and use that, similarly to
> recurse_set_operations, but I'm not sure how/where to get it.

This is also not an area I'm familiar with. Reading through the
prepunion.c code alongside cost_incremental_sort, it seems that we
don't have access to the same level of information as the prepunion
code (i.e., we're only looking at the result of the union, not the
components of it), and trying descend down into it seems even more
gross, so, see below...

> Another option is to use something as simple as checking for Vars with
> varno==0 in cost_incremental_sort() and ignoring them somehow. We could
> simply use some arbitrary estimate - by assuming the rows are unique or
> something like that. Yes, I agree it's pretty ugly and I'd much rather
> find a way to generate something sensible, but I'm not even sure we can
> generate good estimate when doing UNION of data from different relations
> and so on. The attached (ugly) patch does this.

...therefore I think this is worth proceeding with.

James



Re: sqlsmith crash incremental sort

From
Tomas Vondra
Date:
On Tue, Apr 14, 2020 at 01:16:33PM -0400, James Coleman wrote:
>On Sun, Apr 12, 2020 at 8:09 PM Tomas Vondra
><tomas.vondra@2ndquadrant.com> wrote:
>>
>> On Sun, Apr 12, 2020 at 12:44:45AM +0200, Tomas Vondra wrote:
>> >Hi,
>> >
>> >I've looked into this a bit, and at first I thought that maybe the
>> >issue is in how cost_incremental_sort picks the EC members. It simply
>> >does this:
>> >
>> >    EquivalenceMember *member = (EquivalenceMember *)
>> >        linitial(key->pk_eclass->ec_members);
>> >
>> >so I was speculating that maybe there are multiple EC members and the
>> >one we need is not the first one. That would have been easy to fix.
>> >
>> >But that doesn't seem to be the case - in this example the EC ony has a
>> >single EC member anyway.
>> >
>> >    (gdb) p key->pk_eclass->ec_members
>> >    $14 = (List *) 0x12eb958
>> >    (gdb) p *key->pk_eclass->ec_members
>> >    $15 = {type = T_List, length = 1, max_length = 5, elements = 0x12eb970, initial_elements = 0x12eb970}
>> >
>> >and the member is a Var with varno=0 (with a RelabelType on top, but
>> >that's irrelevant).
>> >
>> >    (gdb) p *(Var*)((RelabelType*)member->em_expr)->arg
>> >    $12 = {xpr = {type = T_Var}, varno = 0, varattno = 1, vartype = 12445, vartypmod = -1, varcollid = 950,
varlevelsup= 0, varnosyn = 0, varattnosyn = 1, location = -1}
 
>> >
>> >which then triggers the assert in find_base_rel. When looking for other
>> >places calling estimate_num_groups I found this in prepunion.c:
>> >
>> >     * XXX you don't really want to know about this: we do the estimation
>> >     * using the subquery's original targetlist expressions, not the
>> >     * subroot->processed_tlist which might seem more appropriate.  The
>> >     * reason is that if the subquery is itself a setop, it may return a
>> >     * processed_tlist containing "varno 0" Vars generated by
>> >     * generate_append_tlist, and those would confuse estimate_num_groups
>> >     * mightily.  We ought to get rid of the "varno 0" hack, but that
>> >     * requires a redesign of the parsetree representation of setops, so
>> >     * that there can be an RTE corresponding to each setop's output.
>> >
>> >which seems pretty similar to the issue at hand, because the subpath is
>> >T_UpperUniquePath (not sure if that passes as setop, but the symptoms
>> >match nicely).
>> >
>> >Not sure what to do about it in cost_incremental_sort, though :-(
>> >
>>
>> I've been messing with this the whole day, without much progress :-(
>>
>> I'm 99.9999% sure it's the same issue described by the quoted comment,
>> because the plan looks like this:
>>
>>   Nested Loop Left Join
>>     ->  Sample Scan on pg_namespace
>>           Sampling: system ('7.2'::real)
>>     ->  Incremental Sort
>>           Sort Key: ...
>>           Presorted Key: ...
>>           ->  Unique
>>                 ->  Sort
>>                       Sort Key: ...
>>                       ->  Append
>>                             ->  Nested Loop
>>                                 ...
>>                             ->  Nested Loop
>>                                 ...
>>
>> so yeah, the plan does have set operations, and generate_append_tlist
>> does generate Vars with varno == 0, causing this issue.
>
>This is a bit of an oddly shaped plan anyway, right? In an ideal world
>the sort for the unique would have knowledge about what would be
>useful for the parent node, and we wouldn't need the incremental sort
>at all.
>

Well, yeah. The problem is the Unique simply compares the columns in the
order it sees them, and it does not match the column order desired by
incremental sort. But we don't push down this information at all :-(

In fact, there may be other reasons to reorder the comparisons, e.g.
when the cost is different for different columns. There was a patch by
Teodor IIRC correctly doing exactly that.

>I'm not sure that that kind of thing is really a new problem, though,
>and it might not even be entirely possible to fix directly by trying
>to push down knowledge about useful sort keys to whatever created that
>sort path; it might only be fixable by having the incremental sort (or
>even regular sort) path creation know to "subsume" a sort underneath
>it.
>
>Anyway, I think that's a bit off topic, but it stood out to me.
>

It's not a new problem. It's an optimization we don't have.

>> But I'm not entirely sure what to do about it in cost_incremental_sort.
>> The comment (introduced by 89deca582a in 2017) suggests a proper fix
>> would require redesigning the parsetree representation of setops, and
>> it's a bit too late for that.
>>
>> So I wonder what a possible solution might look like. I was hoping we
>> might grab the original target list and use that, similarly to
>> recurse_set_operations, but I'm not sure how/where to get it.
>
>This is also not an area I'm familiar with. Reading through the
>prepunion.c code alongside cost_incremental_sort, it seems that we
>don't have access to the same level of information as the prepunion
>code (i.e., we're only looking at the result of the union, not the
>components of it), and trying descend down into it seems even more
>gross, so, see below...
>

Yeah. And I'm not even sure having that information would allow good
estimates e.g. for UNIONs of multiple relations etc.

>> Another option is to use something as simple as checking for Vars with
>> varno==0 in cost_incremental_sort() and ignoring them somehow. We could
>> simply use some arbitrary estimate - by assuming the rows are unique or
>> something like that. Yes, I agree it's pretty ugly and I'd much rather
>> find a way to generate something sensible, but I'm not even sure we can
>> generate good estimate when doing UNION of data from different relations
>> and so on. The attached (ugly) patch does this.
>
>...therefore I think this is worth proceeding with.
>

OK, then the question is what estimate to use in this case. Should we
assume 1 group or uniqueness? I'd assume a single group produces costs
slightly above regular sort, right?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: sqlsmith crash incremental sort

From
James Coleman
Date:
On Wed, Apr 15, 2020 at 10:47 AM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> On Tue, Apr 14, 2020 at 01:16:33PM -0400, James Coleman wrote:
> >On Sun, Apr 12, 2020 at 8:09 PM Tomas Vondra
> ><tomas.vondra@2ndquadrant.com> wrote:
> >>
> >> On Sun, Apr 12, 2020 at 12:44:45AM +0200, Tomas Vondra wrote:
> >> >Hi,
> >> >
> >> >I've looked into this a bit, and at first I thought that maybe the
> >> >issue is in how cost_incremental_sort picks the EC members. It simply
> >> >does this:
> >> >
> >> >    EquivalenceMember *member = (EquivalenceMember *)
> >> >        linitial(key->pk_eclass->ec_members);
> >> >
> >> >so I was speculating that maybe there are multiple EC members and the
> >> >one we need is not the first one. That would have been easy to fix.
> >> >
> >> >But that doesn't seem to be the case - in this example the EC ony has a
> >> >single EC member anyway.
> >> >
> >> >    (gdb) p key->pk_eclass->ec_members
> >> >    $14 = (List *) 0x12eb958
> >> >    (gdb) p *key->pk_eclass->ec_members
> >> >    $15 = {type = T_List, length = 1, max_length = 5, elements = 0x12eb970, initial_elements = 0x12eb970}
> >> >
> >> >and the member is a Var with varno=0 (with a RelabelType on top, but
> >> >that's irrelevant).
> >> >
> >> >    (gdb) p *(Var*)((RelabelType*)member->em_expr)->arg
> >> >    $12 = {xpr = {type = T_Var}, varno = 0, varattno = 1, vartype = 12445, vartypmod = -1, varcollid = 950,
varlevelsup= 0, varnosyn = 0, varattnosyn = 1, location = -1}
 
> >> >
> >> >which then triggers the assert in find_base_rel. When looking for other
> >> >places calling estimate_num_groups I found this in prepunion.c:
> >> >
> >> >     * XXX you don't really want to know about this: we do the estimation
> >> >     * using the subquery's original targetlist expressions, not the
> >> >     * subroot->processed_tlist which might seem more appropriate.  The
> >> >     * reason is that if the subquery is itself a setop, it may return a
> >> >     * processed_tlist containing "varno 0" Vars generated by
> >> >     * generate_append_tlist, and those would confuse estimate_num_groups
> >> >     * mightily.  We ought to get rid of the "varno 0" hack, but that
> >> >     * requires a redesign of the parsetree representation of setops, so
> >> >     * that there can be an RTE corresponding to each setop's output.
> >> >
> >> >which seems pretty similar to the issue at hand, because the subpath is
> >> >T_UpperUniquePath (not sure if that passes as setop, but the symptoms
> >> >match nicely).
> >> >
> >> >Not sure what to do about it in cost_incremental_sort, though :-(
> >> >
> >>
> >> I've been messing with this the whole day, without much progress :-(
> >>
> >> I'm 99.9999% sure it's the same issue described by the quoted comment,
> >> because the plan looks like this:
> >>
> >>   Nested Loop Left Join
> >>     ->  Sample Scan on pg_namespace
> >>           Sampling: system ('7.2'::real)
> >>     ->  Incremental Sort
> >>           Sort Key: ...
> >>           Presorted Key: ...
> >>           ->  Unique
> >>                 ->  Sort
> >>                       Sort Key: ...
> >>                       ->  Append
> >>                             ->  Nested Loop
> >>                                 ...
> >>                             ->  Nested Loop
> >>                                 ...
> >>
> >> so yeah, the plan does have set operations, and generate_append_tlist
> >> does generate Vars with varno == 0, causing this issue.
> >
> >This is a bit of an oddly shaped plan anyway, right? In an ideal world
> >the sort for the unique would have knowledge about what would be
> >useful for the parent node, and we wouldn't need the incremental sort
> >at all.
> >
>
> Well, yeah. The problem is the Unique simply compares the columns in the
> order it sees them, and it does not match the column order desired by
> incremental sort. But we don't push down this information at all :-(
>
> In fact, there may be other reasons to reorder the comparisons, e.g.
> when the cost is different for different columns. There was a patch by
> Teodor IIRC correctly doing exactly that.
>
> >I'm not sure that that kind of thing is really a new problem, though,
> >and it might not even be entirely possible to fix directly by trying
> >to push down knowledge about useful sort keys to whatever created that
> >sort path; it might only be fixable by having the incremental sort (or
> >even regular sort) path creation know to "subsume" a sort underneath
> >it.
> >
> >Anyway, I think that's a bit off topic, but it stood out to me.
> >
>
> It's not a new problem. It's an optimization we don't have.
>
> >> But I'm not entirely sure what to do about it in cost_incremental_sort.
> >> The comment (introduced by 89deca582a in 2017) suggests a proper fix
> >> would require redesigning the parsetree representation of setops, and
> >> it's a bit too late for that.
> >>
> >> So I wonder what a possible solution might look like. I was hoping we
> >> might grab the original target list and use that, similarly to
> >> recurse_set_operations, but I'm not sure how/where to get it.
> >
> >This is also not an area I'm familiar with. Reading through the
> >prepunion.c code alongside cost_incremental_sort, it seems that we
> >don't have access to the same level of information as the prepunion
> >code (i.e., we're only looking at the result of the union, not the
> >components of it), and trying descend down into it seems even more
> >gross, so, see below...
> >
>
> Yeah. And I'm not even sure having that information would allow good
> estimates e.g. for UNIONs of multiple relations etc.
>
> >> Another option is to use something as simple as checking for Vars with
> >> varno==0 in cost_incremental_sort() and ignoring them somehow. We could
> >> simply use some arbitrary estimate - by assuming the rows are unique or
> >> something like that. Yes, I agree it's pretty ugly and I'd much rather
> >> find a way to generate something sensible, but I'm not even sure we can
> >> generate good estimate when doing UNION of data from different relations
> >> and so on. The attached (ugly) patch does this.
> >
> >...therefore I think this is worth proceeding with.
> >
>
> OK, then the question is what estimate to use in this case. Should we
> assume 1 group or uniqueness? I'd assume a single group produces costs
> slightly above regular sort, right?

Originally I'd intuitively leaned towards assuming they were unique.
But that would be the best case for memory/disk space usage, for
example, and the costing for incremental sort is always (at least
mildly) higher than regular sort if the number of groups is 1. That
also guarantees the startup cost is higher than regular sort also.

So I think using a number of groups estimate of 1, we just wouldn't
choose an incremental sort ever in this case.

Maybe that's the right choice? It'd certainly be the conservative
choice. What are your thoughts on the trade-offs there?

James



Re: sqlsmith crash incremental sort

From
Richard Guo
Date:

On Mon, Apr 13, 2020 at 8:09 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

I've been messing with this the whole day, without much progress :-(

I'm 99.9999% sure it's the same issue described by the quoted comment,
because the plan looks like this:

  Nested Loop Left Join
    ->  Sample Scan on pg_namespace
          Sampling: system ('7.2'::real)
    ->  Incremental Sort
          Sort Key: ...
          Presorted Key: ...
          ->  Unique
                ->  Sort
                      Sort Key: ...
                      ->  Append
                            ->  Nested Loop
                                ...
                            ->  Nested Loop
                                ...

so yeah, the plan does have set operations, and generate_append_tlist
does generate Vars with varno == 0, causing this issue.

After some digging I believe here is what happened.

1. For the UNION query, we build an upper rel of UPPERREL_SETOP and
generate Append path for it. Since Append doesn't actually evaluate its
targetlist, we generate 'varno 0' Vars for its targetlist. (setrefs.c
would just replace them with OUTER_VAR when adjusting the final plan so
this usually does not cause problems.)

2. To remove duplicates for UNION, we use hash/sort to unique-ify the
result. If sort is chosen, we add Sort path and then Unique path above
Append path, with pathkeys made from Append's targetlist.

3. Also the Append's targetlist would be built into
root->processed_tlist and with that we calculate root->sort_pathkeys.

4. When handling ORDER BY clause, we figure out the pathkeys of
Unique->Sort->Append path share some same prefix with
root->sort_pathkeys and thus incremental sort would be considered.

5. When calculating cost for incremental sort, estimate_num_groups does
not cope with 'varno 0' Vars extracted from root->sort_pathkeys. 


With this scenario, here is a simple recipe:

create table foo(a int, b int, c int);
set enable_hashagg to off;
explain select * from foo union select * from foo order by 1,3;

Thanks
Richard

Re: sqlsmith crash incremental sort

From
Richard Guo
Date:

On Wed, Apr 15, 2020 at 10:47 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Well, yeah. The problem is the Unique simply compares the columns in the
order it sees them, and it does not match the column order desired by
incremental sort. But we don't push down this information at all :-(

This is a nice optimization better to have. Since the 'Sort and Unique'
would unique-ify the result of a UNION by sorting on all columns, why
not we adjust the sort order trying to match parse->sortClause so that
we can avoid the final sort node?

Doing that we can transform plan from:

# explain (costs off) select * from foo union select * from foo order by 1,3;
                  QUERY PLAN
-----------------------------------------------
 Incremental Sort
   Sort Key: foo.a, foo.c
   Presorted Key: foo.a
   ->  Unique
         ->  Sort
               Sort Key: foo.a, foo.b, foo.c
               ->  Append
                     ->  Seq Scan on foo
                     ->  Seq Scan on foo foo_1
(9 rows)

To:

# explain (costs off) select * from foo union select * from foo order by 1,3;
               QUERY PLAN
-----------------------------------------
 Unique
   ->  Sort
         Sort Key: foo.a, foo.c, foo.b
         ->  Append
               ->  Seq Scan on foo
               ->  Seq Scan on foo foo_1
(6 rows)

Thanks
Richard 

Re: sqlsmith crash incremental sort

From
Richard Guo
Date:

On Thu, Apr 16, 2020 at 6:35 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Wed, Apr 15, 2020 at 10:47 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Well, yeah. The problem is the Unique simply compares the columns in the
order it sees them, and it does not match the column order desired by
incremental sort. But we don't push down this information at all :-(

This is a nice optimization better to have. Since the 'Sort and Unique'
would unique-ify the result of a UNION by sorting on all columns, why
not we adjust the sort order trying to match parse->sortClause so that
we can avoid the final sort node?

Doing that we can transform plan from:

# explain (costs off) select * from foo union select * from foo order by 1,3;
                  QUERY PLAN
-----------------------------------------------
 Incremental Sort
   Sort Key: foo.a, foo.c
   Presorted Key: foo.a
   ->  Unique
         ->  Sort
               Sort Key: foo.a, foo.b, foo.c
               ->  Append
                     ->  Seq Scan on foo
                     ->  Seq Scan on foo foo_1
(9 rows)

To:

# explain (costs off) select * from foo union select * from foo order by 1,3;
               QUERY PLAN
-----------------------------------------
 Unique
   ->  Sort
         Sort Key: foo.a, foo.c, foo.b
         ->  Append
               ->  Seq Scan on foo
               ->  Seq Scan on foo foo_1
(6 rows)


Attached is what I'm thinking about this optimization. Does it make any
sense?

Thanks
Richard 
Attachment

Re: sqlsmith crash incremental sort

From
Tomas Vondra
Date:
On Thu, Apr 16, 2020 at 04:44:10PM +0800, Richard Guo wrote:
>On Mon, Apr 13, 2020 at 8:09 AM Tomas Vondra <tomas.vondra@2ndquadrant.com>
>wrote:
>
>>
>> I've been messing with this the whole day, without much progress :-(
>>
>> I'm 99.9999% sure it's the same issue described by the quoted comment,
>> because the plan looks like this:
>>
>>   Nested Loop Left Join
>>     ->  Sample Scan on pg_namespace
>>           Sampling: system ('7.2'::real)
>>     ->  Incremental Sort
>>           Sort Key: ...
>>           Presorted Key: ...
>>           ->  Unique
>>                 ->  Sort
>>                       Sort Key: ...
>>                       ->  Append
>>                             ->  Nested Loop
>>                                 ...
>>                             ->  Nested Loop
>>                                 ...
>>
>> so yeah, the plan does have set operations, and generate_append_tlist
>> does generate Vars with varno == 0, causing this issue.
>>
>
>After some digging I believe here is what happened.
>
>1. For the UNION query, we build an upper rel of UPPERREL_SETOP and
>generate Append path for it. Since Append doesn't actually evaluate its
>targetlist, we generate 'varno 0' Vars for its targetlist. (setrefs.c
>would just replace them with OUTER_VAR when adjusting the final plan so
>this usually does not cause problems.)
>
>2. To remove duplicates for UNION, we use hash/sort to unique-ify the
>result. If sort is chosen, we add Sort path and then Unique path above
>Append path, with pathkeys made from Append's targetlist.
>
>3. Also the Append's targetlist would be built into
>root->processed_tlist and with that we calculate root->sort_pathkeys.
>
>4. When handling ORDER BY clause, we figure out the pathkeys of
>Unique->Sort->Append path share some same prefix with
>root->sort_pathkeys and thus incremental sort would be considered.
>
>5. When calculating cost for incremental sort, estimate_num_groups does
>not cope with 'varno 0' Vars extracted from root->sort_pathkeys.
>

Right.

>
>With this scenario, here is a simple recipe:
>
>create table foo(a int, b int, c int);
>set enable_hashagg to off;
>explain select * from foo union select * from foo order by 1,3;
>

Yep, that's a much simpler query / plan. Thanks.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



Re: sqlsmith crash incremental sort

From
James Coleman
Date:
On Thu, Apr 16, 2020 at 8:22 AM Richard Guo <guofenglinux@gmail.com> wrote:
>
>
> On Thu, Apr 16, 2020 at 6:35 PM Richard Guo <guofenglinux@gmail.com> wrote:
>>
>>
>> On Wed, Apr 15, 2020 at 10:47 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>>
>>>
>>> Well, yeah. The problem is the Unique simply compares the columns in the
>>> order it sees them, and it does not match the column order desired by
>>> incremental sort. But we don't push down this information at all :-(
>>
>>
>> This is a nice optimization better to have. Since the 'Sort and Unique'
>> would unique-ify the result of a UNION by sorting on all columns, why
>> not we adjust the sort order trying to match parse->sortClause so that
>> we can avoid the final sort node?
>>
>> Doing that we can transform plan from:
>>
>> # explain (costs off) select * from foo union select * from foo order by 1,3;
>>                   QUERY PLAN
>> -----------------------------------------------
>>  Incremental Sort
>>    Sort Key: foo.a, foo.c
>>    Presorted Key: foo.a
>>    ->  Unique
>>          ->  Sort
>>                Sort Key: foo.a, foo.b, foo.c
>>                ->  Append
>>                      ->  Seq Scan on foo
>>                      ->  Seq Scan on foo foo_1
>> (9 rows)
>>
>> To:
>>
>> # explain (costs off) select * from foo union select * from foo order by 1,3;
>>                QUERY PLAN
>> -----------------------------------------
>>  Unique
>>    ->  Sort
>>          Sort Key: foo.a, foo.c, foo.b
>>          ->  Append
>>                ->  Seq Scan on foo
>>                ->  Seq Scan on foo foo_1
>> (6 rows)
>>
>
> Attached is what I'm thinking about this optimization. Does it make any
> sense?

Shouldn't this go one either a new thread or on the thread for the
patch Tomas was referencing (by Teodor I believe)?

Or are you saying you believe this patch guarantees we never see this
problem in incremental sort costing?

James



Re: sqlsmith crash incremental sort

From
Tomas Vondra
Date:
On Thu, Apr 16, 2020 at 12:04:03PM -0400, James Coleman wrote:
>On Thu, Apr 16, 2020 at 8:22 AM Richard Guo <guofenglinux@gmail.com> wrote:
>>
>>
>> On Thu, Apr 16, 2020 at 6:35 PM Richard Guo <guofenglinux@gmail.com> wrote:
>>>
>>>
>>> On Wed, Apr 15, 2020 at 10:47 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>>>
>>>>
>>>> Well, yeah. The problem is the Unique simply compares the columns in the
>>>> order it sees them, and it does not match the column order desired by
>>>> incremental sort. But we don't push down this information at all :-(
>>>
>>>
>>> This is a nice optimization better to have. Since the 'Sort and Unique'
>>> would unique-ify the result of a UNION by sorting on all columns, why
>>> not we adjust the sort order trying to match parse->sortClause so that
>>> we can avoid the final sort node?
>>>
>>> Doing that we can transform plan from:
>>>
>>> # explain (costs off) select * from foo union select * from foo order by 1,3;
>>>                   QUERY PLAN
>>> -----------------------------------------------
>>>  Incremental Sort
>>>    Sort Key: foo.a, foo.c
>>>    Presorted Key: foo.a
>>>    ->  Unique
>>>          ->  Sort
>>>                Sort Key: foo.a, foo.b, foo.c
>>>                ->  Append
>>>                      ->  Seq Scan on foo
>>>                      ->  Seq Scan on foo foo_1
>>> (9 rows)
>>>
>>> To:
>>>
>>> # explain (costs off) select * from foo union select * from foo order by 1,3;
>>>                QUERY PLAN
>>> -----------------------------------------
>>>  Unique
>>>    ->  Sort
>>>          Sort Key: foo.a, foo.c, foo.b
>>>          ->  Append
>>>                ->  Seq Scan on foo
>>>                ->  Seq Scan on foo foo_1
>>> (6 rows)
>>>
>>
>> Attached is what I'm thinking about this optimization. Does it make any
>> sense?
>
>Shouldn't this go one either a new thread or on the thread for the
>patch Tomas was referencing (by Teodor I believe)?
>

FWIW the optimization I had in mind is this:

   https://commitfest.postgresql.org/21/1651/

I now realize that was about GROUP BY, but it's not all that different
and the concerns will / should be fairly similar, I think.

IMO simply tweaking the sort keys to match the upper parts of the plan
is probably way too simplistic, I'm afraid. For example, if the Unique
significantly reduces cardinality, then the cost of the additional sort
is much less important. It may be much better to optimize the "large"
sort of the whole data set, either by reordering the columns as proposed
by Teodor in his patch (by number of distinct values and/or cost of the
comparison function function).

Furthermore, this is one of the places that is not using incremental
sort yet - I can easily imagine doing something like this:


    Sort
      -> Unique
         -> Incremenal Sort
       -> ...

could be a massive win. So I think we can't just rejigger the sort keys
abitrarily, we should / need to consider those alternatives.

>Or are you saying you believe this patch guarantees we never see this
>problem in incremental sort costing?
>

Yeah, that's not entirely close to me. But maybe it shows us where we to
get the unprocessed target list?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: sqlsmith crash incremental sort

From
Tomas Vondra
Date:
On Thu, Apr 16, 2020 at 08:44:16PM +0200, Tomas Vondra wrote:
>On Thu, Apr 16, 2020 at 12:04:03PM -0400, James Coleman wrote:
>>On Thu, Apr 16, 2020 at 8:22 AM Richard Guo <guofenglinux@gmail.com> wrote:
>>>
>>>
>>>On Thu, Apr 16, 2020 at 6:35 PM Richard Guo <guofenglinux@gmail.com> wrote:
>>>>
>>>>
>>>>On Wed, Apr 15, 2020 at 10:47 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>>>>
>>>>>
>>>>>Well, yeah. The problem is the Unique simply compares the columns in the
>>>>>order it sees them, and it does not match the column order desired by
>>>>>incremental sort. But we don't push down this information at all :-(
>>>>
>>>>
>>>>This is a nice optimization better to have. Since the 'Sort and Unique'
>>>>would unique-ify the result of a UNION by sorting on all columns, why
>>>>not we adjust the sort order trying to match parse->sortClause so that
>>>>we can avoid the final sort node?
>>>>
>>>>Doing that we can transform plan from:
>>>>
>>>># explain (costs off) select * from foo union select * from foo order by 1,3;
>>>>                  QUERY PLAN
>>>>-----------------------------------------------
>>>> Incremental Sort
>>>>   Sort Key: foo.a, foo.c
>>>>   Presorted Key: foo.a
>>>>   ->  Unique
>>>>         ->  Sort
>>>>               Sort Key: foo.a, foo.b, foo.c
>>>>               ->  Append
>>>>                     ->  Seq Scan on foo
>>>>                     ->  Seq Scan on foo foo_1
>>>>(9 rows)
>>>>
>>>>To:
>>>>
>>>># explain (costs off) select * from foo union select * from foo order by 1,3;
>>>>               QUERY PLAN
>>>>-----------------------------------------
>>>> Unique
>>>>   ->  Sort
>>>>         Sort Key: foo.a, foo.c, foo.b
>>>>         ->  Append
>>>>               ->  Seq Scan on foo
>>>>               ->  Seq Scan on foo foo_1
>>>>(6 rows)
>>>>
>>>
>>>Attached is what I'm thinking about this optimization. Does it make any
>>>sense?
>>
>>Shouldn't this go one either a new thread or on the thread for the
>>patch Tomas was referencing (by Teodor I believe)?
>>
>
>FWIW the optimization I had in mind is this:
>
>  https://commitfest.postgresql.org/21/1651/
>
>I now realize that was about GROUP BY, but it's not all that different
>and the concerns will / should be fairly similar, I think.
>
>IMO simply tweaking the sort keys to match the upper parts of the plan
>is probably way too simplistic, I'm afraid. For example, if the Unique
>significantly reduces cardinality, then the cost of the additional sort
>is much less important. It may be much better to optimize the "large"
>sort of the whole data set, either by reordering the columns as proposed
>by Teodor in his patch (by number of distinct values and/or cost of the
>comparison function function).
>
>Furthermore, this is one of the places that is not using incremental
>sort yet - I can easily imagine doing something like this:
>
>
>   Sort
>     -> Unique
>        -> Incremenal Sort
>       -> ...
>
>could be a massive win. So I think we can't just rejigger the sort keys
>abitrarily, we should / need to consider those alternatives.
>
>>Or are you saying you believe this patch guarantees we never see this
>>problem in incremental sort costing?
>>
>
>Yeah, that's not entirely close to me. But maybe it shows us where we to
>get the unprocessed target list?
>

I think at the very least this needs to apply the same change also to
generate_nonunion_paths, because otherwise this fails because of the
same issue:

   set enable_hashagg = off;
   explain select * from foo except select * from foo order by 1, 3;

I'm still of the opinion that this is really an optimization and
behavior change, and I feel rather uneasy about pushing it post feature
freeze without appropriate review. I also think we really ought to
consider how would this work with the other optimizations I outlined
elsewhere in this thread (comparison costs, ...).


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: sqlsmith crash incremental sort

From
Tomas Vondra
Date:
On Wed, Apr 15, 2020 at 11:26:12AM -0400, James Coleman wrote:
>On Wed, Apr 15, 2020 at 10:47 AM Tomas Vondra
><tomas.vondra@2ndquadrant.com> wrote:
>>
>> ...
>>
>> Yeah. And I'm not even sure having that information would allow good
>> estimates e.g. for UNIONs of multiple relations etc.
>>
>> >> Another option is to use something as simple as checking for Vars with
>> >> varno==0 in cost_incremental_sort() and ignoring them somehow. We could
>> >> simply use some arbitrary estimate - by assuming the rows are unique or
>> >> something like that. Yes, I agree it's pretty ugly and I'd much rather
>> >> find a way to generate something sensible, but I'm not even sure we can
>> >> generate good estimate when doing UNION of data from different relations
>> >> and so on. The attached (ugly) patch does this.
>> >
>> >...therefore I think this is worth proceeding with.
>> >
>>
>> OK, then the question is what estimate to use in this case. Should we
>> assume 1 group or uniqueness? I'd assume a single group produces costs
>> slightly above regular sort, right?
>
>Originally I'd intuitively leaned towards assuming they were unique.
>But that would be the best case for memory/disk space usage, for
>example, and the costing for incremental sort is always (at least
>mildly) higher than regular sort if the number of groups is 1. That
>also guarantees the startup cost is higher than regular sort also.
>
>So I think using a number of groups estimate of 1, we just wouldn't
>choose an incremental sort ever in this case.
>
>Maybe that's the right choice? It'd certainly be the conservative
>choice. What are your thoughts on the trade-offs there?
>

I think we have essentially three options:

1) assuming there's just a single group

This should produce cost estimate higher than plain sort, disabling
incremental sort. I'd say this is "worst case" assumption. I think this
might be overly pessimistic, though.

2) assuming each row is a separate group

If (1) is worst case scenario, then this is probably the best case,
particularly when the query is sensitive to startup cost.


3) something in between

If (1) and (2) are worst/best-case scenarios, maybe we should pick
something in between. We have DEFAULT_NUM_DISTINCT (200) which
essentially says "we don't know what the number of groups is" so maybe
we should use that. Another option would be nrows/10, which is the cap
we use in estimate_num_groups without extended stats.


I was leaning towards (1) as "worst case" choice seems natural to
prevent possible regressions. But consider this:

   create table small (a int);
   create table large (a int);
   
   insert into small select mod(i,10) from generate_series(1,100) s(i);
   insert into large select mod(i,10) from generate_series(1,100000) s(i);

   analyze small;
   analyze large;

   explain select i from large union select i from small;

                                     QUERY PLAN
   -------------------------------------------------------------------------------
    Unique  (cost=11260.35..11760.85 rows=100100 width=4)
      ->  Sort  (cost=11260.35..11510.60 rows=100100 width=4)
            Sort Key: large.i
            ->  Append  (cost=0.00..2946.50 rows=100100 width=4)
                  ->  Seq Scan on large  (cost=0.00..1443.00 rows=100000 width=4)
                  ->  Seq Scan on small  (cost=0.00..2.00 rows=100 width=4)
   (6 rows)

The estimate fo number of groups is clearly bogus - we know there are
only 10 groups in each relation, but here we end up with 100100.

So perhaps we should do (2) to keep the behavior consistent?


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: sqlsmith crash incremental sort

From
James Coleman
Date:
On Thu, Apr 16, 2020 at 8:54 PM Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>
> On Wed, Apr 15, 2020 at 11:26:12AM -0400, James Coleman wrote:
> >On Wed, Apr 15, 2020 at 10:47 AM Tomas Vondra
> ><tomas.vondra@2ndquadrant.com> wrote:
> >>
> >> ...
> >>
> >> Yeah. And I'm not even sure having that information would allow good
> >> estimates e.g. for UNIONs of multiple relations etc.
> >>
> >> >> Another option is to use something as simple as checking for Vars with
> >> >> varno==0 in cost_incremental_sort() and ignoring them somehow. We could
> >> >> simply use some arbitrary estimate - by assuming the rows are unique or
> >> >> something like that. Yes, I agree it's pretty ugly and I'd much rather
> >> >> find a way to generate something sensible, but I'm not even sure we can
> >> >> generate good estimate when doing UNION of data from different relations
> >> >> and so on. The attached (ugly) patch does this.
> >> >
> >> >...therefore I think this is worth proceeding with.
> >> >
> >>
> >> OK, then the question is what estimate to use in this case. Should we
> >> assume 1 group or uniqueness? I'd assume a single group produces costs
> >> slightly above regular sort, right?
> >
> >Originally I'd intuitively leaned towards assuming they were unique.
> >But that would be the best case for memory/disk space usage, for
> >example, and the costing for incremental sort is always (at least
> >mildly) higher than regular sort if the number of groups is 1. That
> >also guarantees the startup cost is higher than regular sort also.
> >
> >So I think using a number of groups estimate of 1, we just wouldn't
> >choose an incremental sort ever in this case.
> >
> >Maybe that's the right choice? It'd certainly be the conservative
> >choice. What are your thoughts on the trade-offs there?
> >
>
> I think we have essentially three options:
>
> 1) assuming there's just a single group
>
> This should produce cost estimate higher than plain sort, disabling
> incremental sort. I'd say this is "worst case" assumption. I think this
> might be overly pessimistic, though.
>
> 2) assuming each row is a separate group
>
> If (1) is worst case scenario, then this is probably the best case,
> particularly when the query is sensitive to startup cost.
>
>
> 3) something in between
>
> If (1) and (2) are worst/best-case scenarios, maybe we should pick
> something in between. We have DEFAULT_NUM_DISTINCT (200) which
> essentially says "we don't know what the number of groups is" so maybe
> we should use that. Another option would be nrows/10, which is the cap
> we use in estimate_num_groups without extended stats.
>
>
> I was leaning towards (1) as "worst case" choice seems natural to
> prevent possible regressions. But consider this:
>
>    create table small (a int);
>    create table large (a int);
>
>    insert into small select mod(i,10) from generate_series(1,100) s(i);
>    insert into large select mod(i,10) from generate_series(1,100000) s(i);
>
>    analyze small;
>    analyze large;
>
>    explain select i from large union select i from small;
>
>                                      QUERY PLAN
>    -------------------------------------------------------------------------------
>     Unique  (cost=11260.35..11760.85 rows=100100 width=4)
>       ->  Sort  (cost=11260.35..11510.60 rows=100100 width=4)
>             Sort Key: large.i
>             ->  Append  (cost=0.00..2946.50 rows=100100 width=4)
>                   ->  Seq Scan on large  (cost=0.00..1443.00 rows=100000 width=4)
>                   ->  Seq Scan on small  (cost=0.00..2.00 rows=100 width=4)
>    (6 rows)
>
> The estimate fo number of groups is clearly bogus - we know there are
> only 10 groups in each relation, but here we end up with 100100.
>
> So perhaps we should do (2) to keep the behavior consistent?

First of all, I agree that we shouldn't (at this point in the cycle)
try to apply pathkey reordering etc. We already have ideas about
additional ways to apply and improve usage of incremental sort, and it
seem like this naturally fits into that list.

Second of all, I like the idea of keeping it consistent (even if
consistency boils down to "we're just guessing").

James



Re: sqlsmith crash incremental sort

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> I think we have essentially three options:
> 1) assuming there's just a single group
> 2) assuming each row is a separate group
> 3) something in between
> If (1) and (2) are worst/best-case scenarios, maybe we should pick
> something in between. We have DEFAULT_NUM_DISTINCT (200) which
> essentially says "we don't know what the number of groups is" so maybe
> we should use that.

I wouldn't recommend picking either the best or worst cases.

Possibly DEFAULT_NUM_DISTINCT is a sane choice, though it's fair to
wonder if it's quite applicable to the case where we already know
we've grouped by some columns.

            regards, tom lane



Re: sqlsmith crash incremental sort

From
Richard Guo
Date:

On Fri, Apr 17, 2020 at 2:44 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Thu, Apr 16, 2020 at 12:04:03PM -0400, James Coleman wrote:
>On Thu, Apr 16, 2020 at 8:22 AM Richard Guo <guofenglinux@gmail.com> wrote:
>> On Thu, Apr 16, 2020 at 6:35 PM Richard Guo <guofenglinux@gmail.com> wrote:
>>
>> Attached is what I'm thinking about this optimization. Does it make any
>> sense?
>
>Shouldn't this go one either a new thread or on the thread for the
>patch Tomas was referencing (by Teodor I believe)?
>

FWIW the optimization I had in mind is this:

   https://commitfest.postgresql.org/21/1651/

I now realize that was about GROUP BY, but it's not all that different
and the concerns will / should be fairly similar, I think.

Thanks for pointing out this thread. Very helpful.
 

IMO simply tweaking the sort keys to match the upper parts of the plan
is probably way too simplistic, I'm afraid. For example, if the Unique
significantly reduces cardinality, then the cost of the additional sort
is much less important. It may be much better to optimize the "large"
sort of the whole data set, either by reordering the columns as proposed
by Teodor in his patch (by number of distinct values and/or cost of the
comparison function function).

Since we don't have Teodor's patch for now, I think it is a clear win if
we can reorder the sort keys in 'Unique/SetOp->Sort->Append' path to
avoid a final Sort/Incremental Sort node, because currently the 'Sort
and Unique/SetOp' above Append simply performs sorting in the order of
columns it sees. I think this is the same logic we do for mergejoin
paths that we try to match the requested query_pathkeys to avoid a
second sort.
 

Furthermore, this is one of the places that is not using incremental
sort yet - I can easily imagine doing something like this:


    Sort
      -> Unique
         -> Incremenal Sort
           -> ...

could be a massive win. So I think we can't just rejigger the sort keys
abitrarily, we should / need to consider those alternatives.

This optimization would only apply to 'Unique/SetOp->Sort->Append' path.
I don't think it will affect our choise of incremental sort in other
cases. For example, with this optimization, we still can choose
incremental sort:

# explain (costs off)
select * from
    (select distinct a, c from (select * from foo order by 1,3) as sub) as sub1
order by 2,1;
                      QUERY PLAN
-------------------------------------------------------
 Sort
   Sort Key: sub.c, sub.a
   ->  Unique
         ->  Subquery Scan on sub
               ->  Incremental Sort
                     Sort Key: foo.a, foo.c
                     Presorted Key: foo.a
                     ->  Index Scan using foo_a on foo
(8 rows)
 

>Or are you saying you believe this patch guarantees we never see this
>problem in incremental sort costing?
>

Yeah, that's not entirely close to me. But maybe it shows us where we to
get the unprocessed target list?


I'm not sure if there are other cases that we would build
targetlit/pathkeys out of 'varno 0' Vars. But for this case here, the
'Unique/SetOp->Sort' above Append would sort the result of Append on all
columns, in the arbitrary order as it sees, (not based on any statistics
as Teodor's patch does), we can always reorder the sort keys trying to
match with result ordering requirements, thus to avoid the final
Sort/Incremental Sort node. So that we can prevent this problem in
incremental sort costing for this case.

Am I missing something? Please correct me.

Thanks
Richard 

Re: sqlsmith crash incremental sort

From
Richard Guo
Date:

On Fri, Apr 17, 2020 at 7:13 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Thu, Apr 16, 2020 at 08:44:16PM +0200, Tomas Vondra wrote:
>
>Yeah, that's not entirely close to me. But maybe it shows us where we to
>get the unprocessed target list?
>

I think at the very least this needs to apply the same change also to
generate_nonunion_paths, because otherwise this fails because of the
same issue:

   set enable_hashagg = off;
   explain select * from foo except select * from foo order by 1, 3;

Ah yes, that's what I'll have to do to cope with EXCEPT/INTERSECT.

Thanks
Richard

Re: sqlsmith crash incremental sort

From
James Coleman
Date:
On Thu, Apr 16, 2020 at 9:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> > I think we have essentially three options:
> > 1) assuming there's just a single group
> > 2) assuming each row is a separate group
> > 3) something in between
> > If (1) and (2) are worst/best-case scenarios, maybe we should pick
> > something in between. We have DEFAULT_NUM_DISTINCT (200) which
> > essentially says "we don't know what the number of groups is" so maybe
> > we should use that.
>
> I wouldn't recommend picking either the best or worst cases.
>
> Possibly DEFAULT_NUM_DISTINCT is a sane choice, though it's fair to
> wonder if it's quite applicable to the case where we already know
> we've grouped by some columns.

Do you think defining a new default, say,
DEFAULT_NUM_DISTINCT_PRESORTED is preferred then? And choose some
value like "1/2 of the normal DEFAULT_NUM_DISTINCT groups" or some
such?

James



Re: sqlsmith crash incremental sort

From
Tomas Vondra
Date:
On Sat, Apr 18, 2020 at 02:23:25PM -0400, James Coleman wrote:
>On Thu, Apr 16, 2020 at 9:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> > I think we have essentially three options:
>> > 1) assuming there's just a single group
>> > 2) assuming each row is a separate group
>> > 3) something in between
>> > If (1) and (2) are worst/best-case scenarios, maybe we should pick
>> > something in between. We have DEFAULT_NUM_DISTINCT (200) which
>> > essentially says "we don't know what the number of groups is" so maybe
>> > we should use that.
>>
>> I wouldn't recommend picking either the best or worst cases.
>>
>> Possibly DEFAULT_NUM_DISTINCT is a sane choice, though it's fair to
>> wonder if it's quite applicable to the case where we already know
>> we've grouped by some columns.
>
>Do you think defining a new default, say,
>DEFAULT_NUM_DISTINCT_PRESORTED is preferred then? And choose some
>value like "1/2 of the normal DEFAULT_NUM_DISTINCT groups" or some
>such?
>

If we had a better intuition what a better value is, maybe. But I don't
think we have that at all, so I'd just use the existing one.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: sqlsmith crash incremental sort

From
Tomas Vondra
Date:
On Mon, Apr 20, 2020 at 01:47:29AM +0200, Tomas Vondra wrote:
>On Sat, Apr 18, 2020 at 02:23:25PM -0400, James Coleman wrote:
>>On Thu, Apr 16, 2020 at 9:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>
>>>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>>>> I think we have essentially three options:
>>>> 1) assuming there's just a single group
>>>> 2) assuming each row is a separate group
>>>> 3) something in between
>>>> If (1) and (2) are worst/best-case scenarios, maybe we should pick
>>>> something in between. We have DEFAULT_NUM_DISTINCT (200) which
>>>> essentially says "we don't know what the number of groups is" so maybe
>>>> we should use that.
>>>
>>>I wouldn't recommend picking either the best or worst cases.
>>>
>>>Possibly DEFAULT_NUM_DISTINCT is a sane choice, though it's fair to
>>>wonder if it's quite applicable to the case where we already know
>>>we've grouped by some columns.
>>
>>Do you think defining a new default, say,
>>DEFAULT_NUM_DISTINCT_PRESORTED is preferred then? And choose some
>>value like "1/2 of the normal DEFAULT_NUM_DISTINCT groups" or some
>>such?
>>
>
>If we had a better intuition what a better value is, maybe. But I don't
>think we have that at all, so I'd just use the existing one.
>

I've pushed fix with the DEFAULT_NUM_DISTINCT. The input comes from a
set operation (which is where we call generate_append_tlist), so it's
probably fairly unique, so maybe we should use input_tuples. But it's
not guaranteed, so DEFAULT_NUM_DISTINCT seems reasonably defensive.

One detail I've changed is that instead of matching the expression
directly to a Var, it now calls pull_varnos() to also detect Vars
somewhere deeper. Lookig at examine_variable() it calls find_base_rel
for such case too, but I haven't tried constructing a query triggering
the issue.

One improvement I can think of is handling lists with only some
expressions containing varno 0. We could still call estimate_num_groups
for expressions with varno != 0, and multiply that by the estimate for
the other part (be it DEFAULT_NUM_DISTINCT). This might produce a higher
estimate than just using DEFAULT_NUM_DISTINCT directly, resulting in a
lower incremenal sort cost. But it's not clear to me if this can even
happen - AFAICS either all Vars have varno 0 or none, so I haven't done
this.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: sqlsmith crash incremental sort

From
Richard Guo
Date:

On Thu, Apr 23, 2020 at 6:59 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
I've pushed fix with the DEFAULT_NUM_DISTINCT. The input comes from a
set operation (which is where we call generate_append_tlist), so it's
probably fairly unique, so maybe we should use input_tuples. But it's
not guaranteed, so DEFAULT_NUM_DISTINCT seems reasonably defensive.

Thanks for the fix. Verified that the crash has been fixed.
 

One detail I've changed is that instead of matching the expression
directly to a Var, it now calls pull_varnos() to also detect Vars
somewhere deeper. Lookig at examine_variable() it calls find_base_rel
for such case too, but I haven't tried constructing a query triggering
the issue.

A minor comment is that I don't think we need to strip relabel
explicitly before calling pull_varnos(), because this function would
recurse into T_RelabelType nodes.

Also do we need to call bms_free(varnos) for each pathkey here to avoid
waste of memory?
 

One improvement I can think of is handling lists with only some
expressions containing varno 0. We could still call estimate_num_groups
for expressions with varno != 0, and multiply that by the estimate for
the other part (be it DEFAULT_NUM_DISTINCT). This might produce a higher
estimate than just using DEFAULT_NUM_DISTINCT directly, resulting in a
lower incremenal sort cost. But it's not clear to me if this can even
happen - AFAICS either all Vars have varno 0 or none, so I haven't done
this.

I don't think this case would happen either. 

Thanks
Richard

Re: sqlsmith crash incremental sort

From
Tomas Vondra
Date:
On Thu, Apr 23, 2020 at 03:28:21PM +0800, Richard Guo wrote:
>On Thu, Apr 23, 2020 at 6:59 AM Tomas Vondra <tomas.vondra@2ndquadrant.com>
>wrote:
>
>> I've pushed fix with the DEFAULT_NUM_DISTINCT. The input comes from a
>> set operation (which is where we call generate_append_tlist), so it's
>> probably fairly unique, so maybe we should use input_tuples. But it's
>> not guaranteed, so DEFAULT_NUM_DISTINCT seems reasonably defensive.
>>
>
>Thanks for the fix. Verified that the crash has been fixed.
>
>
>>
>> One detail I've changed is that instead of matching the expression
>> directly to a Var, it now calls pull_varnos() to also detect Vars
>> somewhere deeper. Lookig at examine_variable() it calls find_base_rel
>> for such case too, but I haven't tried constructing a query triggering
>> the issue.
>>
>
>A minor comment is that I don't think we need to strip relabel
>explicitly before calling pull_varnos(), because this function would
>recurse into T_RelabelType nodes.
>

Hmmm, yeah. I think you're right that's unnecessary. I misread the
walker function, I think.

>Also do we need to call bms_free(varnos) for each pathkey here to avoid
>waste of memory?
>

I don't think so. It wouldn't hurt, but we don't do that for other
pull_vernos calls either AFAICS.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: sqlsmith crash incremental sort

From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> On Thu, Apr 23, 2020 at 03:28:21PM +0800, Richard Guo wrote:
>> A minor comment is that I don't think we need to strip relabel
>> explicitly before calling pull_varnos(), because this function would
>> recurse into T_RelabelType nodes.

> Hmmm, yeah. I think you're right that's unnecessary. I misread the
> walker function, I think.

+1, might as well simplify the code.

>> Also do we need to call bms_free(varnos) for each pathkey here to avoid
>> waste of memory?

> I don't think so. It wouldn't hurt, but we don't do that for other
> pull_vernos calls either AFAICS.

Yeah, the planner is generally pretty profligate of memory, and these
bitmaps aren't likely to be huge anyway.

            regards, tom lane