Re: POC: converting Lists into arrays - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: POC: converting Lists into arrays |
Date | |
Msg-id | CAKJS1f9-p4R_b2__ugSpiJR3h_TRTw4pb+q8jk=yhEhnT5AMYQ@mail.gmail.com Whole thread Raw |
In response to | Re: POC: converting Lists into arrays (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: POC: converting Lists into arrays
|
List | pgsql-hackers |
On Wed, 17 Jul 2019 at 11:06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > 0002 changes some additional places where it's maybe a bit less safe, > ie there's a potential for user-visible behavioral change because > processing will occur in a different order. In particular, the proposed > change in execExpr.c causes aggregates and window functions that are in > the same plan node to be executed in a different order than before --- > but it seems to me that this order is saner. (Note the change in the > expected regression results, in a test that's intentionally sensitive to > the execution order.) And anyway when did we guarantee anything about > that? I've only looked at 0002. Here are my thoughts: get_tables_to_cluster: Looks fine. It's a heap scan. Any previous order was accidental, so if it causes issues then we might need to think of using a more well-defined order for CLUSTER; get_rels_with_domain: This is a static function. Changing the order of the list seems to only really affect the error message that a failed domain constraint validation could emit. Perhaps this might break someone else's tests, but they should just be able to update their expected results. ExecInitExprRec: As you mention, the order of aggregate evaluation is reversed. I agree that the new order is saner. I can't think why we'd be doing it in backwards order beforehand. get_relation_statistics: RelationGetStatExtList does not seem to pay much attention to the order it returns its results, so I don't think the order we apply extended statistics was that well defined before. We always attempt to use the stats with the most matching columns in choose_best_statistics(), so I think for people to be affected they'd either multiple stats with the same sets of columns or a complex clause that equally well matches two sets of stats, and in that case the other columns would be matched to the other stats later... I'd better check that... erm... actually that's not true. I see statext_mcv_clauselist_selectivity() makes no attempt to match the clause list to another set of stats after finding the first best match. I think it likely should do that. estimate_multivariate_ndistinct() seems to have an XXX comment mentioning thoughts about the stability of which stats are used, but nothing is done. parseCheckAggregates: I can't see any user-visible change to this one. Not even in error messages. estimate_num_groups: Similar to get_relation_statistics(), I see that estimate_multivariate_ndistinct() is only called once and we don't attempt to match up the remaining clauses with more stats. I can't imagine swapping lcons for lappend here will upset anyone. The behaviour does not look well defined already. I think we should likely change the "if (estimate_multivariate_ndistinct(root, rel, &relvarinfos," to "while ...", then drop the else. Not for this patch though... -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: