Re: [PERFORM] Query planner gaining the ability to replanning afterstart of query execution.

From: Arne Roland
Subject: Re: [PERFORM] Query planner gaining the ability to replanning afterstart of query execution.
Date: ,
Msg-id: 252cab3895894337a3a88275f423fe0b@index.de
(view: Whole thread, Raw)
In response to: [PERFORM] Query planner gaining the ability to replanning after start of query execution.  (Oliver Mattos)
Responses: [PERFORM] Re: Query planner gaining the ability to replanning after start ofquery execution.  (Gunter)
List: pgsql-performance


Hello,

I'd love to have some sort of dynamic query feedback, yet it's very complicated to do it right. I am not convinced that
changingthe plan during a single execution is the right way to do it, not only because it sounds intrusive to do crazy
thingsin the executor, but also because don't understand why the new plan should be any better than the old one. Can
yoube more elaborate how you'd want to go about it?
 

In your example (which presumably just has a single relation), we have no notion of whether the scan returns no rows
becausewe were unlucky, because just the first few pages were empty of matching rows (which in my experience happens
moreoften), or because the cardinality estimation is wrong. Even if the cardinality estimation is wrong, we have no
notionof which predicate or predicate combination actually caused the misestimation. If the first few pages where
empty,the same might happen with every order (so also with every available indexscan). Imagine a very simple seqscan
planof 
 
select * from mytab where a = 3 and b = 40 limit 1
Even if we know the cardinality is overestimated, we have no idea whether the cardinality of a = 3 or b = 40 is wrong
orthey just correlate, so there is no notion of which is actually the cheapest plan. Usual workaround for most of these
queriesis to add an order by (which has the nice addition of having a deterministic result) with an appropriate complex
index,usually resulting in indexscans.
 

While we actually know more after the first execution of a nodes like materialize, sort or hash nodes, I rarely
encountermaterialize nodes in the wild. Consequently that is the place where the work is usually already done, which is
especiallytrue with the hash node. Even though it still might be more optimal to switch from a mergejoin to a hashjoin
insome cases, I doubt that's worth any work (and even less the maintenance).
 

Best regards
Arne Roland

-----Original Message-----
From:  [mailto:] On Behalf Of Oliver
Mattos
Sent: Monday, November 13, 2017 5:45 PM
To: 
Subject: [PERFORM] Query planner gaining the ability to replanning after start of query execution.

I am interested in giving the query planner the ability to replan (or re-rank plans) after query execution has begun,
basedon the progression of the query so far.
 

Example use case:

*  A LIMIT 1 query is planned using an expensive scan which the planner expects to return a large number of results,
andto terminate
 
early.   The reality is the query actually produces no results, and
the scan must run to completion, potentially taking thousands of times longer than expected.

*  If this plans costs were adjusted mid-execution to reflect the fact that the scan is producing far fewer rows than
expected,then another query plan might come out ahead, which would complete far faster.
 


Has this been done before?   Are there any pitfalls to beware of?


--
Sent via pgsql-performance mailing list ()
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




-----Original Message-----
From:  [mailto:] On Behalf Of Oliver
Mattos
Sent: Monday, November 13, 2017 5:45 PM
To: 
Subject: [PERFORM] Query planner gaining the ability to replanning after start of query execution.

I am interested in giving the query planner the ability to replan (or re-rank plans) after query execution has begun,
basedon the progression of the query so far.
 

Example use case:

*  A LIMIT 1 query is planned using an expensive scan which the planner expects to return a large number of results,
andto terminate
 
early.   The reality is the query actually produces no results, and
the scan must run to completion, potentially taking thousands of times longer than expected.

*  If this plans costs were adjusted mid-execution to reflect the fact that the scan is producing far fewer rows than
expected,then another query plan might come out ahead, which would complete far faster.
 


Has this been done before?   Are there any pitfalls to beware of?


--
Sent via pgsql-performance mailing list ()
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




--
Sent via pgsql-performance mailing list ()
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


pgsql-performance by date:

From: Oliver Mattos
Date:
Subject: Re: [PERFORM] Query planner gaining the ability to replanning afterstart of query execution.
From: legrand legrand
Date:
Subject: [PERFORM] Re: Query planner gaining the ability to replanning after start ofquery execution.