Re: Possible bug in CASE evaluation - Mailing list pgsql-hackers
From | Pavel Stehule |
---|---|
Subject | Re: Possible bug in CASE evaluation |
Date | |
Msg-id | CAFj8pRDmqx8HgsYPT9h-5gQ_T0ZhCZTbzeA1+TJocKCijaj4xw@mail.gmail.com Whole thread Raw |
In response to | Re: Possible bug in CASE evaluation (Andres Freund <andres@2ndquadrant.com>) |
List | pgsql-hackers |
2013/6/25 Andres Freund <andres@2ndquadrant.com>: > On 2013-06-24 21:35:53 -0400, Noah Misch wrote: >> On Sat, Jun 22, 2013 at 04:54:50PM +0200, Andres Freund wrote: >> > On 2013-06-21 16:45:28 +0200, Andres Freund wrote: >> > > On 2013-06-21 09:51:05 -0400, Noah Misch wrote: >> > > > That being said, if we discover a simple-enough fix that performs well, we may >> > > > as well incorporate it. >> > > >> > > What about passing another parameter down eval_const_expressions_mutator >> > > (which is static, so changing the API isn't a problem) that basically >> > > tells us whether we actually should evaluate expressions or just perform >> > > the transformation? >> > > There's seems to be basically a couple of places where we call dangerous >> > > things: >> > > * simplify_function (via ->evaluate_function->evaluate_expr) which is >> > > called in a bunch of places >> > > * evaluate_expr which is directly called in T_ArrayExpr >> > > T_ArrayCoerceExpr >> > > >> > > All places I've inspected so far need to deal with simplify_function >> > > returning NULL anyway, so that seems like a viable fix. >> > >> > *Prototype* patch - that seems simple enough - attached. Opinions? >> >> Simple enough, yes. The other point still stands. > > You mean performance? Primarily I still think we should first worry > about correctness first and then about performance. And CASE is the > documented (and really only, without writing procedual code) solution to > use for the cases where evaluation order actually *is* important. > > But anyway, the question is to find realistic cases to measure the > performance of. Obviously you can just make arbitrarily expensive > expressions that can be computed full during constant folding. Which I > don't find very interesting, do you? > > So, what I've done is to measure the performance difference when doing > full table queries of some CASE containing system views. > > best of 5 everytime: > SELECT * FROM pg_stats; > master: 28.287 patched: 28.565 > > SELECT * FROM information_schema.applicable_roles; > master: 0.757 patched: 0.755 > > regression=# SELECT * FROM information_schema.attributes: > master: 8.392 patched: 8.555 > > SELECT * FROM information_schema.column_privileges; > master: 90.853 patched: 88.551 > > SELECT * FROM information_schema.columns; > master: 259.436 patched: 274.145 > > SELECT * FROM information_schema.constraint_column_usage ; > master: 14.736 patched 15.005 > > SELECT * FROM information_schema.parameters; > master: 76.173 patched: 79.850 > > SELECT * FROM information_schema.routines; > master: 45.102 patched: 46.517 ms > > ... > > So, on those queries there's some difference (I've left out the ones > which are too short), but it's not big. > > Now, for the other extreme, the following completely random query I just > typed out: > SELECT f FROM (SELECT (CASE g.i WHEN -1 THEN 0 WHEN 1 THEN 3.0/1 WHEN g.i THEN 2.0/3 END) f FROM generate_series(1, 1000000)g(i)) s WHERE f = 0; > master: 491.931 patched: 943.629 > > suffers way much worse because the division is so expensive... :-( it is too high price Pavel > > Greetings, > > Andres Freund > > -- > Andres Freund http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date: