Thread: Does FILTER in SEQSCAN short-circuit AND?
Sample code: SELECT * FROM MyTable WHERE foo = 'bar' AND MySlowFunc('foo') = 'bar' Let's say this required a SEQSCAN because there were no indexes to support column foo. For every row where foo <> 'bar' would the filter on the SEQSCAN short-circuit the AND return false right away, or would it still execute MySlowFunc('foo') ? Thanks! Carlo
"Carlo Stonebanks" <stonec.register@sympatico.ca> wrote: > SELECT * > FROM MyTable > WHERE foo = 'bar' AND MySlowFunc('foo') = 'bar' > > Let's say this required a SEQSCAN because there were no indexes to > support column foo. For every row where foo <> 'bar' would the > filter on the SEQSCAN short-circuit the AND return false right > away, or would it still execute MySlowFunc('foo') ? For that example, I'm pretty sure it will skip the slow function for rows which fail the first test. A quick test confirmed that for me. If you create a sufficiently slow function, you shouldn't have much trouble testing that yourself. :-) -Kevin
On 5/18/10 3:28 PM, Carlo Stonebanks wrote: > Sample code: > > SELECT * > FROM MyTable > WHERE foo = 'bar' AND MySlowFunc('foo') = 'bar' > > Let's say this required a SEQSCAN because there were no indexes to > support column foo. For every row where foo <> 'bar' would the filter on > the SEQSCAN short-circuit the AND return false right away, or would it > still execute MySlowFunc('foo') ? I asked a similar question a few years back, and the answer is that the planner just makes a guess and applies it to allfunctions. It has no idea whether your function is super fast or incredibly slow, they're all assigned the same cost. In this fairly simple case, the planner might reasonably guess that "foo = 'bar'" will always be faster than "AnyFunc(foo)= 'bar'". But for real queries, that might not be the case. In my case, I have a function that is so slow that it ALWAYS is good to avoid it. Unfortunately, there's no way to explainthat to Postgres, so I have to use other tricks to force the planner not to use it. select * from (select * from MyTable where foo = 'bar' offset 0) where MySlowFunc(foo) = 'bar'; The "offset 0" prevents the planner from collapsing this query back into your original syntax. It will only apply MySlowFunc()to rows where you already know that foo = 'bar'. It would be nice if Postgres had a way to assign a cost to every function. Until then, you have to use convoluted SQL ifyou have a really slow function. Craig
Craig James wrote on 27.05.2010 23:13: > It would be nice if Postgres had a way to assign a cost to every > function. Isn't that what the COST parameter is intended to be: http://www.postgresql.org/docs/current/static/sql-createfunction.html Thomas
Craig James <craig_james@emolecules.com> wrote: > It would be nice if Postgres had a way to assign a cost to every > function. The COST clause of CREATE FUNCTION doesn't do what you want? http://www.postgresql.org/docs/8.4/interactive/sql-createfunction.html -Kevin
On 5/27/10 2:28 PM, Kevin Grittner wrote: > Craig James<craig_james@emolecules.com> wrote: > >> It would be nice if Postgres had a way to assign a cost to every >> function. > > The COST clause of CREATE FUNCTION doesn't do what you want? > > http://www.postgresql.org/docs/8.4/interactive/sql-createfunction.html Cool ... I must have missed it when this feature was added. Nice! Craig
We are currently using pltclu as our PL of choice AFTER plpgSql. I'd like to know if anyone can comment on the performance costs of the various PL languages BESIDES C. For example, does pltclu instantiate faster than pltcl (presumably because it uses a shared interpreter?) Is Perl more lightweight? I know that everything depends on context - what you are doing with it, e.g. choose Tcl for string handling vs. Perl for number crunching - but for those who know about this, is there a clear performance advantage for any of the various PL languages - and if so, is it a difference so big to be worth switching? I ask this because I had expected to see pl/pgsql as a clear winner in terms of performance over pltclu, but my initial test showed the opposite. I know this may be an apples vs oranges problem and I will test further, but if anyone has any advice or insight, I would appreciate it so I can tailor my tests accordingly. Thanks, Carlo
Hello 2011/12/27 Carlo Stonebanks <stonec.register@sympatico.ca>: > We are currently using pltclu as our PL of choice AFTER plpgSql. > > I'd like to know if anyone can comment on the performance costs of the > various PL languages BESIDES C. For example, does pltclu instantiate faster > than pltcl (presumably because it uses a shared interpreter?) Is Perl more > lightweight? > > I know that everything depends on context - what you are doing with it, e.g. > choose Tcl for string handling vs. Perl for number crunching - but for those > who know about this, is there a clear performance advantage for any of the > various PL languages - and if so, is it a difference so big to be worth > switching? > > I ask this because I had expected to see pl/pgsql as a clear winner in terms > of performance over pltclu, but my initial test showed the opposite. I know > this may be an apples vs oranges problem and I will test further, but if > anyone has any advice or insight, I would appreciate it so I can tailor my > tests accordingly. > A performance strongly depends on use case. PL/pgSQL has fast start but any expression is evaluated as simple SQL expression - and some repeated operation should be very expensive - array update, string update. PL/pgSQL is best as SQL glue. Positive to performance is type compatibility between plpgsql and Postgres. Interpret plpgsql is very simply - there are +/- zero optimizations - plpgsql code should be minimalistic, but when you don't do some really wrong, then a speed is comparable with PHP. http://www.pgsql.cz/index.php/PL/pgSQL_%28en%29#Inappropriate_use_of_the_PL.2FpgSQL_language PL/Perl has slower start - but string or array operations are very fast. Perl has own expression evaluator - faster than expression evaluation in plpgsql. On second hand - any input must be transformed from postgres format to perl format and any result must be transformed too. Perl and other languages doesn't use data type compatible with Postgres. Regards Pavel Stehule > > Thanks, > > Carlo > > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
On Tue, Dec 27, 2011 at 4:20 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > Hello > > 2011/12/27 Carlo Stonebanks <stonec.register@sympatico.ca>: >> We are currently using pltclu as our PL of choice AFTER plpgSql. >> >> I'd like to know if anyone can comment on the performance costs of the >> various PL languages BESIDES C. For example, does pltclu instantiate faster >> than pltcl (presumably because it uses a shared interpreter?) Is Perl more >> lightweight? >> >> I know that everything depends on context - what you are doing with it, e.g. >> choose Tcl for string handling vs. Perl for number crunching - but for those >> who know about this, is there a clear performance advantage for any of the >> various PL languages - and if so, is it a difference so big to be worth >> switching? >> >> I ask this because I had expected to see pl/pgsql as a clear winner in terms >> of performance over pltclu, but my initial test showed the opposite. I know >> this may be an apples vs oranges problem and I will test further, but if >> anyone has any advice or insight, I would appreciate it so I can tailor my >> tests accordingly. >> > > A performance strongly depends on use case. > > PL/pgSQL has fast start but any expression is evaluated as simple SQL > expression - and some repeated operation should be very expensive - > array update, string update. PL/pgSQL is best as SQL glue. Positive to > performance is type compatibility between plpgsql and Postgres. > Interpret plpgsql is very simply - there are +/- zero optimizations - > plpgsql code should be minimalistic, but when you don't do some really > wrong, then a speed is comparable with PHP. > > http://www.pgsql.cz/index.php/PL/pgSQL_%28en%29#Inappropriate_use_of_the_PL.2FpgSQL_language > > PL/Perl has slower start - but string or array operations are very > fast. Perl has own expression evaluator - faster than expression > evaluation in plpgsql. On second hand - any input must be transformed > from postgres format to perl format and any result must be transformed > too. Perl and other languages doesn't use data type compatible with > Postgres. One big advantage pl/pgsql has over scripting languages is that it understands postgresql types natively. It knows what a postgres array is, and can manipulate one directly. pl/perl would typically have to have the database convert it to a string, parse it into a perl structure, do the manipulation, then send it to the database to be parsed again. If your procedure code is mainly moving data between tables and doing minimal intermediate heavy processing, this adds up to a big advantage. Which pl to go with really depends on what you need to do. pl/pgsql is always my first choice though. perl and tcl are not particularly fast languages in the general case -- you are largely at the mercy of how well the language's syntax or library features map to the particular problem you're solving. if you need a fast general purpose language in the backend and are (very understandably) skeptical about C, I'd look at pl/java. merlin
On 12/27/2011 05:54 PM, Merlin Moncure wrote: > On Tue, Dec 27, 2011 at 4:20 PM, Pavel Stehule<pavel.stehule@gmail.com> wrote: >> Hello >> >> 2011/12/27 Carlo Stonebanks<stonec.register@sympatico.ca>: >>> We are currently using pltclu as our PL of choice AFTER plpgSql. >>> >>> I'd like to know if anyone can comment on the performance costs of the >>> various PL languages BESIDES C. For example, does pltclu instantiate faster >>> than pltcl (presumably because it uses a shared interpreter?) Is Perl more >>> lightweight? >>> >>> I know that everything depends on context - what you are doing with it, e.g. >>> choose Tcl for string handling vs. Perl for number crunching - but for those >>> who know about this, is there a clear performance advantage for any of the >>> various PL languages - and if so, is it a difference so big to be worth >>> switching? >>> >>> I ask this because I had expected to see pl/pgsql as a clear winner in terms >>> of performance over pltclu, but my initial test showed the opposite. I know >>> this may be an apples vs oranges problem and I will test further, but if >>> anyone has any advice or insight, I would appreciate it so I can tailor my >>> tests accordingly. >>> >> A performance strongly depends on use case. >> >> PL/pgSQL has fast start but any expression is evaluated as simple SQL >> expression - and some repeated operation should be very expensive - >> array update, string update. PL/pgSQL is best as SQL glue. Positive to >> performance is type compatibility between plpgsql and Postgres. >> Interpret plpgsql is very simply - there are +/- zero optimizations - >> plpgsql code should be minimalistic, but when you don't do some really >> wrong, then a speed is comparable with PHP. >> >> http://www.pgsql.cz/index.php/PL/pgSQL_%28en%29#Inappropriate_use_of_the_PL.2FpgSQL_language >> >> PL/Perl has slower start - but string or array operations are very >> fast. Perl has own expression evaluator - faster than expression >> evaluation in plpgsql. On second hand - any input must be transformed >> from postgres format to perl format and any result must be transformed >> too. Perl and other languages doesn't use data type compatible with >> Postgres. > One big advantage pl/pgsql has over scripting languages is that it > understands postgresql types natively. It knows what a postgres array > is, and can manipulate one directly. pl/perl would typically have to > have the database convert it to a string, parse it into a perl > structure, do the manipulation, then send it to the database to be > parsed again. If your procedure code is mainly moving data between > tables and doing minimal intermediate heavy processing, this adds up > to a big advantage. Which pl to go with really depends on what you > need to do. pl/pgsql is always my first choice though. > > perl and tcl are not particularly fast languages in the general case > -- you are largely at the mercy of how well the language's syntax or > library features map to the particular problem you're solving. if you > need a fast general purpose language in the backend and are (very > understandably) skeptical about C, I'd look at pl/java. > PLV8, which is not yet ready for prime time, maps many common Postgres types into native JS types without the use of Input/Output functions, which means the conversion is very fast. It's work which could very well do with repeating for the other PL's. cheers andrew
Thanks guys. Ah, Pl/java - of course. I would miss writing the code right in the SQL script, but that would have been true of C as well. None of these procedures really qualify as stored procs that move data; rather they are scalar functions used for fuzzy string comparisons based on our own domain logic - imagine something like, SELECT * FROM fathers AS f, sons AS s WHERE same_name(f.last_name, s.last_name) ... and same_name had business logic that corrected for O'reilly vs oreilly, Van De Lay vs Vandelay, etc. The point is that as we learn about the domain, we would add the rules into the function same_name() so that all apps would benefit from the new rules. Some of the functions are data-driven, for example a table of common abbreviations with regex or LIKE expressions that would be run against both strings so that each string is reduced to common abbreviations (i.e. lowest common denominator) then compared, e.g. SELECT * FROM companies AS c WHERE same_business_name(s, 'ACME Business Supplies, Incorporated') Would reduce both parameters down to the most common abbreviation and then compare again with fuzzy logic. Of course, even if this was written in C, the function would be data-bound as it read from the abbreviation table - unless you guys tell that there is a not inconsiderable cost involved in type conversion from PG to internal vars. Carlo -----Original Message----- From: Merlin Moncure [mailto:mmoncure@gmail.com] Sent: December 27, 2011 5:54 PM To: Pavel Stehule Cc: Carlo Stonebanks; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Performance costs of various PL languages On Tue, Dec 27, 2011 at 4:20 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > Hello > > 2011/12/27 Carlo Stonebanks <stonec.register@sympatico.ca>: >> We are currently using pltclu as our PL of choice AFTER plpgSql. >> >> I'd like to know if anyone can comment on the performance costs of the >> various PL languages BESIDES C. For example, does pltclu instantiate faster >> than pltcl (presumably because it uses a shared interpreter?) Is Perl more >> lightweight? >> >> I know that everything depends on context - what you are doing with it, e.g. >> choose Tcl for string handling vs. Perl for number crunching - but for those >> who know about this, is there a clear performance advantage for any of the >> various PL languages - and if so, is it a difference so big to be worth >> switching? >> >> I ask this because I had expected to see pl/pgsql as a clear winner in terms >> of performance over pltclu, but my initial test showed the opposite. I know >> this may be an apples vs oranges problem and I will test further, but if >> anyone has any advice or insight, I would appreciate it so I can tailor my >> tests accordingly. >> > > A performance strongly depends on use case. > > PL/pgSQL has fast start but any expression is evaluated as simple SQL > expression - and some repeated operation should be very expensive - > array update, string update. PL/pgSQL is best as SQL glue. Positive to > performance is type compatibility between plpgsql and Postgres. > Interpret plpgsql is very simply - there are +/- zero optimizations - > plpgsql code should be minimalistic, but when you don't do some really > wrong, then a speed is comparable with PHP. > > http://www.pgsql.cz/index.php/PL/pgSQL_%28en%29#Inappropriate_use_of_the_PL. 2FpgSQL_language > > PL/Perl has slower start - but string or array operations are very > fast. Perl has own expression evaluator - faster than expression > evaluation in plpgsql. On second hand - any input must be transformed > from postgres format to perl format and any result must be transformed > too. Perl and other languages doesn't use data type compatible with > Postgres. One big advantage pl/pgsql has over scripting languages is that it understands postgresql types natively. It knows what a postgres array is, and can manipulate one directly. pl/perl would typically have to have the database convert it to a string, parse it into a perl structure, do the manipulation, then send it to the database to be parsed again. If your procedure code is mainly moving data between tables and doing minimal intermediate heavy processing, this adds up to a big advantage. Which pl to go with really depends on what you need to do. pl/pgsql is always my first choice though. perl and tcl are not particularly fast languages in the general case -- you are largely at the mercy of how well the language's syntax or library features map to the particular problem you're solving. if you need a fast general purpose language in the backend and are (very understandably) skeptical about C, I'd look at pl/java. merlin