Re: Performance costs of various PL languages - Mailing list pgsql-performance
From | Carlo Stonebanks |
---|---|
Subject | Re: Performance costs of various PL languages |
Date | |
Msg-id | 1C5807A6935044BEB4E24BCD0201B350@CAPRICA Whole thread Raw |
In response to | Re: Performance costs of various PL languages (Merlin Moncure <mmoncure@gmail.com>) |
List | pgsql-performance |
Thanks guys. Ah, Pl/java - of course. I would miss writing the code right in the SQL script, but that would have been true of C as well. None of these procedures really qualify as stored procs that move data; rather they are scalar functions used for fuzzy string comparisons based on our own domain logic - imagine something like, SELECT * FROM fathers AS f, sons AS s WHERE same_name(f.last_name, s.last_name) ... and same_name had business logic that corrected for O'reilly vs oreilly, Van De Lay vs Vandelay, etc. The point is that as we learn about the domain, we would add the rules into the function same_name() so that all apps would benefit from the new rules. Some of the functions are data-driven, for example a table of common abbreviations with regex or LIKE expressions that would be run against both strings so that each string is reduced to common abbreviations (i.e. lowest common denominator) then compared, e.g. SELECT * FROM companies AS c WHERE same_business_name(s, 'ACME Business Supplies, Incorporated') Would reduce both parameters down to the most common abbreviation and then compare again with fuzzy logic. Of course, even if this was written in C, the function would be data-bound as it read from the abbreviation table - unless you guys tell that there is a not inconsiderable cost involved in type conversion from PG to internal vars. Carlo -----Original Message----- From: Merlin Moncure [mailto:mmoncure@gmail.com] Sent: December 27, 2011 5:54 PM To: Pavel Stehule Cc: Carlo Stonebanks; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Performance costs of various PL languages On Tue, Dec 27, 2011 at 4:20 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > Hello > > 2011/12/27 Carlo Stonebanks <stonec.register@sympatico.ca>: >> We are currently using pltclu as our PL of choice AFTER plpgSql. >> >> I'd like to know if anyone can comment on the performance costs of the >> various PL languages BESIDES C. For example, does pltclu instantiate faster >> than pltcl (presumably because it uses a shared interpreter?) Is Perl more >> lightweight? >> >> I know that everything depends on context - what you are doing with it, e.g. >> choose Tcl for string handling vs. Perl for number crunching - but for those >> who know about this, is there a clear performance advantage for any of the >> various PL languages - and if so, is it a difference so big to be worth >> switching? >> >> I ask this because I had expected to see pl/pgsql as a clear winner in terms >> of performance over pltclu, but my initial test showed the opposite. I know >> this may be an apples vs oranges problem and I will test further, but if >> anyone has any advice or insight, I would appreciate it so I can tailor my >> tests accordingly. >> > > A performance strongly depends on use case. > > PL/pgSQL has fast start but any expression is evaluated as simple SQL > expression - and some repeated operation should be very expensive - > array update, string update. PL/pgSQL is best as SQL glue. Positive to > performance is type compatibility between plpgsql and Postgres. > Interpret plpgsql is very simply - there are +/- zero optimizations - > plpgsql code should be minimalistic, but when you don't do some really > wrong, then a speed is comparable with PHP. > > http://www.pgsql.cz/index.php/PL/pgSQL_%28en%29#Inappropriate_use_of_the_PL. 2FpgSQL_language > > PL/Perl has slower start - but string or array operations are very > fast. Perl has own expression evaluator - faster than expression > evaluation in plpgsql. On second hand - any input must be transformed > from postgres format to perl format and any result must be transformed > too. Perl and other languages doesn't use data type compatible with > Postgres. One big advantage pl/pgsql has over scripting languages is that it understands postgresql types natively. It knows what a postgres array is, and can manipulate one directly. pl/perl would typically have to have the database convert it to a string, parse it into a perl structure, do the manipulation, then send it to the database to be parsed again. If your procedure code is mainly moving data between tables and doing minimal intermediate heavy processing, this adds up to a big advantage. Which pl to go with really depends on what you need to do. pl/pgsql is always my first choice though. perl and tcl are not particularly fast languages in the general case -- you are largely at the mercy of how well the language's syntax or library features map to the particular problem you're solving. if you need a fast general purpose language in the backend and are (very understandably) skeptical about C, I'd look at pl/java. merlin
pgsql-performance by date: