Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables) - Mailing list pgsql-hackers
| From | Daniel Farina |
|---|---|
| Subject | Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables) |
| Date | |
| Msg-id | CAAZKuFa0H8XxRF9PgKcHj=ffBZNHibcBQJLJBX67nc0ozu-wMA@mail.gmail.com Whole thread Raw |
| In response to | Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3] writable foreign tables) (Daniel Farina <daniel@heroku.com>) |
| Responses |
Re: postgres_fdw vs data formatting GUCs (was Re: [v9.3]
writable foreign tables)
|
| List | pgsql-hackers |
On Tue, Mar 19, 2013 at 3:06 PM, Daniel Farina <daniel@heroku.com> wrote:
> On Tue, Mar 19, 2013 at 2:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Daniel Farina <daniel@heroku.com> writes:
>>> On Tue, Mar 19, 2013 at 1:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> I'd be inclined to eat the cost of calling PQparameterStatus every time
>>>> (which won't be that much) and instead try to optimize by avoiding the
>>>> GUC-setting overhead if the current value matches the local setting.
>>>> But even that might be premature optimization. Did you do any
>>>> performance testing to see if there was a problem worth avoiding?
>>
>>> Nope; should I invent a new way to do that, or would it be up to
>>> commit standard based on inspection alone? I'm okay either way, let
>>> me know.
>>
>> Doesn't seem that hard to test: run a dblink query that pulls back a
>> bunch of data under best-case conditions (ie, local not remote server),
>> and time it with and without the proposed fix. If the difference
>> is marginal then it's not worth working hard to optimize.
>
> Okay, will do, and here's the shorter and less mechanically intensive
> naive version that I think is the baseline: it doesn't try to optimize
> out any GUC settings and sets up the GUCs before the two
> materialization paths in dblink.
The results. Summary: seems like grabbing the GUC and strcmp-ing is
worthwhile, but the amount of ping-ponging between processes adds some
noise to the timing results: utilization is far short of 100% on
either processor involved. Attached is a cumulative diff of the new
version, and also reproduced below are the changes to v2 that make up
v3.
## Benchmark
SELECT dblink_connect('benchconn','dbname=contrib_regression');
CREATE FUNCTION bench() RETURNS integer AS $$
DECLARE
iterations integer;
BEGIN
iterations := 0;
WHILE iterations < 300000 LOOP
PERFORM * FROM dblink('benchconn', 'SELECT 1') AS t(a int);
iterations := iterations + 1;
END LOOP;
RETURN iterations;
END;
$$ LANGUAGE plpgsql;
SELECT clock_timestamp() INTO TEMP beginning;
SELECT bench();
SELECT clock_timestamp() INTO TEMP ending;
SELECT 'dblink-benchmark-lines';
SELECT ending.clock_timestamp - beginning.clock_timestamp
FROM beginning, ending;
## Data Setup
CREATE TEMP TABLE bench_results (version text, span interval);
COPY bench_results FROM STDIN WITH CSV;
no-patch,@ 41.308122 secs
no-patch,@ 36.63597 secs
no-patch,@ 34.264119 secs
no-patch,@ 34.760179 secs
no-patch,@ 32.991257 secs
no-patch,@ 34.538258 secs
no-patch,@ 42.576354 secs
no-patch,@ 39.335557 secs
no-patch,@ 37.493206 secs
no-patch,@ 37.812205 secs
v2-applied,@ 36.550553 secs
v2-applied,@ 38.608723 secs
v2-applied,@ 39.415744 secs
v2-applied,@ 46.091052 secs
v2-applied,@ 45.425438 secs
v2-applied,@ 48.219506 secs
v2-applied,@ 43.514878 secs
v2-applied,@ 45.892302 secs
v2-applied,@ 48.479335 secs
v2-applied,@ 47.632041 secs
v3-strcmp,@ 32.524385 secs
v3-strcmp,@ 34.982098 secs
v3-strcmp,@ 34.487222 secs
v3-strcmp,@ 44.394681 secs
v3-strcmp,@ 44.638309 secs
v3-strcmp,@ 44.113088 secs
v3-strcmp,@ 45.497769 secs
v3-strcmp,@ 33.530176 secs
v3-strcmp,@ 32.9704 secs
v3-strcmp,@ 40.84764 secs
\.
=> SELECT version, avg(extract(epoch from span)), stddev(extract(epoch
from span))
FROM bench_results
GROUP BY version;
version | avg | stddev
------------+------------+------------------
no-patch | 37.1715227 | 3.17076487912923
v2-applied | 43.9829572 | 4.30572672565711
v3-strcmp | 38.7985768 | 5.54760393720725
## Changes to v2:
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -2981,9 +2981,11 @@ initRemoteGucs(remoteGucs *rgs, PGconn *conn)
static void
applyRemoteGucs(remoteGucs *rgs)
{
- int i;
const int numGucs = sizeof parseAffectingGucs / sizeof *parseAffectingGucs;
+ int i;
+ int addedGucNesting = false;
+
/*
* Affected types require local GUC manipulations. Create a new
* GUC NestLevel to overlay the remote settings.
@@ -2992,14 +2994,27 @@ applyRemoteGucs(remoteGucs *rgs)
* structure, so expect it to come with an invalid NestLevel.
*/
Assert(rgs->localGUCNestLevel == -1);
- rgs->localGUCNestLevel = NewGUCNestLevel();
for (i = 0; i < numGucs; i += 1)
{
+ int gucApplyStatus;
+
const char *gucName = parseAffectingGucs[i];
const char *remoteVal = PQparameterStatus(rgs->conn, gucName);
+ const char *localVal = GetConfigOption(gucName, true, true);
- int gucApplyStatus;
+ /*
+ * Attempt to avoid GUC setting if the remote and local GUCs
+ * already have the same value.
+ */
+ if (strcmp(remoteVal, localVal) == 0)
+ continue;
+
+ if (!addedGucNesting)
+ {
+ rgs->localGUCNestLevel = NewGUCNestLevel();
+ addedGucNesting = true;
+ }
gucApplyStatus = set_config_option(gucName, remoteVal,
PGC_USERSET, PGC_S_SESSION,
--
fdr
Attachment
pgsql-hackers by date: