Re: DBLink: interesting issue - Mailing list pgsql-general

From Joe Conway
Subject Re: DBLink: interesting issue
Date
Msg-id 3D8BB0FE.2010206@joeconway.com
Whole thread Raw
In response to DBLink: interesting issue  ("Oleg Lebedev" <oleg.lebedev@waterford.org>)
List pgsql-general
Oleg Lebedev wrote:
> I am using dbLink library to synchronize tables in two databases. Below
> are two identical in terms of their results queries, one is using
> NOT EXISTS and another NOT IN. The later one is 100 time slower then the
> former one according to EXPLAIN. However, the former one throws an
> interesting exception every time I run it. Interestingly, it works fine
> if the local table is empty. Any ideas why it happens?

I need more information. What version of PostgreSQL, and is dblink the version
distributed with Postgres, or is it a later copy from CVS?

I have a machine running 7.2.2 and a CVS (Version 0.4, ~mid April, 2002) copy
of dblink, and cannot reproduce the failure:

testmst=# select version();
                            version
-------------------------------------------------------------
  PostgreSQL 7.2.2 on i686-pc-linux-gnu, compiled by GCC 2.96
(1 row)

testslv=# SELECT objectid FROM activity WHERE NOT EXISTS (SELECT remoteid FROM
(SELECT CAST (dblink_tok(t1.dblink_p, 0) AS int8) AS remoteid FROM (SELECT
dblink('dbname=testmst','SELECT objectid FROM activity') AS dblink_p) t1) a1
WHERE remoteid=activity.objectid);
  objectid
----------
(0 rows)

Early in 7.3 development I updated dblink in a 7.2 backward compatible way.
Unfortunately, it wasn't practical to keep it backward compatible to 7.2.x, so
the current CVS HEAD version will no longer work with 7.2.x. If you would like
to try dblink Version 0.4, let me know and I'll send you a patch.

As a side note, 7.3 (currently in beta) has much improved semantics for use of
dblink, made possible by the new table function capability. E.g.:

regression_slave=# select version();
                                                     version
----------------------------------------------------------------------------------------------------------------
  PostgreSQL 7.3b1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.1
20020205 (Red Hat Linux Rawhide 3.1-0.21)
(1 row)

regression_slave=# SELECT objectid FROM activity WHERE NOT EXISTS (SELECT
remoteid FROM dblink('dbname=regression','SELECT objectid FROM activity') AS
t1(remoteid int8) WHERE remoteid=activity.objectid);
  objectid
----------
         5
(1 row)

Or even better performance-wise:

regression_slave=# SELECT objectid FROM activity WHERE NOT EXISTS (SELECT
remoteid FROM dblink('dbname=regression','SELECT objectid FROM activity WHERE
objectid = ' || activity.objectid) AS t1(remoteid int8));
  objectid
----------
         5
(1 row)

If you can, please give the beta (Postgres and dblink) a try (not for
production yet, of course, but for feedback).

Thanks,

Joe


pgsql-general by date:

Previous
From: "Oleg Lebedev"
Date:
Subject: DBLink: interesting issue
Next
From: kbase
Date:
Subject: User Defined Functions