Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta) - Mailing list pgsql-hackers

From Stephan Szabo
Subject Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)
Date
Msg-id 20030929112706.U94710@megazone.bigpanda.com
Whole thread Raw
In response to Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)
Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)
List pgsql-hackers
On Sun, 28 Sep 2003, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone.bigpanda.com> writes:
> > On Sat, 27 Sep 2003, Tom Lane wrote:
> >> I thought of what seems to be a better design for the check query: use
> >> a LEFT JOIN and check for NULL in the righthand joined column.
>
> > Hmm, my initial testing showed that it really was a little slower
> > than a more complicated one with NOT EXISTS so I'd abandoned it. How does
> > it fare for you compared to:
> > select f1, f2 from fk where not exists (select 1 from pk where pk.f1=fk.f1
> > and pk.f2=pk.f2) where fk.f1 is not null and fk.f2 is not null;
>
> Were you testing against 7.3 or 7.4?  On what kinds of tables?
>
> In 7.4 I think that the JOIN would yield as good or better a plan.  The
> best possible plan for the NOT EXISTS query is effectively a nestloop
> with inner indexscan, which is great if the FK table is small and the
> PK table is large, but it sucks otherwise.  The planner should choose a
> plan of this form for the LEFT JOIN given that combination of table
> sizes, and so there shouldn't be any great difference in runtime in that
> case.  But in other combinations, such as large FK and small PK, other
> plan types will beat the pants off nestloop.

As an update, so far I still am getting better results with NOT EXISTS
than the left join.  For a 50m row fk, 10k row pk where the rows are just
the keys, I'm getting a plan like
Merge JoinIndex scan on pktableSort Seqscan on fktable

which is taking about 2-4 times longer for me than the not exists
depending on sort_mem (at 4096,64000,128000).

When I lowered random_page_cost to 1, I got an indexscan on fktable, but
that hadn't seemed to finish after about 2 hours (as opposed to about
30-35 minutes for the not exists and about 30 min - 1 1/2 hours for the
sort+seqscan version.

I want to do some more tests where there's extraneous data in both tables
and see what that does to the results.



pgsql-hackers by date:

Previous
From: "Marc G. Fournier"
Date:
Subject: Re: 7.4 status
Next
From: Stephan Szabo
Date:
Subject: Re: ADD FOREIGN KEY (was Re: [GENERAL] 7.4Beta)