Re: nested loop semijoin estimates - Mailing list pgsql-hackers

From Tom Lane
Subject Re: nested loop semijoin estimates
Date
Msg-id 28688.1433255879@sss.pgh.pa.us
Whole thread Raw
In response to Re: nested loop semijoin estimates  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: nested loop semijoin estimates  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: nested loop semijoin estimates  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> OK, so I did the testing today - with TPC-H and TPC-DS benchmarks. The 
> results are good, IMHO.

> With TPC-H, I've used 1GB and 4GB datasets, and I've seen no plan 
> changes at all. I don't plan to run the tests on larger data sets, I do 
> expect the behavior to remain the same, considering the uniformity of 
> TPC-H data sets.

> With TPC-DS (using the 63 queries supported by PostgreSQL), I've seen 
> two cases of plan changes - see the plans attached. In both cases 
> however the plan change results in much better performance. While on 
> master the queries took 23 and 18 seconds, with the two patches it's 
> only 7 and 3. This is just the 1GB dataset. I'll repeat the test with 
> the 4GB dataset and post an update if there are any changes.

I'm a bit disturbed by that, because AFAICS from the plans, these queries
did not involve any semi or anti joins, which should mean that the patch
would not have changed the planner's behavior.  You were using the second
patch as-posted, right, without further hacking on
compare_path_costs_fuzzily?

It's possible that the change was due to random variation in ANALYZE
statistics, in which case it was just luck.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: pg_xlog -> pg_xjournal?
Next
From: Andreas Seltenreich
Date:
Subject: [PATCH] Add error handling to byteaout.