Re: Nested loop performance - Mailing list pgsql-performance

From Nick Fankhauser
Subject Re: Nested loop performance
Date
Msg-id NEBBLAAHGLEEPCGOBHDGMEPAKBAA.nickf@ontko.com
Whole thread Raw
In response to Re: Nested loop performance  (Richard Poole <richard@ruthie.org>)
List pgsql-performance
> It seems that your basic problem is that you're fetching lots of rows
> from two big ol' tables.

> It doesn't seem to me that there would be a substantially better plan
> for this query with your tables as they stand.

That's more or less the conclusion I had come to. I was just hoping someone
else could point out an approach I've been missing. (sigh!)



> If your data were more
> normalised, then your big scans might be quicker (because their rows
> would be smaller so they would hit fewer disk pages),

This started off as a 5-table join on well-normalized data. Unfortunately,
the actor table doesn't get any smaller, and the work involved in
calculating the "case_count" information on the fly was clearly becoming a
problem- particularly with actors that had a heavy caseload. (Busy attorneys
and judges.) The actor_summary approach makes these previous problem cases
go away, but the payback is that (as you correctly pointed out) queries on
average citizens who only have one case suffer from the de-normalized
approach.

We're currently considering the approach of just returning all of the rows
to our application, and doing the aggregation and limit work in the app. The
inconsistency of the data makes it very tough for the query planner to come
up with an strategy that is always a winner.

Thanks for your thoughts!

-Nick



pgsql-performance by date:

Previous
From: "Nick Fankhauser"
Date:
Subject: Re: Nested loop question
Next
From: Tom Lane
Date:
Subject: Re: Excessive rows/tuples seriously degrading query