Re: inheritance: planning time vs children number vs column number - Mailing list pgsql-performance

From Marc Cousin
Subject Re: inheritance: planning time vs children number vs column number
Date
Msg-id 201103010842.42128.cousinmarc@gmail.com
Whole thread Raw
In response to Re: inheritance: planning time vs children number vs column number  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: inheritance: planning time vs children number vs column number
List pgsql-performance
Le mardi 01 mars 2011 07:20:19, Tom Lane a écrit :
> Marc Cousin <cousinmarc@gmail.com> writes:
> > The Monday 28 February 2011 16:35:37, Tom Lane wrote :
> >> Could we see a concrete example demonstrating that?  I agree with Heikki
> >> that it's not obvious what you are testing that would have such
> >> behavior. I can think of places that would have O(N^2) behavior in the
> >> length of the targetlist, but it seems unlikely that they'd come to
> >> dominate runtime at a mere 1000 columns.
> >
> > I feel a little silly not having provided a test case from the startق€�
> >
> > A script doing a complete test is attached to this email.
>
> I did some oprofile analysis of this test case.  It's spending
> essentially all its time in SearchCatCache, on failed searches of
> pg_statistic.  The cache accumulates negative entries for each probed
> column, and then the searches take time proportional to the number of
> entries, so indeed there is an O(N^2) behavior --- but N is the number
> of columns times number of tables in your test case, not just the number
> of columns.
>
> The cache is a hash table, so ideally the search time would be more or
> less constant as the table grows, but to make that happen we'd need to
> reallocate with more buckets as the table grows, and catcache.c doesn't
> do that yet.  We've seen a few cases that make that look worth doing,
> but they tend to be pretty extreme, like this one.
>
> It's worth pointing out that the only reason this effect is dominating
> the runtime is that you don't have any statistics for these toy test
> tables.  If you did, the cycles spent using those entries would dwarf
> the lookup costs, I think.  So it's hard to get excited about doing
> anything based on this test case --- it's likely the bottleneck would be
> somewhere else entirely if you'd bothered to load up some data.
>
>             regards, tom lane

Yes, for the same test case, with a bit of data in every partition and
statistics up to date, planning time goes from 20 seconds to 125ms for the 600
children/1000 columns case. Which is of course more than acceptable.

Now I've got to check it's the same problem on the real environment. I think
it has quite a few empty partitions, so no statistics for them…

Thanks a lot.

Marc

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: inheritance: planning time vs children number vs column number
Next
From: Joby Joba
Date:
Subject: Two different execution plans for similar requests