Re: inheritance: planning time vs children number vs column number - Mailing list pgsql-performance

From Tom Lane
Subject Re: inheritance: planning time vs children number vs column number
Date
Msg-id 4188.1298960419@sss.pgh.pa.us
Whole thread Raw
In response to Re: inheritance: planning time vs children number vs column number  (Marc Cousin <cousinmarc@gmail.com>)
Responses Re: inheritance: planning time vs children number vs column number  (Marc Cousin <cousinmarc@gmail.com>)
List pgsql-performance
Marc Cousin <cousinmarc@gmail.com> writes:
> The Monday 28 February 2011 16:35:37, Tom Lane wrote :
>> Could we see a concrete example demonstrating that?  I agree with Heikki
>> that it's not obvious what you are testing that would have such behavior.
>> I can think of places that would have O(N^2) behavior in the length of
>> the targetlist, but it seems unlikely that they'd come to dominate
>> runtime at a mere 1000 columns.

> I feel a little silly not having provided a test case from the start…

> A script doing a complete test is attached to this email.

I did some oprofile analysis of this test case.  It's spending
essentially all its time in SearchCatCache, on failed searches of
pg_statistic.  The cache accumulates negative entries for each probed
column, and then the searches take time proportional to the number of
entries, so indeed there is an O(N^2) behavior --- but N is the number
of columns times number of tables in your test case, not just the number
of columns.

The cache is a hash table, so ideally the search time would be more or
less constant as the table grows, but to make that happen we'd need to
reallocate with more buckets as the table grows, and catcache.c doesn't
do that yet.  We've seen a few cases that make that look worth doing,
but they tend to be pretty extreme, like this one.

It's worth pointing out that the only reason this effect is dominating
the runtime is that you don't have any statistics for these toy test
tables.  If you did, the cycles spent using those entries would dwarf
the lookup costs, I think.  So it's hard to get excited about doing
anything based on this test case --- it's likely the bottleneck would be
somewhere else entirely if you'd bothered to load up some data.

            regards, tom lane

pgsql-performance by date:

Previous
From: Selva manickaraja
Date:
Subject: Re: Load and Stress on PostgreSQL 9.0
Next
From: Marc Cousin
Date:
Subject: Re: inheritance: planning time vs children number vs column number