Home > mailing lists

Re: Performance query about large tables, lots of concurrent access - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: Performance query about large tables, lots of concurrent access
Date	June 19, 2007 11:56:26
Msg-id	28036.1182264975@sss.pgh.pa.us Whole thread Raw
In response to	Re: Performance query about large tables, lots of concurrent access (Karl Wright <kwright@metacarta.com>)
Responses	Re: Performance query about large tables, lots of concurrent access
List	pgsql-performance

Tree view

Karl Wright <kwright@metacarta.com> writes:
> [2007-06-18 09:39:49,797]ERROR  Plan: Index Scan using i1181764142395 on
> intrinsiclink  (cost=0.00..14177.29 rows=5 width=253)
> [2007-06-18 09:39:49,797]ERROR  Plan:   Index Cond: ((jobid = $2) AND
> ((childidhash)::text = ($3)::text))
> [2007-06-18 09:39:49,797]ERROR  Plan:   Filter: ((childid = ($4)::text)
> AND ((isnew = ($5)::bpchar) OR (isnew = ($6)::bpchar)))

>> In this case it looks like the planner is afraid that that's exactly
>> what will happen --- a cost of 14177 suggests that several thousand row
>> fetches are expected to happen, and yet it's only predicting 5 rows out
>> after the filter.

> Well, that's odd, because the hash in question that it is using is the
> SHA-1 hash of a URL.  There's essentially one row per URL in this table.

What about isnew?

Also, how many rows do *you* expect out of the query?  The planner is
not going to be aware of the hashed relationship between childidhash
and childid --- it'll think those are independent conditions which they
evidently aren't.  So it may be that the query really does retrieve
thousands of rows, and the rows=5 estimate is bogus because it's
double-counting the selectivity of the childid condition.

            regards, tom lane

pgsql-performance by date:

From: Karl Wright
Date: 19 June 2007, 11:48:33
Subject: Re: Performance query about large tables, lots of concurrent access

From: Chris Browne
Date: 19 June 2007, 12:24:00
Subject: Re: Maintenance question / DB size anomaly...

Re: Performance query about large tables, lots of concurrent access - Mailing list pgsql-performance

Previous

Next