Re: A very long running query.... - Mailing list pgsql-performance

From Ioannis Anagnostopoulos
Subject Re: A very long running query....
Date
Msg-id 5009F032.40002@anatec.com
Whole thread Raw
In response to Re: A very long running query....  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On 21/07/2012 00:10, Tom Lane wrote:
> Claudio Freire <klaussfreire@gmail.com> writes:
>> Looking at this:
>> "                    ->  Index Scan using
>> idx_message_copies_wk2_date_src_pos_partial on message_copies_wk2
>> message_copies  (cost=0.00..19057.93 rows=52 width=32) (actual
>> time=62.124..5486270.845 rows=387524 loops=1)"
>> "                          Index Cond: ((date_trunc('day'::text,
>> msg_date_rec) = '2012-07-17 00:00:00'::timestamp without time zone)
>> AND (src_id = 1))"
>> "                          Filter: ((date_part('day'::text,
>> msg_date_rec) = 17::double precision) AND (NOT (((((pos_georef1)::text
>> || (pos_georef2)::text) || (pos_georef3)::text) ||
>> (pos_georef4)::text) IS NULL)) AND (((((pos_georef1)::text ||
>> (pos_georef2)::text) || (pos_georef3)::text) || (pos_georef4)::text)
>> <> ''::text))"
> I think the real problem is that the planner has no hope of doing
> anything very accurate with such an unwieldy filter condition.  I'd look
> at ways of making the filter conditions simpler, perhaps by recasting
> the data representation.  In particular, that's a horridly bad way of
> asking whether some columns are empty, which I gather is the intent.
> If you really want to do it just like that, creating an index on the
> concatenation expression would guide ANALYZE to collect some stats about
> it, but it would probably be a lot more efficient to put together an AND
> or OR of tests on the individual columns.
>
>             regards, tom lane
So what you suggest is to forget all together the concatenation of the
georef1/2/3/4 and instead alter my query with something like:

georef1 is not null and not georeg1 = ''....etc for georef2 3 and 4

That would require to alter my index and have the four georef columns
separately in it and not as a concatenation and so on for the partial
index part. And a final thing, you seem to imply that the indexes are
used by the analyser to collect statistics even if they are not used. So
an index serves not only as a way to speed up targeted queries but also
to provide better statistics to the analyzer?

Kind Regards
Yiannis

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: A very long running query....
Next
From: Craig Ringer
Date:
Subject: Re: A very long running query....