Re: arrays and indexes - Mailing list pgsql-performance

From Greg Stark
Subject Re: arrays and indexes
Date
Msg-id 87ekmyul22.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: arrays and indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Tom Lane <tgl@sss.pgh.pa.us> writes:

> > I still don't really know why it failed, but after two days building the
> > index I gave up.
>
> Sounds like a bug to me.  Could you put together a test case?

At the time I contacted one of the GiST authors and we went over things for a
while. They diagnosed the problem as being caused by having a poor selectivity
GiST btree as the leading column in the index.

He seemed to think this was fairly fundamental and wasn't something they were
going to be able to address. And I was fairly certain I didn't want to turn
the index upside down to have the more selective columns first (as is usually
normal) for various reasons.

So I gave it up as a lost cause. In any case in my application it was unlikely
to really help. I expect that leading btree index to narrow the search to only
a few hundred or few thousand records in the normal case. So the access times
are already within reason even having to dig through all the records. And
since other queries are likely to need other records from that set I'll need
them all in cache eventually. There are a lot of array columns to search
through, so the added i/o to read all those indexes would probably be a net
loss when they push other things out of cache.

I could try setting up a test case, but I think all it took was having a
btree-gist index that was insufficiently selective. In my case I had about 900
integer values each on the order of 100-1000 records.

--
greg

pgsql-performance by date:

Previous
From: Christopher Kings-Lynne
Date:
Subject: Re: Timestamp-based indexing
Next
From: Hervé Piedvache
Date:
Subject: Little understanding for tuning ...