Re: PostgreSQL using the wrong Index - Mailing list pgsql-performance

From Alex Stapleton
Subject Re: PostgreSQL using the wrong Index
Date
Msg-id A07C7E6A-6E84-476A-AEA3-1FE9344870BF@advfn.com
Whole thread Raw
In response to Re: PostgreSQL using the wrong Index  (John A Meinel <john@arbash-meinel.com>)
List pgsql-performance
On 13 Jun 2005, at 15:47, John A Meinel wrote:

> Alex Stapleton wrote:
>
>
>> Oh, we are running 7.4.2 btw. And our random_page_cost = 1
>>
>>
> Which is only correct if your entire db fits into memory. Also, try
> updating to a later 7.4 version if at all possible.
>

I am aware of this, I didn't configure this machine though
unfortuantely.

>> On 13 Jun 2005, at 14:02, Alex Stapleton wrote:
>>
>>
>>> We have two index's like so
>>>
>>> l1_historical=# \d "N_intra_time_idx"
>>>        Index "N_intra_time_idx"
>>> Column |            Type
>>> --------+-----------------------------
>>> time   | timestamp without time zone
>>> btree
>>>
>>>
> Just so you are aware, writing this as: "We have an index on
> N_intra(time) and one on N_Intra(symbol, time)" is a lot more
> succinct.
>

Sorry, I happened to have them there in my clipboard at the time so I
just blindly pasted them in.

>>>
>>> l1_historical=# \d "N_intra_pkey"
>>>          Index "N_intra_pkey"
>>> Column |            Type
>>> --------+-----------------------------
>>> symbol | text
>>> time   | timestamp without time zone
>>> unique btree (primary key)
>>>
>>> and on queries like this
>>>
>>> select * from "N_intra" where symbol='SOMETHING WHICH DOESNT EXIST'
>>> order by time desc limit 1;
>>>
>>> PostgreSQL takes a very long time to complete, as it effectively
>>> scans the entire table, backwards. And the table is huge, about 450
>>> million rows. (btw, there are no triggers or any other exciting
>>> things like that on our tables in this db.)
>>>
>>> but on things where the symbol does exist in the table, it's
>>> more  or
>>> less fine, and nice and fast.
>>>
>>
>>
> What happens if you do:
> SELECT * FROM "N_intra" WHERE symbol='doesnt exist' ORDER BY symbol,
> time DESC LIMIT 1;

Hurrah! I should of thought of this, considering i've done it in the
past :) Thanks a lot, that's great.

> Yes, symbol is constant, but it frequently helps the planner
> realize it
> can use an index scan if you include all terms in the index in the
> ORDER
> BY clause.



>
>>>
>>> Whilst the option the planner has taken might be faster most of the
>>> time, the worst case scenario is unacceptable for obvious reasons.
>>> I've googled for trying to force the use of a specific index, but
>>> can't find anything relevant. Does anyone have any suggestions on
>>> getting it to use an index which hopefully will have better worst
>>> case performance?
>>>
>>
>>
> Try the above first. You could also create a new index on symbol
>    CREATE INDEX "N_intra_symbol_idx" ON "N_intra"(symbol);
>
> Then the WHERE clause should use the symbol index, which means it can
> know quickly that an entry doesn't exist. I'm not sure how many
> entries
> you have per symbol, though, so this might cause problems in the ORDER
> BY time portion.
>
> I'm guessing what you really want is to just do the ORDER BY
> symbol, time.
>
> John
> =:->
>
>


pgsql-performance by date:

Previous
From: John A Meinel
Date:
Subject: Re: Index ot being used
Next
From: Wei Weng
Date:
Subject: Re: PostgreSQL using the wrong Index