Thread: Insight into indexes? (or inverting then externally)

Insight into indexes? (or inverting then externally)

From
"Joshua b. Jore"
Date:
So far as I can tell indexes in PostgreSQL are fairly opaque things. The
planner may or may not decide to use them, exactly what they contain is
largely unknown (unless I want to read the source), how PostgreSQL uses
them and in which cases a btree index is better than a hash index.

Foo. If I've managed to just skip over a manual section please whack me
with a 2x4. If I haven't then any links would be appreciated. I'm
considering how best to provide a fast index on a 2.5 million row table
(all the voters in Minnesota). Since I can't see what PostgreSQL is doing
with the indexes I have to wonder if I can eke out some more performance
by just indexing the record's id numbers (a usually numeric char(10) -
blame the MN Secretary of State for the "mostly" part) in PostgreSQL and
then sticking an inverted index into say... BerkeleyDB.

Records might look like where it's obvious that the ids store nicely..

lname.db
'Jore' => '12345678901234567890'

So before I jump down this rabbit hole is there anything I can do to
PostgreSQL to see what is in the indexes and if there is any way to tweak
them? And when is a hash index preferrable to a btree index? Or that other
index type that I don't remember the name of.

Joshua b. Jore ; http://www.greentechnologist.org ; 1121 1233 1311 200
1201 1302 1211 200 1201 1303 200 1300 1233 1313 1211 1302 1212 1311 1230
200 1201 1303 200 1321 1233 1311 1302 200 1211 1232 1211 1231 1321 200
1310 1220 1221 1232 1223 1303 200 1321 1233 1311 200 1201 1302 1211 232
200 1112 1233 1310 1211 200 1013 1302 1211 1211 1232 201 22


Re: Insight into indexes? (or inverting then externally)

From
Oleg Bartunov
Date:
Joshua,

what kind of queries do you need ? Do you need to perform full text search ?

    Oleg
On Wed, 5 Jun 2002, Joshua b. Jore wrote:

> So far as I can tell indexes in PostgreSQL are fairly opaque things. The
> planner may or may not decide to use them, exactly what they contain is
> largely unknown (unless I want to read the source), how PostgreSQL uses
> them and in which cases a btree index is better than a hash index.
>
> Foo. If I've managed to just skip over a manual section please whack me
> with a 2x4. If I haven't then any links would be appreciated. I'm
> considering how best to provide a fast index on a 2.5 million row table
> (all the voters in Minnesota). Since I can't see what PostgreSQL is doing
> with the indexes I have to wonder if I can eke out some more performance
> by just indexing the record's id numbers (a usually numeric char(10) -
> blame the MN Secretary of State for the "mostly" part) in PostgreSQL and
> then sticking an inverted index into say... BerkeleyDB.
>
> Records might look like where it's obvious that the ids store nicely..
>
> lname.db
> 'Jore' => '12345678901234567890'
>
> So before I jump down this rabbit hole is there anything I can do to
> PostgreSQL to see what is in the indexes and if there is any way to tweak
> them? And when is a hash index preferrable to a btree index? Or that other
> index type that I don't remember the name of.
>
> Joshua b. Jore ; http://www.greentechnologist.org ; 1121 1233 1311 200
> 1201 1302 1211 200 1201 1303 200 1300 1233 1313 1211 1302 1212 1311 1230
> 200 1201 1303 200 1321 1233 1311 1302 200 1211 1232 1211 1231 1321 200
> 1310 1220 1221 1232 1223 1303 200 1321 1233 1311 200 1201 1302 1211 232
> 200 1112 1233 1310 1211 200 1013 1302 1211 1211 1232 201 22
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>

    Regards,
        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


Re: Insight into indexes? (or inverting then externally)

From
"Peter A. Daly"
Date:
What's the performance like?  I have a similar table (about 3 million
names, and I get great (sub second) performance when searching on a name
doing a select distinct name where name like 'smith john%'.

Have you run an EXPLAIN on the query you are doing?  Is it even using
your indexes?

-Pete

Oleg Bartunov wrote:

>Joshua,
>
>what kind of queries do you need ? Do you need to perform full text search ?
>
>    Oleg
>On Wed, 5 Jun 2002, Joshua b. Jore wrote:
>
>>So far as I can tell indexes in PostgreSQL are fairly opaque things. The
>>planner may or may not decide to use them, exactly what they contain is
>>largely unknown (unless I want to read the source), how PostgreSQL uses
>>them and in which cases a btree index is better than a hash index.
>>
>>Foo. If I've managed to just skip over a manual section please whack me
>>with a 2x4. If I haven't then any links would be appreciated. I'm
>>considering how best to provide a fast index on a 2.5 million row table
>>(all the voters in Minnesota). Since I can't see what PostgreSQL is doing
>>with the indexes I have to wonder if I can eke out some more performance
>>by just indexing the record's id numbers (a usually numeric char(10) -
>>blame the MN Secretary of State for the "mostly" part) in PostgreSQL and
>>then sticking an inverted index into say... BerkeleyDB.
>>
>>Records might look like where it's obvious that the ids store nicely..
>>
>>lname.db
>>'Jore' => '12345678901234567890'
>>
>>So before I jump down this rabbit hole is there anything I can do to
>>PostgreSQL to see what is in the indexes and if there is any way to tweak
>>them? And when is a hash index preferrable to a btree index? Or that other
>>index type that I don't remember the name of.
>>
>>Joshua b. Jore ; http://www.greentechnologist.org ; 1121 1233 1311 200
>>1201 1302 1211 200 1201 1303 200 1300 1233 1313 1211 1302 1212 1311 1230
>>200 1201 1303 200 1321 1233 1311 1302 200 1211 1232 1211 1231 1321 200
>>1310 1220 1221 1232 1223 1303 200 1321 1233 1311 200 1201 1302 1211 232
>>200 1112 1233 1310 1211 200 1013 1302 1211 1211 1232 201 22
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 2: you can get off all lists at once with the unregister command
>>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>
>
>    Regards,
>        Oleg
>_____________________________________________________________
>Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
>Sternberg Astronomical Institute, Moscow University (Russia)
>Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
>phone: +007(095)939-16-83, +007(095)939-23-83
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 4: Don't 'kill -9' the postmaster
>




Re: Insight into indexes? (or inverting then externally)

From
Neil Conway
Date:
On Wed, 5 Jun 2002 09:07:37 -0500 (CDT)
"Joshua b. Jore" <josh@greentechnologist.org> wrote:
> So before I jump down this rabbit hole is there anything I can do to
> PostgreSQL to see what is in the indexes and if there is any way to tweak
> them?

Exactly what would you like to "tweak"? I'm not really sure what the
problem is...

> And when is a hash index preferrable to a btree index?

Never (if you know of a situation in which hash has a significant
advantage over btree, let me know -- I haven't been able to find one).
The current hash index code doesn't handle multiple columns, unique
indexes, tends to deadlock under heavy concurrent access, and is VERY
slow to create.

> Or that other index type that I don't remember the name of.

You're probably thinking of rtree, which is for non-scalar data (and
thus, isn't an option for you AFAICT).

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC