Thread: hash taboo?

hash taboo?

From
admin
Date:
I've been reading the postgresql manual and I find there is very little
discussion about hash compared to btree. Most of the focus seems to be on
using btree indices even that the default for 'create index' is btree
also. From the documentation, it seems the only difference between either
searching method is that btree can be used with multiple operators whilst
hash can only be used with '='. Furthermore, hash seems to be contained in
memory, so should be limited to small queries or, in my case, queries
using limit (without using sort which would need to retrieve the entire
data anyways).

My conclusion is that if I can live with just using '=' and using slightly
more memory, I should be using hash. Unfortunately, there is very little
sign in the documentation that I should be using hash at all. Perhaps I
have missed something important.

If someone could help me make a more rational decision on using searching
methods, I'd appreciate.

Thanks in advance,
Marc


Re: [GENERAL] hash taboo?

From
Bruce Momjian
Date:
Run some performace tests and let us know.

> I've been reading the postgresql manual and I find there is very little
> discussion about hash compared to btree. Most of the focus seems to be on
> using btree indices even that the default for 'create index' is btree
> also. From the documentation, it seems the only difference between either
> searching method is that btree can be used with multiple operators whilst
> hash can only be used with '='. Furthermore, hash seems to be contained in
> memory, so should be limited to small queries or, in my case, queries
> using limit (without using sort which would need to retrieve the entire
> data anyways).
>
> My conclusion is that if I can live with just using '=' and using slightly
> more memory, I should be using hash. Unfortunately, there is very little
> sign in the documentation that I should be using hash at all. Perhaps I
> have missed something important.
>
> If someone could help me make a more rational decision on using searching
> methods, I'd appreciate.
>
> Thanks in advance,
> Marc
>
>
> ************
>
>


--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [GENERAL] hash taboo?

From
admin
Date:
My results were exactly the same for btree and hash, even when vacumming
between each index creation. Here's my query:
SELECT * FROM prod_base WHERE mid='2';

Here's my result:
Index Scan using prod_mid_idx on prod_base (cost=2.05 rows=2 width=120)

My database is perhaps not big enough to run some relevant tests, so
please let me know if there's another way I could get a better idea of the
resources used for using each searching method.

> Run some performace tests and let us know.
>
> > I've been reading the postgresql manual and I find there is very little
> > discussion about hash compared to btree. Most of the focus seems to be on
> > using btree indices even that the default for 'create index' is btree
> > also. From the documentation, it seems the only difference between either
> > searching method is that btree can be used with multiple operators whilst
> > hash can only be used with '='. Furthermore, hash seems to be contained in
> > memory, so should be limited to small queries or, in my case, queries
> > using limit (without using sort which would need to retrieve the entire
> > data anyways).
> >
> > My conclusion is that if I can live with just using '=' and using slightly
> > more memory, I should be using hash. Unfortunately, there is very little
> > sign in the documentation that I should be using hash at all. Perhaps I
> > have missed something important.
> >
> > If someone could help me make a more rational decision on using searching
> > methods, I'd appreciate.
> >
> > Thanks in advance,
> > Marc
> >
> >
> > ************
> >
> >
>
>
> --
>   Bruce Momjian                        |  http://www.op.net/~candle
>   maillist@candle.pha.pa.us            |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
>


Re: [GENERAL] hash taboo?

From
Bruce Momjian
Date:
> My results were exactly the same for btree and hash, even when vacumming
> between each index creation. Here's my query:
> SELECT * FROM prod_base WHERE mid='2';
>
> Here's my result:
> Index Scan using prod_mid_idx on prod_base (cost=2.05 rows=2 width=120)
>
> My database is perhaps not big enough to run some relevant tests, so
> please let me know if there's another way I could get a better idea of the
> resources used for using each searching method.

You have to look at index creation speed and index access speed.

Not sure which one wins in each category.  Also, index modification
speed may be important.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

************


Re: [GENERAL] hash taboo?

From
admin
Date:
Excellent point, your last comment gives me a tangible incentive for using
hash instead of btree. Since I don't need to use other operators than '=',
there is really no need to spend extra time creating a btree while all I
need is a hash table. In the end, both are as fast for searching, but I
gain some additional speed for inserting and removing entries.

> > My results were exactly the same for btree and hash, even when vacumming
> > between each index creation. Here's my query:
> > SELECT * FROM prod_base WHERE mid='2';
> >
> > Here's my result:
> > Index Scan using prod_mid_idx on prod_base (cost=2.05 rows=2 width=120)
> >
> > My database is perhaps not big enough to run some relevant tests, so
> > please let me know if there's another way I could get a better idea of the
> > resources used for using each searching method.
>
> You have to look at index creation speed and index access speed.
>
> Not sure which one wins in each category.  Also, index modification
> speed may be important.

Thanks again,
Marc


Re: [GENERAL] hash taboo?

From
Bruce Momjian
Date:
> Excellent point, your last comment gives me a tangible incentive for using
> hash instead of btree. Since I don't need to use other operators than '=',
> there is really no need to spend extra time creating a btree while all I
> need is a hash table. In the end, both are as fast for searching, but I
> gain some additional speed for inserting and removing entries.

Is the hash faster to create?

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [GENERAL] hash taboo?

From
Bruce Momjian
Date:
> My results were exactly the same for btree and hash, even when vacumming
> between each index creation. Here's my query:
> SELECT * FROM prod_base WHERE mid='2';
>
> Here's my result:
> Index Scan using prod_mid_idx on prod_base (cost=2.05 rows=2 width=120)
>
> My database is perhaps not big enough to run some relevant tests, so
> please let me know if there's another way I could get a better idea of the
> resources used for using each searching method.

You have to look at index creation speed and index access speed.

Not sure which one wins in each category.  Also, index modification
speed may be important.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

************