Home > mailing lists

Re: An idea on faster CHAR field indexing - Mailing list pgsql-hackers

From	Giles Lean
Subject	Re: An idea on faster CHAR field indexing
Date	June 22, 2000 07:45:53
Msg-id	12346.961663663@nemeton.com.au Whole thread Raw
In response to	Re: An idea on faster CHAR field indexing (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

> Interesting.  That certainly suggests strxfrm could be a loser for
> a database index too, but I agree it'd be nice to see some actual
> measurements rather than speculation.
> 
> What locale(s) were you using when testing your sort code?  I suspect
> the answers might depend on locale quite a bit...

I did a little more measurement today.  It's still only annecdotal
evidence -- I wasn't terribly rigorous -- but here are my results.

My data file consisted of ~660,000 lines and a total size of ~200MB.
Each line had part descriptions in German and some uninteresting
fields.  I stripped out the uninteresting fields and read the file
calling calling strxfrm() for each line.  I recorded the total input
bytes and the total bytes returned by strxfrm().

HP-UX 11.00 de_DE.roman8 locale:
input bytes:   179647811
result bytes: 1447833496 (increase factor 8.05)

Solaris 2.6 de_CH locale:
input bytes:   179647811 
result bytes: 1085875122 (increase factor 6.04)

I didn't time the test program on Solaris, but on HP-UX this program
took longer to run than a simplistic qsort() using strcoll() does, and
my comparison sort program has to write the data out as well, which
the strxfrm() calling program didn't do.

Regards,

Giles

pgsql-hackers by date:

From: "Philip J. Warner"
Date: 22 June 2000, 06:55:29
Subject: Re: Big 7.1 open items

From: "Hiroshi Inoue"
Date: 22 June 2000, 08:06:16
Subject: RE: Big 7.1 open items

Re: An idea on faster CHAR field indexing - Mailing list pgsql-hackers

Previous

Next