Thread: index use

index use

From

Arshavir Grigorian

Date:

19 November 2004, 19:19:10

Hi,

I have a query that when run on similar tables in 2 different databases
either uses the index on the column (primary key) in the where clause or
does a full table scan. The structure of the tables is the same, except
that the table where the index does not get used has an extra million
rows (22mil vs 23mil).

The 2 boxes where these database run are very different (Sparc with scsi
disks and 2G RAM running Solaris 8 AND a PC with 128M RAM running and an
IDE drive running Linux RH9 2.4.20-20.9). I am not sure why that would
make a difference, but maybe it does.
Also, according to our dba both tables have been analyzed about the same
time.

Any pointers would be much appreciated.


Arshavir



WORKS:

=> explain analyze select num from document where num like 'EP1000000%';

                                                       QUERY PLAN
 

-------------------------------------------------------------------------------------------------------------------------
 Index Scan using document_pkey on document  (cost=0.00..5.77 rows=1 width=14) (actual time=0.147..0.166 rows=2
loops=1)
   Index Cond: (((num)::text >= 'EP1000000'::character varying) AND ((num)::text < 'EP1000001'::character varying))
   Filter: ((num)::text ~~ 'EP1000000%'::text)
 Total runtime: 0.281 ms
(4 rows)

=> \d document
            Table "public.document"
  Column   |          Type          | Modifiers
-----------+------------------------+-----------
 num       | character varying(30)  | not null
 titl      | character varying(500) |
 isscntry  | character varying(50)  |
 issdate   | date                   |
 filedate  | date                   |
 appnum    | character varying(20)  |
 clnum     | integer                |
 exnum     | integer                |
 exmnr     | character varying(300) |
 agent     | character varying(300) |
 priodate  | date                   |
 prionum   | character varying(100) |
 priocntry | character varying(50)  |
 legalstat | integer                |
Indexes:
    "document_pkey" primary key, btree (num)
Check constraints:
    "document_legalstat" CHECK (legalstat > 0 AND legalstat < 6)



DOES NOT WORK:

d5=> EXPLAIN ANALYZE select num from document where num like 'EP1000000%';
                                                  QUERY PLAN
--------------------------------------------------------------------------------------------------------------
 Seq Scan on document  (cost=0.00..804355.12 rows=1 width=14) (actual time=97.235..353286.781 rows=2 loops=1)
   Filter: ((num)::text ~~ 'EP1000000%'::text)
 Total runtime: 353286.907 ms
(3 rows)

d5=> \d document
            Table "public.document"
  Column   |          Type          | Modifiers
-----------+------------------------+-----------
 num       | character varying(30)  | not null
 titl      | character varying(500) |
 isscntry  | character varying(50)  |
 issdate   | date                   |
 filedate  | date                   |
 clnum     | integer                |
 exnum     | integer                |
 exmnr     | character varying(300) |
 agent     | character varying(300) |
 priodate  | date                   |
 prionum   | character varying(100) |
 priocntry | character varying(50)  |
 legalstat | integer                |
 appnum    | character varying(20)  |
Indexes:
    "document_pkey" primary key, btree (num)
Check constraints:
    "$1" CHECK (legalstat > 0 AND legalstat < 6)

Re: index use

From

"Steinar H. Gunderson"

Date:

19 November 2004, 20:00:30

On Fri, Nov 19, 2004 at 02:18:55PM -0500, Arshavir Grigorian wrote:
> The 2 boxes where these database run are very different (Sparc with scsi
> disks and 2G RAM running Solaris 8 AND a PC with 128M RAM running and an
> IDE drive running Linux RH9 2.4.20-20.9). I am not sure why that would
> make a difference, but maybe it does.

Are you having different locales on your systems?

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: index use

From

Josh Berkus

Date:

19 November 2004, 22:16:18

Arshavir,

> I have a query that when run on similar tables in 2 different databases
> either uses the index on the column (primary key) in the where clause or
> does a full table scan. The structure of the tables is the same, except
> that the table where the index does not get used has an extra million
> rows (22mil vs 23mil).

Are both using the same version of PostgreSQL?   If so, what version?

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

Re: index use

From

Arshavir Grigorian

Date:

19 November 2004, 22:30:44

Thanks for all the replies. It actually has to do with the locales. The
db where the index gets used is running on C vs the the other one that
uses en_US.UTF-8. I guess the db with the wrong locale will need to be
waxed and recreated with correct locale settings. I wonder if there are
any plans to make LIKE work with all locales.

Again, many thanks. You guys are great!



Arshavir

Re: index use

From

Stephan Szabo

Date:

19 November 2004, 23:26:58

On Fri, 19 Nov 2004, Arshavir Grigorian wrote:

> Hi,
>
> I have a query that when run on similar tables in 2 different databases
> either uses the index on the column (primary key) in the where clause or
> does a full table scan. The structure of the tables is the same, except
> that the table where the index does not get used has an extra million
> rows (22mil vs 23mil).
>
> The 2 boxes where these database run are very different (Sparc with scsi
> disks and 2G RAM running Solaris 8 AND a PC with 128M RAM running and an
> IDE drive running Linux RH9 2.4.20-20.9). I am not sure why that would
> make a difference, but maybe it does.

Is the second server running in "C" locale or a different locale? The
optimization for LIKE to use indexes involves either making an index with
a *_pattern_ops operator class or being in "C" locale.

Re: index use

From

Tom Lane

Date:

19 November 2004, 23:28:31

Arshavir Grigorian <ag@m-cam.com> writes:
> I have a query that when run on similar tables in 2 different databases
> either uses the index on the column (primary key) in the where clause or
> does a full table scan. The structure of the tables is the same, except
> that the table where the index does not get used has an extra million
> rows (22mil vs 23mil).

I'd say you initialized the second database in a non-C locale.  The
planner is clearly well aware that the seqscan is going to be expensive,
so the explanation has to be that it does not have a usable index available.

            regards, tom lane

Re: index use

From

Josh Berkus

Date:

21 November 2004, 01:43:08

Arshavir,

> Thanks for all the replies. It actually has to do with the locales. The
> db where the index gets used is running on C vs the the other one that
> uses en_US.UTF-8. I guess the db with the wrong locale will need to be
> waxed and recreated with correct locale settings. I wonder if there are
> any plans to make LIKE work with all locales.

I thought there were some fixes for this in 8.0, but I can't find anything in
the release notes.

--
Josh Berkus
Aglio Database Solutions
San Francisco