Thread: BIGINT indexes still with problems

BIGINT indexes still with problems

From
"Dan Ruthers"
Date:
Hi, I have read in this list and elsewhere the problem with indexes and big int.
Still, I have an index that is used or not, depending of the parameter value used in the query.
I am using PostgreSQL 7.4.3 on Linux RH ES3.

Here's the table:
test=> \d dmaildatum
                Table "public.dmaildatum"
     Column     |            Type             | Modifiers
----------------+-----------------------------+-----------
 id             | bigint                      | not null
 type           | integer                     | not null
 idowneruser    | bigint                      | not null
 idparent       | bigint                      | not null
 creationdate   | timestamp without time zone | not null
 name           | character varying(1024)     | not null
 deletedate     | timestamp without time zone | not null
 lastmodifydate | timestamp without time zone |
 ffrom          | character varying(255)      |
 fto            | text                        |
 fcc            | text                        |
 fbcc           | text                        |
 fsubject       | text                        |
 fdate          | timestamp without time zone |
 fread          | boolean                     |
 nattachments   | integer                     |
 fsize          | bigint                      |
 hashtml        | boolean                     |
 replyto        | text                        |
 pop3id         | character varying(255)      |
 nvattachments  | integer                     |
Indexes:
    "dmaildatum_pkey" primary key, btree (id)
    "ix_dmaildatum_fdate" btree (fdate)
    "ix_dmaildatum_idowneruser" btree (idowneruser)
    "ix_dmaildatum_idparent" btree (idparent)
    "ix_dmaildatum_pop3id" btree (pop3id)
Foreign-key constraints:
    "fk_dmaildatum_dmailfolderdatum" FOREIGN KEY (idparent) REFERENCES dmailfold
erdatum(id) ON DELETE CASCADE

Now, if I run this query (note the int8 cast - also tried with the '' cast to String, same results):
test=> explain select * from dmaildatum where idparent=int8(783219);
                            QUERY PLAN
------------------------------------------------------------------
 Seq Scan on dmaildatum  (cost=0.00..2241.71 rows=2229 width=272)
   Filter: (idparent = 783219::bigint)
(2 rows)

The index is not used. But with an identical query, only different parameter value:
desknow=> explain select * from dmaildatum where idparent=int8(1187838);
                                          QUERY PLAN

--------------------------------------------------------------------------------
---------------
 Index Scan using ix_dmaildatum_idparent on dmaildatum  (cost=0.00..284.05 rows=
102 width=272)
   Index Cond: (idparent = 1187838::bigint)
(2 rows)

The index is used!
I also did a vacuum analyze, and restarted Postgres and it did not make any difference.
I tried many other ID values (ex 783218 and 783220), and they seem to use the index correctly. Only that value doesn't.

Can anyone explain why Postgres behaves differently in these two cases, or at least point to some hints?

Thanks in advance for all your help

--
_______________________________________________
Find what you are looking for with the Lycos Yellow Pages
http://r.lycos.com/r/yp_emailfooter/http://yellowpages.lycos.com/default.asp?SRC=lycos10


Re: BIGINT indexes still with problems

From
Gaetano Mendola
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dan Ruthers wrote:


| Now, if I run this query (note the int8 cast - also tried with the '' cast to String, same results):
| test=> explain select * from dmaildatum where idparent=int8(783219);
|                             QUERY PLAN
| ------------------------------------------------------------------
|  Seq Scan on dmaildatum  (cost=0.00..2241.71 rows=2229 width=272)
|    Filter: (idparent = 783219::bigint)
| (2 rows)
|
| The index is not used. But with an identical query, only different parameter value:
| desknow=> explain select * from dmaildatum where idparent=int8(1187838);
|                                           QUERY PLAN
|
| --------------------------------------------------------------------------------
| ---------------
|  Index Scan using ix_dmaildatum_idparent on dmaildatum  (cost=0.00..284.05 rows=
| 102 width=272)
|    Index Cond: (idparent = 1187838::bigint)
| (2 rows)
|
| The index is used!
| I also did a vacuum analyze, and restarted Postgres and it did not make any difference.
| I tried many other ID values (ex 783218 and 783220), and they seem to use the index correctly. Only that value
doesn't.
|
| Can anyone explain why Postgres behaves differently in these two cases, or at least point to some hints?

Because this means that a sequential scan is better for that value.

Perform this selects:


(1) select count(*) from dmaildatum;
(2) select count(*) from dmaildatum where idparent=int8(783219);
(3) select count(*) from dmaildatum where idparent=int8(1187838);


I bet that the ratio  (2)/(1) is greater then (3)/(1).


Now show us the following results:


explain analyze select * from dmaildatum where idparent=int8(783219);
explain analyze select * from dmaildatum where idparent=int8(1187838);

and repeat it again but executing before:

set enable_seqscan = off;



Depending on the results that you get may be you need to lower the index
scan cost tuning the cpu related GUC variables.



Regards
Gaetano Mendola














-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBGSfL7UpzwH2SGd4RAgBsAKCXvs2L/XUEmSGxBzEiAHmWasgShACeLvjp
9m12DSnj2tBuGSgldr4D9Po=
=KTil
-----END PGP SIGNATURE-----


Re: BIGINT indexes still with problems

From
Stephan Szabo
Date:
On Mon, 9 Aug 2004, Dan Ruthers wrote:

> Now, if I run this query (note the int8 cast - also tried with the '' cast to String, same results):
> test=> explain select * from dmaildatum where idparent=int8(783219);
>                             QUERY PLAN
> ------------------------------------------------------------------
>  Seq Scan on dmaildatum  (cost=0.00..2241.71 rows=2229 width=272)
>    Filter: (idparent = 783219::bigint)
> (2 rows)
>
> The index is not used. But with an identical query, only different parameter value:
> desknow=> explain select * from dmaildatum where idparent=int8(1187838);
>                                           QUERY PLAN
>
> --------------------------------------------------------------------------------
> ---------------
>  Index Scan using ix_dmaildatum_idparent on dmaildatum  (cost=0.00..284.05 rows=
> 102 width=272)
>    Index Cond: (idparent = 1187838::bigint)
> (2 rows)


Look at the row estimates for the two cases.  How many rows are actually
returned and how long the queries take (explain analyze will give that
information)?


Re: BIGINT indexes still with problems

From
Peter Eisentraut
Date:
Dan Ruthers wrote:
> The index is used!
> I also did a vacuum analyze, and restarted Postgres and it did not
> make any difference. I tried many other ID values (ex 783218 and
> 783220), and they seem to use the index correctly. Only that value
> doesn't.

Possibly, that is the most common value and the cost calculation yields
that it would be more efficient to not use the index.  If you disagree,
please show the timings generated by EXPLAIN ANALYZE.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/