Thread: primary key scans in sequence
hey i have the following table def (834.000 rows, vaccum analyze'd): dl_online=# \d mitglied Table "mitglied" Attribute | Type | Modifier ----------------+--------------+----------------------------mitgliedid | bigint | not nulldlnummer | varchar(30) |vorname | varchar(50) |zuname | varchar(50) | not nullgeburtsdatum | varchar(20) |aktiv | boolean | not null default 't'::boolstrasse | varchar(255) |plz | varchar(25) |ort | varchar(255) |telefon | varchar(255) |eintrittsdatum | varchar(20) |geschlechtid | bigint | not null default 3treuelevelid | bigint | not null default 1clubmitglied | boolean | not null default'f'::boolbemerkungen | text |mid | bigint | Indices: mitglied_dlnummer_idx, [on dlnummer] mitglied_pkey [on mitgliedid] -------------------- ok; i use 2 querys: 1) get one row over dlnummer:dl_online=# explain select * from mitglied where dlnummer = '098765432';NOTICE: QUERY PLAN:IndexScan using mitglied_dlnummer_idx on mitglied (cost=0.00..4.77 rows=1 width=154)EXPLAIN 2) get one row over the primatry key (mitgliedid): dl_online=# explain select * from mitglied where mitgliedid = 833228; NOTICE: QUERY PLAN: Seq Scan on mitglied (cost=0.00..18674.74 rows=1 width=154) EXPLAIN why doesn't use postrges in (2) the primary-key-index?? take a look at the cost! and both queries returns only ONE row (the optimizer knows that fact). and the worst. in(2) the query take ~3sec. in this time the cpu works on 99.9% (rh-linux 7 on a compac dc10 -alpha). the other works fine (no time to calculate, "no" use of the cpu!). PS: i tried also "set ENABLE_SEQSCAN to OFF". no chance, (2) will work with Seq Scan. thx to ANY hint!! bernd.
On Thu, Mar 29, 2001 at 03:47:58PM +0200, bernd wrote: > hey i have the following table def (834.000 rows, vaccum analyze'd): > dl_online=# \d mitglied > Table "mitglied" > Attribute | Type | Modifier > ----------------+--------------+---------------------------- > mitgliedid | bigint | not null > dlnummer | varchar(30) | > vorname | varchar(50) | > zuname | varchar(50) | not null > geburtsdatum | varchar(20) | > aktiv | boolean | not null default 't'::bool > strasse | varchar(255) | > plz | varchar(25) | > ort | varchar(255) | > telefon | varchar(255) | > eintrittsdatum | varchar(20) | > geschlechtid | bigint | not null default 3 > treuelevelid | bigint | not null default 1 > clubmitglied | boolean | not null default 'f'::bool > bemerkungen | text | > mid | bigint | > > Indices: mitglied_dlnummer_idx, [on dlnummer] > mitglied_pkey [on mitgliedid] > -------------------- > ok; i use 2 querys: > > 1) get one row over dlnummer: > dl_online=# explain select * from mitglied where dlnummer = '098765432'; > NOTICE: QUERY PLAN: > Index Scan using mitglied_dlnummer_idx on mitglied (cost=0.00..4.77 rows=1 > width=154) > EXPLAIN > > 2) get one row over the primatry key (mitgliedid): > dl_online=# explain select * from mitglied where mitgliedid = 833228; > NOTICE: QUERY PLAN: > Seq Scan on mitglied (cost=0.00..18674.74 rows=1 width=154) > EXPLAIN > > why doesn't use postrges in (2) the primary-key-index?? take a look at the > cost! and both queries returns only ONE row (the optimizer knows that fact). Because the type of the "mitgliedid" is "bigint", but the type of the constant "833228" is "integer" (I think; certainly it isn't "bigint"). Postgres doesn't realise that it can use an index on a bigint to do comparisons to an integer. If you explicitly cast the constant to a bigint, it should be willing to do an index scan, like so: select * from mitglied where mitgliedid = 833228::bigint Yes, this is a bit unpleasant to have to in your client code, and no, I don't know if there's a neater way to let Postgres know it can use this index for this query. But what I've just described does work. Richard
bernd writes: > hey i have the following table def (834.000 rows, vaccum analyze'd): > dl_online=# \d mitglied > Table "mitglied" > Attribute | Type | Modifier > ----------------+--------------+---------------------------- > mitgliedid | bigint | not null > dlnummer | varchar(30) | > vorname | varchar(50) | > zuname | varchar(50) | not null > geburtsdatum | varchar(20) | > aktiv | boolean | not null default 't'::bool > strasse | varchar(255) | > plz | varchar(25) | > ort | varchar(255) | > telefon | varchar(255) | > eintrittsdatum | varchar(20) | > geschlechtid | bigint | not null default 3 > treuelevelid | bigint | not null default 1 > clubmitglied | boolean | not null default 'f'::bool > bemerkungen | text | > mid | bigint | > > Indices: mitglied_dlnummer_idx, [on dlnummer] > mitglied_pkey [on mitgliedid] > -------------------- > 2) get one row over the primatry key (mitgliedid): > dl_online=# explain select * from mitglied where mitgliedid = 833228; > NOTICE: QUERY PLAN: > Seq Scan on mitglied (cost=0.00..18674.74 rows=1 width=154) > EXPLAIN This is because the system is not smart enough to match up a bigint = integer comparison with an index scan on a bigint column. Quoting the number, '833228', should fool it sufficiently to make this work. Incidentally, it seems unlikely that you need to use bigint for membership ids, unless you plan on more than 2*10^9 members. -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
I have the same problem, my primary key is defined as a serial though. Other tables use tables are defined as serials as well, but DO use Index Scans some tables do, some tables don't, even when creating 'my own' index on the primary key, it still uses sequencial scans! This one works fine: Table "sponsors"Attribute | Type | Modifier -----------+---------+-------------------------------------------------- ---id | integer | not null default nextval('sponsors_id_seq'::text)name | text | not nullemail | text | logo | text | not null default 'images/sponsors/logo_default.gif'url | text | qoute | text | active | boolean | default 't'::boolmain | boolean | default 'f'::bool Indices: index_sponsors_main, index_sponsors_name, sponsors_pkey, unq_sponosrs_name dsc_competition=# explain select * from sponsors where id = 4; NOTICE: QUERY PLAN: Index Scan using sponsors_pkey on sponsors (cost=0.00..2.01 rows=1 width=66) EXPLAIN Now this one doesn't: Table "teams"Attribute | Type | Modifier -----------+---------+--------------------------------------------------id | integer | not null default nextval('teams_id_seq'::text)name | text | not nullmgr_name | text | address | text | zipcode | text | city | text | country | text | email | text | telnr | text | mobnr | text | faxnr | text | logo | text | not null default 'images/teams/logo_default.gif'movie | text | url | text | qoute | text | active | boolean | default 't'::bool Indices: index_teams_id, <=!!! 'my own' index index_teams_name, teams_pkey, <=normal pkey index unq_teams_name NOTICE: QUERY PLAN: Seq Scan on teams (cost=0.00..1.09 rows=1 width=173) EXPLAIN I really don't understand the difference between the two, and it didn't work before i created an extra index on id... Kind regards, Koen Antonissen -----Original Message----- From: Richard Poole [mailto:richard.poole@vi.net] Sent: vrijdag 30 maart 2001 18:12 To: bernd Cc: pgsql-sql@postgresql.org Subject: Re: [SQL] primary key scans in sequence Because the type of the "mitgliedid" is "bigint", but the type of the constant "833228" is "integer" (I think; certainly it isn't "bigint"). Postgres doesn't realise that it can use an index on a bigint to do comparisons to an integer. If you explicitly cast the constant to a bigint, it should be willing to do an index scan, like so: select * from mitglied where mitgliedid = 833228::bigint Yes, this is a bit unpleasant to have to in your client code, and no, I don't know if there's a neater way to let Postgres know it can use this index for this query. But what I've just described does work. Richard ---------------------------(end of broadcast)--------------------------- TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
On Wed, 30 May 2001, Koen Antonissen wrote: > Now this one doesn't: > Table "teams" > Attribute | Type | Modifier > -----------+---------+-------------------------------------------------- > id | integer | not null default nextval('teams_id_seq'::text) > name | text | not null > mgr_name | text | > address | text | > zipcode | text | > city | text | > country | text | > email | text | > telnr | text | > mobnr | text | > faxnr | text | > logo | text | not null default 'images/teams/logo_default.gif' > movie | text | > url | text | > qoute | text | > active | boolean | default 't'::bool > Indices: index_teams_id, <=!!! 'my own' index > index_teams_name, > teams_pkey, <=normal pkey index > unq_teams_name > > NOTICE: QUERY PLAN: > > Seq Scan on teams (cost=0.00..1.09 rows=1 width=173) Looking at that cost, I have to wonder, how many rows are in the table and has vacuum analyze been run on it? If the number of rows in the table is small, the seq scan is definately better than having to read from both the index and heap.
actually the serials are declared int4 (integer), I tried to use your work around anyway, but it didn't work...: dsc_competition=# \d classes Table "classes"Attribute | Type | Modifier ------------+---------+------------------------------------------------- -id | integer | not null default nextval('classes_id_seq'::text)name | text | not nulldefinition | text | active | boolean | default 't'::bool Indices: classes_pkey, index_classes_name, unq_classes_name dsc_competition=# explain select * from classes where id = int8(4); NOTICE: QUERY PLAN: Seq Scan on classes (cost=0.00..1.07 rows=1 width=29) EXPLAIN dsc_competition=# explain select * from classes where id = int4(4); NOTICE: QUERY PLAN: Seq Scan on classes (cost=0.00..1.07 rows=1 width=29) EXPLAIN dsc_competition=# explain select * from classes where id = int2(4); NOTICE: QUERY PLAN: Seq Scan on classes (cost=0.00..1.07 rows=1 width=29) EXPLAIN Thing I descovered after i posted to the group was that after creating the scheme again, the indexes are used! after vacuum (analyze) the use of indexes was gone again on certain tables... Any other suggestions? -----Original Message----- From: bernd pinter [mailto:bernd@uptime.at] Sent: donderdag 31 mei 2001 8:53 To: Koen Antonissen Subject: Re: [SQL] primary key scans in sequence the problem is the optimizer. you use a int8 as primary key. so if you do somthing like "select * from sponsors where id = 34;" then the optimizer interprets the 34 as an int4 => id is int8, so postgres cant use the primary key-index. the workaround is simple. say that you are lookiong for an int8 (instead of an int4): "select * from sponsors where id = int8(34);" thats it! now you convert 34 to an int8, postgres can use the index...... bernd. Koen Antonissen wrote: > I have the same problem, my primary key is defined as a serial though. > Other tables use tables are defined as serials as well, but DO use Index > Scans > some tables do, some tables don't, even when creating 'my own' index on > the primary key, it still uses sequencial scans! > > > This one works fine: > Table "sponsors" > Attribute | Type | Modifier > > -----------+---------+-------------------------------------------------- > --- > id | integer | not null default nextval('sponsors_id_seq'::text) > name | text | not null > email | text | > logo | text | not null default > 'images/sponsors/logo_default.gif' > url | text | > qoute | text | > active | boolean | default 't'::bool > main | boolean | default 'f'::bool > Indices: index_sponsors_main, > index_sponsors_name, > sponsors_pkey, > unq_sponosrs_name > > dsc_competition=# explain select * from sponsors where id = 4; > NOTICE: QUERY PLAN: > > Index Scan using sponsors_pkey on sponsors (cost=0.00..2.01 rows=1 > width=66) > > EXPLAIN > > Now this one doesn't: > Table "teams" > Attribute | Type | Modifier > -----------+---------+-------------------------------------------------- > id | integer | not null default nextval('teams_id_seq'::text) > name | text | not null > mgr_name | text | > address | text | > zipcode | text | > city | text | > country | text | > email | text | > telnr | text | > mobnr | text | > faxnr | text | > logo | text | not null default 'images/teams/logo_default.gif' > movie | text | > url | text | > qoute | text | > active | boolean | default 't'::bool > Indices: index_teams_id, <=!!! 'my own' index > index_teams_name, > teams_pkey, <=normal pkey index > unq_teams_name > > NOTICE: QUERY PLAN: > > Seq Scan on teams (cost=0.00..1.09 rows=1 width=173) > > EXPLAIN > > > I really don't understand the difference between the two, and it didn't > work before i created an extra index on id... > > Kind regards, > Koen Antonissen > > > -----Original Message----- > From: Richard Poole [mailto:richard.poole@vi.net] > Sent: vrijdag 30 maart 2001 18:12 > To: bernd > Cc: pgsql-sql@postgresql.org > Subject: Re: [SQL] primary key scans in sequence > > > Because the type of the "mitgliedid" is "bigint", but the type of the > constant "833228" is "integer" (I think; certainly it isn't "bigint"). > Postgres doesn't realise that it can use an index on a bigint to do > comparisons to an integer. If you explicitly cast the constant to a > bigint, it should be willing to do an index scan, like so: > > select * from mitglied where mitgliedid = 833228::bigint > > Yes, this is a bit unpleasant to have to in your client code, and no, > I don't know if there's a neater way to let Postgres know it can use > this index for this query. But what I've just described does work. > > Richard > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
It really depends on the number of rows. If the number of rows in the tables are small or the number of rows returned is a reasonable percentage, the index scan is currently more expensive. What does (for example) select count(*) from classes; give? On Thu, 31 May 2001, Koen Antonissen wrote: > actually the serials are declared int4 (integer), > I tried to use your work around anyway, but it didn't work...: > > dsc_competition=# \d classes > Table "classes" > Attribute | Type | Modifier > > ------------+---------+------------------------------------------------- > - > id | integer | not null default nextval('classes_id_seq'::text) > name | text | not null > definition | text | > active | boolean | default 't'::bool > Indices: classes_pkey, > index_classes_name, > unq_classes_name > > dsc_competition=# explain select * from classes where id = int8(4); > NOTICE: QUERY PLAN: > > Seq Scan on classes (cost=0.00..1.07 rows=1 width=29) > > EXPLAIN > dsc_competition=# explain select * from classes where id = int4(4); > NOTICE: QUERY PLAN: > > Seq Scan on classes (cost=0.00..1.07 rows=1 width=29) > > EXPLAIN > dsc_competition=# explain select * from classes where id = int2(4); > NOTICE: QUERY PLAN: > > Seq Scan on classes (cost=0.00..1.07 rows=1 width=29) > > EXPLAIN > > > > Thing I descovered after i posted to the group was that after creating > the scheme again, the indexes are used! after vacuum (analyze) the use > of indexes was gone again on certain tables... > > Any other suggestions?
I think you're just witnessing the optimizer at work. If it thinks that doing sequential scans is faster, it will ignore the indices. At 11:03 AM 5/31/2001 +0200, Koen Antonissen wrote: >Thing I descovered after i posted to the group was that after creating >the scheme again, the indexes are used! after vacuum (analyze) the use >of indexes was gone again on certain tables... > >Any other suggestions?