Thread: INDEX and JOINs
Hi all. I have a setup like this: CREATE TABLE T_FIELDS ( TABL_ID TEXT NOT NULL, COLU_ID TEXT NOT NULL, FIEL_ID TEXT PRIMARY KEY, UNIQUE( TABL_ID,COLU_ID ) ); -- < 200 ROWS CREATE TABLE T_DATA ( ITEM_ID INT8 NOT NULL, FIEL_ID TEXT NOT NULL REFERENCES T_FIELDS, DATA_T TEXT NOT NULL, PRIMARY( FIEL_ID,ITEM_ID ) ); -- > 10M ROWS When I run SELECT * FROM T_DATA WHERE FIEL_ID='TABL.FIEL'; it's very fast (of course!). But when I run: SELECT * FROM T_DATA NATURAL JOIN T_FIELDS WHERE TABL_ID='TABL'; it's very slow. The EXPLAIN says that in the second case it has to do a sequential scan on T_DATA. And this explains the timing. Is there a way to avoid such a behaviour by acting on indexes? Thanks.
On Fri, Oct 26, 2007 at 12:34:06PM +0200, Reg Me Please wrote: > it's very fast (of course!). But when I run: > > SELECT * FROM T_DATA NATURAL JOIN T_FIELDS WHERE TABL_ID='TABL'; > > it's very slow. > The EXPLAIN says that in the second case it has to do a sequential > scan on T_DATA. And this explains the timing. > Is there a way to avoid such a behaviour by acting on indexes? Firstly, have you run ANALYZE recently. Secondly, you'll have to show us the output of EXPLAIN ANALYZE if you want some useful help. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
Il Friday 26 October 2007 13:05:10 Martijn van Oosterhout ha scritto: > On Fri, Oct 26, 2007 at 12:34:06PM +0200, Reg Me Please wrote: > > it's very fast (of course!). But when I run: > > > > SELECT * FROM T_DATA NATURAL JOIN T_FIELDS WHERE TABL_ID='TABL'; > > > > it's very slow. > > The EXPLAIN says that in the second case it has to do a sequential > > scan on T_DATA. And this explains the timing. > > Is there a way to avoid such a behaviour by acting on indexes? > > Firstly, have you run ANALYZE recently. Secondly, you'll have to show > us the output of EXPLAIN ANALYZE if you want some useful help. > > Have a nice day, Yes, I'm often runing analyze while trying to sort this kind of things out. This is the output: prove=# explain analyze SELECT * from t_dati natural left join t_campi where tabe_id='CONTE'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ Hash Join (cost=3.95..382140.91 rows=274709 width=91) (actual time=1.929..57713.305 rows=92 loops=1) Hash Cond: (t_dati.camp_id = t_campi.camp_id) -> Seq Scan on t_dati (cost=0.00..326851.72 rows=14010172 width=73) (actual time=0.028..43814.946 rows=14011712 loops=1) -> Hash (cost=3.91..3.91 rows=3 width=33) (actual time=0.129..0.129 rows=3 loops=1) -> Seq Scan on t_campi (cost=0.00..3.91 rows=3 width=33) (actual time=0.040..0.121 rows=3 loops=1) Filter: (tabe_id = 'CONTE'::text) Total runtime: 57713.449 ms (I translated the table and column names. The substance is the same.)
On Fri, Oct 26, 2007 at 01:10:42PM +0200, Reg Me Please wrote: > prove=# explain analyze SELECT * from t_dati natural left join t_campi where > tabe_id='CONTE'; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------ > Hash Join (cost=3.95..382140.91 rows=274709 width=91) (actual > time=1.929..57713.305 rows=92 loops=1) > Hash Cond: (t_dati.camp_id = t_campi.camp_id) Neither of the columns are indexed according to the schema you sent so that's the problem. Or you broke something while "translating". > (I translated the table and column names. The substance is the same.) Try not translating, and we might get somewhere... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
Il Friday 26 October 2007 13:56:20 Martijn van Oosterhout ha scritto: > On Fri, Oct 26, 2007 at 01:10:42PM +0200, Reg Me Please wrote: > > prove=# explain analyze SELECT * from t_dati natural left join t_campi > > where tabe_id='CONTE'; > > QUERY PLAN > > ------------------------------------------------------------------------- > >----------------------------------------------------- Hash Join > > (cost=3.95..382140.91 rows=274709 width=91) (actual > > time=1.929..57713.305 rows=92 loops=1) > > Hash Cond: (t_dati.camp_id = t_campi.camp_id) > > Neither of the columns are indexed according to the schema you sent so > that's the problem. Or you broke something while "translating". > > > (I translated the table and column names. The substance is the same.) > > Try not translating, and we might get somewhere... > > Have a nice day, prove=# \d t_dati Tabella "public.t_dati" Colonna | Tipo | Modificatori ---------------+--------------------------+---------------------------------------------------------- elem_id | bigint | not null camp_id | text | not null dato_t | text | not null dato_r | double precision | dato_validita | timestamp with time zone | not null default '-infinity'::timestamp with time zone dato_scadenza | timestamp with time zone | not null default 'infinity'::timestamp with time zone dato_flag | boolean | not null default true dato_data | timestamp with time zone | not null default now() dato_id | bigint | not null default nextval('t_dati_dato_id_seq'::regclass) Indici: "t_dati_pkey" PRIMARY KEY, btree (dato_id) "i_dati_0" btree (elem_id) "i_dati_1" btree (camp_id) "i_dati_2" btree (dato_t text_pattern_ops) "i_dati_3" btree (dato_flag, dato_validita, dato_scadenza) "i_dati_4" btree (dato_data) Vincoli di integrità referenziale "t_dati_camp_id_fkey" FOREIGN KEY (camp_id) REFERENCES t_campi(camp_id) prove=# \d t_campi Tabella "public.t_campi" Colonna | Tipo | Modificatori ---------+------+-------------- tabe_id | text | not null colo_id | text | not null camp_id | text | not null Indici: "t_campi_pkey" PRIMARY KEY, btree (camp_id) "i_t_campi_0" btree (tabe_id) Vincoli di integrità referenziale "t_campi_colo_id_fkey" FOREIGN KEY (colo_id) REFERENCES t_colonne(colo_id) "t_campi_tabe_id_fkey" FOREIGN KEY (tabe_id) REFERENCES t_tabelle(tabe_id) They seems to be indexed.
On Fri, Oct 26, 2007 at 02:39:28PM +0200, Reg Me Please wrote: > > >----------------------------------------------------- Hash Join > > > (cost=3.95..382140.91 rows=274709 width=91) (actual > > > time=1.929..57713.305 rows=92 loops=1) > > > Hash Cond: (t_dati.camp_id = t_campi.camp_id) > > prove=# \d t_dati Ok, my suggestion would be to run it with "enable_seqscan=off" and if that is indeed faster, then try reducing random_page_cost. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
Reg Me Please <regmeplease@gmail.com> writes: >>> (cost=3.95..382140.91 rows=274709 width=91) (actual >>> time=1.929..57713.305 rows=92 loops=1) >>> Hash Cond: (t_dati.camp_id = t_campi.camp_id) I think the reason it doesn't want to use an indexed join is the large estimate of the number of join result rows. You need to try to get that number down to something nearer the reality. Increasing the statistics target for the larger table might help. regards, tom lane
Il Friday 26 October 2007 15:18:04 Tom Lane ha scritto: > Reg Me Please <regmeplease@gmail.com> writes: > >>> (cost=3.95..382140.91 rows=274709 width=91) (actual > >>> time=1.929..57713.305 rows=92 loops=1) > >>> Hash Cond: (t_dati.camp_id = t_campi.camp_id) > > I think the reason it doesn't want to use an indexed join is the large > estimate of the number of join result rows. You need to try to get that > number down to something nearer the reality. Increasing the statistics > target for the larger table might help. > > regards, tom lane How can I "Increasing the statistics target for the larger table"? I'ìm sorry for asking, but I'm not that deep into RDBMS. Thanks.
Reg Me Please <regmeplease@gmail.com> schrieb: > How can I "Increasing the statistics target for the larger table"? > I'ìm sorry for asking, but I'm not that deep into RDBMS. alter table <table> alter column <column> SET STATISTICS <value>; Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknow) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
Il Saturday 27 October 2007 08:51:09 Andreas Kretschmer ha scritto: > Reg Me Please <regmeplease@gmail.com> schrieb: > > How can I "Increasing the statistics target for the larger table"? > > I'ìm sorry for asking, but I'm not that deep into RDBMS. > > alter table <table> alter column <column> SET STATISTICS <value>; > > Andreas How can I display the statistics for a table/column/index/whatever applies? Thanks again.
Reg Me Please <regmeplease@gmail.com> schrieb: > Il Saturday 27 October 2007 08:51:09 Andreas Kretschmer ha scritto: > > Reg Me Please <regmeplease@gmail.com> schrieb: > > > How can I "Increasing the statistics target for the larger table"? > > > I'ìm sorry for asking, but I'm not that deep into RDBMS. > > > > alter table <table> alter column <column> SET STATISTICS <value>; > > > > Andreas > > How can I display the statistics for a table/column/index/whatever applies? select attstattarget from pg_attribute where attname = <column> and attrelid=<table>::regclass; for instance: select attstattarget from pg_attribute where attname = 'ts' and attrelid='foo'::regclass; -1 means the default statistics value. Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknow) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°