Re: Optimization of this SQL sentence - Mailing list pgsql-performance

From Alexander Staubo
Subject Re: Optimization of this SQL sentence
Date
Msg-id 181E6D21-8326-4ED6-A9AE-6F9917C1C319@purefiction.net
Whole thread Raw
In response to Re: Optimization of this SQL sentence  (Ruben Rubio <ruben@rentalia.com>)
Responses Re: Optimization of this SQL sentence  ("Craig A. James" <cjames@modgraph-usa.com>)
Re: Optimization of this SQL sentence  (Mario Weilguni <mweilguni@sime.com>)
List pgsql-performance
On Oct 17, 2006, at 11:33 , Ruben Rubio wrote:

> CREATE TABLE "comment"
> (
>   idcomment int4 NOT NULL DEFAULT
> nextval('comment_idcomment_seq'::regclass),
[snip 28 columns]
>   CONSTRAINT comment_pkey PRIMARY KEY (idcomment)
> )
>
> Ficha structure:
> No indexes in ficha
> Ficha rows: 17.850
>
> CREATE TABLE ficha
> (
>   idficha int4 NOT NULL DEFAULT nextval
> ('ficha_idficha_seq'::regclass),
[snip 67 (!) columns]
>   CONSTRAINT pk_ficha PRIMARY KEY (idficha),
>   CONSTRAINT fk_ficha_geonivel6 FOREIGN KEY (idlocacion) REFERENCES
> geonivel6 (idgeonivel6) ON UPDATE NO ACTION ON DELETE NO ACTION
> )

  These tables are particularly egregious examples of ignorant
database design. You need to understand the relational model (http://
en.wikipedia.org/wiki/Relational_model), specifically data
normalization (http://en.wikipedia.org/wiki/Database_normalization)
and 3NF (http://en.wikipedia.org/wiki/3NF).

These columns are particularly telling:

   searchengine1 int4,
   searchengine2 int4,
   searchengine3 int4,
   searchengine4 int4,
   searchengine5 int4,
   searchengine6 int4,
   deseo1 int4,
   deseo2 int4,
   deseo3 int4,
   deseo4 int4,
   deseo5 int4,
   deseo6 int4,
   titulomapa_l0 varchar(255),
   titulomapa_l1 varchar(255),
   titulomapa_l2 varchar(255),
   titulomapa_l3 varchar(255),
   titulomapa_l4 varchar(255),
   titulomapa_l5 varchar(255),
   titulomapa_l6 varchar(255),
   titulomapa_l7 varchar(255),
   titulomapa_l8 varchar(255),
   titulomapa_l9 varchar(255),

Refactor into three separate tables:

   create table searchengine (
     idficha int references ficha (idficha),
     searchengine int,
     primary key (idficha, searchengine)
   );

   create table deseo (
     idficha int references ficha (idficha),
     deseo int,
     primary key (idficha, deseo)
   );

   create table titulomapa (
     idficha int references ficha (idficha),
     titulomapa int,
     primary key (idficha, titulomapa)
   );

Now you can find all search engines for a single ficha row:

   select searchengine from searchengine where idficha = n

This design allows for more than 5 search engines per ficha row, and
allows expressive joins such as:

   select ficha.idficha, searchengine.searchengine
   inner join searchengine on searchengine.idfciha = ficha.idficha

Also, most of your columns are nullable. This alone shows that you
don't understand your own data.

Lastly, note that in PostgreSQL these length declarations are not
necessary:

   contacto varchar(255),
   fuente varchar(512),
   prefijopais varchar(10)

Instead, use:

   contacto text,
   fuente text,
   prefijopais text

See the PostgreSQL manual for an explanation of varchar vs. text.

Alexander.


pgsql-performance by date:

Previous
From: "A. Kretschmer"
Date:
Subject: Re: Optimization of this SQL sentence
Next
From: "Heikki Linnakangas"
Date:
Subject: Re: Optimization of this SQL sentence