Problem when doing join from multiple views - Mailing list pgsql-general

From Denis Perchine
Subject Problem when doing join from multiple views
Date
Msg-id 0006110131070Z.00505@dyp
Whole thread Raw
List pgsql-general
Hello all,

I have quite strange problem.
I have one table for lots of dicts:

CREATE TABLE dicts (
        id serial,
        type int4,
        name text
);

And dicts_type table with textual description of types. Don't blame me... I know that this is not
fast. I just try to be as close as the situation is.

CREATE TABLE dicts_type (
        id serial,
        name text
);

And I created lots of view for each of them:
CREATE VIEW country AS
    select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='country';
CREATE VIEW gender AS
    select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='gender';
CREATE VIEW income AS
    select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='income';
CREATE VIEW occupation AS
    select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='occupation';
CREATE VIEW question AS
    select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='question';
CREATE VIEW state AS
    select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='state';
CREATE VIEW born_year AS
    select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='born_year';

Then I have table users which use all of these dicts:
CREATE TABLE "users" (
        "id" serial,
        "account_name" text,
        "account_pass" text,
        "blocked" bool DEFAULT 'f',
        "commercial" bool DEFAULT 'f',
        "expire_date" timestamp,
        "num_of_pop3" int4 DEFAULT 1,
        "num_of_local" int4 DEFAULT 1,
        "first_name" text,
        "last_name" text,
        "bd_year" int4,
        "gender" int4,
        "occupation" int4,
        "income" int4,
        "alternate_email" character varying(129),
        "state" int4,
        "country" int4,
        "phone" text,
        "password_question" int4,
        "password_answer" text,
        "crypt" character(13),
        "last_login_ip" character(31),
        "last_seen" timestamp,
        "registered" timestamp,
        "in_limit" int4 DEFAULT 10 NOT NULL,
        "out_limit" int4 DEFAULT 10 NOT NULL,
        "max_msg_size" int4 DEFAULT 64 NOT NULL,
        "max_reply_size" int4 DEFAULT 16 NOT NULL,
        "max_attach_size" int4 DEFAULT 64 NOT NULL,
        "max_replies" int4 DEFAULT 3 NOT NULL
);

And finaly I try to exec explain for join query (this query is incorrect... I know):

explain  select
    account_name, account_pass, blocked, commercial, expire_date, num_of_pop3,
    num_of_local, first_name, last_name, born_year.born_year, gender.gender, occupation.occupation,
    income.income, alternate_email, state.state, country.country, phone, question.question,
    password_answer, last_login_ip, last_seen, registered, in_limit, out_limit, max_msg_size,
    max_reply_size, max_attach_size, max_replies
 from users, born_year, gender, country, income, occupation, state
 where
    born_year.id=bd_year and gender.id=users.gender and occupation.id=users.occupation and
    income.id=users.income and state.id=users.state and country.id=users.country and
    question.id=users.password_question;

And get:
psql:test.sql:11: pqReadData() -- backend closed the channel unexpectedly.
        This probably means the backend terminated abnormally
        before or while processing the request.
psql:test.sql:11: connection to server was lost

Perfect!!!
PostgreSQL 7.0.2
Linux 2.2.15pre8
GLIBC 2.1.3

If I issue correct query:

explain select
    account_name, account_pass, blocked, commercial, expire_date, num_of_pop3,
    num_of_local, first_name, last_name, born_year.name, gender.name, occupation.name,
    income.name, alternate_email, state.name, country.name, phone, question.name,
    password_answer, last_login_ip, last_seen, registered, in_limit, out_limit, max_msg_size,
    max_reply_size, max_attach_size, max_replies
 from users, born_year, gender, country, income, occupation, state
 where
    born_year.id=bd_year and gender.id=users.gender and occupation.id=users.occupation and
    income.id=users.income and state.id=users.state and country.id=users.country and
    question.id=users.password_question;

After 13 seconds wait(why???)... I get the following plan:
psql:test.sql:11: NOTICE:  QUERY PLAN:

psql:test.sql:11: NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..22.89 rows=1 width=350)
  ->  Nested Loop  (cost=0.00..20.86 rows=1 width=330)
        ->  Nested Loop  (cost=0.00..19.76 rows=1 width=326)
              ->  Nested Loop  (cost=0.00..17.74 rows=1 width=306)
                    ->  Nested Loop  (cost=0.00..16.64 rows=1 width=302)
                          ->  Nested Loop  (cost=0.00..15.54 rows=1 width=298)
                                ->  Nested Loop  (cost=0.00..13.51 rows=1 width=278)
                                      ->  Nested Loop  (cost=0.00..11.48 rows=1 width=258)
                                            ->  Nested Loop  (cost=0.00..9.45 rows=1 width=238)
                                                  ->  Nested Loop  (cost=0.00..8.36 rows=1 width=234)
                                                        ->  Nested Loop  (cost=0.00..6.33 rows=1 width=214)
                                                              ->  Nested Loop  (cost=0.00..4.30 rows=1 width=194)
                                                                    ->  Nested Loop  (cost=0.00..3.28 rows=1 width=12)
                                                                          ->  Nested Loop  (cost=0.00..2.18 rows=1
width=8)
                                                                                ->  Seq Scan on dicts_type t
(cost=0.00..1.09rows=1 width=4) 
                                                                                ->  Seq Scan on dicts_type t
(cost=0.00..1.09rows=1 width=4) 
                                                                          ->  Seq Scan on dicts_type t
(cost=0.00..1.09rows=1 width=4) 
                                                                    ->  Seq Scan on users  (cost=0.00..1.01 rows=1
width=182)
                                                              ->  Index Scan using dicts_id_key on dicts d
(cost=0.00..2.01rows=1 width=20) 
                                                        ->  Index Scan using dicts_id_key on dicts d  (cost=0.00..2.01
rows=1width=20) 
                                                  ->  Seq Scan on dicts_type t  (cost=0.00..1.09 rows=1 width=4)
                                            ->  Index Scan using dicts_id_key on dicts d  (cost=0.00..2.01 rows=1
width=20)
                                      ->  Index Scan using dicts_id_key on dicts d  (cost=0.00..2.01 rows=1 width=20)
                                ->  Index Scan using dicts_id_key on dicts d  (cost=0.00..2.01 rows=1 width=20)
                          ->  Seq Scan on dicts_type t  (cost=0.00..1.09 rows=1 width=4)
                    ->  Seq Scan on dicts_type t  (cost=0.00..1.09 rows=1 width=4)
              ->  Index Scan using dicts_id_key on dicts d  (cost=0.00..2.01 rows=1 width=20)
        ->  Seq Scan on dicts_type t  (cost=0.00..1.09 rows=1 width=4)
  ->  Index Scan using dicts_id_key on dicts d  (cost=0.00..2.01 rows=1 width=20)

EXPLAIN

All is perfect :-((( Except small issue. It would be much better to do only one seq (or index) scan
on dicts and the do join... Why they are SO much scans on dicts???? And why they are so much type
spent for prepare?

OK. I know that I am bad guy... Let's do more simple thing:
CREATE VIEW country AS select id, name from dicts where type = 1;
CREATE VIEW gender AS select id, name from dicts where type = 2;
CREATE VIEW income AS select id, name from dicts where type = 3;
CREATE VIEW occupation AS select id, name from dicts where type = 4;
CREATE VIEW question AS select id, name from dicts where type = 5;
CREATE VIEW state AS select id, name from dicts where type = 6;
CREATE VIEW born_year AS select id, name from dicts where type = 7;

And the same query's explain:
psql:test.sql:11: NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..15.21 rows=1 width=294)
  ->  Nested Loop  (cost=0.00..13.18 rows=1 width=278)
        ->  Nested Loop  (cost=0.00..11.15 rows=1 width=262)
              ->  Nested Loop  (cost=0.00..9.12 rows=1 width=246)
                    ->  Nested Loop  (cost=0.00..7.10 rows=1 width=230)
                          ->  Nested Loop  (cost=0.00..5.07 rows=1 width=214)
                                ->  Nested Loop  (cost=0.00..3.04 rows=1 width=198)
                                      ->  Seq Scan on users  (cost=0.00..1.01 rows=1 width=182)
                                      ->  Index Scan using dicts_id_key on dicts  (cost=0.00..2.02 rows=1 width=16)
                                ->  Index Scan using dicts_id_key on dicts  (cost=0.00..2.02 rows=1 width=16)
                          ->  Index Scan using dicts_id_key on dicts  (cost=0.00..2.02 rows=1 width=16)
                    ->  Index Scan using dicts_id_key on dicts  (cost=0.00..2.02 rows=1 width=16)
              ->  Index Scan using dicts_id_key on dicts  (cost=0.00..2.02 rows=1 width=16)
        ->  Index Scan using dicts_id_key on dicts  (cost=0.00..2.02 rows=1 width=16)
  ->  Index Scan using dicts_id_key on dicts  (cost=0.00..2.02 rows=1 width=16)

EXPLAIN

Again... Lot's of index scans on the same table... Is it possible to do one scan and
avoid lot's of index scans? Sorry to bother, but I created views to avoid multiple scans...
But...

DISCLAMER: after any change vacuum analyze was made. :-))) Just to avoid some questions.

--
Sincerely Yours,
Denis Perchine

----------------------------------
E-Mail: dyp@perchine.com
HomePage: http://www.perchine.com/dyp/
FidoNet: 2:5000/120.5
----------------------------------

pgsql-general by date:

Previous
From: Travis Bauer
Date:
Subject: Hiding table structure
Next
From: Marcos Barreto de Castro
Date:
Subject: Cursor Problems