Problem when doing join from multiple views - Mailing list pgsql-general
From | Denis Perchine |
---|---|
Subject | Problem when doing join from multiple views |
Date | |
Msg-id | 0006110131070Z.00505@dyp Whole thread Raw |
List | pgsql-general |
Hello all, I have quite strange problem. I have one table for lots of dicts: CREATE TABLE dicts ( id serial, type int4, name text ); And dicts_type table with textual description of types. Don't blame me... I know that this is not fast. I just try to be as close as the situation is. CREATE TABLE dicts_type ( id serial, name text ); And I created lots of view for each of them: CREATE VIEW country AS select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='country'; CREATE VIEW gender AS select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='gender'; CREATE VIEW income AS select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='income'; CREATE VIEW occupation AS select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='occupation'; CREATE VIEW question AS select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='question'; CREATE VIEW state AS select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='state'; CREATE VIEW born_year AS select d.id, d.name from dicts d, dicts_type t where d.type=t.id and t.name='born_year'; Then I have table users which use all of these dicts: CREATE TABLE "users" ( "id" serial, "account_name" text, "account_pass" text, "blocked" bool DEFAULT 'f', "commercial" bool DEFAULT 'f', "expire_date" timestamp, "num_of_pop3" int4 DEFAULT 1, "num_of_local" int4 DEFAULT 1, "first_name" text, "last_name" text, "bd_year" int4, "gender" int4, "occupation" int4, "income" int4, "alternate_email" character varying(129), "state" int4, "country" int4, "phone" text, "password_question" int4, "password_answer" text, "crypt" character(13), "last_login_ip" character(31), "last_seen" timestamp, "registered" timestamp, "in_limit" int4 DEFAULT 10 NOT NULL, "out_limit" int4 DEFAULT 10 NOT NULL, "max_msg_size" int4 DEFAULT 64 NOT NULL, "max_reply_size" int4 DEFAULT 16 NOT NULL, "max_attach_size" int4 DEFAULT 64 NOT NULL, "max_replies" int4 DEFAULT 3 NOT NULL ); And finaly I try to exec explain for join query (this query is incorrect... I know): explain select account_name, account_pass, blocked, commercial, expire_date, num_of_pop3, num_of_local, first_name, last_name, born_year.born_year, gender.gender, occupation.occupation, income.income, alternate_email, state.state, country.country, phone, question.question, password_answer, last_login_ip, last_seen, registered, in_limit, out_limit, max_msg_size, max_reply_size, max_attach_size, max_replies from users, born_year, gender, country, income, occupation, state where born_year.id=bd_year and gender.id=users.gender and occupation.id=users.occupation and income.id=users.income and state.id=users.state and country.id=users.country and question.id=users.password_question; And get: psql:test.sql:11: pqReadData() -- backend closed the channel unexpectedly. This probably means the backend terminated abnormally before or while processing the request. psql:test.sql:11: connection to server was lost Perfect!!! PostgreSQL 7.0.2 Linux 2.2.15pre8 GLIBC 2.1.3 If I issue correct query: explain select account_name, account_pass, blocked, commercial, expire_date, num_of_pop3, num_of_local, first_name, last_name, born_year.name, gender.name, occupation.name, income.name, alternate_email, state.name, country.name, phone, question.name, password_answer, last_login_ip, last_seen, registered, in_limit, out_limit, max_msg_size, max_reply_size, max_attach_size, max_replies from users, born_year, gender, country, income, occupation, state where born_year.id=bd_year and gender.id=users.gender and occupation.id=users.occupation and income.id=users.income and state.id=users.state and country.id=users.country and question.id=users.password_question; After 13 seconds wait(why???)... I get the following plan: psql:test.sql:11: NOTICE: QUERY PLAN: psql:test.sql:11: NOTICE: QUERY PLAN: Nested Loop (cost=0.00..22.89 rows=1 width=350) -> Nested Loop (cost=0.00..20.86 rows=1 width=330) -> Nested Loop (cost=0.00..19.76 rows=1 width=326) -> Nested Loop (cost=0.00..17.74 rows=1 width=306) -> Nested Loop (cost=0.00..16.64 rows=1 width=302) -> Nested Loop (cost=0.00..15.54 rows=1 width=298) -> Nested Loop (cost=0.00..13.51 rows=1 width=278) -> Nested Loop (cost=0.00..11.48 rows=1 width=258) -> Nested Loop (cost=0.00..9.45 rows=1 width=238) -> Nested Loop (cost=0.00..8.36 rows=1 width=234) -> Nested Loop (cost=0.00..6.33 rows=1 width=214) -> Nested Loop (cost=0.00..4.30 rows=1 width=194) -> Nested Loop (cost=0.00..3.28 rows=1 width=12) -> Nested Loop (cost=0.00..2.18 rows=1 width=8) -> Seq Scan on dicts_type t (cost=0.00..1.09rows=1 width=4) -> Seq Scan on dicts_type t (cost=0.00..1.09rows=1 width=4) -> Seq Scan on dicts_type t (cost=0.00..1.09rows=1 width=4) -> Seq Scan on users (cost=0.00..1.01 rows=1 width=182) -> Index Scan using dicts_id_key on dicts d (cost=0.00..2.01rows=1 width=20) -> Index Scan using dicts_id_key on dicts d (cost=0.00..2.01 rows=1width=20) -> Seq Scan on dicts_type t (cost=0.00..1.09 rows=1 width=4) -> Index Scan using dicts_id_key on dicts d (cost=0.00..2.01 rows=1 width=20) -> Index Scan using dicts_id_key on dicts d (cost=0.00..2.01 rows=1 width=20) -> Index Scan using dicts_id_key on dicts d (cost=0.00..2.01 rows=1 width=20) -> Seq Scan on dicts_type t (cost=0.00..1.09 rows=1 width=4) -> Seq Scan on dicts_type t (cost=0.00..1.09 rows=1 width=4) -> Index Scan using dicts_id_key on dicts d (cost=0.00..2.01 rows=1 width=20) -> Seq Scan on dicts_type t (cost=0.00..1.09 rows=1 width=4) -> Index Scan using dicts_id_key on dicts d (cost=0.00..2.01 rows=1 width=20) EXPLAIN All is perfect :-((( Except small issue. It would be much better to do only one seq (or index) scan on dicts and the do join... Why they are SO much scans on dicts???? And why they are so much type spent for prepare? OK. I know that I am bad guy... Let's do more simple thing: CREATE VIEW country AS select id, name from dicts where type = 1; CREATE VIEW gender AS select id, name from dicts where type = 2; CREATE VIEW income AS select id, name from dicts where type = 3; CREATE VIEW occupation AS select id, name from dicts where type = 4; CREATE VIEW question AS select id, name from dicts where type = 5; CREATE VIEW state AS select id, name from dicts where type = 6; CREATE VIEW born_year AS select id, name from dicts where type = 7; And the same query's explain: psql:test.sql:11: NOTICE: QUERY PLAN: Nested Loop (cost=0.00..15.21 rows=1 width=294) -> Nested Loop (cost=0.00..13.18 rows=1 width=278) -> Nested Loop (cost=0.00..11.15 rows=1 width=262) -> Nested Loop (cost=0.00..9.12 rows=1 width=246) -> Nested Loop (cost=0.00..7.10 rows=1 width=230) -> Nested Loop (cost=0.00..5.07 rows=1 width=214) -> Nested Loop (cost=0.00..3.04 rows=1 width=198) -> Seq Scan on users (cost=0.00..1.01 rows=1 width=182) -> Index Scan using dicts_id_key on dicts (cost=0.00..2.02 rows=1 width=16) -> Index Scan using dicts_id_key on dicts (cost=0.00..2.02 rows=1 width=16) -> Index Scan using dicts_id_key on dicts (cost=0.00..2.02 rows=1 width=16) -> Index Scan using dicts_id_key on dicts (cost=0.00..2.02 rows=1 width=16) -> Index Scan using dicts_id_key on dicts (cost=0.00..2.02 rows=1 width=16) -> Index Scan using dicts_id_key on dicts (cost=0.00..2.02 rows=1 width=16) -> Index Scan using dicts_id_key on dicts (cost=0.00..2.02 rows=1 width=16) EXPLAIN Again... Lot's of index scans on the same table... Is it possible to do one scan and avoid lot's of index scans? Sorry to bother, but I created views to avoid multiple scans... But... DISCLAMER: after any change vacuum analyze was made. :-))) Just to avoid some questions. -- Sincerely Yours, Denis Perchine ---------------------------------- E-Mail: dyp@perchine.com HomePage: http://www.perchine.com/dyp/ FidoNet: 2:5000/120.5 ----------------------------------
pgsql-general by date: