Re: query optimization scenarios 17,701 times faster!!! - Mailing list pgsql-hackers

From Robert Dyas
Subject Re: query optimization scenarios 17,701 times faster!!!
Date
Msg-id MGEFJOBFIEAIADIKAMEKOEJECIAA.rdyas@adelphia.net
Whole thread Raw
In response to Re: query optimization scenarios 17,701 times faster!!!  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
OK, so you lack mind-reading skills ;-) I wasn't sure if anybody would be
interested in exploring this, but here's the full monty. (full EXPLAIN
ANALYZE output + entire schema) I can't provide the actual data, sorry. But
each table has at most a couple of hundred records. The numbers are actually
a little worse this time (33,623 times faster!!!) because there is slightly
more data in the tables. Performance degrades very rapidly as table size
increases.



EXPLAIN ANALYZE SELECT DISTINCT organizations.city, organizations.inactive,
organizations.java_developers, organizations.name, organizations.org_id,
organizations.overview, organizations.phone, organizations.salesperson,
organizations.state, organizations.time_zone FROM organizations LEFT OUTER
JOIN org_milestones ON (org_milestones.org_id = organizations.org_id) LEFT
OUTER JOIN contacts ON (contacts.org_id = organizations.org_id) LEFT OUTER
JOIN tasks ON (tasks.org_id = organizations.org_id) WHERE
(organizations.org_id = 71) ORDER BY organizations.name
covont_production-# ;

QUERY PLAN
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-------------------------Unique  (cost=53.32..54.08 rows=3 width=884) (actual
time=19200.24..24870.87 rows=1 loops=1)  ->  Sort  (cost=53.32..53.39 rows=27 width=884) (actual
time=19200.19..19315.98 rows=840 loops=1)        Sort Key: organizations.name, organizations.city,
organizations.inactive, organizations.java_developers, organizations.org_id,
organizations.overview, organizations.phone, organizations.salesperson,
organizations.state, organizations.time_zone        ->  Hash Join  (cost=43.68..52.67 rows=27 width=884) (actual
time=18.42..170.69 rows=840 loops=1)              Hash Cond: ("outer".org_id = "inner".org_id)              ->  Hash
Join (cost=12.38..20.75 rows=5 width=880) (actual
 
time=7.18..19.42 rows=42 loops=1)                    Hash Cond: ("outer".org_id = "inner".org_id)                    ->
Nested Loop  (cost=0.00..8.24 rows=2 width=876)
 
(actual time=1.03..6.16 rows=7 loops=1)                          Join Filter: ("inner".org_id = "outer".org_id)
                ->  Index Scan using organizations_pkey on
 
organizations  (cost=0.00..4.66 rows=1 width=872) (actual time=0.32..0.34
rows=1 loops=1)                                Index Cond: (org_id = 71)                          ->  Seq Scan on
org_milestones (cost=0.00..2.15
 
rows=115 width=4) (actual time=0.04..2.97 rows=116 loops=1)                    ->  Hash  (cost=8.36..8.36 rows=136
width=4)(actual
 
time=5.44..5.44 rows=0 loops=1)                          ->  Seq Scan on contacts  (cost=0.00..8.36
rows=136 width=4) (actual time=0.05..3.21 rows=136 loops=1)              ->  Hash  (cost=19.92..19.92 rows=292 width=4)
(actual
time=10.50..10.50 rows=0 loops=1)                    ->  Seq Scan on tasks  (cost=0.00..19.92 rows=292
width=4) (actual time=0.06..6.38 rows=294 loops=1)Total runtime: 24881.28 msec
(17 rows)

covont_production=# EXPLAIN ANALYZE SELECT organizations.city,
organizations.inactive, organizations.java_developers, organizations.name,
organizations.org_id, organizations.overview, organizations.phone,
organizations.salesperson, organizations.state, organizations.time_zone FROM
organizations WHERE (organizations.org_id = 71)
covont_production-# ;                                                           QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------------------Index Scan using organizations_pkey on organizations
(cost=0.00..4.66
rows=1 width=872) (actual time=0.28..0.29 rows=1 loops=1)  Index Cond: (org_id = 71)Total runtime: 0.74 msec
(3 rows)

covont_production=#



****************************************************************************
**
Here is the complete schema:

CREATE TABLE organizations (org_id              SERIAL PRIMARY KEY,salesperson         INTEGER REFERENCES
users(user_id),name               VARCHAR(30) NOT NULL,phone               VARCHAR(30),city
VARCHAR(30),state              VARCHAR(2),time_zone           VARCHAR(3),overview            TEXT,java_developers
INTEGER,inactive           BOOLEAN NOT NULL
 
);

CREATE INDEX organizations_salesperson ON organizations(salesperson);

CREATE TABLE milestones (milestone_id        SERIAL PRIMARY KEY,name                VARCHAR(60) NOT NULL,red_flag_days
    INTEGER NOT NULL
 
);

CREATE TABLE org_milestones (id                  SERIAL PRIMARY KEY,milestone_id          INTEGER NOT NULL REFERENCES
milestones(milestone_id),org_id             INTEGER NOT NULL REFERENCES organizations(org_id),completed_on        DATE
NOTNULL,notes               VARCHAR(250)
 
);

CREATE TABLE contacts (contact_id          SERIAL PRIMARY KEY,org_id              INTEGER NOT NULL REFERENCES
organizations(org_id),first_name         VARCHAR(30) NOT NULL,last_name           VARCHAR(30),title
VARCHAR(60),phone              VARCHAR(30),email_address       VARCHAR(120),assistant_name
VARCHAR(30),assistant_phone    VARCHAR(30),assistant_email     VARCHAR(120),functional_role     VARCHAR(30),reports_to
       INTEGER REFERENCES contacts(contact_id),notes               TEXT,pain                VARCHAR(120),reasons
    VARCHAR(250),needs               VARCHAR(250),address_1           VARCHAR(60),address_2           VARCHAR(60),city
             VARCHAR(30),state               VARCHAR(2),zip_code            VARCHAR(10),time_zone           VARCHAR(3)
 
);

CREATE INDEX contacts_org_id ON contacts(org_id);

CREATE TABLE tasks (task_id             SERIAL PRIMARY KEY,org_id              INTEGER NOT NULL REFERENCES
organizations(org_id),contact_id         INTEGER NOT NULL REFERENCES contacts(contact_id),created_by          INTEGER
NOTNULL REFERENCES users(user_id),start_date          DATE NOT NULL,start_time          TIME,completed
BOOLEANNOT NULL,task_type           VARCHAR(30) NOT NULL,objective           VARCHAR(120) NOT NULL,outcome
VARCHAR(120),notes              TEXT,priority            INTEGER NOT NULL
 
);

CREATE INDEX tasks_org_id ON tasks(org_id);
CREATE INDEX tasks_contact_id ON tasks(contact_id);
CREATE INDEX tasks_start_date ON tasks(start_date);

CREATE TABLE attachments (attach_id           SERIAL PRIMARY KEY,task_id             INTEGER NOT NULL REFERENCES
tasks(task_id)ON DELETE
 
CASCADE,attachment          BYTEA
);

CREATE INDEX attachments_task_id ON attachments(task_id);




-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Wednesday, April 23, 2003 11:00 PM
To: Robert Dyas
Cc: pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] query optimization scenarios 17,701 times
faster!!!


"Robert Dyas" <rdyas@adelphia.net> writes:
> The following is a list of query pairs (one fast, one slow) that must
> produce identical results by definition (and do), but have very different
> execution times.

With no details about the table schemas, nor the EXPLAIN ANALYZE output
data, I really wonder how you expect any intelligent comments.  We are
programmers, not mind-readers.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: Unrelated question
Next
From: "Dave Page"
Date:
Subject: Re: putting new pdf docs on ftp site