Thread: Question on pgsql optimisation of SQL and structure (index, etc)

Question on pgsql optimisation of SQL and structure (index, etc)

From
Alexandre Leclerc
Date:
Good day,

I use pgsql 7.4: I would like to know if indexes will solve my problem
(I fear the system will become slow with the time). And also some
questions on how pgsql optimise for speed.

*Database*

-- Assuming those tables (not original, but enought to get the point):

CREATE TABLE prod.jobs (
job_id          serial   PRIMARY KEY,
order_id        integer  NOT NULL REFERENCES sales.orders,
);

CREATE TABLE design.products (
product_id          serial      PRIMARY KEY,
company_id          integer     NOT NULL REFERENCES sales.companies ON
UPDATE CASCADE,
product_code        varchar(24) NOT NULL,
CONSTRAINT product_code_already_used_for_this_company UNIQUE
(company_id, product_code)
);

CREATE TABLE prod.jobs_products (
product_id    integer     REFERENCES design.products ON UPDATE CASCADE,
) INHERITS (prod.jobs);

-- Assuming this view:

CREATE VIEW prod.orders_jobs_view AS
  SELECT job_id, order_id, product_code
    FROM (
      SELECT *, NULL AS product_id FROM ONLY prod.jobs
      UNION
      SELECT * FROM prod.jobs_products
    ) AS alljobs LEFT JOIN design.products ON alljobs.product_id =
products.product_id;

*Question 1*

Assuming this request:
SELECT * FROM prod.orders_jobs_view WHERE order_id = 1;

I imagine that somewhere down the road, this will get slow since there
is no index on the order_id. I tought of creating two indexes... With
the previous VIEW and database schema, will the following boost the
DB; as I don't know how PostgreSQL works internally:

CREATE UNIQUE INDEX order_jobs ON prod.jobs(order_id);
CREATE UNIQUE INDEX order_jobs_products ON prod.jobs_products(order_id);

*Question 2*

If no to question 1, what can I do to boost the database speed. I do
have prety heavy views on data, and I would like to get some speed as
the DB will get filled up quickly.

*Question 3*

When creating a wien with linked "UNION" tables as previous... when we
do a SELECT with a WHERE clause, will the database act efficiently by
adding the WHERE clause to the UNIONed tables in the FROM clause?

Example:

SELECT * FROM prod.orders_jobs_view WHERE order_id = 1;

whould cause something like

  SELECT job_id, order_id, product_code
    FROM (
      SELECT *, NULL AS product_id FROM ONLY prod.jobs WHERE order_id = 1
      UNION
      SELECT * FROM prod.jobs_products WHERE order_id = 1
    ) AS alljobs LEFT JOIN design.products ON alljobs.product_id =
products.product_id;

in order to speed the union processing?

Thank you for any help on this.

--
Alexandre Leclerc

Re: Question on pgsql optimisation of SQL and structure (index, etc)

From
Josh Berkus
Date:
Alexandre,

> -- Assuming those tables (not original, but enought to get the point):
>
> CREATE TABLE prod.jobs (
> job_id          serial   PRIMARY KEY,
> order_id        integer  NOT NULL REFERENCES sales.orders,
> );
>
> CREATE TABLE design.products (
> product_id          serial      PRIMARY KEY,
> company_id          integer     NOT NULL REFERENCES sales.companies ON
> UPDATE CASCADE,
> product_code        varchar(24) NOT NULL,
> CONSTRAINT product_code_already_used_for_this_company UNIQUE
> (company_id, product_code)
> );
>
> CREATE TABLE prod.jobs_products (
> product_id    integer     REFERENCES design.products ON UPDATE CASCADE,
> ) INHERITS (prod.jobs);

First off, let me say that I find this schema rather bizarre.  The standard
way to handle your situation would be to add a join table instead of
inheritance for jobs_products:

CREATE TABLE jobs_products (
    job_id INT NOT NULL REFERENCES prod.jobs(job_id) ON DELETE CASCADE,
    product_id    INT NOT NULL  REFERENCES design.products(product_id) ON UPDATE
CASCADE,
    CONSTRAINT jobs_products_pk PRIMARY KEY (job_id, product_id)
);

Then this view:

> CREATE VIEW prod.orders_jobs_view AS
>   SELECT job_id, order_id, product_code
>     FROM (
>       SELECT *, NULL AS product_id FROM ONLY prod.jobs
>       UNION
>       SELECT * FROM prod.jobs_products
>     ) AS alljobs LEFT JOIN design.products ON alljobs.product_id =
> products.product_id;

Becomes much simpler, and better performance:

CREATE VIEW prod.orders_jobs_view AS
SELECT job_id, order_id, product_code
FROM prod.jobs LEFT JOIN prod.jobs_products ON prod.jobs.job_id =
prod.jobs_products.job_id
    LEFT JOIN design.products ON prod.jobs_products.product_id =
design.products.product_id;

> I imagine that somewhere down the road, this will get slow since there
> is no index on the order_id. I tought of creating two indexes... With
> the previous VIEW and database schema, will the following boost the
> DB; as I don't know how PostgreSQL works internally:

Yes.  Any time you have a foreign key, you should index it unless you have a
really good reason not to.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco