PostgreSQL and VIEWS - Mailing list pgsql-general
From | Misa Simic |
---|---|
Subject | PostgreSQL and VIEWS |
Date | |
Msg-id | CAH3i69=L2=LXE4YTxwJg3brC_Pv703+Mz2vH504qxcHWF3t5aw@mail.gmail.com Whole thread Raw |
Responses |
Re: PostgreSQL and VIEWS
Re: PostgreSQL and VIEWS |
List | pgsql-general |
HI,
When I have met PostgreSQL for a first time - I have been really amazed - with many things...
But how we started to use it - and data jumps in - we meet performance problems...
Now, it is a bit tricky... any concrete performance problem - can be solved on some way...
However, I am more concerned, that "our way" how we do the things - is not best way for Postgres...
The thing is - our tables are highly normalised - so lot of joins... (NULLS do not exist in the tables)
to make things simplier we use VIEWS to "denormalise" data - though it is not that much we care about denormalisation - it is more to make things simplier and less error prone...
So every "thing" (Customer, product, transaction, employee, whatever) is built up from more tables... I am not sure we have even one "thing" built up just from 1 table...
Lot of "thing properties" are actually calculations: i.e. Invoice Amount, or for Employee:
We have in the one table: first_name and last_name fields, but Full name is concatented as:
last_name || ',' || first_name
So, whenever we need employee full name - is it on Employee Info window - or in an Invoice as Salesperson, or in Payslip report...instead of to everywhere have above formula - we have function...
but again, instead of to developer think about each possible function for concrete thing - we have made for each entity - the view... what have all relevant joins - and all relevant calculated properties... about the thing...
I have had a thought - somewhere read (but now I am more convenient I have read it wrong) that planner is that smart that if we we have:
CREATE VIEW person_view AS
SELECT person_id, first_name, last_name, fn_full_name(id) as full_name, date_of_birth, age(date_of_birth) from person LEFT JOIN person_date_of_birth USING (person_id)
SELECT first_name FROM person_view WHERE id = 1
Planner is smart and:
-will not care about joins - you ask for the field(s) - what do not belong to other tables - both belong to 1 - and it is pk!
-will not care about functions - you havent asked for any field what is function in your query...
However - how we met more and more problems with our approach... and spending more time on
EXPLAIN ANALYZE - instead of on business problems... It seems things are not that good...
for simple questions as above - results are acceptable - even looking into EXPLAIN ANALYZE i would not say it is the best possible plan... (i.e. planner spending time on Seq Scan on person_date_of_birth_table - and filter it - even no need to think about that table at all - LEFT JOIN (show me columns - if there is a matching row for pk column) - so could be check via index -however - there is no any column from that table in the query - I would simple discard that table from plan....
So query
SELECT id FROM view WHERE id = 5 (view is SELECT * FROM table1 LEFT JOIN table2)
I would treat the same as:
SELECT id FROM table1 = 5
ok in INNER JOIN it requires additional confimration - but even there is FK to PK join - that confirmation is not needed iether - but in our cases it is always FK to PK...
However - if we need to involve more "entities"/views - from some unknown reason to me - postgres always picks bad plan...
i.e. list of employees what work in New York
we have employees_contract table:
contract_id, person_id, department_id,
a lot of others tables, but to make it shorter:
Department_view
Buidlings_view
and now query:
SELECT full_name FROM person_view INNER JOIN emplyee_contract USING (person_id) INNER JOIN department_view USING (department_id) INNER JOIN buildings_view USING (building_id) WHERE city_id = 'NY'
from some unknown reason - gives bad plan - then if we "refactor" query and send different question - we get good result... I am pretty sure planner should be capable to "rephrase" my question instead of me...
I would like to hear your expirience with VIEWS in postgres... And some kind of best practice/advice for described situation... So far it looks to me there is no way - to make things ready for any specific question - every "request" will need specific SQL syntax to drive planner in acceptable direction...
Many thanks,
Misa...
pgsql-general by date: