Thread: Problem with left join when moving a column to another table
I am having some problems moving a column to another table and fixing some views that rely on it. I want to move the area_id column from t_offerprice_pipe to t_offerprice and then left join the results.
When I have only one table I get the correct results. area_id is currently in the t_offerprice_pipe. The working portion on the query is below.
I am joining the price.t_offerprice_pipe twice because I am looking for a wild card with manufacturer_id=-100 that has lower precedence than a specific manufacturer_id
LEFT JOIN t_offerprice_pipe opp ON opp.size_id = st.size_id AND
opp.manufacturer_id = st.manufacturer_id AND
opp.area_id = c.area_id
LEFT JOIN price.t_offerprice_pipe opam ON opam.size_id = st.size_id AND
opam.manufacturer_id = (-100) AND
opam.area_id = c.area_id
After moving the column to t_offerprice I am attempting to add a second left join, but is not working as I expected. I am getting multiple results from this query.
LEFT JOIN t_offerprice_pipe opp ON opp.size_id = st.size_id AND
opp.manufacturer_id = st.manufacturer_id
LEFT JOIN t_offerprice op ON op.id = opp.id AND
op.area_id = c.area_id
LEFT JOIN price.t_offerprice_pipe oppam ON oppam.size_id = st.size_id AND
oppam.manufacturer_id = (-100)
LEFT JOIN t_offerprice opam ON opam.id = oppam.id AND
opam.area_id = c.area_id
This is a stripped down version of the query for clarity.
I tried moving the condition into the where clause with no success.
I would greatly appreciate any advice on rewriting this query.
When I have only one table I get the correct results. area_id is currently in the t_offerprice_pipe. The working portion on the query is below.
I am joining the price.t_offerprice_pipe twice because I am looking for a wild card with manufacturer_id=-100 that has lower precedence than a specific manufacturer_id
LEFT JOIN t_offerprice_pipe opp ON opp.size_id = st.size_id AND
opp.manufacturer_id = st.manufacturer_id AND
opp.area_id = c.area_id
LEFT JOIN price.t_offerprice_pipe opam ON opam.size_id = st.size_id AND
opam.manufacturer_id = (-100) AND
opam.area_id = c.area_id
After moving the column to t_offerprice I am attempting to add a second left join, but is not working as I expected. I am getting multiple results from this query.
LEFT JOIN t_offerprice_pipe opp ON opp.size_id = st.size_id AND
opp.manufacturer_id = st.manufacturer_id
LEFT JOIN t_offerprice op ON op.id = opp.id AND
op.area_id = c.area_id
LEFT JOIN price.t_offerprice_pipe oppam ON oppam.size_id = st.size_id AND
oppam.manufacturer_id = (-100)
LEFT JOIN t_offerprice opam ON opam.id = oppam.id AND
opam.area_id = c.area_id
This is a stripped down version of the query for clarity.
I tried moving the condition into the where clause with no success.
I would greatly appreciate any advice on rewriting this query.
Jason Long-2 wrote > I am having some problems moving a column to another table and fixing > some views that rely on it. I want to move the area_id column from > t_offerprice_pipe to t_offerprice and then left join the results. > > When I have only one table I get the correct results. area_id is > currently in the t_offerprice_pipe. The working portion on the query is > below. Maybe someone else can make sense of your partial examples but I cannot. I'd suggest creating self-contained queries that exhibit both the correct and incorrect behavior. Use the following template: WITH from_table_not_specified (col1, col2) AS ( VALUES (1, 1), (2, 2) ) , t_offerprice_pipe (....) AS ( VALUES (...), (....) ) , to_offerprice (...) AS ( VALUES (...), (...) ) /* working query */ SELECT * FROM from_table_not_specified LEFT JOIN t_offerprice_pipe op1 ON ... LEFT JOIN t_offerprice_pipe op2 ON ... /* not working query using same or similar CTEs where possible. */ SELECT * FROM ... LEFT JOIN ... LEFT JOIN ... LEFT JOIN ... LEFT JOIN ... Without a working query it is really hard (impossible really) to debug "wrong number of rows" problems. Especially since the query itself is possibly not the problem but rather your data model is flawed. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Problem-with-left-join-when-moving-a-column-to-another-table-tp5760187p5760192.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
David,
Thank you very much for your response.
Below is a script that will reproduce the problem with comments included.
/*******************************************************/
--drop table t_item;
--drop table t_price_base_table;
--drop table t_price_original_with_area_id;
--this table represents inventory line items
CREATE TABLE t_item
(
id bigint NOT NULL,
size_id bigint NOT NULL,
area_id bigint NOT NULL,
CONSTRAINT pk_t_item PRIMARY KEY (id)
);
INSERT INTO t_item VALUES (1, 1, 10);
INSERT INTO t_item VALUES (2, 4, 1);
INSERT INTO t_item VALUES (3, 19, 1);
-- I want to move the area_id(and other columns not listed here) to another base table and left join it
CREATE TABLE t_price_original_with_area_id
(
id bigint NOT NULL,
size_id bigint NOT NULL,
area_id bigint NOT NULL,
CONSTRAINT pk_t_price_original_with_area_id PRIMARY KEY (id)
);
INSERT INTO t_price_original_with_area_id VALUES (162, 4, 6);
INSERT INTO t_price_original_with_area_id VALUES (161, 4, 2);
INSERT INTO t_price_original_with_area_id VALUES (159, 4, 1);
INSERT INTO t_price_original_with_area_id VALUES (638, 19, 9);
INSERT INTO t_price_original_with_area_id VALUES (633, 19, 14);
INSERT INTO t_price_original_with_area_id VALUES (675, 19, 45);
INSERT INTO t_price_original_with_area_id VALUES (64, 19, 1);
-- My simplified base table
CREATE TABLE t_price_base_table
(
id bigint NOT NULL,
area_id bigint NOT NULL,
CONSTRAINT pk_t_price_base_table PRIMARY KEY (id)
);
-- insert to add the information I want to transfer to the base table so I can drop the area_id column
insert into t_price_base_table (id, area_id) (select id, area_id from t_price_original_with_area_id);
/*
This is the working query. Note it joins size_id and area_id in one left join.
It produces 1 row for each item. There is no match for item 1. Item 2 and 3
match the price table.
*/
select it.*,
pwoa.*
from t_item it
left join t_price_original_with_area_id pwoa on it.size_id=pwoa.size_id and
it.area_id=pwoa.area_id
order by it.id;
/*
This is the new query that is not working correctly.
I am trying to left join the base table by its id and area_id.
I need a left join because there is no guarantee that there is a matching price.
The where claues seems to work, but I the orginal query is much more complicated,
and I will be needed to do a simiar join in may views.
*/
select it.*,
pwoa.*,
pbt.*
from t_item it
left join t_price_original_with_area_id pwoa on it.size_id=pwoa.size_id
left join t_price_base_table pbt on pbt.id=pwoa.id and
it.area_id=pbt.area_id
/*
where (pwoa.id is not null and pbt.id is not null) or
(pwoa.id is null and pbt.id is null)
*/
order by it.id;
/*******************************************************/
On Thu, 2013-06-20 at 12:29 -0700, David Johnston wrote:
Thank you very much for your response.
Below is a script that will reproduce the problem with comments included.
/*******************************************************/
--drop table t_item;
--drop table t_price_base_table;
--drop table t_price_original_with_area_id;
--this table represents inventory line items
CREATE TABLE t_item
(
id bigint NOT NULL,
size_id bigint NOT NULL,
area_id bigint NOT NULL,
CONSTRAINT pk_t_item PRIMARY KEY (id)
);
INSERT INTO t_item VALUES (1, 1, 10);
INSERT INTO t_item VALUES (2, 4, 1);
INSERT INTO t_item VALUES (3, 19, 1);
-- I want to move the area_id(and other columns not listed here) to another base table and left join it
CREATE TABLE t_price_original_with_area_id
(
id bigint NOT NULL,
size_id bigint NOT NULL,
area_id bigint NOT NULL,
CONSTRAINT pk_t_price_original_with_area_id PRIMARY KEY (id)
);
INSERT INTO t_price_original_with_area_id VALUES (162, 4, 6);
INSERT INTO t_price_original_with_area_id VALUES (161, 4, 2);
INSERT INTO t_price_original_with_area_id VALUES (159, 4, 1);
INSERT INTO t_price_original_with_area_id VALUES (638, 19, 9);
INSERT INTO t_price_original_with_area_id VALUES (633, 19, 14);
INSERT INTO t_price_original_with_area_id VALUES (675, 19, 45);
INSERT INTO t_price_original_with_area_id VALUES (64, 19, 1);
-- My simplified base table
CREATE TABLE t_price_base_table
(
id bigint NOT NULL,
area_id bigint NOT NULL,
CONSTRAINT pk_t_price_base_table PRIMARY KEY (id)
);
-- insert to add the information I want to transfer to the base table so I can drop the area_id column
insert into t_price_base_table (id, area_id) (select id, area_id from t_price_original_with_area_id);
/*
This is the working query. Note it joins size_id and area_id in one left join.
It produces 1 row for each item. There is no match for item 1. Item 2 and 3
match the price table.
*/
select it.*,
pwoa.*
from t_item it
left join t_price_original_with_area_id pwoa on it.size_id=pwoa.size_id and
it.area_id=pwoa.area_id
order by it.id;
/*
This is the new query that is not working correctly.
I am trying to left join the base table by its id and area_id.
I need a left join because there is no guarantee that there is a matching price.
The where claues seems to work, but I the orginal query is much more complicated,
and I will be needed to do a simiar join in may views.
*/
select it.*,
pwoa.*,
pbt.*
from t_item it
left join t_price_original_with_area_id pwoa on it.size_id=pwoa.size_id
left join t_price_base_table pbt on pbt.id=pwoa.id and
it.area_id=pbt.area_id
/*
where (pwoa.id is not null and pbt.id is not null) or
(pwoa.id is null and pbt.id is null)
*/
order by it.id;
/*******************************************************/
On Thu, 2013-06-20 at 12:29 -0700, David Johnston wrote:
Jason Long-2 wrote > I am having some problems moving a column to another table and fixing > some views that rely on it. I want to move the area_id column from > t_offerprice_pipe to t_offerprice and then left join the results. > > When I have only one table I get the correct results. area_id is > currently in the t_offerprice_pipe. The working portion on the query is > below. Maybe someone else can make sense of your partial examples but I cannot. I'd suggest creating self-contained queries that exhibit both the correct and incorrect behavior. Use the following template: WITH from_table_not_specified (col1, col2) AS ( VALUES (1, 1), (2, 2) ) , t_offerprice_pipe (....) AS ( VALUES (...), (....) ) , to_offerprice (...) AS ( VALUES (...), (...) ) /* working query */ SELECT * FROM from_table_not_specified LEFT JOIN t_offerprice_pipe op1 ON ... LEFT JOIN t_offerprice_pipe op2 ON ... /* not working query using same or similar CTEs where possible. */ SELECT * FROM ... LEFT JOIN ... LEFT JOIN ... LEFT JOIN ... LEFT JOIN ... Without a working query it is really hard (impossible really) to debug "wrong number of rows" problems. Especially since the query itself is possibly not the problem but rather your data model is flawed. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Problem-with-left-join-when-moving-a-column-to-another-table-tp5760187p5760192.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Jason Long-2 wrote > David, > > Thank you very much for your response. > Below is a script that will reproduce the problem with comments > included. > > .... > > /* > This is the new query that is not working correctly. > I am trying to left join the base table by its id and area_id. > I need a left join because there is no guarantee that there is a > matching price. The query I am pretty sure you want is: WITH item (i_id, size_id, area_id) AS ( VALUES (1,1,10),(2,4,1),(3,19,1) ) , price_orig (p_id, size_id, area_id) AS ( VALUES (162,4,6),(161,4,2),(159,4,1),(638,19,9),(633,19,14),(675,19,45),(64,19,1) ) , simple_base (p_id, area_id) AS ( SELECT p_id, area_id FROm price_orig ) --SELECT * FROM item LEFT JOIN price_orig USING (size_id, area_id) --original /* your problem query SELECT * FROM item LEFT JOIN price_orig USING (size_id) LEFT JOIN simple_base ON (price_orig.p_id = simple_base.p_id AND item.area_id = simple_base.area_id) */ -- the correct query SELECT * FROM item LEFT JOIN (SELECT p_id, price_orig.size_id, simple_base.area_id FROM price_orig JOIN simple_base USING (p_id)) rebuild USING (size_id, area_id) In the original query you used both size and area to link to the price table. Even though you have moved the area to a different table in order to keep the same semantics you have to continue performing the same relational join. If you intend something different then you are not providing enough information since neither size_id nor area_id are unique within the price table. Because the combination of the two just happens to not be duplicated in the supplied data the correct queries only return a single result per item. In the "correct" query I am providing I am first re-joining (with an inner join) the two tables so that they appear just like the original table appeared. Then I am joining the "view" to the items table using both size and area. The fundamental problem is that you really do not want right-hand tables in left joins to refer to each other. FROM item LEFT JOIN price_orig ON item = price_orig LEFT JOIN price_base ON item = price_base AND price_orig = price_base --< the second AND expression is the problem. I do not even try to remember nesting rules for JOIN generally. My basic form is: FROM INNER* LEFT* (with the ON clause only referring to tables joined via INNER) if my solution requires a different usage I either move parts of the query into CTEs or I start explicitly adding parenthesis to explicitly group the different pieces - and adding INNER JOIN where necessary like I did for your example. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Problem-with-left-join-when-moving-a-column-to-another-table-tp5760187p5760210.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On Thu, 2013-06-20 at 15:37 -0700, David Johnston wrote:
size_id and area_id will have to be unique across both t_price_base_table and t_price_original_with_area_id. I will want to drop area_id from t_price_original_with_area_id.
What is the best way to implement the cross table unique constraint?
There is a unique constraint on the real price table. I hadn't thought of how I will enforce the constraint across two tables.Jason Long-2 wrote > David, > > Thank you very much for your response. > Below is a script that will reproduce the problem with comments > included. > > .... > > /* > This is the new query that is not working correctly. > I am trying to left join the base table by its id and area_id. > I need a left join because there is no guarantee that there is a > matching price. The query I am pretty sure you want is: WITH item (i_id, size_id, area_id) AS ( VALUES (1,1,10),(2,4,1),(3,19,1) ) , price_orig (p_id, size_id, area_id) AS ( VALUES (162,4,6),(161,4,2),(159,4,1),(638,19,9),(633,19,14),(675,19,45),(64,19,1) ) , simple_base (p_id, area_id) AS ( SELECT p_id, area_id FROm price_orig ) --SELECT * FROM item LEFT JOIN price_orig USING (size_id, area_id) --original /* your problem query SELECT * FROM item LEFT JOIN price_orig USING (size_id) LEFT JOIN simple_base ON (price_orig.p_id = simple_base.p_id AND item.area_id = simple_base.area_id) */ -- the correct query SELECT * FROM item LEFT JOIN (SELECT p_id, price_orig.size_id, simple_base.area_id FROM price_orig JOIN simple_base USING (p_id)) rebuild USING (size_id, area_id) In the original query you used both size and area to link to the price table. Even though you have moved the area to a different table in order to keep the same semantics you have to continue performing the same relational join. If you intend something different then you are not providing enough information since neither size_id nor area_id are unique within the price table. Because the combination of the two just happens to not be duplicated in the supplied data the correct queries only return a single result per item.
size_id and area_id will have to be unique across both t_price_base_table and t_price_original_with_area_id. I will want to drop area_id from t_price_original_with_area_id.
What is the best way to implement the cross table unique constraint?
In the "correct" query I am providing I am first re-joining (with an inner join) the two tables so that they appear just like the original table appeared. Then I am joining the "view" to the items table using both size and area. The fundamental problem is that you really do not want right-hand tables in left joins to refer to each other. FROM item LEFT JOIN price_orig ON item = price_orig LEFT JOIN price_base ON item = price_base AND price_orig = price_base --< the second AND expression is the problem. I do not even try to remember nesting rules for JOIN generally. My basic form is: FROM INNER* LEFT* (with the ON clause only referring to tables joined via INNER) if my solution requires a different usage I either move parts of the query into CTEs or I start explicitly adding parenthesis to explicitly group the different pieces - and adding INNER JOIN where necessary like I did for your example. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Problem-with-left-join-when-moving-a-column-to-another-table-tp5760187p5760210.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
Jason Long-2 wrote >> Jason Long-2 wrote > > > There is a unique constraint on the real price table. I hadn't thought > of how I will enforce the constraint across two tables. > size_id and area_id will have to be unique across both > t_price_base_table and t_price_original_with_area_id. I will want to > drop area_id from t_price_original_with_area_id. > > What is the best way to implement the cross table unique constraint? Don't. If size+area is a unique constraint then there should be a table that defines valid pairs and creates a PRIMARY KEY over them. Per my original comment your issue isn't JOINs (well, your biggest issue anyway) but your model. The fact that you couldn't write a good query simply exposed the problems in the model. This is not uncommon. I would need a lot more information (and time) than I have now to offer any design thoughts on your schema; though I do find the unique constraint over size+area to be unusual - as well as using that as a foreign key from the item table. You haven't specified the domain for this model but using homes as an example I would use a 'model' table with "model_id, size, area" as columns. A particular house would then link in "model" and "price". You could possibly further restrict that certain models can only sell for certain prices if necessary - in which case you would have "model_price" and possibly "house_model_price" tables (the later could be an FK). David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Problem-with-left-join-when-moving-a-column-to-another-table-tp5760187p5760220.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On Thu, 2013-06-20 at 16:22 -0700, David Johnston wrote:
David,Jason Long-2 wrote >> Jason Long-2 wrote > > > There is a unique constraint on the real price table. I hadn't thought > of how I will enforce the constraint across two tables. > size_id and area_id will have to be unique across both > t_price_base_table and t_price_original_with_area_id. I will want to > drop area_id from t_price_original_with_area_id. > > What is the best way to implement the cross table unique constraint? Don't. If size+area is a unique constraint then there should be a table that defines valid pairs and creates a PRIMARY KEY over them. Per my original comment your issue isn't JOINs (well, your biggest issue anyway) but your model. The fact that you couldn't write a good query simply exposed the problems in the model. This is not uncommon. I would need a lot more information (and time) than I have now to offer any design thoughts on your schema; though I do find the unique constraint over size+area to be unusual - as well as using that as a foreign key from the item table. You haven't specified the domain for this model but using homes as an example I would use a 'model' table with "model_id, size, area" as columns. A particular house would then link in "model" and "price". You could possibly further restrict that certain models can only sell for certain prices if necessary - in which case you would have "model_price" and possibly "house_model_price" tables (the later could be an FK). David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Problem-with-left-join-when-moving-a-column-to-another-table-tp5760187p5760220.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
I really appreciate your help. I had not used WITH or any of the syntax you showed me before. Pretty cool.
I normally just write a bunch of views to build complex queries.
Does the syntax you showed me have performance benefits vs joining a bunch of views together?
I spent way to much time trying to get the query to work, and all I needed to do was write a view
create or replace view price.v_offerprice_pipe as
select op.id,
op.price,
op.active,
op.stditem,
op.version,
opp.size_id,
opp.weight_id,
opp.grade_id,
opp.endfinish_id,
opp.manufacturer_id,
opp.condition_id,
opp.area_id
from price.t_offerprice_pipe opp
join price.t_offerprice op on op.id=opp.id;
This allowed me to move (price, active, stditem, version) to the base table without breaking any of my views with very minimal change to the view definitions.
I just had to replace any references to price.t_offerprice_pipe with the view price.v_offerprice_pipe in any of the views that were complaining about dropping the columns.
I decided not to move area_id to the base table for now. Without being able to properly do a cross table unique constraint, it will stay where it is currently.
Jason Long-2 wrote > Does the syntax you showed me have performance benefits vs joining a > bunch of views together? As a general rule CTE/WITH is going to be worse performing than the equivalent view definition - depending on the view is actually used in the query of course. They both have their place. A CTE/WITH is basically a per-query VIEW though there is an optimization barrier that doesn't allow the main query WHERE clause to limit the queries like it a view would normally allow if possible. Because of this I'll occasionally find need to specify redundant where clauses inside the CTE to get decent performance on large tables - mostly for my interactive queries. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Problem-with-left-join-when-moving-a-column-to-another-table-tp5760187p5760255.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.