Thread: Table transform query
Hi, I'm trying to find out how to transform this kind of table data (history of rental data in a firm): date serial delivery ---------------------- 1 1 L 1 2 L 1 3 L 2 1 R 2 2 R 2 4 L 3 5 L 3 3 R 3 4 R 4 1 L ... into this: serial dateL dateR -------------------- 1 1 2 1 4 2 1 2 3 1 3 4 2 3 5 3 Basically, data on table 1 means: - on date 1, product with serial 1 is sent to the customer - on date 1, product with serial 2 is sent to the customer - on date 1, product with serial 3 is sent to the customer - on date 2, product with serial 1 comes back ... On table 2, data means: - Product with serial 1 is sent to the customer on date 1, and comes back on date 2 - Product with serial 1 is sent to the customer on date 4, and hasn't come back yet ... Do you think there is a generic SQL solution to this problem, like crosstab or pivot table? I'm thinking of doing things in a "procedural" plpgsql manner... Any idea is welcome. Thanks! Philippe
pgsql-sql-owner@postgresql.org wrote: > Hi, > > I'm trying to find out how to transform this kind of table data > (history of rental data in a firm): ... I have answred my own question: yes, there is a pure SQL solution, with a subselect: CREATE TABLE foo ( serial integer, delivery character(1), date integer ); INSERT INTO foo (serial, delivery, date) VALUES (1, 'L', 1); INSERT INTO foo (serial, delivery, date) VALUES (2, 'L', 1); INSERT INTO foo (serial, delivery, date) VALUES (3, 'L', 1); INSERT INTO foo (serial, delivery, date) VALUES (1, 'R', 2); INSERT INTO foo (serial, delivery, date) VALUES (2, 'R', 2); INSERT INTO foo (serial, delivery, date) VALUES (4, 'L', 2); INSERT INTO foo (serial, delivery, date) VALUES (5, 'L', 3); INSERT INTO foo (serial, delivery, date) VALUES (3, 'R', 3); INSERT INTO foo (serial, delivery, date) VALUES (4, 'R', 3); INSERT INTO foo (serial, delivery, date) VALUES (1, 'L', 4); ------------- select f.serial, f.date as dateL, ( select f2.date from foo as f2 where f2.serial = f.serial and f2.date > f.date and f2.delivery = 'R' order by f2.dateasc limit 1 ) as dateR from foo as f where f.delivery = 'L' order by f.serial, f.date ------------- I'm not sure if we could use a self-join here... Cheers, Philippe Lang
On 9/18/07, Philippe Lang <philippe.lang@attiksystem.ch> wrote: > ... into this: > > > serial dateL dateR > -------------------- > 1 1 2 > 1 4 > 2 1 2 > 3 1 3 > 4 2 3 > 5 3 SELECT t1.serial, t1.DATE AS datel, t2.DATE AS dater FROM t t1 LEFT JOIN t t2 ON( t1.serial = t2.serial AND t1.DATE < t2.DATE) WHERE t1.delivery = 'L' AND ( t2.delivery = 'R' OR t2.delivery ISNULL) ORDER BY t1.serial
Rodrigo De León skrev: > On 9/18/07, Philippe Lang <philippe.lang@attiksystem.ch> wrote: >> ... into this: >> >> >> serial dateL dateR >> -------------------- >> 1 1 2 >> 1 4 >> 2 1 2 >> 3 1 3 >> 4 2 3 >> 5 3 > > SELECT t1.serial, t1.DATE AS datel, t2.DATE AS dater > FROM t t1 LEFT JOIN t t2 ON( t1.serial = t2.serial > AND t1.DATE < t2.DATE) > WHERE t1.delivery = 'L' > AND ( t2.delivery = 'R' > OR t2.delivery IS NULL) > ORDER BY t1.serial This only works if (serial, delivery) is unique - which it doesn't appear to be, from the solution posted by Philippe himself (which does a LIMIT 1 in the subquery). A take on a self-join: SELECT t1.serial, t1.date as dateL, MIN(t2.date) as dateR FROM t t1 LEFT JOIN t t2 ON t1.serial = t2.serial AND t1.date < t2.date AND t2.delivery = 'R' WHERE t1.delivery = 'L' GROUP BY t1.serial, t1.date Whether this is any clearer, or runs faster, than the correlated subquery (which could be simplified by using MIN instead of LIMIT 1) is up for debate and test, respectively. Nis
> A take on a self-join: > > SELECT t1.serial, t1.date as dateL, MIN(t2.date) as dateR FROM t t1 > LEFT JOIN t t2 ON t1.serial = t2.serial AND t1.date < t2.date AND > t2.delivery = 'R' > WHERE t1.delivery = 'L' > GROUP BY t1.serial, t1.date > > Whether this is any clearer, or runs faster, than the correlated > subquery (which could be simplified by using MIN instead of LIMIT 1) > is up for debate and test, respectively. Hi Nis, Thanks for your tip with the "MIN" operator. I always imagined a self-join solution was faster than a query with a subselect. With a quick test, it seems to be the casehere. CREATE TABLE foo ( serial integer, delivery character(1), date integer ); INSERT INTO foo (serial, delivery, date) VALUES (1, 'L', 1); INSERT INTO foo (serial, delivery, date) VALUES (2, 'L', 1);INSERT INTO foo (serial, delivery, date) VALUES (3, 'L', 1); INSERT INTO foo (serial, delivery, date) VALUES (1, 'R',2); INSERT INTO foo (serial, delivery, date) VALUES (2, 'R', 2); INSERT INTO foo (serial, delivery, date) VALUES (4,'L', 2); INSERT INTO foo (serial, delivery, date) VALUES (5, 'L', 3); INSERT INTO foo (serial, delivery, date) VALUES(3, 'R', 3); INSERT INTO foo (serial, delivery, date) VALUES (4, 'R', 3); INSERT INTO foo (serial, delivery, date)VALUES (1, 'L', 4); -- Subselect SELECT f.serial, f.date as dateL, ( SELECT MIN(f2.date) FROM foo AS f2 WHERE f2.serial = f.serial AND f2.date > f.date AND f2.delivery = 'R' ) AS dateR FROM foo AS f WHERE f.delivery = 'L' ORDER BY f.serial, f.date -- Self-join SELECT t1.serial, t1.date as dateL, MIN(t2.date) as dateR FROM foo t1 LEFT JOIN foo t2 ON t1.serial = t2.serial AND t1.date < t2.date AND t2.delivery = 'R' WHERE t1.delivery = 'L' GROUP BY t1.serial, t1.date ORDER BY t1.serial, t1.date