Thread: Table transform query

Table transform query

From

"Philippe Lang"

Date:

18 September 2007, 10:26:53

Hi,

I'm trying to find out how to transform this kind of table data (history
of rental data in a firm):


date  serial  delivery
----------------------
1     1       L
1     2       L
1     3       L
2     1       R
2     2       R
2     4       L
3     5       L
3     3       R
3     4       R
4     1       L


... into this:


serial  dateL  dateR
--------------------
1       1      2
1       4
2       1      2
3       1      3
4       2      3
5       3


Basically, data on table 1 means:

- on date 1, product with serial 1 is sent to the customer
- on date 1, product with serial 2 is sent to the customer
- on date 1, product with serial 3 is sent to the customer
- on date 2, product with serial 1 comes back
...

On table 2, data means:

- Product with serial 1 is sent to the customer on date 1, and comes
back on date 2
- Product with serial 1 is sent to the customer on date 4, and hasn't
come back yet
...


Do you think there is a generic SQL solution to this problem, like
crosstab or pivot table?

I'm thinking of doing things in a "procedural" plpgsql manner...

Any idea is welcome.

Thanks!

Philippe

Re: Table transform query

From

"Philippe Lang"

Date:

18 September 2007, 10:50:08

pgsql-sql-owner@postgresql.org wrote:

> Hi,
>
> I'm trying to find out how to transform this kind of table data
> (history of rental data in a firm):

...

I have answred my own question: yes, there is a pure SQL solution, with
a subselect:

CREATE TABLE foo (   serial integer,   delivery character(1),   date integer
);

INSERT INTO foo (serial, delivery, date) VALUES (1, 'L', 1);
INSERT INTO foo (serial, delivery, date) VALUES (2, 'L', 1);
INSERT INTO foo (serial, delivery, date) VALUES (3, 'L', 1);
INSERT INTO foo (serial, delivery, date) VALUES (1, 'R', 2);
INSERT INTO foo (serial, delivery, date) VALUES (2, 'R', 2);
INSERT INTO foo (serial, delivery, date) VALUES (4, 'L', 2);
INSERT INTO foo (serial, delivery, date) VALUES (5, 'L', 3);
INSERT INTO foo (serial, delivery, date) VALUES (3, 'R', 3);
INSERT INTO foo (serial, delivery, date) VALUES (4, 'R', 3);
INSERT INTO foo (serial, delivery, date) VALUES (1, 'L', 4);


-------------
select

f.serial,
f.date as dateL,
( select f2.date from foo as f2  where f2.serial = f.serial  and f2.date > f.date  and f2.delivery = 'R'  order by
f2.dateasc  limit 1 
) as dateR

from foo as f
where f.delivery = 'L'
order by f.serial, f.date
-------------


I'm not sure if we could use a self-join here...


Cheers,


Philippe Lang

Re: Table transform query

From

"Rodrigo De León"

Date:

18 September 2007, 10:51:22

On 9/18/07, Philippe Lang <philippe.lang@attiksystem.ch> wrote:
> ... into this:
>
>
> serial  dateL  dateR
> --------------------
> 1       1      2
> 1       4
> 2       1      2
> 3       1      3
> 4       2      3
> 5       3

SELECT   t1.serial, t1.DATE AS datel, t2.DATE AS dater   FROM t t1 LEFT JOIN t t2 ON(    t1.serial = t2.serial
                    AND t1.DATE < t2.DATE)  WHERE t1.delivery = 'L'    AND (   t2.delivery = 'R'         OR t2.delivery
ISNULL)
 
ORDER BY t1.serial

Re: Table transform query

From

Nis Jørgensen

Date:

19 September 2007, 05:13:54

Rodrigo De León skrev:
> On 9/18/07, Philippe Lang <philippe.lang@attiksystem.ch> wrote:
>> ... into this:
>>
>>
>> serial  dateL  dateR
>> --------------------
>> 1       1      2
>> 1       4
>> 2       1      2
>> 3       1      3
>> 4       2      3
>> 5       3
> 
> SELECT   t1.serial, t1.DATE AS datel, t2.DATE AS dater
>     FROM t t1 LEFT JOIN t t2 ON(    t1.serial = t2.serial
>                                 AND t1.DATE < t2.DATE)
>    WHERE t1.delivery = 'L'
>      AND (   t2.delivery = 'R'
>           OR t2.delivery IS NULL)
> ORDER BY t1.serial

This only works if (serial, delivery) is unique - which it doesn't
appear to be, from the solution posted by Philippe himself (which does a
LIMIT 1 in the subquery).

A take on a self-join:

SELECT t1.serial, t1.date as dateL, MIN(t2.date) as dateR
FROM t t1 LEFT JOIN t t2
ON t1.serial = t2.serial AND t1.date < t2.date AND t2.delivery = 'R'
WHERE t1.delivery = 'L'
GROUP BY t1.serial, t1.date

Whether this is any clearer, or runs faster, than the correlated
subquery (which could be  simplified by using MIN instead of LIMIT 1) is
up for debate and test, respectively.

Nis

Re: Table transform query

From

"Philippe Lang"

Date:

21 September 2007, 05:25:53

> A take on a self-join:
>
> SELECT t1.serial, t1.date as dateL, MIN(t2.date) as dateR FROM t t1
> LEFT JOIN t t2 ON t1.serial = t2.serial AND t1.date < t2.date AND
> t2.delivery = 'R'
> WHERE t1.delivery = 'L'
> GROUP BY t1.serial, t1.date
>
> Whether this is any clearer, or runs faster, than the correlated
> subquery (which could be  simplified by using MIN instead of LIMIT 1)
> is up for debate and test, respectively.

Hi Nis,

Thanks for your tip with the "MIN" operator.


I always imagined a self-join solution was faster than a query with a subselect. With a quick test, it seems to be the
casehere. 



CREATE TABLE foo (   serial integer,   delivery character(1),   date integer
);

INSERT INTO foo (serial, delivery, date) VALUES (1, 'L', 1); INSERT INTO foo (serial, delivery, date) VALUES (2, 'L',
1);INSERT INTO foo (serial, delivery, date) VALUES (3, 'L', 1); INSERT INTO foo (serial, delivery, date) VALUES (1,
'R',2); INSERT INTO foo (serial, delivery, date) VALUES (2, 'R', 2); INSERT INTO foo (serial, delivery, date) VALUES
(4,'L', 2); INSERT INTO foo (serial, delivery, date) VALUES (5, 'L', 3); INSERT INTO foo (serial, delivery, date)
VALUES(3, 'R', 3); INSERT INTO foo (serial, delivery, date) VALUES (4, 'R', 3); INSERT INTO foo (serial, delivery,
date)VALUES (1, 'L', 4); 


-- Subselect
SELECT
f.serial,
f.date as dateL,
( SELECT MIN(f2.date) FROM foo AS f2 WHERE f2.serial = f.serial AND f2.date > f.date AND f2.delivery = 'R'
) AS dateR
FROM foo AS f
WHERE f.delivery = 'L'
ORDER BY f.serial, f.date

-- Self-join
SELECT
t1.serial,
t1.date as dateL,
MIN(t2.date) as dateR
FROM foo t1
LEFT JOIN foo t2
ON t1.serial = t2.serial
AND t1.date < t2.date
AND t2.delivery = 'R'
WHERE t1.delivery = 'L'
GROUP BY t1.serial, t1.date
ORDER BY t1.serial, t1.date