Thread: Difference in columns

Difference in columns

From
"Mag Gam"
Date:
Hi All,<br /><br />I have a view that generates output similar to this.<br /><br />select * from foo.view;<br /><br
/><preclass="PROGRAMLISTING">       ts          | size<br />-------------------+-----<br /> 2002-03-16        | 11 <br
/>2002-03-17        | 16 <br />2002-03-18        | 18<br /> 2002-03-19        | 12 
 
<br /></pre>I am trying to find the difference between the size column. So the desired output would be<br /><br /><pre
class="PROGRAMLISTING">      ts          | size| Diff<br />-------------------+-----+------<br /> 2002-03-16        |
11 | 0<br />2002-03-17        | 15  | 4<br /> 2002-03-18        | 18  | 3<br /> 2002-03-19        | 12  | -6<br
/></pre><br/>I need the first column to be 0, since it will be 11-11. The second colum is 15-11. The third column is
18-15.The fourth column is 12-18. <br /><br />Any thoughts about this?<br /><br />TIA<br /> 

Re: Difference in columns

From
chester c young
Date:
--- Mag Gam <magawake@gmail.com> wrote:

> Hi All,
> 
> I have a view that generates output similar to this.
> 
> select * from foo.view;
> 
>        ts          | size
> -------------------+-----
>  2002-03-16        | 11
>  2002-03-17        | 16
>  2002-03-18        | 18
>  2002-03-19        | 12
> 
> I am trying to find the difference between the size column. So the
> desired
> output would be
> 
>        ts          | size| Diff
> -------------------+-----+------
>  2002-03-16        | 11  | 0
>  2002-03-17        | 15  | 4
>  2002-03-18        | 18  | 3
>  2002-03-19        | 12  | -6
> 
> 
> I need the first column to be 0, since it will be 11-11. The second
> colum is
> 15-11. The third column is 18-15. The fourth column is 12-18.
> 
> Any thoughts about this?
> 

select cur.ts, cur.size,  cur.size - coalesce(    (select size from view next      where next.ts = cur.ts - '1
day'::interval),    cur.size ) as diff
 
from view cur;

alternately:

select cur.ts, cur.size,  case when cur.ts = '2002-03-16' then 0 else cur.size - coalesce(    (select size from view
next     where next.ts = cur.ts - '1 day'::interval),     cur.size ) end as diff
 
from view;


     ____________________________________________________________________________________
Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ


Re: Difference in columns

From
"Gurjeet Singh"
Date:
On Sun, May 11, 2008 at 11:07 PM, Mag Gam <magawake@gmail.com> wrote:
Hi All,

I have a view that generates output similar to this.

select * from foo.view;

       ts          | size
-------------------+-----
2002-03-16 | 11
2002-03-17 | 16
2002-03-18 | 18
 2002-03-19 | 12
I am trying to find the difference between the size column. So the desired output would be

       ts          | size| Diff
-------------------+-----+------
2002-03-16 | 11 | 0
2002-03-17 | 15 | 4
2002-03-18 | 18 | 3
 2002-03-19 | 12 | -6

I need the first column to be 0, since it will be 11-11. The second colum is 15-11. The third column is 18-15. The fourth column is 12-18.

Any thoughts about this?

Try this:

select   ts,
            size,
            t1.size - (select t2.size
                            from foo.view as t2
                            where t2.ts < t1.ts
                            order by ts desc
                            limit 1) as diff
from foo.view as t1
order by ts asc;

HTH,

--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device

Re: Difference in columns

From
hubert depesz lubaczewski
Date:
On Sun, May 11, 2008 at 01:37:52PM -0400, Mag Gam wrote:
> Any thoughts about this?

1. will there be any gaps in between dates? if yes, what should be diff
be then?
2. can't you calculate it in client application?
3. is usage of pl/pgsql acceptable (i think this will be the fastest way
to do it in postgresql itself).

depesz

-- 
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)


Re: Difference in columns

From
Craig Ringer
Date:
Mag Gam wrote:

> 
>        ts          | size| Diff
> -------------------+-----+------
>  2002-03-16        | 11  | 0
> 
>  2002-03-17        | 15  | 4
>  2002-03-18        | 18  | 3
>  2002-03-19        | 12  | -6
> 
> 
> I need the first column to be 0, since it will be 11-11. The second
> colum is 15-11. The third column is 18-15. The fourth column is 12-18.
> 
> Any thoughts about this?

Without making any comments on the advisability of the structure you're
trying to use, here are a few ideas.

The easy way is to use PL/PgSQL and FOR EACH .. SELECT . It's probably
going to be rather fast too as it can use a single sequential scan.

Otherwise (all examples use the following code):

CREATE TABLE x (ts timestamp, size int);
INSERT INTO x (ts, size) VALUES
('2002-03-16',11),
('2002-03-17',15),
('2002-03-18',18),
('2002-03-19',12);

If you can assume that there is always exactly 1 day between entries
then it's easy enough with a self join.

If you cannot assume that, you can use a subquery with limit and order
by to obtain the next record:

SELECT a.ts, (SELECT b.size FROM x b WHERE b.ts > a.ts ORDER BY b.ts ASC LIMIT 1)   - a.size AS difference
FROM x a;

... but that'll be really slow for any significant number of entries.


Another alternative if you can't assume each record is always exactly 1
day apart is to populate a temp table with the values and add a serial
column that guarantees a 1 offset between values, then do a self join. I
have no idea whether or not this might be faster, but thought I'd throw
it out there as an alternative:

CREATE TEMPORARY SEQUENCE x_seq;

SELECT nextval('x_seq') AS id, ts, size
INTO TEMPORARY TABLE x_temp
FROM x
ORDER BY ts ASC;

SELECT a.ts, a.size - b.size AS diff
FROM x_temp a, x_temp b
WHERE a.id = b.id + 1;

Note that this query doesn't give you the first record with zero
difference; it returns only true differences. Here's one possible way to
add your initial record:

SELECT a.ts, b.size - a.size AS diff
FROM x_temp a, x_temp b
WHERE b.id = a.id + 1
OR (b.id = (SELECT min(id) FROM x_temp) AND a.id = b.id);

--
Craig Ringer


Re: Difference in columns

From
"Gurjeet Singh"
Date:
On Sun, May 11, 2008 at 11:47 PM, Craig Ringer <craig@postnewspapers.com.au> wrote:
Mag Gam wrote:

>
>        ts          | size| Diff
> -------------------+-----+------
>  2002-03-16        | 11  | 0
>
>  2002-03-17        | 15  | 4
>  2002-03-18        | 18  | 3
>  2002-03-19        | 12  | -6
>
>
> I need the first column to be 0, since it will be 11-11. The second
> colum is 15-11. The third column is 18-15. The fourth column is 12-18.
>
> Any thoughts about this?

Without making any comments on the advisability of the structure you're
trying to use, here are a few ideas.

The easy way is to use PL/PgSQL and FOR EACH .. SELECT . It's probably
going to be rather fast too as it can use a single sequential scan.

Otherwise (all examples use the following code):

CREATE TABLE x (ts timestamp, size int);
INSERT INTO x (ts, size) VALUES
('2002-03-16',11),
('2002-03-17',15),
('2002-03-18',18),
('2002-03-19',12);

If you can assume that there is always exactly 1 day between entries
then it's easy enough with a self join.

If you cannot assume that, you can use a subquery with limit and order
by to obtain the next record:

SELECT
 a.ts,
 (SELECT b.size FROM x b WHERE b.ts > a.ts ORDER BY b.ts ASC LIMIT 1)
   - a.size AS difference
FROM x a;

... but that'll be really slow for any significant number of entries.

not really... if you have an index on the TS column.

Best regards,
--
gurjeet[.singh]@EnterpriseDB.com
singh.gurjeet@{ gmail | hotmail | indiatimes | yahoo }.com

EnterpriseDB http://www.enterprisedb.com

Mail sent from my BlackLaptop device

Re: Difference in columns

From
Tom Lane
Date:
"Gurjeet Singh" <singh.gurjeet@gmail.com> writes:
> On Sun, May 11, 2008 at 11:47 PM, Craig Ringer <craig@postnewspapers.com.au>
>> If you cannot assume that, you can use a subquery with limit and order
>> by to obtain the next record:
>> 
>> SELECT
>> a.ts,
>> (SELECT b.size FROM x b WHERE b.ts > a.ts ORDER BY b.ts ASC LIMIT 1)
>> - a.size AS difference
>> FROM x a;
>> 
>> ... but that'll be really slow for any significant number of entries.

> not really... if you have an index on the TS column.

The OP said this was a view, so it may well not have any easy way to
provide such an index.
        regards, tom lane


Re: Difference in columns

From
Craig Ringer
Date:
Mag Gam wrote:

> I am trying to find the difference between the size column. So the
> desired output would be
> 
>        ts          | size| Diff
> -------------------+-----+------
>  2002-03-16        | 11  | 0
> 
>  2002-03-17        | 15  | 4
>  2002-03-18        | 18  | 3
>  2002-03-19        | 12  | -6
> 
> 
> I need the first column to be 0, since it will be 11-11. The second
> colum is 15-11. The third column is 18-15. The fourth column is 12-18.
> 
> Any thoughts about this?

Here's one way to do this with PL/PgSQL. It's probably not the most
efficient, but it does work. For this code to be safe `size' must never
be NULL and `ts' must be unique across all records in the input set.

CREATE OR REPLACE FUNCTION x_diff(   OUT ts TIMESTAMP,   OUT size INTEGER,   OUT diff INTEGER)
RETURNS SETOF record AS $$
DECLARE   cur_x x;   last_size INTEGER := null;
BEGIN   FOR cur_x IN SELECT * FROM x ORDER BY ts ASC LOOP       ts := cur_x.ts;       size := cur_x.size;       IF
last_sizeIS NULL THEN           -- First record in set has diff `0' because the differences           -- are defined
againstthe previous, rather than next,           -- record.           diff := 0;       ELSE           diff :=
cur_x.size- last_size;       END IF;       last_size := cur_x.size;       RETURN NEXT;   END LOOP;   RETURN;
 
END;
$$ LANGUAGE 'plpgsql' STRICT;

If you need to constrain the range of values processed that's not too
tricky - either feed the function a refcursor for a query result set to
iterate over, or pass it parameters to constrain the query with a WHERE
clause. The former is more flexible, the latter is easier to use.

--
Craig Ringer


Re: Difference in columns

From
"Mag Gam"
Date:
Thanks all. I will try some of these suggestions. <br /><br /><br /><div class="gmail_quote">On Sun, May 11, 2008 at
3:58PM, Craig Ringer <<a href="mailto:craig@postnewspapers.com.au">craig@postnewspapers.com.au</a>> wrote:<br
/><blockquoteclass="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex;
padding-left:1ex;"><div class="Ih2E3d">Mag Gam wrote:<br /><br /> > I am trying to find the difference between the
sizecolumn. So the<br /> > desired output would be<br /> ><br /> >        ts          | size| Diff<br /> >
-------------------+-----+------<br/> >  2002-03-16        | 11  | 0<br /> ><br /> >  2002-03-17        | 15
 |4<br /> >  2002-03-18        | 18  | 3<br /> >  2002-03-19        | 12  | -6<br /> ><br /> ><br /> > I
needthe first column to be 0, since it will be 11-11. The second<br /> > colum is 15-11. The third column is 18-15.
Thefourth column is 12-18.<br /> ><br /> > Any thoughts about this?<br /><br /></div>Here's one way to do this
withPL/PgSQL. It's probably not the most<br /> efficient, but it does work. For this code to be safe `size' must
never<br/> be NULL and `ts' must be unique across all records in the input set.<br /><br /> CREATE OR REPLACE FUNCTION
x_diff(<br/>    OUT ts TIMESTAMP,<br />    OUT size INTEGER,<br />    OUT diff INTEGER)<br /> RETURNS SETOF record AS
$$<br/> DECLARE<br />    cur_x x;<br />    last_size INTEGER := null;<br /> BEGIN<br />    FOR cur_x IN SELECT * FROM x
ORDERBY ts ASC LOOP<br />        ts := cur_x.ts;<br />        size := cur_x.size;<br />        IF last_size IS NULL
THEN<br/>            -- First record in set has diff `0' because the differences<br />            -- are defined
againstthe previous, rather than next,<br />            -- record.<br />            diff := 0;<br />        ELSE<br />
          diff := cur_x.size - last_size;<br />        END IF;<br />        last_size := cur_x.size;<br />      
 RETURNNEXT;<br />    END LOOP;<br />    RETURN;<br /> END;<br /> $$ LANGUAGE 'plpgsql' STRICT;<br /><br /> If you need
toconstrain the range of values processed that's not too<br /> tricky - either feed the function a refcursor for a
queryresult set to<br /> iterate over, or pass it parameters to constrain the query with a WHERE<br /> clause. The
formeris more flexible, the latter is easier to use.<br /><br /> --<br /><font color="#888888">Craig Ringer<br
/></font></blockquote></div><br/>