Thread: Using generate_series to create a unique ID in a query?

Using generate_series to create a unique ID in a query?

From
Sarah Dougherty
Date:
Hello,

I am trying to create a view that will contain a generated sequence
(unique ID), and am running into problems doing so.

For some context, I am trying to create a report that provides a list of
client charges and payments and a "running balance" after each
transaction. Because  we often have multiple charges and/or payments on
the same day, we can't use the transaction date to calculate this
balance.  Instead, I want to calculate our running balance by assigning
a transaction ID to each transaction a d then having the query sum up
transaction amounts for all transactions with an equal or lower ID.

I can use generate_series to produce a set of IDs, but can't get it to
join properly to the rest of my query.  For example, if I had 10 rows in
my query, I would get a series of 1 to 10, but would then get 100 rows
(10x10) in my result.  Ultimately the results of this query are going to
be used as a view, so I'd like to avoid creating a temp table, sequence,
etc. Does anyone know how to use generate_series in this manner, or know
of some other way I can go about this?  Thanks in advance!

To recap with an example, the query below works fine, but how do I add a
series to it?

SELECT * FROM (

SELECT
          client_id,
          effective_date AS transaction_date,
          amount AS charge_amount,
          0 AS payment_amount
   FROM    charge
   UNION
   SELECT
           client_id,
           payment_date AS transaction_date,
           0 as charge_amount,
           amount AS payment_amount
   FROM payment

) AS tmp
ORDER BY
   transaction_date,
   charge_amount<>0 /* order charges before payments */

Thanks,
Sarah Dougherty

Attachment

Re: Using generate_series to create a unique ID in a query?

From
"Pavel Stehule"
Date:
Hello

use temporary sequence instead.
postgres=#create temp sequence a;
CREATE SEQUENCE
postgres=# select nextval('a'), b FROM (values(1),(2),(10),(20)) b(b);
 nextval | b
---------+----
       1 |  1
       2 |  2
       3 | 10
       4 | 20
(4 rows)

Regards
Pavel Stehule

On 13/11/2007, Sarah Dougherty <sdougherty@desc.org> wrote:
> Hello,
>
> I am trying to create a view that will contain a generated sequence
> (unique ID), and am running into problems doing so.
>
> For some context, I am trying to create a report that provides a list of
> client charges and payments and a "running balance" after each
> transaction. Because  we often have multiple charges and/or payments on
> the same day, we can't use the transaction date to calculate this
> balance.  Instead, I want to calculate our running balance by assigning
> a transaction ID to each transaction a d then having the query sum up
> transaction amounts for all transactions with an equal or lower ID.
>
> I can use generate_series to produce a set of IDs, but can't get it to
> join properly to the rest of my query.  For example, if I had 10 rows in
> my query, I would get a series of 1 to 10, but would then get 100 rows
> (10x10) in my result.  Ultimately the results of this query are going to
> be used as a view, so I'd like to avoid creating a temp table, sequence,
> etc. Does anyone know how to use generate_series in this manner, or know
> of some other way I can go about this?  Thanks in advance!
>
> To recap with an example, the query below works fine, but how do I add a
> series to it?
>
> SELECT * FROM (
>
> SELECT
>           client_id,
>           effective_date AS transaction_date,
>           amount AS charge_amount,
>           0 AS payment_amount
>    FROM    charge
>    UNION
>    SELECT
>            client_id,
>            payment_date AS transaction_date,
>            0 as charge_amount,
>            amount AS payment_amount
>    FROM payment
>
> ) AS tmp
> ORDER BY
>    transaction_date,
>    charge_amount<>0 /* order charges before payments */
>
> Thanks,
> Sarah Dougherty
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>
>
>

Re: Using generate_series to create a unique ID in a query?

From
Decibel!
Date:
On Nov 12, 2007, at 5:11 PM, Sarah Dougherty wrote:
> For some context, I am trying to create a report that provides a
> list of client charges and payments and a "running balance" after
> each transaction. Because  we often have multiple charges and/or
> payments on the same day, we can't use the transaction date to
> calculate this balance.  Instead, I want to calculate our running
> balance by assigning a transaction ID to each transaction a d then
> having the query sum up transaction amounts for all transactions
> with an equal or lower ID.

I think you'd be much better off writing a function that does this
for you... it'd have to accept and amount and then keep a running
total. Wouldn't be a bad idea to have a boolean you can pass in that
will reset the total, too... that would allow using it in a grouping
scenario. Erm, I guess you'd actually have to accept something like a
hash; pass in a hash of all the grouping fields and whenever that
changes you reset the total.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



Attachment

Re: Using generate_series to create a unique ID in a query?

From
hubert depesz lubaczewski
Date:
On Mon, Nov 12, 2007 at 03:11:50PM -0800, Sarah Dougherty wrote:
> To recap with an example, the query below works fine, but how do I add a
> series to it?

generate_series will not help with this.
try the sequence approach, or this:
http://www.depesz.com/index.php/2007/08/17/rownum-anyone-cumulative-sum-in-one-query/

best regards,

depesz

--
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

Re: Using generate_series to create a unique ID in a query?

From
Tom Lane
Date:
hubert depesz lubaczewski <depesz@depesz.com> writes:
> On Mon, Nov 12, 2007 at 03:11:50PM -0800, Sarah Dougherty wrote:
>> To recap with an example, the query below works fine, but how do I add a
>> series to it?

> generate_series will not help with this.
> try the sequence approach, or this:
> http://www.depesz.com/index.php/2007/08/17/rownum-anyone-cumulative-sum-in-one-query/

That's a fairly ugly/messy way of doing it.  If you're going to need a C
function anyway, why not just do it directly?  As in the attachment.

regression=# create function rownum() returns int as '/home/tgl/pgsql/rownum'
regression-# language c;
CREATE FUNCTION

One thing you have to watch out for is that per spec, ORDER BY happens
after evaluation of the SELECT's targetlist, and in fact PG will usually
do it that way if an explicit sort is needed.  So for example, this
works fine:

regression=# select rownum(),* from int8_tbl;
 rownum |        q1        |        q2
--------+------------------+-------------------
      1 |              123 |               456
      2 |              123 |  4567890123456789
      3 | 4567890123456789 |               123
      4 | 4567890123456789 |  4567890123456789
      5 | 4567890123456789 | -4567890123456789
(5 rows)

but this will not give the desired results:

regression=# select rownum(),* from int8_tbl order by q2;
 rownum |        q1        |        q2
--------+------------------+-------------------
      5 | 4567890123456789 | -4567890123456789
      3 | 4567890123456789 |               123
      1 |              123 |               456
      2 |              123 |  4567890123456789
      4 | 4567890123456789 |  4567890123456789
(5 rows)

You can work around it with a subselect:

regression=# select rownum(),* from (select * from int8_tbl order by q2) ss;
 rownum |        q1        |        q2
--------+------------------+-------------------
      1 | 4567890123456789 | -4567890123456789
      2 | 4567890123456789 |               123
      3 |              123 |               456
      4 |              123 |  4567890123456789
      5 | 4567890123456789 |  4567890123456789
(5 rows)

However, that bit of ugliness is enough to dissuade me from wanting to
put this into core PG ...

            regards, tom lane

#include "postgres.h"

#include "fmgr.h"

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

Datum rownum(PG_FUNCTION_ARGS);

PG_FUNCTION_INFO_V1(rownum);

Datum
rownum(PG_FUNCTION_ARGS)
{
    int32       *ptr;

    ptr = (int32 *) fcinfo->flinfo->fn_extra;
    if (ptr == NULL)
    {
        /* First time through for the current query: allocate storage */
        fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt,
                                                      sizeof(int32));
        ptr = (int32 *) fcinfo->flinfo->fn_extra;
        /* ... and initialize counter */
        *ptr = 1;
    }
    else
        (*ptr)++;

    PG_RETURN_INT32(*ptr);
}

Re: Using generate_series to create a unique ID in a query?

From
hubert depesz lubaczewski
Date:
On Wed, Nov 14, 2007 at 10:26:52AM -0500, Tom Lane wrote:
> That's a fairly ugly/messy way of doing it.  If you're going to need a C
> function anyway, why not just do it directly?  As in the attachment.

actually you dont have to do it in c.
alec pointed (in comments) that there already is statement_timestamp()
function, so you can remove the c code, and use statement_timestamp()
instead of get_statement_timestamp().

depesz

--
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

Re: Using generate_series to create a unique ID in a query?

From
Tom Lane
Date:
hubert depesz lubaczewski <depesz@depesz.com> writes:
> On Wed, Nov 14, 2007 at 10:26:52AM -0500, Tom Lane wrote:
>> That's a fairly ugly/messy way of doing it.  If you're going to need a C
>> function anyway, why not just do it directly?  As in the attachment.

> actually you dont have to do it in c.
> alec pointed (in comments) that there already is statement_timestamp()
> function, so you can remove the c code, and use statement_timestamp()
> instead of get_statement_timestamp().

Using statement_timestamp that way at all is pretty horrid, because
it has approximately zip to do with the concept of a query.  For
instance your approach would fail in a query used inside a function that
is called more than once in a user-issued command.  Nor do I care for
the idea that the user should have to assign a distinct name to each
use of the function.  Lastly, statement_timestamp isn't there at all
before 8.2 ...

            regards, tom lane

Re: Using generate_series to create a unique ID in a query?

From
"Jan de Visser"
Date:
On 11/14/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> hubert depesz lubaczewski <depesz@depesz.com> writes:
> > On Mon, Nov 12, 2007 at 03:11:50PM -0800, Sarah Dougherty wrote:
> >> To recap with an example, the query below works fine, but how do I add a
> >> series to it?
>
> > generate_series will not help with this.
> > try the sequence approach, or this:
> > http://www.depesz.com/index.php/2007/08/17/rownum-anyone-cumulative-sum-in-one-query/
>
> That's a fairly ugly/messy way of doing it.  If you're going to need a C
> function anyway, why not just do it directly?  As in the attachment.
>
> regression=# create function rownum() returns int as '/home/tgl/pgsql/rownum'
> regression-# language c;
> CREATE FUNCTION

Any reason why this couldn't appear in the core of some future
version? I've been wanting something like this a couple of times
before. Note that Oracle has it as well.

jan

Re: Using generate_series to create a unique ID in a query?

From
Tom Lane
Date:
"Jan de Visser" <jdevisser@digitalfairway.com> writes:
> Any reason why this couldn't appear in the core of some future
> version?

You didn't read to the end of my post ;-).  If a rownum() function
like this didn't have any gotchas, I'd be in favor of putting it in,
but I don't really want to set the behavior in stone just yet.

            regards, tom lane

Re: Using generate_series to create a unique ID in a query?

From
"Jan de Visser"
Date:
On 11/14/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Jan de Visser" <jdevisser@digitalfairway.com> writes:
> > Any reason why this couldn't appear in the core of some future
> > version?
>
> You didn't read to the end of my post ;-).  If a rownum() function
> like this didn't have any gotchas, I'd be in favor of putting it in,
> but I don't really want to set the behavior in stone just yet.

<g> That's me, the ADHD getting the better off.... Oh look, waffles!

:)

jan