Thread: Using generate_series to create a unique ID in a query?
Hello, I am trying to create a view that will contain a generated sequence (unique ID), and am running into problems doing so. For some context, I am trying to create a report that provides a list of client charges and payments and a "running balance" after each transaction. Because we often have multiple charges and/or payments on the same day, we can't use the transaction date to calculate this balance. Instead, I want to calculate our running balance by assigning a transaction ID to each transaction a d then having the query sum up transaction amounts for all transactions with an equal or lower ID. I can use generate_series to produce a set of IDs, but can't get it to join properly to the rest of my query. For example, if I had 10 rows in my query, I would get a series of 1 to 10, but would then get 100 rows (10x10) in my result. Ultimately the results of this query are going to be used as a view, so I'd like to avoid creating a temp table, sequence, etc. Does anyone know how to use generate_series in this manner, or know of some other way I can go about this? Thanks in advance! To recap with an example, the query below works fine, but how do I add a series to it? SELECT * FROM ( SELECT client_id, effective_date AS transaction_date, amount AS charge_amount, 0 AS payment_amount FROM charge UNION SELECT client_id, payment_date AS transaction_date, 0 as charge_amount, amount AS payment_amount FROM payment ) AS tmp ORDER BY transaction_date, charge_amount<>0 /* order charges before payments */ Thanks, Sarah Dougherty
Attachment
Hello use temporary sequence instead. postgres=#create temp sequence a; CREATE SEQUENCE postgres=# select nextval('a'), b FROM (values(1),(2),(10),(20)) b(b); nextval | b ---------+---- 1 | 1 2 | 2 3 | 10 4 | 20 (4 rows) Regards Pavel Stehule On 13/11/2007, Sarah Dougherty <sdougherty@desc.org> wrote: > Hello, > > I am trying to create a view that will contain a generated sequence > (unique ID), and am running into problems doing so. > > For some context, I am trying to create a report that provides a list of > client charges and payments and a "running balance" after each > transaction. Because we often have multiple charges and/or payments on > the same day, we can't use the transaction date to calculate this > balance. Instead, I want to calculate our running balance by assigning > a transaction ID to each transaction a d then having the query sum up > transaction amounts for all transactions with an equal or lower ID. > > I can use generate_series to produce a set of IDs, but can't get it to > join properly to the rest of my query. For example, if I had 10 rows in > my query, I would get a series of 1 to 10, but would then get 100 rows > (10x10) in my result. Ultimately the results of this query are going to > be used as a view, so I'd like to avoid creating a temp table, sequence, > etc. Does anyone know how to use generate_series in this manner, or know > of some other way I can go about this? Thanks in advance! > > To recap with an example, the query below works fine, but how do I add a > series to it? > > SELECT * FROM ( > > SELECT > client_id, > effective_date AS transaction_date, > amount AS charge_amount, > 0 AS payment_amount > FROM charge > UNION > SELECT > client_id, > payment_date AS transaction_date, > 0 as charge_amount, > amount AS payment_amount > FROM payment > > ) AS tmp > ORDER BY > transaction_date, > charge_amount<>0 /* order charges before payments */ > > Thanks, > Sarah Dougherty > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > > >
On Nov 12, 2007, at 5:11 PM, Sarah Dougherty wrote: > For some context, I am trying to create a report that provides a > list of client charges and payments and a "running balance" after > each transaction. Because we often have multiple charges and/or > payments on the same day, we can't use the transaction date to > calculate this balance. Instead, I want to calculate our running > balance by assigning a transaction ID to each transaction a d then > having the query sum up transaction amounts for all transactions > with an equal or lower ID. I think you'd be much better off writing a function that does this for you... it'd have to accept and amount and then keep a running total. Wouldn't be a bad idea to have a boolean you can pass in that will reset the total, too... that would allow using it in a grouping scenario. Erm, I guess you'd actually have to accept something like a hash; pass in a hash of all the grouping fields and whenever that changes you reset the total. -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828
Attachment
On Mon, Nov 12, 2007 at 03:11:50PM -0800, Sarah Dougherty wrote: > To recap with an example, the query below works fine, but how do I add a > series to it? generate_series will not help with this. try the sequence approach, or this: http://www.depesz.com/index.php/2007/08/17/rownum-anyone-cumulative-sum-in-one-query/ best regards, depesz -- quicksil1er: "postgres is excellent, but like any DB it requires a highly paid DBA. here's my CV!" :) http://www.depesz.com/ - blog dla ciebie (i moje CV)
hubert depesz lubaczewski <depesz@depesz.com> writes: > On Mon, Nov 12, 2007 at 03:11:50PM -0800, Sarah Dougherty wrote: >> To recap with an example, the query below works fine, but how do I add a >> series to it? > generate_series will not help with this. > try the sequence approach, or this: > http://www.depesz.com/index.php/2007/08/17/rownum-anyone-cumulative-sum-in-one-query/ That's a fairly ugly/messy way of doing it. If you're going to need a C function anyway, why not just do it directly? As in the attachment. regression=# create function rownum() returns int as '/home/tgl/pgsql/rownum' regression-# language c; CREATE FUNCTION One thing you have to watch out for is that per spec, ORDER BY happens after evaluation of the SELECT's targetlist, and in fact PG will usually do it that way if an explicit sort is needed. So for example, this works fine: regression=# select rownum(),* from int8_tbl; rownum | q1 | q2 --------+------------------+------------------- 1 | 123 | 456 2 | 123 | 4567890123456789 3 | 4567890123456789 | 123 4 | 4567890123456789 | 4567890123456789 5 | 4567890123456789 | -4567890123456789 (5 rows) but this will not give the desired results: regression=# select rownum(),* from int8_tbl order by q2; rownum | q1 | q2 --------+------------------+------------------- 5 | 4567890123456789 | -4567890123456789 3 | 4567890123456789 | 123 1 | 123 | 456 2 | 123 | 4567890123456789 4 | 4567890123456789 | 4567890123456789 (5 rows) You can work around it with a subselect: regression=# select rownum(),* from (select * from int8_tbl order by q2) ss; rownum | q1 | q2 --------+------------------+------------------- 1 | 4567890123456789 | -4567890123456789 2 | 4567890123456789 | 123 3 | 123 | 456 4 | 123 | 4567890123456789 5 | 4567890123456789 | 4567890123456789 (5 rows) However, that bit of ugliness is enough to dissuade me from wanting to put this into core PG ... regards, tom lane #include "postgres.h" #include "fmgr.h" #ifdef PG_MODULE_MAGIC PG_MODULE_MAGIC; #endif Datum rownum(PG_FUNCTION_ARGS); PG_FUNCTION_INFO_V1(rownum); Datum rownum(PG_FUNCTION_ARGS) { int32 *ptr; ptr = (int32 *) fcinfo->flinfo->fn_extra; if (ptr == NULL) { /* First time through for the current query: allocate storage */ fcinfo->flinfo->fn_extra = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt, sizeof(int32)); ptr = (int32 *) fcinfo->flinfo->fn_extra; /* ... and initialize counter */ *ptr = 1; } else (*ptr)++; PG_RETURN_INT32(*ptr); }
On Wed, Nov 14, 2007 at 10:26:52AM -0500, Tom Lane wrote: > That's a fairly ugly/messy way of doing it. If you're going to need a C > function anyway, why not just do it directly? As in the attachment. actually you dont have to do it in c. alec pointed (in comments) that there already is statement_timestamp() function, so you can remove the c code, and use statement_timestamp() instead of get_statement_timestamp(). depesz -- quicksil1er: "postgres is excellent, but like any DB it requires a highly paid DBA. here's my CV!" :) http://www.depesz.com/ - blog dla ciebie (i moje CV)
hubert depesz lubaczewski <depesz@depesz.com> writes: > On Wed, Nov 14, 2007 at 10:26:52AM -0500, Tom Lane wrote: >> That's a fairly ugly/messy way of doing it. If you're going to need a C >> function anyway, why not just do it directly? As in the attachment. > actually you dont have to do it in c. > alec pointed (in comments) that there already is statement_timestamp() > function, so you can remove the c code, and use statement_timestamp() > instead of get_statement_timestamp(). Using statement_timestamp that way at all is pretty horrid, because it has approximately zip to do with the concept of a query. For instance your approach would fail in a query used inside a function that is called more than once in a user-issued command. Nor do I care for the idea that the user should have to assign a distinct name to each use of the function. Lastly, statement_timestamp isn't there at all before 8.2 ... regards, tom lane
On 11/14/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > hubert depesz lubaczewski <depesz@depesz.com> writes: > > On Mon, Nov 12, 2007 at 03:11:50PM -0800, Sarah Dougherty wrote: > >> To recap with an example, the query below works fine, but how do I add a > >> series to it? > > > generate_series will not help with this. > > try the sequence approach, or this: > > http://www.depesz.com/index.php/2007/08/17/rownum-anyone-cumulative-sum-in-one-query/ > > That's a fairly ugly/messy way of doing it. If you're going to need a C > function anyway, why not just do it directly? As in the attachment. > > regression=# create function rownum() returns int as '/home/tgl/pgsql/rownum' > regression-# language c; > CREATE FUNCTION Any reason why this couldn't appear in the core of some future version? I've been wanting something like this a couple of times before. Note that Oracle has it as well. jan
"Jan de Visser" <jdevisser@digitalfairway.com> writes: > Any reason why this couldn't appear in the core of some future > version? You didn't read to the end of my post ;-). If a rownum() function like this didn't have any gotchas, I'd be in favor of putting it in, but I don't really want to set the behavior in stone just yet. regards, tom lane
On 11/14/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Jan de Visser" <jdevisser@digitalfairway.com> writes: > > Any reason why this couldn't appear in the core of some future > > version? > > You didn't read to the end of my post ;-). If a rownum() function > like this didn't have any gotchas, I'd be in favor of putting it in, > but I don't really want to set the behavior in stone just yet. <g> That's me, the ADHD getting the better off.... Oh look, waffles! :) jan