Thread: Special grouping on sorted data.
Hi, I was looking for a simple solution to this problem. I can't find a way to group on b and n by just collapsing sequential n's (identical n's right next to each other) the sorting condition is the timestamp. b | n | stamp ----------------------------------------A | 1 | 2008-09-20 06:07:47.981445A | 1 | 2008-09-20 06:08:13.294306A | 1 | 2008-09-20 06:12:02.046596A | 2 | 2008-09-20 06:12:26.267786A | 2 | 2008-09-20 06:12:47.750429A | 1 |2008-09-20 06:13:12.152512A | 2 | 2008-09-20 06:13:39.052528A | 2 | 2008-09-20 06:14:12.875389B | 1 | 2008-09-2006:14:29.963352B | 1 | 2008-09-20 06:14:52.247307B | 3 | 2008-09-20 06:15:13.358151B | 3 | 2008-09-2006:15:44.307792B | 3 | 2008-09-20 06:16:17.32131B | 2 | 2008-09-20 06:16:44.030435B | 2 | 2008-09-2006:17:00.140907C | 1 | 2008-09-20 06:17:50.067258C | 1 | 2008-09-20 06:18:22.280218C | 1 | 2008-09-2006:18:41.661213C | 1 | 2008-09-20 06:19:07.920327C | 3 | 2008-09-20 06:19:26.166675C | 2 | 2008-09-2006:19:46.459439C | 2 | 2008-09-20 06:20:04.634328 Here I give an example of the output I'm looking for, And I can find a way to do that in crystal report, but I would like postgresql to send it that way. If the next n is different create a new row. b | n | min_stamp | max_stamp --------------------------------------------------------------------- A | 1 | 2008-09-20 06:07:47.981445 | 2008-09-2006:12:02.046596 A | 2 | 2008-09-20 06:12:26.267786 | 2008-09-20 06:12:47.750429 A | 1 | 2008-09-20 06:13:12.152512| 2008-09-20 06:13:12.152512 A | 2 | 2008-09-20 06:13:39.052528 | 2008-09-20 06:14:12.875389 B | 1 | 2008-09-20 06:14:29.963352 | 2008-09-20 06:14:52.247307 B | 3 | 2008-09-20 06:15:13.358151 | 2008-09-20 06:16:17.32131B | 2 | 2008-09-20 06:16:44.030435 | 2008-09-20 06:17:00.140907 C | 1 | 2008-09-20 06:17:50.067258| 2008-09-20 06:19:07.920327 C | 3 | 2008-09-20 06:19:26.166675 | 2008-09-20 06:19:26.166675 C | 2 | 2008-09-20 06:19:46.459439 | 2008-09-20 06:20:04.634328 If I write a GROUP BY b,n there is going to be only two rows for b = 'A', if I use distinct on i get the same thing is there an easy way to do that ? Nicolas
Nicolas Beuzeboc <nicolasb@norchemlab.com> writes: > I was looking for a simple solution to this problem. I can't find a way > to group on b and n by just collapsing sequential n's (identical n's > right next to each other) the sorting condition is the timestamp. I'm not totally certain that I understand what you need, but it sounds like the SELECT DISTINCT ON feature might solve it for you. Look into our SELECT reference page at the "weather reports" example. regards, tom lane
Nicolas Beuzeboc wrote: > Hi, > > I was looking for a simple solution to this problem. I can't find a way > to group on b and n by just collapsing sequential n's (identical n's > right next to each other) the sorting condition is the timestamp. > > b | n | stamp > ---------------------------------------- > A | 1 | 2008-09-20 06:07:47.981445 [1] > A | 1 | 2008-09-20 06:08:13.294306 [1] > A | 1 | 2008-09-20 06:12:02.046596 [1] > A | 2 | 2008-09-20 06:12:26.267786 [2] > A | 2 | 2008-09-20 06:12:47.750429 [2] > A | 1 | 2008-09-20 06:13:12.152512 [3] > A | 2 | 2008-09-20 06:13:39.052528 [4] > A | 2 | 2008-09-20 06:14:12.875389 [4] > B | 1 | 2008-09-20 06:14:29.963352 [5] > B | 1 | 2008-09-20 06:14:52.247307 [5] > B | 3 | 2008-09-20 06:15:13.358151 [6] > B | 3 | 2008-09-20 06:15:44.307792 [6] > B | 3 | 2008-09-20 06:16:17.32131 [6] > B | 2 | 2008-09-20 06:16:44.030435 [7] > B | 2 | 2008-09-20 06:17:00.140907 [7] > C | 1 | 2008-09-20 06:17:50.067258 [8] > C | 1 | 2008-09-20 06:18:22.280218 [8] > C | 1 | 2008-09-20 06:18:41.661213 [8] > C | 1 | 2008-09-20 06:19:07.920327 [8] > C | 3 | 2008-09-20 06:19:26.166675 [9] > C | 2 | 2008-09-20 06:19:46.459439 [10] > C | 2 | 2008-09-20 06:20:04.634328 [10] I'd be tempted to use a set-returning PL/PgSQL function to process an input set ordered by stamp and return a result whenever the (b,n) pair changed. I'm sure there's a cleverer set-oriented approach, but it's eluding me at present. You need a way to express the notion of "contiguous runs of (b,n)" which doesn't really exist in (set-oriented) SQL. > Here I give an example of the output I'm looking for, And I can find a > way to do that in crystal report, but I would like postgresql to send it > that way. If the next n is different create a new row. I suspect that Crystal Reports may be pulling the whole data set from PostgreSQL then doing its processing client-side. Try turning on query logging in the server and running your report. See what SQL Crystal Reports actually executes. -- Craig Ringer
Craig Ringer wrote: >> >> b | n | stamp >> ---------------------------------------- >> A | 1 | 2008-09-20 06:07:47.981445 [1] >> A | 1 | 2008-09-20 06:08:13.294306 [1] >> A | 1 | 2008-09-20 06:12:02.046596 [1] >> A | 2 | 2008-09-20 06:12:26.267786 [2] >> A | 2 | 2008-09-20 06:12:47.750429 [2] >> A | 1 | 2008-09-20 06:13:12.152512 [3] >> A | 2 | 2008-09-20 06:13:39.052528 [4] >> A | 2 | 2008-09-20 06:14:12.875389 [4] >> > > I'd be tempted to use a set-returning PL/PgSQL function to process an > input set ordered by stamp and return a result whenever the (b,n) pair > changed. I'm sure there's a cleverer set-oriented approach, but it's > eluding me at present. > > You need a way to express the notion of "contiguous runs of (b,n)" > which doesn't really exist in (set-oriented) SQL. The numbers you have next to each row is exactly what I'm looking for. You mention PL/PgSQL, I'm familiar with creating triggered procedures so I'll look into that > I suspect that Crystal Reports may be pulling the whole data set from > PostgreSQL then doing its processing client-side. Crystal report is running a simple pass through query that I wrote, select b.n.stamp from table where stamp .... order by stamp Then I use its grouping features, I group by b, then n but when I group by n I don't specify ascending or descending order but "in original order" And it ends up doing what I'm looking for. I which distinct on was more flexible, it's not happy when the order by set is different than the distinct on set. I would like to be able to write select distinct on (b,n) b,n,stamp from table where ... order by stamp; Nicolas
I which distinct on ... I wish distinct on ...
Nicolas Beuzeboc <nicolasb@norchemlab.com> writes: > I which distinct on was more flexible, it's not happy when the order by > set is different than the distinct on set. > I would like to be able to write select distinct on (b,n) b,n,stamp from > table where ... order by stamp; Well, no, because it's defined to use the ORDER BY order to determine which row is "first" within each DISTINCT ON group. There is an easy workaround for this, which is to sort the rows again in an outer select: select * from ( select distinct on (b,n) b,n,stamp from table where ... order by b,n,stamp ) ss order by stamp; regards, tom lane