Thread: Special grouping on sorted data.

Special grouping on sorted data.

From
Nicolas Beuzeboc
Date:
Hi,

I was looking for a simple solution to this problem. I can't find a way 
to group on b and n by just collapsing sequential n's (identical n's 
right next to each other) the sorting condition is the timestamp.
b    | n  |         stamp
----------------------------------------A    | 1  | 2008-09-20 06:07:47.981445A    | 1  | 2008-09-20 06:08:13.294306A
| 1  | 2008-09-20 06:12:02.046596A    | 2  | 2008-09-20 06:12:26.267786A    | 2  | 2008-09-20 06:12:47.750429A    | 1
|2008-09-20 06:13:12.152512A    | 2  | 2008-09-20 06:13:39.052528A    | 2  | 2008-09-20 06:14:12.875389B    | 1  |
2008-09-2006:14:29.963352B    | 1  | 2008-09-20 06:14:52.247307B    | 3  | 2008-09-20 06:15:13.358151B    | 3  |
2008-09-2006:15:44.307792B    | 3  | 2008-09-20 06:16:17.32131B    | 2  | 2008-09-20 06:16:44.030435B    | 2  |
2008-09-2006:17:00.140907C    | 1  | 2008-09-20 06:17:50.067258C    | 1  | 2008-09-20 06:18:22.280218C    | 1  |
2008-09-2006:18:41.661213C    | 1  | 2008-09-20 06:19:07.920327C    | 3  | 2008-09-20 06:19:26.166675C    | 2  |
2008-09-2006:19:46.459439C    | 2  | 2008-09-20 06:20:04.634328
 

Here I give an example of the output I'm looking for, And I can find a 
way to do that in crystal report, but I would like postgresql to send it 
that way. If the next n is different create a new row.
 b    | n  |        min_stamp           | max_stamp
--------------------------------------------------------------------- A    | 1  | 2008-09-20 06:07:47.981445 |
2008-09-2006:12:02.046596 A    | 2  | 2008-09-20 06:12:26.267786 | 2008-09-20 06:12:47.750429 A    | 1  | 2008-09-20
06:13:12.152512| 2008-09-20 06:13:12.152512 A    | 2  | 2008-09-20 06:13:39.052528 | 2008-09-20 06:14:12.875389 B    |
1 | 2008-09-20 06:14:29.963352 | 2008-09-20 06:14:52.247307 B    | 3  | 2008-09-20 06:15:13.358151 | 2008-09-20
06:16:17.32131B    | 2  | 2008-09-20 06:16:44.030435 | 2008-09-20 06:17:00.140907 C    | 1  | 2008-09-20
06:17:50.067258| 2008-09-20 06:19:07.920327 C    | 3  | 2008-09-20 06:19:26.166675 | 2008-09-20 06:19:26.166675 C    |
2 | 2008-09-20 06:19:46.459439 | 2008-09-20 06:20:04.634328
 

If I write a GROUP BY b,n there is going to be only two rows for b = 
'A', if I use distinct on i get the same thing

is there an easy way to do that ?

Nicolas





Re: Special grouping on sorted data.

From
Tom Lane
Date:
Nicolas Beuzeboc <nicolasb@norchemlab.com> writes:
> I was looking for a simple solution to this problem. I can't find a way 
> to group on b and n by just collapsing sequential n's (identical n's 
> right next to each other) the sorting condition is the timestamp.

I'm not totally certain that I understand what you need, but it sounds
like the SELECT DISTINCT ON feature might solve it for you.  Look into
our SELECT reference page at the "weather reports" example.
        regards, tom lane


Re: Special grouping on sorted data.

From
Craig Ringer
Date:
Nicolas Beuzeboc wrote:
> Hi,
> 
> I was looking for a simple solution to this problem. I can't find a way 
> to group on b and n by just collapsing sequential n's (identical n's 
> right next to each other) the sorting condition is the timestamp.
> 
> b    | n  |         stamp
> ----------------------------------------
> A    | 1  | 2008-09-20 06:07:47.981445 [1]
> A    | 1  | 2008-09-20 06:08:13.294306 [1]
> A    | 1  | 2008-09-20 06:12:02.046596 [1]
> A    | 2  | 2008-09-20 06:12:26.267786 [2]
> A    | 2  | 2008-09-20 06:12:47.750429 [2]
> A    | 1  | 2008-09-20 06:13:12.152512 [3]
> A    | 2  | 2008-09-20 06:13:39.052528 [4]
> A    | 2  | 2008-09-20 06:14:12.875389 [4]
> B    | 1  | 2008-09-20 06:14:29.963352 [5]
> B    | 1  | 2008-09-20 06:14:52.247307 [5]
> B    | 3  | 2008-09-20 06:15:13.358151 [6]
> B    | 3  | 2008-09-20 06:15:44.307792 [6]
> B    | 3  | 2008-09-20 06:16:17.32131  [6]
> B    | 2  | 2008-09-20 06:16:44.030435 [7] 
> B    | 2  | 2008-09-20 06:17:00.140907 [7]
> C    | 1  | 2008-09-20 06:17:50.067258 [8]
> C    | 1  | 2008-09-20 06:18:22.280218 [8]
> C    | 1  | 2008-09-20 06:18:41.661213 [8]
> C    | 1  | 2008-09-20 06:19:07.920327 [8]
> C    | 3  | 2008-09-20 06:19:26.166675 [9]
> C    | 2  | 2008-09-20 06:19:46.459439 [10]
> C    | 2  | 2008-09-20 06:20:04.634328 [10]

I'd be tempted to use a set-returning PL/PgSQL function to process an 
input set ordered by stamp and return a result whenever the (b,n) pair 
changed. I'm sure there's a cleverer set-oriented approach, but it's 
eluding me at present.

You need a way to express the notion of "contiguous runs of (b,n)" which 
doesn't really exist in (set-oriented) SQL.

> Here I give an example of the output I'm looking for, And I can find a 
> way to do that in crystal report, but I would like postgresql to send it 
> that way. If the next n is different create a new row.

I suspect that Crystal Reports may be pulling the whole data set from 
PostgreSQL then doing its processing client-side.

Try turning on query logging in the server and running your report. See 
what SQL Crystal Reports actually executes.

--
Craig Ringer


Re: Special grouping on sorted data.

From
Nicolas Beuzeboc
Date:
Craig Ringer wrote:
>>
>> b    | n  |         stamp
>> ----------------------------------------
>> A    | 1  | 2008-09-20 06:07:47.981445 [1]
>> A    | 1  | 2008-09-20 06:08:13.294306 [1]
>> A    | 1  | 2008-09-20 06:12:02.046596 [1]
>> A    | 2  | 2008-09-20 06:12:26.267786 [2]
>> A    | 2  | 2008-09-20 06:12:47.750429 [2]
>> A    | 1  | 2008-09-20 06:13:12.152512 [3]
>> A    | 2  | 2008-09-20 06:13:39.052528 [4]
>> A    | 2  | 2008-09-20 06:14:12.875389 [4]
>>
>
> I'd be tempted to use a set-returning PL/PgSQL function to process an 
> input set ordered by stamp and return a result whenever the (b,n) pair 
> changed. I'm sure there's a cleverer set-oriented approach, but it's 
> eluding me at present.
>
> You need a way to express the notion of "contiguous runs of (b,n)" 
> which doesn't really exist in (set-oriented) SQL.

The numbers you have next to each row is exactly what I'm looking for. 
You mention PL/PgSQL, I'm familiar with creating triggered procedures so 
I'll look into that

> I suspect that Crystal Reports may be pulling the whole data set from 
> PostgreSQL then doing its processing client-side.
Crystal report is running a simple pass through query that I wrote, 
select b.n.stamp from table where stamp .... order by stamp
Then I use its grouping features, I group by b, then n but when I group 
by n I don't specify ascending or descending order but "in original order"
And it ends up doing what I'm looking for.

I which distinct on was more flexible, it's not happy when the order by 
set is different than the distinct on set.

I would like to be able to write select distinct on (b,n) b,n,stamp from 
table where ... order by stamp;

Nicolas




Re: Special grouping on sorted data.

From
Nicolas Beuzeboc
Date:
I which distinct on ...
I wish distinct on ...


Re: Special grouping on sorted data.

From
Tom Lane
Date:
Nicolas Beuzeboc <nicolasb@norchemlab.com> writes:
> I which distinct on was more flexible, it's not happy when the order by 
> set is different than the distinct on set.
> I would like to be able to write select distinct on (b,n) b,n,stamp from 
> table where ... order by stamp;

Well, no, because it's defined to use the ORDER BY order to determine
which row is "first" within each DISTINCT ON group.

There is an easy workaround for this, which is to sort the rows again in
an outer select:

select * from
( select distinct on (b,n) b,n,stamp from  table where ... order by b,n,stamp ) ss
order by stamp;
        regards, tom lane