Thread: Consecutive row count query

Consecutive row count query

From

Leon Stringer

Date:

17 March 2005, 20:18:58

Hi,

I wondered if anyone could answer the following question:

If I have a table such as the one below:

col1   col_order
-----------
Apple  1
Apple  2
Orange 3
Banana 4
Apple  5

Is there a way I can get the following results:

Apple  2
Orange 1
Banana 1
Apple  1

i.e. Each row is printed ordered by col_order but consecutive
appearances of the same col1 result in only a single line in the result
with the number of consecutive appearances.

Obviously I could store the table as:

col1   col_order col_count
--------------------------
Apple  1         2
Orange 2         1
Banana 3         1
Apple  4         1

But since (in my intended table) most rows will have col_count = 1, this
seems like unnecessary normalization (and semantically "wrong").

Thanks in advance for any help,

Leon Stringer

Re: Consecutive row count query

From

Bruno Wolff III

Date:

17 March 2005, 21:00:07

On Thu, Mar 17, 2005 at 20:21:24 +0000, Leon Stringer <leon.stringer@ntlworld.com> wrote:
> Hi,
> 
> I wondered if anyone could answer the following question:
> 
> If I have a table such as the one below:
> 
> col1   col_order
> -----------
> Apple  1
> Apple  2
> Orange 3
> Banana 4
> Apple  5
> 
> Is there a way I can get the following results:
> 
> Apple  2
> Orange 1
> Banana 1
> Apple  1
> 
> i.e. Each row is printed ordered by col_order but consecutive
> appearances of the same col1 result in only a single line in the result
> with the number of consecutive appearances.

Which col_order value do you expect to use in ordering?

You can probably join a grouped version of the table to get the counts
with a distinct on version of the table to do what you want.

Re: Consecutive row count query

From

Jaime Casanova

Date:

17 March 2005, 21:33:11

On Thu, 17 Mar 2005 20:21:24 +0000, Leon Stringer
<leon.stringer@ntlworld.com> wrote:
> Hi,
> 
> I wondered if anyone could answer the following question:
> 
> If I have a table such as the one below:
> 
> col1   col_order
> -----------
> Apple  1
> Apple  2
> Orange 3
> Banana 4
> Apple  5
> 
> Is there a way I can get the following results:
> 
> Apple  2
> Orange 1
> Banana 1
> Apple  1
> 
A function?

regards,
Jaime Casanova

Re: Consecutive row count query

From

Greg Stark

Date:

17 March 2005, 23:03:48

Leon Stringer <leon.stringer@ntlworld.com> writes:

> Hi,
> 
> I wondered if anyone could answer the following question:
> 
> If I have a table such as the one below:
> 
> col1   col_order
> -----------
> Apple  1
> Apple  2
> Orange 3
> Banana 4
> Apple  5
> 
> Is there a way I can get the following results:
> 
> Apple  2
> Orange 1
> Banana 1
> Apple  1

Maybe. But not easily or efficiently.

How about this:

SELECT a.col1, a.col_order FROM tab as a LEFT OUTER JOIN tab as b    ON (b.col_order = a.col_order+1 AND
b.col1=a.col1)WHEREb.col1 IS NULL

> But since (in my intended table) most rows will have col_count = 1, this
> seems like unnecessary normalization (and semantically "wrong").

I think this looks like a better option. "unnecessary normalization" is an odd
phrase. Unless you can point at some reason that the denormalized seems *more*
convenient --and much *more* convenient at that-- not less convenient then you
should go for it.

Besides, that col_count column's only going to be four bytes. Unless the
"Apple" data is really short it'll only take a few col_count>1 to make it
worthwhile.

The only reason you might have a problem is if it's really "semantically
wrong" which would be if there's data attached to Apple or Orange that might
be different from one streak of results to the other.

-- 
greg

Re: Consecutive row count query

From

Andrew Hammond

Date:

17 March 2005, 23:37:17

You could hack it using a custom aggregate. NB: you'll want to reset the  categorizer_seq every now and then. And this
isn'tsafe for concurrent 
 
queries. You could make it safe for concurrent queries by using a 
complex type for STYPE, but I didn't bother. I also haven't debugged 
this, but I think it expresses the concept.


CREATE SEQUENCE categorizer_seq;

CREATE OR REPLACE FUNCTION categorizer_func (string, string) RETURNS 
bigint VOLATILE CALLED ON NULL INPUT AS '
SELECT CASE WHEN $1 = $2    THEN (SELECT last_value FROM categorizer_seq)    ELSE nextval(''categorizer_seq'')    END
AScategory
 
' LANGUAGE SQL;

CREATE AGGREGATE categorizer (    BASETYPE = text,    SFUNC = categorizer_func,    STYPE = text,    INITCOND = ''
);

SELECT col1, count(*)
FROM (    SELECT col1, cagetorizer(col1) AS category    FROM mytable    ORDER BY col_order
) tmp
GROUP BY (col1, category);

Leon Stringer wrote:
> Hi,
> 
> I wondered if anyone could answer the following question:
> 
> If I have a table such as the one below:
> 
> col1   col_order
> -----------
> Apple  1
> Apple  2
> Orange 3
> Banana 4
> Apple  5
> 
> Is there a way I can get the following results:
> 
> Apple  2
> Orange 1
> Banana 1
> Apple  1
> 
> i.e. Each row is printed ordered by col_order but consecutive
> appearances of the same col1 result in only a single line in the result
> with the number of consecutive appearances.
> 
> Obviously I could store the table as:
> 
> col1   col_order col_count
> --------------------------
> Apple  1         2
> Orange 2         1
> Banana 3         1
> Apple  4         1
> 
> But since (in my intended table) most rows will have col_count = 1, this
> seems like unnecessary normalization (and semantically "wrong").
> 
> Thanks in advance for any help,
> 
> Leon Stringer
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly

Re: Consecutive row count query

From

Leon Stringer

Date:

18 March 2005, 21:05:22

> I wondered if anyone could answer the following question:

Thanks for your responses, I think I'll just add the extra column in as
Greg suggests. (BTW: I mean "unnecessary *denormalization*" which I hope
is less odd!).

Leon...