Thread: Consecutive row count query
Hi, I wondered if anyone could answer the following question: If I have a table such as the one below: col1 col_order ----------- Apple 1 Apple 2 Orange 3 Banana 4 Apple 5 Is there a way I can get the following results: Apple 2 Orange 1 Banana 1 Apple 1 i.e. Each row is printed ordered by col_order but consecutive appearances of the same col1 result in only a single line in the result with the number of consecutive appearances. Obviously I could store the table as: col1 col_order col_count -------------------------- Apple 1 2 Orange 2 1 Banana 3 1 Apple 4 1 But since (in my intended table) most rows will have col_count = 1, this seems like unnecessary normalization (and semantically "wrong"). Thanks in advance for any help, Leon Stringer
On Thu, Mar 17, 2005 at 20:21:24 +0000, Leon Stringer <leon.stringer@ntlworld.com> wrote: > Hi, > > I wondered if anyone could answer the following question: > > If I have a table such as the one below: > > col1 col_order > ----------- > Apple 1 > Apple 2 > Orange 3 > Banana 4 > Apple 5 > > Is there a way I can get the following results: > > Apple 2 > Orange 1 > Banana 1 > Apple 1 > > i.e. Each row is printed ordered by col_order but consecutive > appearances of the same col1 result in only a single line in the result > with the number of consecutive appearances. Which col_order value do you expect to use in ordering? You can probably join a grouped version of the table to get the counts with a distinct on version of the table to do what you want.
On Thu, 17 Mar 2005 20:21:24 +0000, Leon Stringer <leon.stringer@ntlworld.com> wrote: > Hi, > > I wondered if anyone could answer the following question: > > If I have a table such as the one below: > > col1 col_order > ----------- > Apple 1 > Apple 2 > Orange 3 > Banana 4 > Apple 5 > > Is there a way I can get the following results: > > Apple 2 > Orange 1 > Banana 1 > Apple 1 > A function? regards, Jaime Casanova
Leon Stringer <leon.stringer@ntlworld.com> writes: > Hi, > > I wondered if anyone could answer the following question: > > If I have a table such as the one below: > > col1 col_order > ----------- > Apple 1 > Apple 2 > Orange 3 > Banana 4 > Apple 5 > > Is there a way I can get the following results: > > Apple 2 > Orange 1 > Banana 1 > Apple 1 Maybe. But not easily or efficiently. How about this: SELECT a.col1, a.col_order FROM tab as a LEFT OUTER JOIN tab as b ON (b.col_order = a.col_order+1 AND b.col1=a.col1)WHEREb.col1 IS NULL > But since (in my intended table) most rows will have col_count = 1, this > seems like unnecessary normalization (and semantically "wrong"). I think this looks like a better option. "unnecessary normalization" is an odd phrase. Unless you can point at some reason that the denormalized seems *more* convenient --and much *more* convenient at that-- not less convenient then you should go for it. Besides, that col_count column's only going to be four bytes. Unless the "Apple" data is really short it'll only take a few col_count>1 to make it worthwhile. The only reason you might have a problem is if it's really "semantically wrong" which would be if there's data attached to Apple or Orange that might be different from one streak of results to the other. -- greg
You could hack it using a custom aggregate. NB: you'll want to reset the categorizer_seq every now and then. And this isn'tsafe for concurrent queries. You could make it safe for concurrent queries by using a complex type for STYPE, but I didn't bother. I also haven't debugged this, but I think it expresses the concept. CREATE SEQUENCE categorizer_seq; CREATE OR REPLACE FUNCTION categorizer_func (string, string) RETURNS bigint VOLATILE CALLED ON NULL INPUT AS ' SELECT CASE WHEN $1 = $2 THEN (SELECT last_value FROM categorizer_seq) ELSE nextval(''categorizer_seq'') END AScategory ' LANGUAGE SQL; CREATE AGGREGATE categorizer ( BASETYPE = text, SFUNC = categorizer_func, STYPE = text, INITCOND = '' ); SELECT col1, count(*) FROM ( SELECT col1, cagetorizer(col1) AS category FROM mytable ORDER BY col_order ) tmp GROUP BY (col1, category); Leon Stringer wrote: > Hi, > > I wondered if anyone could answer the following question: > > If I have a table such as the one below: > > col1 col_order > ----------- > Apple 1 > Apple 2 > Orange 3 > Banana 4 > Apple 5 > > Is there a way I can get the following results: > > Apple 2 > Orange 1 > Banana 1 > Apple 1 > > i.e. Each row is printed ordered by col_order but consecutive > appearances of the same col1 result in only a single line in the result > with the number of consecutive appearances. > > Obviously I could store the table as: > > col1 col_order col_count > -------------------------- > Apple 1 2 > Orange 2 1 > Banana 3 1 > Apple 4 1 > > But since (in my intended table) most rows will have col_count = 1, this > seems like unnecessary normalization (and semantically "wrong"). > > Thanks in advance for any help, > > Leon Stringer > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
> I wondered if anyone could answer the following question: Thanks for your responses, I think I'll just add the extra column in as Greg suggests. (BTW: I mean "unnecessary *denormalization*" which I hope is less odd!). Leon...