Thread: counting distinct rows on more than one column
Hi, on 7.0.3 want to COUNT SELECT DISTINCT a,b FROM t; I can't find a solution because any combination with count with more than one column gives syntax errors. One solution would be to set a view: CREATE VIEW v AS SELECT DISTINCT a,b FROM t; and then SELECT count(a) FROM v but views don't support distinct in v7.0.3 Ok I could use a temporary table but my select distinct tends to give large results. Any clues? Dirk
<p><font size="2">SELECT count(*) FROM (SELECT DISTINCT a, b FROM t) AS t2;</font><p><font size="2">should give you whatyou want.</font><br /><p><font size="2">MikeA</font><br /><p><font size="2">>> -----Original Message-----</font><br/><font size="2">>> From: Dirk Lutzebaeck [<a href="mailto:lutzeb@aeccom.com">mailto:lutzeb@aeccom.com</a>]</font><br/><font size="2">>> Sent: 28 March 2001 16:11</font><br/><font size="2">>> To: pgsql-sql@postgresql.org</font><br /><font size="2">>> Subject: [SQL]counting distinct rows on more than one column</font><br /><font size="2">>> </font><br /><font size="2">>></font><br /><font size="2">>> </font><br /><font size="2">>> Hi,</font><br /><font size="2">>></font><br /><font size="2">>> on 7.0.3 want to COUNT</font><br /><font size="2">>> </font><br/><font size="2">>> SELECT DISTINCT a,b FROM t;</font><br /><font size="2">>> </font><br /><fontsize="2">>> I can't find a solution because any combination with</font><br /><font size="2">>> count withmore than one column gives syntax errors.</font><br /><font size="2">>> </font><br /><font size="2">>> Onesolution would be to set a view:</font><br /><font size="2">>> </font><br /><font size="2">>> CREATE VIEWv AS SELECT DISTINCT a,b FROM t;</font><br /><font size="2">>> </font><br /><font size="2">>> and then</font><br/><font size="2">>> </font><br /><font size="2">>> SELECT count(a) FROM v</font><br /><font size="2">>></font><br /><font size="2">>> but views don't support distinct in v7.0.3</font><br /><font size="2">>></font><br /><font size="2">>> Ok I could use a temporary table but my select distinct tends to give</font><br/><font size="2">>> large results.</font><br /><font size="2">>> </font><br /><font size="2">>>Any clues?</font><br /><font size="2">>> </font><br /><font size="2">>> Dirk</font><br /><fontsize="2">>> </font><br /><font size="2">>> </font><br /><font size="2">>> </font><br /><font size="2">>>---------------------------(end of </font><br /><font size="2">>> broadcast)---------------------------</font><br/><font size="2">>> TIP 1: subscribe and unsubscribe commands go to</font><br /><font size="2">>> majordomo@postgresql.org</font><br /><font size="2">>> </font><code><font size="3"><br/><br /> _________________________________________________________________________<br /> This e-mail and anyattachments are confidential and may also be privileged and/or copyright <br /> material of Intec Telecom Systems PLC(or its affiliated companies). If you are not an <br /> intended or authorised recipient of this e-mail or have receivedit in error, please delete <br /> it immediately and notify the sender by e-mail. In such a case, reading, reproducing,<br /> printing or further dissemination of this e-mail is strictly prohibited and may be unlawful. <br /> IntecTelecom Systems PLC. does not represent or warrant that an attachment hereto is free <br /> from computer viruses orother defects. The opinions expressed in this e-mail and any <br /> attachments may be those of the author and are notnecessarily those of Intec Telecom <br /> Systems PLC. <br /><br /> This footnote also confirms that this email messagehas been swept by<br /> MIMEsweeper for the presence of computer viruses. <br /> __________________________________________________________________________<br/></font></code>
In 7.0.3, I believe the following would work: SELECT count(distinct(a || b)) FROM t; if subselects in from were supported in 7.0.3 as they are in 7.1, you could do: SELECT count(*) FROM (SELECT DISTINCT a,b FROM t) FROM x Michael Fork - CCNA - MCP - A+ Network Support - Toledo Internet Access - Toledo Ohio On Wed, 28 Mar 2001, Dirk Lutzebaeck wrote: > > Hi, > > on 7.0.3 want to COUNT > > SELECT DISTINCT a,b FROM t; > > I can't find a solution because any combination with > count with more than one column gives syntax errors. > > One solution would be to set a view: > > CREATE VIEW v AS SELECT DISTINCT a,b FROM t; > > and then > > SELECT count(a) FROM v > > but views don't support distinct in v7.0.3 > > Ok I could use a temporary table but my select distinct tends to give > large results. > > Any clues? > > Dirk > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >
Dirk Lutzebaeck <lutzeb@aeccom.com> writes: > on 7.0.3 want to COUNT > SELECT DISTINCT a,b FROM t; In 7.1 you could do select count(*) from (select distinct a,b from t) as t1; In 7.0 and before I think you have no choice but to use a temp table. regards, tom lane
Michael Fork writes:> In 7.0.3, I believe the following would work:> > SELECT count(distinct(a || b)) FROM t; Great, this works! I don't quite get it why... Dirk
Dirk Lutzebaeck <lutzeb@aeccom.com> writes: > Michael Fork writes: >>> In 7.0.3, I believe the following would work: >>> >>> SELECT count(distinct(a || b)) FROM t; > Great, this works! I don't quite get it why... Michael really should not have proposed that solution without mentioning its limitations: it's not actually counting distinct values of the column pair a,b, but only of their textual concatenation. For example a = 'xy' and b = 'z' will look the same as a = 'x' and b = 'yz'. If there is some character you never use in column A, say '|', you could do count(distinct(a || '|' || b)) with some safety, but this strikes me as still a pretty fragile approach. regards, tom lane
I don't think this will necessarily work: field1 | field2 aa | ab a | aab These are two distinct rows, so should be counted as two. The proposed method would count them as one. You can get around this problem by doing: count (distinct (a || x || b)) where x is some character not found in your data. > -----Original Message----- > From: Dirk Lutzebaeck [SMTP:lutzeb@aeccom.com] > Sent: Wednesday, March 28, 2001 1:32 PM > To: Michael Fork > Cc: pgsql-sql@postgresql.org > Subject: Re: counting distinct rows on more than one column > > Michael Fork writes: > > In 7.0.3, I believe the following would work: > > > > SELECT count(distinct(a || b)) FROM t; > > Great, this works! I don't quite get it why... > > Dirk > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
If the fields are fixed length character type, then the simpler concatenation should work. Actually, the requirement is only that all but the final field be fixed length. And if they aren't fixed length, you can cast them to be such, as long as you know the maximum length of the string values, as in the following where that maximum is 20 and we are looking at 3 fields: select count(distinct (cast(field1 as char(20)) || cast(field2 as char(20)) || field3)) from ... Jim Ballard ----- Original Message ----- From: "Jeff Eckermann" <jeckermann@verio.net> To: "'Dirk Lutzebaeck'" <lutzeb@aeccom.com>; "Michael Fork" <mfork@toledolink.com> Cc: <pgsql-sql@postgresql.org> Sent: Wednesday, March 28, 2001 3:43 PM Subject: [SQL] RE: counting distinct rows on more than one column > I don't think this will necessarily work: > > field1 | field2 > aa | ab > a | aab > > These are two distinct rows, so should be counted as two. > The proposed method would count them as one. > You can get around this problem by doing: > count (distinct (a || x || b)) > where x is some character not found in your data. > > > -----Original Message----- > > From: Dirk Lutzebaeck [SMTP:lutzeb@aeccom.com] > > Sent: Wednesday, March 28, 2001 1:32 PM > > To: Michael Fork > > Cc: pgsql-sql@postgresql.org > > Subject: Re: counting distinct rows on more than one column > > > > Michael Fork writes: > > > In 7.0.3, I believe the following would work: > > > > > > SELECT count(distinct(a || b)) FROM t; > > > > Great, this works! I don't quite get it why... > > > > Dirk > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 3: if posting/reading through Usenet, please send an appropriate > > subscribe-nomail command to majordomo@postgresql.org so that your > > message can get through to the mailing list cleanly > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org >