Thread: counting distinct rows on more than one column

counting distinct rows on more than one column

From

Dirk Lutzebaeck

Date:

28 March 2001, 10:16:30

Hi,

on 7.0.3 want to COUNT
 SELECT DISTINCT a,b FROM t;

I can't find a solution because any combination with
count with more than one column gives syntax errors.

One solution would be to set a view:

CREATE VIEW v AS SELECT DISTINCT a,b FROM t;

and then

SELECT count(a) FROM v

but views don't support distinct in v7.0.3

Ok I could use a temporary table but my select distinct tends to give
large results.

Any clues?

Dirk

RE: counting distinct rows on more than one column

From

Michael Ansley

Date:

28 March 2001, 10:54:15

<p><font size="2">SELECT count(*) FROM (SELECT DISTINCT a, b FROM t) AS t2;</font><p><font size="2">should give you
whatyou want.</font><br /><p><font size="2">MikeA</font><br /><p><font size="2">>> -----Original
Message-----</font><br/><font size="2">>> From: Dirk Lutzebaeck [<a
href="mailto:lutzeb@aeccom.com">mailto:lutzeb@aeccom.com</a>]</font><br/><font size="2">>> Sent: 28 March 2001
16:11</font><br/><font size="2">>> To: pgsql-sql@postgresql.org</font><br /><font size="2">>> Subject:
[SQL]counting distinct rows on more than one column</font><br /><font size="2">>> </font><br /><font
size="2">>></font><br /><font size="2">>> </font><br /><font size="2">>> Hi,</font><br /><font
size="2">>></font><br /><font size="2">>> on 7.0.3 want to COUNT</font><br /><font size="2">>>
</font><br/><font size="2">>>   SELECT DISTINCT a,b FROM t;</font><br /><font size="2">>> </font><br
/><fontsize="2">>> I can't find a solution because any combination with</font><br /><font size="2">>> count
withmore than one column gives syntax errors.</font><br /><font size="2">>> </font><br /><font size="2">>>
Onesolution would be to set a view:</font><br /><font size="2">>> </font><br /><font size="2">>> CREATE
VIEWv AS SELECT DISTINCT a,b FROM t;</font><br /><font size="2">>> </font><br /><font size="2">>> and
then</font><br/><font size="2">>> </font><br /><font size="2">>> SELECT count(a) FROM v</font><br /><font
size="2">>></font><br /><font size="2">>> but views don't support distinct in v7.0.3</font><br /><font
size="2">>></font><br /><font size="2">>> Ok I could use a temporary table but my select distinct tends to
give</font><br/><font size="2">>> large results.</font><br /><font size="2">>> </font><br /><font
size="2">>>Any clues?</font><br /><font size="2">>> </font><br /><font size="2">>> Dirk</font><br
/><fontsize="2">>> </font><br /><font size="2">>> </font><br /><font size="2">>> </font><br /><font
size="2">>>---------------------------(end of </font><br /><font size="2">>>
broadcast)---------------------------</font><br/><font size="2">>> TIP 1: subscribe and unsubscribe commands go
to</font><br /><font size="2">>> majordomo@postgresql.org</font><br /><font size="2">>> </font><code><font
size="3"><br/><br /> _________________________________________________________________________<br /> This e-mail and
anyattachments are confidential and may also be privileged and/or copyright <br /> material of Intec Telecom Systems
PLC(or its affiliated companies). If you are not an <br /> intended or authorised recipient of this e-mail or have
receivedit in error, please delete <br /> it immediately and notify the sender by e-mail. In such a case, reading,
reproducing,<br /> printing or further dissemination of this e-mail is strictly prohibited and may be unlawful. <br />
IntecTelecom Systems PLC. does not represent or warrant that an attachment hereto is free <br /> from computer viruses
orother defects. The opinions expressed in this e-mail and any <br /> attachments may be those of the author and are
notnecessarily those of Intec Telecom <br /> Systems PLC. <br /><br /> This footnote also confirms that this email
messagehas been swept by<br /> MIMEsweeper for the presence of computer viruses. <br />
__________________________________________________________________________<br/></font></code>

Re: counting distinct rows on more than one column

From

Michael Fork

Date:

28 March 2001, 11:42:21

In 7.0.3, I believe the following would work:

SELECT count(distinct(a || b)) FROM t;

if subselects in from were supported in 7.0.3 as they are in 7.1, you
could do:

SELECT count(*) FROM (SELECT DISTINCT a,b FROM t) FROM x

Michael Fork - CCNA - MCP - A+
Network Support - Toledo Internet Access - Toledo Ohio

On Wed, 28 Mar 2001, Dirk Lutzebaeck wrote:

> 
> Hi,
> 
> on 7.0.3 want to COUNT
> 
>   SELECT DISTINCT a,b FROM t;
> 
> I can't find a solution because any combination with
> count with more than one column gives syntax errors.
> 
> One solution would be to set a view:
> 
> CREATE VIEW v AS SELECT DISTINCT a,b FROM t;
> 
> and then
> 
> SELECT count(a) FROM v
> 
> but views don't support distinct in v7.0.3
> 
> Ok I could use a temporary table but my select distinct tends to give
> large results.
> 
> Any clues?
> 
> Dirk
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>

Re: counting distinct rows on more than one column

From

Tom Lane

Date:

28 March 2001, 12:05:45

Dirk Lutzebaeck <lutzeb@aeccom.com> writes:
> on 7.0.3 want to COUNT
>   SELECT DISTINCT a,b FROM t;

In 7.1 you could do
select count(*) from (select distinct a,b from t) as t1;

In 7.0 and before I think you have no choice but to use a temp table.
        regards, tom lane

Re: counting distinct rows on more than one column

From

Dirk Lutzebaeck

Date:

28 March 2001, 14:33:12

Michael Fork writes:> In 7.0.3, I believe the following would work:> > SELECT count(distinct(a || b)) FROM t;

Great, this works! I don't quite get it why...

Dirk

Re: counting distinct rows on more than one column

From

Tom Lane

Date:

28 March 2001, 14:52:04

Dirk Lutzebaeck <lutzeb@aeccom.com> writes:
> Michael Fork writes:
>>> In 7.0.3, I believe the following would work:
>>> 
>>> SELECT count(distinct(a || b)) FROM t;

> Great, this works! I don't quite get it why...

Michael really should not have proposed that solution without mentioning
its limitations: it's not actually counting distinct values of the column
pair a,b, but only of their textual concatenation.  For example a = 'xy'
and b = 'z' will look the same as a = 'x' and b = 'yz'.

If there is some character you never use in column A, say '|', you
could do count(distinct(a || '|' || b)) with some safety, but this
strikes me as still a pretty fragile approach.
        regards, tom lane

RE: counting distinct rows on more than one column

From

Jeff Eckermann

Date:

28 March 2001, 15:43:35

I don't think this will necessarily work:

field1 | field2
aa     |  ab
a       |  aab

These are two distinct rows, so should be counted as two.
The proposed method would count them as one.
You can get around this problem by doing:
count (distinct (a || x || b))
where x is some character not found in your data.

> -----Original Message-----
> From:    Dirk Lutzebaeck [SMTP:lutzeb@aeccom.com]
> Sent:    Wednesday, March 28, 2001 1:32 PM
> To:    Michael Fork
> Cc:    pgsql-sql@postgresql.org
> Subject:    Re: counting distinct rows on more than one column
> 
> Michael Fork writes:
>  > In 7.0.3, I believe the following would work:
>  > 
>  > SELECT count(distinct(a || b)) FROM t;
> 
> Great, this works! I don't quite get it why...
> 
> Dirk
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly

Re: RE: counting distinct rows on more than one column

From

"Jim Ballard"

Date:

28 March 2001, 17:28:09

If the fields are fixed length character type, then the simpler
concatenation should work.

Actually, the requirement is only that all but the final field be fixed
length.

And if they aren't fixed length, you can cast them to be such, as long as
you know the maximum length of the string values, as in the following where
that maximum is 20 and we are looking at 3 fields:

select count(distinct (cast(field1 as char(20)) || cast(field2 as char(20))
|| field3)) from ...

Jim Ballard

----- Original Message -----
From: "Jeff Eckermann" <jeckermann@verio.net>
To: "'Dirk Lutzebaeck'" <lutzeb@aeccom.com>; "Michael Fork"
<mfork@toledolink.com>
Cc: <pgsql-sql@postgresql.org>
Sent: Wednesday, March 28, 2001 3:43 PM
Subject: [SQL] RE: counting distinct rows on more than one column


> I don't think this will necessarily work:
>
> field1 | field2
> aa     |  ab
> a       |  aab
>
> These are two distinct rows, so should be counted as two.
> The proposed method would count them as one.
> You can get around this problem by doing:
> count (distinct (a || x || b))
> where x is some character not found in your data.
>
> > -----Original Message-----
> > From: Dirk Lutzebaeck [SMTP:lutzeb@aeccom.com]
> > Sent: Wednesday, March 28, 2001 1:32 PM
> > To: Michael Fork
> > Cc: pgsql-sql@postgresql.org
> > Subject: Re: counting distinct rows on more than one column
> >
> > Michael Fork writes:
> >  > In 7.0.3, I believe the following would work:
> >  >
> >  > SELECT count(distinct(a || b)) FROM t;
> >
> > Great, this works! I don't quite get it why...
> >
> > Dirk
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo@postgresql.org so that your
> > message can get through to the mailing list cleanly
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>