Thread: Counting distinct names

Counting distinct names

From

"G. Anthony Reina"

Date:

17 April 2000, 21:32:49

Let's say I have a table:

table1
=====

subject_name         text
subject_age            int4
test_trial                int4
test_score               int4


I'd like to find out how many times a unique 'subject_name' is in the
table. My SQL book says that I should be able to do this:

select COUNT(DISTINCT subject_name) from table1;

However, the psql program is giving me an "ERROR:  parser: parse error
at or near "distinct"".

Is there another way to do this (or a problem with my query)?

Thanks.
-Tony Reina

Using [PostgreSQL 6.5.1 on i686-pc-linux-gnu, compiled by gcc
egcs-2.91.66]

Re: Counting distinct names

From

"Brett W. McCoy"

Date:

17 April 2000, 22:08:50

On Mon, 17 Apr 2000, G. Anthony Reina wrote:

> Let's say I have a table:
> 
> table1
> =====
> 
> subject_name         text
> subject_age            int4
> test_trial                int4
> test_score               int4
> 
> 
> I'd like to find out how many times a unique 'subject_name' is in the
> table. My SQL book says that I should be able to do this:
> 
> select COUNT(DISTINCT subject_name) from table1;

SELECT DISTINCT(subject_name), COUNT(subject_name) 
FROM table1 
GROUP BY subject_name
;

Brett W. McCoy                                                                     http://www.chapelperilous.net
---------------------------------------------------------------------------
Anything that is good and useful is made of chocolate.

Re: Counting distinct names

From

Tom Lane

Date:

17 April 2000, 23:26:50

"G. Anthony Reina" <reina@nsi.edu> writes:
> I'd like to find out how many times a unique 'subject_name' is in the
> table. My SQL book says that I should be able to do this:
> select COUNT(DISTINCT subject_name) from table1;
> However, the psql program is giving me an "ERROR:  parser: parse error
> at or near "distinct"".

6.5 doesn't have aggregate(DISTINCT ...).  7.0 does, though.

If you don't want to upgrade right now, you could do something like

SELECT DISTINCT subject_name INTO temp_table FROM table1;
SELECT count(*) FROM temp_table;
DROP TABLE temp_table;
        regards, tom lane

Re: Counting distinct names

From

"G. Anthony Reina"

Date:

18 April 2000, 15:53:02

Tom Lane wrote:

> "G. Anthony Reina" <reina@nsi.edu> writes:
> > I'd like to find out how many times a unique 'subject_name' is in the
> > table. My SQL book says that I should be able to do this:
> > select COUNT(DISTINCT subject_name) from table1;
> > However, the psql program is giving me an "ERROR:  parser: parse error
> > at or near "distinct"".
>
> 6.5 doesn't have aggregate(DISTINCT ...).  7.0 does, though.
>
> If you don't want to upgrade right now, you could do something like
>
> SELECT DISTINCT subject_name INTO temp_table FROM table1;
> SELECT count(*) FROM temp_table;
> DROP TABLE temp_table;
>
>                         regards, tom lane

Thanks Tom. I'll eventually upgrade, but still am a little wary considering
how many messages hackers continues to get on beta5. Perhaps, when 7.1 is
out I'll feel a little more secure about the upgrade.

-Tony

Re: Counting distinct names

From

Tom Lane

Date:

19 April 2000, 01:06:08

"G. Anthony Reina" <reina@nsi.edu> writes:
> Thanks Tom. I'll eventually upgrade, but still am a little wary considering
> how many messages hackers continues to get on beta5. Perhaps, when 7.1 is
> out I'll feel a little more secure about the upgrade.

You meant 7.0.1, no?  7.1 will be a new major release with its own bugs...

Seriously, despite the amount of traffic about 7.0 issues, I believe
that 7.0 beta is already considerably more reliable than any prior
release ever was; and I believe the other developers share that opinion.
If you think 6.5.* is bug-free then you haven't been paying attention ;-).
We've fixed a huge number of problems since 6.5.  Furthermore, how are
we going to flush out remaining issues if people don't try it?  You're
not doing yourself or anyone else a service by staying away from 7.0.
        regards, tom lane

Re: Counting distinct names

From

Christopher Sawtell

Date:

19 April 2000, 22:38:22

On Wed, 19 Apr 2000, Tom Lane wrote:
> "G. Anthony Reina" <reina@nsi.edu> writes:
> > Thanks Tom. I'll eventually upgrade, but still am a little wary considering
> > how many messages hackers continues to get on beta5. Perhaps, when 7.1 is
> > out I'll feel a little more secure about the upgrade.
> 
> You meant 7.0.1, no?  7.1 will be a new major release with its own bugs...
> 
> Seriously, despite the amount of traffic about 7.0 issues, I believe
> that 7.0 beta is already considerably more reliable than any prior
> release ever was; and I believe the other developers share that opinion.
> If you think 6.5.* is bug-free then you haven't been paying attention ;-).
> We've fixed a huge number of problems since 6.5.  Furthermore, how are
> we going to flush out remaining issues if people don't try it?  You're
> not doing yourself or anyone else a service by staying away from 7.0.

OK, here are my comments and questions after just a very short play
yesterday evening with 7.0RC1. ( should I be getting the snapshot? )

1) There are files on the ftp sites which seem to be the "whole thing"
and what I suppose is the "whole thing" cut up into smaller sections.
 Am I correct? Or do I need to d/l the smaller files as well? A README_for_Download might be a good idea.

2) I couldn't find a --database-dir=[/var/lib/pgsql/data] option in the output
from ./configure --help.  How can one configure pg to place it's database in a
defined place in the file system? btw the default of /var/lib/pgsql/data seems
pretty sensible.

3) Some of the scripts and binary files for the subsiduary tools have to be
installed by hand. If you need to know what, I'd be happy to re-install pg and
take notes.

4) For my tiny ultra-simple database it "just goes". ( On SuSE-6.3 )

5) Is the idea of putting the data on a raw partition on the horizon?

Thanks for a wonderful product.

--
Sincerely etc.,
NAME       Christopher Sawtell - iOpen Technologies Ltd.CELL PHONE 021 257 4451ICQ UIN    45863470EMAIL      chris @
iopen. co . nz,  csawtell @ xtra . co . nzWWW        http://www.iopen.co.nzCNOTES
ftp://ftp.funet.fi/pub/languages/C/tutorials/sawtell_C.tar.gz
-->> Please refrain from using HTML or WORD attachments in e-mails to me <<--

Re: Counting distinct names

From

Tom Lane

Date:

19 April 2000, 22:55:25

Christopher Sawtell <csawtell@xtra.co.nz> writes:
> 1) There are files on the ftp sites which seem to be the "whole thing"
> and what I suppose is the "whole thing" cut up into smaller sections.

Yes.

>   Am I correct? Or do I need to d/l the smaller files as well?
>   A README_for_Download might be a good idea.

Yo, Marc ... where's the documentation about the new tarfile sections?

> 2) I couldn't find a --database-dir=[/var/lib/pgsql/data] option in
> the output from ./configure --help.  How can one configure pg to place
> it's database in a defined place in the file system? btw the default
> of /var/lib/pgsql/data seems pretty sensible.

The default location is PREFIX/data; I don't think there is a configure
option to change that without setting --prefix.  However, you can set it
at runtime with the -D options to initdb and postmaster, so adding yet
another configure switch might be gilding the lily.

> 3) Some of the scripts and binary files for the subsiduary tools have
> to be installed by hand. If you need to know what, I'd be happy to
> re-install pg and take notes.

News to me, other than that the perl interface module usually has to be
installed separately because of permission issues.  What did you see
missing?

> 5) Is the idea of putting the data on a raw partition on the horizon?

Not really.  There are portability issues with that; furthermore, it's
not obvious that it'd be better.  The system's been developed and tuned
to sit atop a Unix kernel's disk cache.  Writing our own disk management
code would be a large investment of work and a large loss of portability
for very questionable benefits.  It might happen someday, but I think
we have many higher-priority tasks ahead...
        regards, tom lane

Re: Counting distinct names

From

Peter Eisentraut

Date:

20 April 2000, 09:08:09

On Wed, 19 Apr 2000, Tom Lane wrote:

> > 2) I couldn't find a --database-dir=[/var/lib/pgsql/data] option in
> > the output from ./configure --help.

> The default location is PREFIX/data; I don't think there is a configure
> option to change that without setting --prefix.

That's news to me. There is no default for the data storage location
anywhere that I can see. You always need to use the -D option or set
PGDATA in the environment. Of course PREFIX/data is the commonly used
place but it isn't any more special than any other location.

-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden

Re: Counting distinct names

From

Tom Lane

Date:

20 April 2000, 10:05:31

Peter Eisentraut <e99re41@DoCS.UU.SE> writes:
> On Wed, 19 Apr 2000, Tom Lane wrote:
>> The default location is PREFIX/data; I don't think there is a configure
>> option to change that without setting --prefix.

> That's news to me. There is no default for the data storage location
> anywhere that I can see. You always need to use the -D option or set
> PGDATA in the environment.

Oh, you're right of course.  My system behaves like there's a default,
but that's because I have PGDATA set in my .profile ;-)

*Should* there be a default location established by configure?
I think I'd prefer that to having the dependence on an environment var
... but on the other hand we couldn't really remove the envar without
breaking a lot of installations...
        regards, tom lane

Re: Counting distinct names

From

SL Baur

Date:

20 April 2000, 22:31:18

Tom Lane <tgl@sss.pgh.pa.us> writes:

> *Should* there be a default location established by configure?

I think so.

> I think I'd prefer that to having the dependence on an environment var
> ... but on the other hand we couldn't really remove the envar without
> breaking a lot of installations...

Keep the environment variable and let it override the default.

The test suite could use a sane default for PGLIB too.  It seems
strange to compile and install postgres into a default location and
then have `make runtest' fail due to an environment variable not being
set.  All the more so because the README file doesn't mention it.

Re: Counting distinct names

From

Peter Eisentraut

Date:

21 April 2000, 18:01:54

SL Baur writes:

> Tom Lane <tgl@sss.pgh.pa.us> writes:
> 
> > *Should* there be a default location established by configure?
> 
> I think so.

I don't think so, see separate message.

> The test suite could use a sane default for PGLIB too.

That's a completely different issue, because the files obtained that way
are necessarily fixed for a given installation (unless you are *very*
crafty). Peeking at the GNU/Autoconf recommendations, the files in
question should be installed into PREFIX/share by default and that default
noted inside the using program (initdb), with overriding possible. Of
course the name of the corresponding configure option --datadir isn't
going to make our life easier vis a vis the first issue above, but it's
still the way to go somehow.

-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden

Re: Counting distinct names

From

Peter Eisentraut

Date:

21 April 2000, 18:02:54

Tom Lane writes:

> *Should* there be a default location established by configure?

Not IMO. Firstly (broken record time), configure should not be used to set
run-time options period. (And there's a single good reason for that: when
users build a package they won't know what options they need later on. I
get really annoyed at packages that do that.) Secondly, this will lead to
all kinds of funny problem reports because everyone will have a different
default. Worse yet, people might not even know what their default is
because they use a binary package. I think the current setup is the
cleanest by far because it emphasizes the independence of the data storage
location and the fact that multiple locations are equally valid.

> I think I'd prefer that to having the dependence on an environment var

Environment variables are a common way of pointing programs to their data.
Personally, I rarely bother with it and start the postmaster with -D
instead, then I know what I'm getting. I suppose there's no reason that
users couldn't do the same. The environment variable is only wildly useful
if people start and stop the server all the time.

-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden