Re: Aggregate functions on groups - Mailing list pgsql-general
From | Morris de Oryx |
---|---|
Subject | Re: Aggregate functions on groups |
Date | |
Msg-id | CAKqncciB=yzfCpcbANb20NBxxVuQcmWOkUi8=2fYuaKuP-w9fQ@mail.gmail.com Whole thread Raw |
In response to | Aggregate functions on groups (Rich Shepard <rshepard@appl-ecosys.com>) |
Responses |
Re: Aggregate functions on groups
|
List | pgsql-general |
Your tributaries and fish master tables make sense. If I read your code right, you're grouping by too many columns. I flattened the data into a survey table for this simple example:
select tributary,
common_name,
scientific_name,
sum(count_value) as fish_seen,
count(count_value) as observations_made
from survey
group by 1,2,3 -- The GROUP BY clause can use positions on the select list, if you feel like typing less.
tributary common_name scientific_name fish_seen observations_made
Anderson Creek trib to Nehalem River Black crappie Pomoxis nigromaculatus 3 2
Anderson Creek trib to Nehalem River Brook trout Salvelinus fontinalis 3 2
Anderson Creek trib to Nehalem River Bluegill Lepomis macrochirus 3 2
Anderson Creek trib to Nehalem River Brown bullhead Ameiurus nebulosus 3 2
But this is not why I'm answering. I'm responding as I wanted to make sure that you're aware of the pg-similarity extension:
This tool implements a *lot* of similarity measures for fuzzy cmparisons. Some are sting-oriented algorithms (Jaro-Winkler, Soundex, Levenshtein, etc.), and others derive from and/or apply to field population comparisons, like the Jaccard and Dice Coefficients. There's a lot of great stuff in the package.
On Sat, Aug 31, 2019 at 3:14 AM Rich Shepard <rshepard@appl-ecosys.com> wrote:
Tables hold data on fish counts by stream name, species, and (unreported)
collection dates. I'm trying to write a query that returns the total number
of each species in each stream.
The latest attempt is (lines wrapped by alpine; submitted as one line):
\copy (select f.stream_tribs, f.count_value, sum(f.count_value),
i.common_name, i.sci_name from fish_counts as f, itis as i where
f.stream_tribs like '%Nehalem River%' group by f.stream_tribs,
i.common_name, i.sci_name, f.count_value order by f.stream_tribs,
i.common_name, i.sci_name, f.count_value) to
'/home/rshepard/projects/oregon/mohler_sand/data/fish/fishes.dat';
The returned set starts this way:
Anderson Creek trib to Nehalem River 0 0 Black crappie Pomoxis nigromaculatus
Anderson Creek trib to Nehalem River 3 3 Black crappie Pomoxis nigromaculatus
Anderson Creek trib to Nehalem River 0 0 Bluegill Lepomis macrochirus
Anderson Creek trib to Nehalem River 3 3 Bluegill Lepomis macrochirus
Anderson Creek trib to Nehalem River 0 0 Brook trout Salvelinus fontinalis
Anderson Creek trib to Nehalem River 3 3 Brook trout Salvelinus fontinalis
Anderson Creek trib to Nehalem River 0 0 Brown bullhead Ameiurus nebulosus
Anderson Creek trib to Nehalem River 3 3 Brown bullhead Ameiurus nebulosus
What I want returned would look like this:
Anderson Creek trib to Nehalem River Black crappie Pomoxis nigromaculatus 3
Anderson Creek trib to Nehalem River Bluegill Lepomis macrochirus 3
Anderson Creek trib to Nehalem River Brook trout Salvelinus fontinalis 3
Anderson Creek trib to Nehalem River Brown bullhead Ameiurus nebulosus 3
I've read the manual yet must have not seen the section explaining how to
apply aggregate functions to groups.
Thanks in advance,
Rich
pgsql-general by date: