Thread: Conditional return of aggregated data
Hi, I have some aggregated statistics which are currently being queried as follows: create view stats_age as SELECT a.category, a.statcount, b.a_desc FROM reg_aggregated_stats a,regs_select_age b where a.category='age' and a.datapoint::smallint=b.a_val order by a.datapointasc; However, as these relate to event registrations, a suggestion has been made that the database really should be returningnothing until a certain number of registrations has been met (in order to avoid privacy infringing inferrence fromwhat should be an otherwise fully irreversibly aggregated dataset). Now, the queries are wrapped in PL/PGSQL functions anyway, so I could make a second call to Postgres to find out sum(statcount)and then conditionally return based on that. But is there a smarter way to do this out of a single SQL query ? My initial idea was something along the lines of : select (select sum(statcount) from stats_residence) as aggstat,statcount,short_name_en from stats_residence where aggstat>some_number; But as I soon discovered that's not valid syntax! Hence ideas welcome from those smarter than me. Thanks ! Laura
Hallo Laura, i don't know if i understand your question fully, but this might be helpfull?: FILTER SELECT count(*) AS unfiltered, count(*) FILTER (WHERE i < 5) AS filtered FROM generate_series(1,10) AS s(i); unfiltered | filtered ------------+---------- 10 | 4 (1 row) https://www.postgresql.org/docs/current/sql-expressions.html hth, Wim Laura Smith schreef op ma 02-12-2019 om 11:10 [+0000]: > Hi, > > I have some aggregated statistics which are currently being queried > as follows: > > create view stats_age as > SELECT a.category, > a.statcount, > b.a_desc > FROM reg_aggregated_stats a,regs_select_age b where a.category='age' > and a.datapoint::smallint=b.a_val order by a.datapoint asc; > > However, as these relate to event registrations, a suggestion has > been made that the database really should be returning nothing until > a certain number of registrations has been met (in order to avoid > privacy infringing inferrence from what should be an otherwise fully > irreversibly aggregated dataset). > > Now, the queries are wrapped in PL/PGSQL functions anyway, so I could > make a second call to Postgres to find out sum(statcount) and then > conditionally return based on that. > > But is there a smarter way to do this out of a single SQL query ? > > My initial idea was something along the lines of : > select (select sum(statcount) from stats_residence) as > aggstat,statcount,short_name_en from stats_residence where > aggstat>some_number; > > But as I soon discovered that's not valid syntax! Hence ideas welcome > from those smarter than me. > > Thanks ! > > Laura > > -- mvg, Wim Bertels -- Lector UC Leuven-Limburg -- My only love sprung from my only hate! Too early seen unknown, and known too late! -- William Shakespeare, "Romeo and Juliet"
> My initial idea was something along the lines of : > select (select sum(statcount) from stats_residence) as aggstat,statcount,short_name_en from stats_residence where aggstat>some_number; Am I missing something basic. The above can be done using GROUP BY and HAVING clause.
> > > My initial idea was something along the lines of : > > select (select sum(statcount) from stats_residence) as aggstat,statcount,short_name_en from stats_residence where aggstat>some_number; > > Am I missing something basic. The above can be done using > GROUP BY and HAVING clause. or this with t as (select (select sum(statcount) from stats_residence) as aggstat, statcount,short_name_en from stats_residence ) select * from t where aggstat > some_number Apology if I did not understand the question correctly.
> > or this > with t as > (select (select sum(statcount) from stats_residence) as aggstat, > statcount,short_name_en from stats_residence > ) > select * from t where aggstat > some_number > > Apology if I did not understand the question correctly. Hi Ravi, Thanks for helping show me the way. You're quite right, a CTE did the trick Laura
On Mon, 2 Dec 2019 at 12:11, Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch> wrote:
My initial idea was something along the lines of :
select (select sum(statcount) from stats_residence) as aggstat,statcount,short_name_en from stats_residence where aggstat>some_number;
One option is to move the aggregate to the where-clause. If you also need the value in your select-list, you can just repeat the subselect there, usually the planner is smart enough to figure out that it can just re-use the result.
--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.
Cut the trees and you'll see there is no forest.