Re: Measure Theoretic Data Types in Postgresql - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Measure Theoretic Data Types in Postgresql
Date
Msg-id 5077CB47.90605@vmware.com
Whole thread Raw
In response to Measure Theoretic Data Types in Postgresql  (Aaron Sheldon <aaron.sheldon@gmail.com>)
Responses Re: Measure Theoretic Data Types in Postgresql  (Josh Berkus <josh@agliodbs.com>)
Re: Measure Theoretic Data Types in Postgresql  (Aaron Sheldon <aaron.sheldon@gmail.com>)
List pgsql-hackers
On 11.10.2012 07:37, Aaron Sheldon wrote:
> This would allow for a succinct syntax to do calculations such as
> finding the daily unique patient count given the intervals of their
> attendance in particular programs; a computation I encounter
> routinely as a statistician for a health services provider.

Hmm. It's easy to get the count of unique patients on a particular date 
with something like:

select count(distinct patient) from attendance where interval && 
'2012-10-12'::date

I guess what you're after is to get that count for a range of days, in 
one query, so that the result looks something like this:
   date    |  patients
-----------+------------
2012-10-05 |       20
2012-10-06 |       24
2012-10-07 |       30
2012-10-08 |       29

The way I think of that problem is that you need to join the dates 
you're interested in with the attendance table.

select date, count (distinct patientid)
from attendance
inner join (  select '2012-10-04'::date + a AS date from generate_series(1,20) a
) dates on interval @> date
group by date;    date    | count
------------+------- 2012-10-05 |    11 2012-10-06 |    27 2012-10-07 |    47 2012-10-08 |    63 2012-10-09 |    83
2012-10-10|    95 2012-10-11 |    80 2012-10-12 |    60 2012-10-13 |    35 2012-10-14 |    13
 
(10 rows)

I created the test table for that with:

create table attendance (patientid int4 , interval daterange)
insert into attendance select id, daterange('2012-10-05'::date + 
(random()*5)::int4, '2012-10-10'::date + (random()*5)::int4) from 
generate_series(1,100) id;


So, I think the current range types already cover that use case pretty 
well. I can't imagine how the proposed measure theoretic concepts would 
make that simpler. Can you give some more complicated problem, perhaps, 
that the proposed measure theoretic concepts would make simpler than the 
current tools?

- Heikki



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Deprecating RULES
Next
From: Hannu Krosing
Date:
Subject: Re: Deprecating RULES