Re: Populating missing dates in postgresql data - Mailing list pgsql-general
| From | Alban Hertroys |
|---|---|
| Subject | Re: Populating missing dates in postgresql data |
| Date | |
| Msg-id | D5409921-71A5-40EC-B63D-EA50DD2DC329@gmail.com Whole thread Raw |
| In response to | Populating missing dates in postgresql data ("Lavrenz, Steven M" <slavrenz@purdue.edu>) |
| List | pgsql-general |
> On 26 Mar 2015, at 1:25, Lavrenz, Steven M <slavrenz@purdue.edu> wrote:
>
> Alright everyone, this is a doozy of a problem. I am new to Postgres so I appreciate patience/understanding. I have a
databaseof hardware objects, each of which has several different “channels”. Once per day, these channels are supposed
>
>
>
> object_id channel check-in date
>
> ****************************************
>
> 990 1 2014-12-01
>
> 990 1 2014-12-02
>
> 990 2 2014-12-01
>
> 990 2 2014-12-02
>
> 286 2 2014-12-01
>
> 286 2 2014-12-02
>
> 286 5 2014-12-01
>
> 286 5 2014-12-02
>
> 4507 1 2014-12-01
>
> 4507 1 2014-12-02
>
> 4507 2 2014-12-01
>
> 4507 2 2014-12-02
>
>
>
> And so on. Occasionally, communications will break down to the hardware, such that no reporting occurs. object 286
losescommunications on 12/1/2014. Then the table might look like:
>
>
>
> object_id channel check-in date
>
> ****************************************
>
> 990 1 2014-12-01
>
> 990 1 2014-12-02
>
> 990 2 2014-12-01
>
> 990 2 2014-12-02
>
>
>
> 286 2 2014-12-02
>
>
>
> 286 5 2014-12-02
>
> 4507 1 2014-12-01
>
> 4507 1 2014-12-02
>
> 4507 2 2014-12-01
>
> 4507 2 2014-12-02
>
>
>
> Or let’s say that for some reason, just channel 2 loses reporting for a day. Then we would have:
>
>
>
> object_id channel check-in date
>
> ****************************************
>
> 990 1 2014-12-01
>
> 990 1 2014-12-02
>
> 990 2 2014-12-01
>
> 990 2 2014-12-02
>
>
>
> 286 2 2014-12-02
>
> 286 5 2014-12-01
>
> 286 5 2014-12-02
>
> 4507 1 2014-12-01
>
> 4507 1 2014-12-02
>
> 4507 2 2014-12-01
>
> 4507 2 2014-12-02
>
>
>
> I have a second table (TABLE B) with all of the object_ids and channels that are supposed to be reporting in each
day.For cases where a certain channel does not check in, I want to add a column that indicates the comm failure. So,
for
>
>
>
> object_id channel check-in date comm failure
>
> **********************************************************
>
> 990 1 2014-12-01 No
>
> 990 1 2014-12-02 No
>
> 990 2 2014-12-01 No
>
> 990 2 2014-12-02 No
>
> 286 2 2014-12-01 Yes
>
> 286 2 2014-12-02 No
>
> 286 5 2014-12-01 Yes
>
> 286 5 2014-12-02 No
>
> 4507 1 2014-12-01 No
>
> 4507 1 2014-12-02 No
>
> 4507 2 2014-12-01 No
>
> 4507 2 2014-12-02 No
>
>
>
>
>
> I have been racking my mind for the better part of a day on how to do this. The thing is that I can do a right join
ofTABLE B on TABLE A, and this will populate the missing object ids and channels. However, this only works for a single
>
>
>
> object_id channel check-in date comm failure
>
> **********************************************************
>
> 990 1 2014-12-01 No
>
> 990 1 2014-12-02 No
>
> 990 2 2014-12-01 No
>
> 990 2 2014-12-02 No
>
> 286 2 Yes
>
> 286 2 2014-12-02 No
>
> 286 5 Yes
>
> 286 5 2014-12-02 No
>
> 4507 1 2014-12-01 No
>
> 4507 1 2014-12-02 No
>
> 4507 2 2014-12-01 No
>
> 4507 2 2014-12-02 No
>
>
>
> I need to do a count of comm failures by day, so I need to populate the check-in date field. Please help!
Easiest would be to insert the missing values in your table, something like:
WITH RECURSIVE calendar (missing_date) AS (
SELECT MAX(check_in_date) FROM table_a WHERE comm_failure = 'YES'
UNION ALL
SELECT missing_date + interval '1 day' FROM calendar
WHERE missing_date < CURRENT_DATE
)
INSERT INTO table_a (object_id, channel, check_in_date, comm_failure)
SELECT b.object_id, b.channel. c.missing_date, 'YES'
FROM table_b b, calendar c
WHERE NOT EXISTS (
SELECT 1
FROM table_a a
WHERE a.object_id = b.object_id
AND a.channel = b.channel
AND a.check_in_date = c.missing_date
);
That's off the top of my head, untested, etc, but I think I got that mostly right.
Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.
pgsql-general by date: