Re: Populating missing dates in postgresql data - Mailing list pgsql-general

From David G. Johnston
Subject Re: Populating missing dates in postgresql data
Date
Msg-id CAKFQuwaaw27EoO7M0z5FemJxHD39RGxhbeixZRY5vPKvExFPiQ@mail.gmail.com
Whole thread Raw
In response to Populating missing dates in postgresql data  ("Lavrenz, Steven M" <slavrenz@purdue.edu>)
Responses Re: Populating missing dates in postgresql data  (Mitu Verma <mitu.verma@ericsson.com>)
List pgsql-general
On Wed, Mar 25, 2015 at 5:25 PM, Lavrenz, Steven M <slavrenz@purdue.edu> wrote:

Alright everyone, this is a doozy of a problem. I am new to Postgres so I appreciate patience/understanding. I have a database of hardware objects, each of which has several different “channels”. Once per day, these channels are supposed to check in with a central server, generating an event log table (TABLE A) like the following:

 

object_id             channel                check-in date

****************************************

990                         1                              2014-12-01

990                         1                              2014-12-02

990                         2                              2014-12-01

990                         2                              2014-12-02

286                         2                              2014-12-01

286                         2                              2014-12-02

286                         5                              2014-12-01

286                         5                              2014-12-02

4507                       1                              2014-12-01

4507                       1                              2014-12-02

4507                       2                              2014-12-01

4507                       2                              2014-12-02

 

And so on. Occasionally, communications will break down to the hardware, such that no reporting occurs. For example, let’s say that object 286 loses communications on 12/1/2014. Then the table might look like:   

 

object_id             channel                check-in date

****************************************

990                         1                              2014-12-01

990                         1                              2014-12-02

990                         2                              2014-12-01

990                         2                              2014-12-02

 

286                         2                              2014-12-02

 

286                         5                              2014-12-02

4507                       1                              2014-12-01

4507                       1                              2014-12-02

4507                       2                              2014-12-01

4507                       2                              2014-12-02

 

Or let’s say that for some reason, just channel 2 loses reporting for a day. Then we would have:

 

object_id             channel                check-in date

****************************************

990                         1                              2014-12-01

990                         1                              2014-12-02

990                         2                              2014-12-01

990                         2                              2014-12-02

 

286                         2                              2014-12-02

286                         5                              2014-12-01

286                         5                              2014-12-02

4507                       1                              2014-12-01

4507                       1                              2014-12-02

4507                       2                              2014-12-01

4507                       2                              2014-12-02

 

I have a second table (TABLE B) with all of the object_ids and channels that are supposed to be reporting in each day. For cases where a certain channel does not check in, I want to add a column that indicates the comm failure. So, for the example where all channels on object 286 do not check in, I would like to get is something like this:

 

object_id             channel                check-in date                     comm failure

**********************************************************

990                         1                              2014-12-01                          No

990                         1                              2014-12-02                          No

990                         2                              2014-12-01                          No

990                         2                              2014-12-02                          No

286                         2                              2014-12-01                          Yes

286                         2                              2014-12-02                          No

286                         5                              2014-12-01                          Yes

286                         5                              2014-12-02                          No

4507                       1                              2014-12-01                          No

4507                       1                              2014-12-02                          No

4507                       2                              2014-12-01                          No

4507                       2                              2014-12-02                          No

 

 

I have been racking my mind for the better part of a day on how to do this. The thing is that I can do a right join of TABLE B on TABLE A, and this will populate the missing object ids and channels. However, this only works for a single day, and it gives me something like:

 

object_id             channel                check-in date                     comm failure

**********************************************************

990                         1                              2014-12-01                          No

990                         1                              2014-12-02                          No

990                         2                              2014-12-01                          No

990                         2                              2014-12-02                          No

286                         2                                                                              Yes

286                         2                              2014-12-02                          No

286                         5                                                                              Yes

286                         5                              2014-12-02                          No

4507                       1                              2014-12-01                          No

4507                       1                              2014-12-02                          No

4507                       2                              2014-12-01                          No

4507                       2                              2014-12-02                          No

 

I need to do a count of comm failures by day, so I need to populate the check-in date field. Please help!

               


​People are much more inclined to help if you make it easy for them.

Read up on "CTE" (WITH clause) in the documentation.  And "VALUES".  Both in the "SELECT" command section.

Use those to create inline versions of your two tables and then provide the query you can get working.

The basic solution is:

WITH actual_checkins (id, channel, date, count_for_day) AS (...)
, expected_checkins (id, channel, date) AS (...)
SELECT id, channel, date, COALESCE(count_for_day, 0) AS number_of_checkins
FROM expected_checkins
LEFT JOIN actual_checkins USING (id, channel, date)

You can convert zero/non-true to "no/yes" via a CASE WHEN count = 0 THEN 'no' ELSE 'yes' END or similar.

David J.

​P.S. the function "generate_series(date, date)" may serve you well

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Populating missing dates in postgresql data
Next
From: Mitu Verma
Date:
Subject: Autovacuum query