Thread: Time series aggregation

Time series aggregation

From
"Yelai, Ramkumar IN BLR STS"
Date:
Hi,
 
In my current project,  we are keep inserting  device related data in 5 minutes basis  in our archiving tables. Soon we added more devices, more number of  rows will be inserted.
 
When we are generating report for 1000 devices for around 1 week of date range, It takes more time to aggregate the data and compute the business logic. Hence we thought to go for time series aggregation like, aggregate the device data by Hourly, daily and monthly and store in to another table.  Since Postgresql is supporting cron job in pgAgent, so I created a jobs to run hourly , daily and monthly basis to aggregate the data.  The jobs are working fine, but if postgresql server is shutdown and resume after some time like 2 hours or 1 day then job are not running from the  last run instead it is running from the current time and return to the normal schedule.
 
Here, how do I run the jobs from last run to current time as per the schedules based on hourly, daily and monthly.
 
Thanks & Regards,
Ramkumar
 
 
 

Re: Time series aggregation

From
Dave Page
Date:
On Wed, Jun 27, 2012 at 7:15 AM, Yelai, Ramkumar IN BLR STS
<ramkumar.yelai@siemens.com> wrote:
> Hi,
>
> In my current project,  we are keep inserting  device related data in 5
> minutes basis  in our archiving tables. Soon we added more devices, more
> number of  rows will be inserted.
>
> When we are generating report for 1000 devices for around 1 week of date
> range, It takes more time to aggregate the data and compute the business
> logic. Hence we thought to go for time series aggregation like, aggregate
> the device data by Hourly, daily and monthly and store in to another table.
> Since Postgresql is supporting cron job in pgAgent, so I created a jobs to
> run hourly , daily and monthly basis to aggregate the data.  The jobs are
> working fine, but if postgresql server is shutdown and resume after some
> time like 2 hours or 1 day then job are not running from the  last run
> instead it is running from the current time and return to the normal
> schedule.

That's by design (following a long discussion here many years ago).

> Here, how do I run the jobs from last run to current time as per the
> schedules based on hourly, daily and monthly.

The easiest way would probably be to reset the
pgagent.pga_job.jobnextrun value to whatever you want it to be when
the server restarts, before pgAgent is restarted.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Time series aggregation

From
"Yelai, Ramkumar IN BLR STS"
Date:
Thanks Dave.

I have reset the pgagent.pga_job.jobnextrun when server restarts, however it will be reset by pgAgent, when it starts.

Following are the steps, I have done.

1. Around 2:45 PM, I have stopped the server
2. I have started the server at 4:03 PM and I reset the jobnextrun ( from pgAdmin UI ) to 3:00PM.
3. Soon I started the pgAgent at 4:04PM, the jobnextrun is changed to 4:04 PM and it run once.

Here, I lost the 2:00 PM to 3:00 PM aggregation.

Please let me know is there any additional steps I have to be done.

Thanks & Regards,
Ramkumar


-----Original Message-----
From: Dave Page [mailto:dpage@pgadmin.org]
Sent: Wednesday, June 27, 2012 2:14 PM
To: Yelai, Ramkumar IN BLR STS
Cc: pgadmin-support@postgresql.org
Subject: Re: [pgadmin-support] Time series aggregation

On Wed, Jun 27, 2012 at 7:15 AM, Yelai, Ramkumar IN BLR STS <ramkumar.yelai@siemens.com> wrote:
> Hi,
>
> In my current project,  we are keep inserting  device related data in
> 5 minutes basis  in our archiving tables. Soon we added more devices,
> more number of  rows will be inserted.
>
> When we are generating report for 1000 devices for around 1 week of
> date range, It takes more time to aggregate the data and compute the
> business logic. Hence we thought to go for time series aggregation
> like, aggregate the device data by Hourly, daily and monthly and store in to another table.
> Since Postgresql is supporting cron job in pgAgent, so I created a
> jobs to run hourly , daily and monthly basis to aggregate the data. 
> The jobs are working fine, but if postgresql server is shutdown and
> resume after some time like 2 hours or 1 day then job are not running
> from the  last run instead it is running from the current time and
> return to the normal schedule.

That's by design (following a long discussion here many years ago).

> Here, how do I run the jobs from last run to current time as per the
> schedules based on hourly, daily and monthly.

The easiest way would probably be to reset the pgagent.pga_job.jobnextrun value to whatever you want it to be when the
serverrestarts, before pgAgent is restarted. 

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company


Re: Time series aggregation

From
Dave Page
Date:
On Wed, Jun 27, 2012 at 11:44 AM, Yelai, Ramkumar IN BLR STS
<ramkumar.yelai@siemens.com> wrote:
> Thanks Dave.
>
> I have reset the pgagent.pga_job.jobnextrun when server restarts, however it will be reset by pgAgent, when it
starts.
>
> Following are the steps, I have done.
>
> 1. Around 2:45 PM, I have stopped the server
> 2. I have started the server at 4:03 PM and I reset the jobnextrun ( from pgAdmin UI ) to 3:00PM.
> 3. Soon I started the pgAgent at 4:04PM, the jobnextrun is changed to 4:04 PM and it run once.
>
> Here, I lost the 2:00 PM to 3:00 PM aggregation.

The only way I can see for that to happen is if the agent is killed
whilst the job is being processed. That will cause the job's nextrun
time to be reset during zombie clearance when the agent restarts. In
the normal case, the job will be run as soon as the agent enters the
main processing loop, where it will select all jobs with nextrun <=
now() and process them immediately.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Time series aggregation

From
"Yelai, Ramkumar IN BLR STS"
Date:
Hi Dave,

I use to shutdown the postgresql server by stopping the service, followed by that pgAgent server is automatically
stopped. 

However I am getting the following error message in service logger, while shutdown server.

Postgresql Service error message

The description for Event ID ( 0 ) in Source ( PostgreSQL ) cannot be found. The local computer may not have the
necessaryregistry information or message DLL files to display messages from a remote computer. You may be able to use
the/AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part
ofthe event: FATAL:  terminating connection due to administrator command 

Pgadmin error message

Failed to query jobs table!

The PostgreSQL Scheduling Agent - pgAgent service terminated unexpectedly.  It has done this 7 time(s).

It looks like pgAgent is not properly shutdown as you said.

Would you please help me how do I resolve this issue.

Thanks & Regards,
Ramkumar


-----Original Message-----
From: Dave Page [mailto:dpage@pgadmin.org]
Sent: Wednesday, June 27, 2012 9:44 PM
To: Yelai, Ramkumar IN BLR STS
Cc: pgadmin-support@postgresql.org
Subject: Re: [pgadmin-support] Time series aggregation

On Wed, Jun 27, 2012 at 11:44 AM, Yelai, Ramkumar IN BLR STS <ramkumar.yelai@siemens.com> wrote:
> Thanks Dave.
>
> I have reset the pgagent.pga_job.jobnextrun when server restarts, however it will be reset by pgAgent, when it
starts.
>
> Following are the steps, I have done.
>
> 1. Around 2:45 PM, I have stopped the server 2. I have started the
> server at 4:03 PM and I reset the jobnextrun ( from pgAdmin UI ) to 3:00PM.
> 3. Soon I started the pgAgent at 4:04PM, the jobnextrun is changed to 4:04 PM and it run once.
>
> Here, I lost the 2:00 PM to 3:00 PM aggregation.

The only way I can see for that to happen is if the agent is killed whilst the job is being processed. That will cause
thejob's nextrun time to be reset during zombie clearance when the agent restarts. In the normal case, the job will be
runas soon as the agent enters the main processing loop, where it will select all jobs with nextrun <= 
now() and process them immediately.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company


Re: Time series aggregation

From
"Yelai, Ramkumar IN BLR STS"
Date:
Hi Dave,

I have stopped the pgAgent service first and stooped the postgres server next then restarted server first and pgAgent
nextafter some time. In this case, I have not seen any issue in pgAgnet service logger. 

However, still I am facing the same problem ( jobnextrun is resetting by pgAgent).

Thanks & Regards,
Ramkumar

-----Original Message-----
From: Yelai, Ramkumar IN BLR STS
Sent: Thursday, June 28, 2012 11:51 AM
To: 'Dave Page'
Cc: pgadmin-support@postgresql.org
Subject: RE: [pgadmin-support] Time series aggregation

Hi Dave,

I use to shutdown the postgresql server by stopping the service, followed by that pgAgent server is automatically
stopped. 

However I am getting the following error message in service logger, while shutdown server.

Postgresql Service error message

The description for Event ID ( 0 ) in Source ( PostgreSQL ) cannot be found. The local computer may not have the
necessaryregistry information or message DLL files to display messages from a remote computer. You may be able to use
the/AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part
ofthe event: FATAL:  terminating connection due to administrator command 

Pgadmin error message

Failed to query jobs table!

The PostgreSQL Scheduling Agent - pgAgent service terminated unexpectedly.  It has done this 7 time(s).

It looks like pgAgent is not properly shutdown as you said.

Would you please help me how do I resolve this issue.

Thanks & Regards,
Ramkumar


-----Original Message-----
From: Dave Page [mailto:dpage@pgadmin.org]
Sent: Wednesday, June 27, 2012 9:44 PM
To: Yelai, Ramkumar IN BLR STS
Cc: pgadmin-support@postgresql.org
Subject: Re: [pgadmin-support] Time series aggregation

On Wed, Jun 27, 2012 at 11:44 AM, Yelai, Ramkumar IN BLR STS <ramkumar.yelai@siemens.com> wrote:
> Thanks Dave.
>
> I have reset the pgagent.pga_job.jobnextrun when server restarts, however it will be reset by pgAgent, when it
starts.
>
> Following are the steps, I have done.
>
> 1. Around 2:45 PM, I have stopped the server 2. I have started the
> server at 4:03 PM and I reset the jobnextrun ( from pgAdmin UI ) to 3:00PM.
> 3. Soon I started the pgAgent at 4:04PM, the jobnextrun is changed to 4:04 PM and it run once.
>
> Here, I lost the 2:00 PM to 3:00 PM aggregation.

The only way I can see for that to happen is if the agent is killed whilst the job is being processed. That will cause
thejob's nextrun time to be reset during zombie clearance when the agent restarts. In the normal case, the job will be
runas soon as the agent enters the main processing loop, where it will select all jobs with nextrun <= 
now() and process them immediately.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company


Re: Time series aggregation

From
Dave Page
Date:
On Thu, Jun 28, 2012 at 7:21 AM, Yelai, Ramkumar IN BLR STS
<ramkumar.yelai@siemens.com> wrote:
> Hi Dave,
>
> I use to shutdown the postgresql server by stopping the service, followed by that pgAgent server is automatically
stopped.
>
> However I am getting the following error message in service logger, while shutdown server.
>
> Postgresql Service error message
>
> The description for Event ID ( 0 ) in Source ( PostgreSQL ) cannot be found. The local computer may not have the
necessaryregistry information or message DLL files to display messages from a remote computer. You may be able to use
the/AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part
ofthe event: FATAL:  terminating connection due to administrator command 
>
> Pgadmin error message
>
> Failed to query jobs table!
>
> The PostgreSQL Scheduling Agent - pgAgent service terminated unexpectedly.  It has done this 7 time(s).
>
> It looks like pgAgent is not properly shutdown as you said.

Well that's what happened this time - you killed the database
underneath it which it doesn't particularly like. That's not quite the
issue I was describing though; I was saying that the only time the
agent resets the nextrun time at startup is if it gets shutdown when
the job is running (which can be because the database server dies
underneath it, or the agent itself is shutdown). That results in a
"zombie", ie. a job which is assigned to an agent that's no longer
running. We really have no choice but to reset in that case because
it's safer not to re-run a job that may have partially completed, than
to run it and potentially damage data. This way, the user can manually
cleanup any fallout from the partially completed job.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company