Thread: Time series aggregation
Hi,
In my current project, we are keep inserting device related data in 5 minutes basis in our archiving tables. Soon we added more devices, more number of rows will be inserted.
When we are generating report for 1000 devices for around 1 week of date range, It takes more time to aggregate the data and compute the business logic. Hence we thought to go for time series aggregation like, aggregate the device data by Hourly, daily and monthly and store in to another table. Since Postgresql is supporting cron job in pgAgent, so I created a jobs to run hourly , daily and monthly basis to aggregate the data. The jobs are working fine, but if postgresql server is shutdown and resume after some time like 2 hours or 1 day then job are not running from the last run instead it is running from the current time and return to the normal schedule.
Here, how do I run the jobs from last run to current time as per the schedules based on hourly, daily and monthly.
Thanks & Regards,
Ramkumar
On Wed, Jun 27, 2012 at 7:15 AM, Yelai, Ramkumar IN BLR STS <ramkumar.yelai@siemens.com> wrote: > Hi, > > In my current project, we are keep inserting device related data in 5 > minutes basis in our archiving tables. Soon we added more devices, more > number of rows will be inserted. > > When we are generating report for 1000 devices for around 1 week of date > range, It takes more time to aggregate the data and compute the business > logic. Hence we thought to go for time series aggregation like, aggregate > the device data by Hourly, daily and monthly and store in to another table. > Since Postgresql is supporting cron job in pgAgent, so I created a jobs to > run hourly , daily and monthly basis to aggregate the data. The jobs are > working fine, but if postgresql server is shutdown and resume after some > time like 2 hours or 1 day then job are not running from the last run > instead it is running from the current time and return to the normal > schedule. That's by design (following a long discussion here many years ago). > Here, how do I run the jobs from last run to current time as per the > schedules based on hourly, daily and monthly. The easiest way would probably be to reset the pgagent.pga_job.jobnextrun value to whatever you want it to be when the server restarts, before pgAgent is restarted. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Thanks Dave. I have reset the pgagent.pga_job.jobnextrun when server restarts, however it will be reset by pgAgent, when it starts. Following are the steps, I have done. 1. Around 2:45 PM, I have stopped the server 2. I have started the server at 4:03 PM and I reset the jobnextrun ( from pgAdmin UI ) to 3:00PM. 3. Soon I started the pgAgent at 4:04PM, the jobnextrun is changed to 4:04 PM and it run once. Here, I lost the 2:00 PM to 3:00 PM aggregation. Please let me know is there any additional steps I have to be done. Thanks & Regards, Ramkumar -----Original Message----- From: Dave Page [mailto:dpage@pgadmin.org] Sent: Wednesday, June 27, 2012 2:14 PM To: Yelai, Ramkumar IN BLR STS Cc: pgadmin-support@postgresql.org Subject: Re: [pgadmin-support] Time series aggregation On Wed, Jun 27, 2012 at 7:15 AM, Yelai, Ramkumar IN BLR STS <ramkumar.yelai@siemens.com> wrote: > Hi, > > In my current project, we are keep inserting device related data in > 5 minutes basis in our archiving tables. Soon we added more devices, > more number of rows will be inserted. > > When we are generating report for 1000 devices for around 1 week of > date range, It takes more time to aggregate the data and compute the > business logic. Hence we thought to go for time series aggregation > like, aggregate the device data by Hourly, daily and monthly and store in to another table. > Since Postgresql is supporting cron job in pgAgent, so I created a > jobs to run hourly , daily and monthly basis to aggregate the data. > The jobs are working fine, but if postgresql server is shutdown and > resume after some time like 2 hours or 1 day then job are not running > from the last run instead it is running from the current time and > return to the normal schedule. That's by design (following a long discussion here many years ago). > Here, how do I run the jobs from last run to current time as per the > schedules based on hourly, daily and monthly. The easiest way would probably be to reset the pgagent.pga_job.jobnextrun value to whatever you want it to be when the serverrestarts, before pgAgent is restarted. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jun 27, 2012 at 11:44 AM, Yelai, Ramkumar IN BLR STS <ramkumar.yelai@siemens.com> wrote: > Thanks Dave. > > I have reset the pgagent.pga_job.jobnextrun when server restarts, however it will be reset by pgAgent, when it starts. > > Following are the steps, I have done. > > 1. Around 2:45 PM, I have stopped the server > 2. I have started the server at 4:03 PM and I reset the jobnextrun ( from pgAdmin UI ) to 3:00PM. > 3. Soon I started the pgAgent at 4:04PM, the jobnextrun is changed to 4:04 PM and it run once. > > Here, I lost the 2:00 PM to 3:00 PM aggregation. The only way I can see for that to happen is if the agent is killed whilst the job is being processed. That will cause the job's nextrun time to be reset during zombie clearance when the agent restarts. In the normal case, the job will be run as soon as the agent enters the main processing loop, where it will select all jobs with nextrun <= now() and process them immediately. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi Dave, I use to shutdown the postgresql server by stopping the service, followed by that pgAgent server is automatically stopped. However I am getting the following error message in service logger, while shutdown server. Postgresql Service error message The description for Event ID ( 0 ) in Source ( PostgreSQL ) cannot be found. The local computer may not have the necessaryregistry information or message DLL files to display messages from a remote computer. You may be able to use the/AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part ofthe event: FATAL: terminating connection due to administrator command Pgadmin error message Failed to query jobs table! The PostgreSQL Scheduling Agent - pgAgent service terminated unexpectedly. It has done this 7 time(s). It looks like pgAgent is not properly shutdown as you said. Would you please help me how do I resolve this issue. Thanks & Regards, Ramkumar -----Original Message----- From: Dave Page [mailto:dpage@pgadmin.org] Sent: Wednesday, June 27, 2012 9:44 PM To: Yelai, Ramkumar IN BLR STS Cc: pgadmin-support@postgresql.org Subject: Re: [pgadmin-support] Time series aggregation On Wed, Jun 27, 2012 at 11:44 AM, Yelai, Ramkumar IN BLR STS <ramkumar.yelai@siemens.com> wrote: > Thanks Dave. > > I have reset the pgagent.pga_job.jobnextrun when server restarts, however it will be reset by pgAgent, when it starts. > > Following are the steps, I have done. > > 1. Around 2:45 PM, I have stopped the server 2. I have started the > server at 4:03 PM and I reset the jobnextrun ( from pgAdmin UI ) to 3:00PM. > 3. Soon I started the pgAgent at 4:04PM, the jobnextrun is changed to 4:04 PM and it run once. > > Here, I lost the 2:00 PM to 3:00 PM aggregation. The only way I can see for that to happen is if the agent is killed whilst the job is being processed. That will cause thejob's nextrun time to be reset during zombie clearance when the agent restarts. In the normal case, the job will be runas soon as the agent enters the main processing loop, where it will select all jobs with nextrun <= now() and process them immediately. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi Dave, I have stopped the pgAgent service first and stooped the postgres server next then restarted server first and pgAgent nextafter some time. In this case, I have not seen any issue in pgAgnet service logger. However, still I am facing the same problem ( jobnextrun is resetting by pgAgent). Thanks & Regards, Ramkumar -----Original Message----- From: Yelai, Ramkumar IN BLR STS Sent: Thursday, June 28, 2012 11:51 AM To: 'Dave Page' Cc: pgadmin-support@postgresql.org Subject: RE: [pgadmin-support] Time series aggregation Hi Dave, I use to shutdown the postgresql server by stopping the service, followed by that pgAgent server is automatically stopped. However I am getting the following error message in service logger, while shutdown server. Postgresql Service error message The description for Event ID ( 0 ) in Source ( PostgreSQL ) cannot be found. The local computer may not have the necessaryregistry information or message DLL files to display messages from a remote computer. You may be able to use the/AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part ofthe event: FATAL: terminating connection due to administrator command Pgadmin error message Failed to query jobs table! The PostgreSQL Scheduling Agent - pgAgent service terminated unexpectedly. It has done this 7 time(s). It looks like pgAgent is not properly shutdown as you said. Would you please help me how do I resolve this issue. Thanks & Regards, Ramkumar -----Original Message----- From: Dave Page [mailto:dpage@pgadmin.org] Sent: Wednesday, June 27, 2012 9:44 PM To: Yelai, Ramkumar IN BLR STS Cc: pgadmin-support@postgresql.org Subject: Re: [pgadmin-support] Time series aggregation On Wed, Jun 27, 2012 at 11:44 AM, Yelai, Ramkumar IN BLR STS <ramkumar.yelai@siemens.com> wrote: > Thanks Dave. > > I have reset the pgagent.pga_job.jobnextrun when server restarts, however it will be reset by pgAgent, when it starts. > > Following are the steps, I have done. > > 1. Around 2:45 PM, I have stopped the server 2. I have started the > server at 4:03 PM and I reset the jobnextrun ( from pgAdmin UI ) to 3:00PM. > 3. Soon I started the pgAgent at 4:04PM, the jobnextrun is changed to 4:04 PM and it run once. > > Here, I lost the 2:00 PM to 3:00 PM aggregation. The only way I can see for that to happen is if the agent is killed whilst the job is being processed. That will cause thejob's nextrun time to be reset during zombie clearance when the agent restarts. In the normal case, the job will be runas soon as the agent enters the main processing loop, where it will select all jobs with nextrun <= now() and process them immediately. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Jun 28, 2012 at 7:21 AM, Yelai, Ramkumar IN BLR STS <ramkumar.yelai@siemens.com> wrote: > Hi Dave, > > I use to shutdown the postgresql server by stopping the service, followed by that pgAgent server is automatically stopped. > > However I am getting the following error message in service logger, while shutdown server. > > Postgresql Service error message > > The description for Event ID ( 0 ) in Source ( PostgreSQL ) cannot be found. The local computer may not have the necessaryregistry information or message DLL files to display messages from a remote computer. You may be able to use the/AUXSOURCE= flag to retrieve this description; see Help and Support for details. The following information is part ofthe event: FATAL: terminating connection due to administrator command > > Pgadmin error message > > Failed to query jobs table! > > The PostgreSQL Scheduling Agent - pgAgent service terminated unexpectedly. It has done this 7 time(s). > > It looks like pgAgent is not properly shutdown as you said. Well that's what happened this time - you killed the database underneath it which it doesn't particularly like. That's not quite the issue I was describing though; I was saying that the only time the agent resets the nextrun time at startup is if it gets shutdown when the job is running (which can be because the database server dies underneath it, or the agent itself is shutdown). That results in a "zombie", ie. a job which is assigned to an agent that's no longer running. We really have no choice but to reset in that case because it's safer not to re-run a job that may have partially completed, than to run it and potentially damage data. This way, the user can manually cleanup any fallout from the partially completed job. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company