Thread: PostgreSQL intenal scheduler?

PostgreSQL intenal scheduler?

From

Thiemo Kellner

Date:

05 September 2018, 18:06:07

Hi all

I am designing a framework for historisation implementation (SCD). One  
feature I would like to provide is a table in that the actual state of  
an entity is put and if this is complete, this history table is  
"updated":

       ------------         -------------
==>  | ENTITY_ACT |  ==>  | ENTITY_HIST |
       ------------         -------------

I plan to use instead-of-triggers on the hist table that read the  
actual table and perfoms all necessary inserts und updates on the  
history table. If I want the termination of a record version (actually  
the record of a specific business key with a specific payload) to get  
propagated up and/or down referential integrities (no overlapping  
validities) I have to make sure that only one of those processes is  
modifying a table. I was thinking of a scheduler queue where the  
trigger would put a process request and PostgreSQL would work through.  
Is there a scheduler within PostgreSQL? I read the documentation and  
searched the web but could not find a hint. But before going another  
road or implementing something myself, I ask. Maybe this design is no  
good at all.

Kind regards

Thiemo

Re: PostgreSQL intenal scheduler?

From

Tim Clarke

Date:

05 September 2018, 18:09:04

On 05/09/18 16:06, Thiemo Kellner wrote:
>
> Hi all
>
> I am designing a framework for historisation implementation (SCD). One
> feature I would like to provide is a table in that the actual state of
> an entity is put and if this is complete, this history table is
> "updated":
>
>       ------------         -------------
> ==>  | ENTITY_ACT |  ==>  | ENTITY_HIST |
>       ------------         -------------
>
> I plan to use instead-of-triggers on the hist table that read the
> actual table and perfoms all necessary inserts und updates on the
> history table. If I want the termination of a record version (actually
> the record of a specific business key with a specific payload) to get
> propagated up and/or down referential integrities (no overlapping
> validities) I have to make sure that only one of those processes is
> modifying a table. I was thinking of a scheduler queue where the
> trigger would put a process request and PostgreSQL would work through.
> Is there a scheduler within PostgreSQL? I read the documentation and
> searched the web but could not find a hint. But before going another
> road or implementing something myself, I ask. Maybe this design is no
> good at all.
>
> Kind regards
>
> Thiemo
>
>

Why not just call your "do a scheduled run" code from cron?

Tim Clarke

Re: PostgreSQL intenal scheduler?

From

Thomas Kellerer

Date:

05 September 2018, 18:31:42

Thiemo Kellner schrieb am 05.09.2018 um 17:06:
> I am designing a framework for historisation implementation (SCD).
> One feature I would like to provide is a table in that the actual
> state of an entity is put and if this is complete, this history table
> is "updated":
> 
>       ------------         -------------
> ==>  | ENTITY_ACT |  ==>  | ENTITY_HIST |
>       ------------         -------------
> 

> I plan to use instead-of-triggers on the hist table that read the
> actual table and perfoms all necessary inserts und updates on the
> history table. If I want the termination of a record version
> (actually the record of a specific business key with a specific
> payload) to get propagated up and/or down referential integrities (no
> overlapping validities) I have to make sure that only one of those
> processes is modifying a table. I was thinking of a scheduler queue
> where the trigger would put a process request and PostgreSQL would
> work through. Is there a scheduler within PostgreSQL? I read the
> documentation and searched the web but could not find a hint. But
> before going another road or implementing something myself, I ask.
> Maybe this design is no good at all.


There is no built-in scheduler, but there is an extension that supplies that

https://github.com/citusdata/pg_cron

Re: PostgreSQL intenal scheduler?

From

Thiemo Kellner

Date:

05 September 2018, 22:35:49

I have seen pg_cron but it is not what I am looking for. It schedules  
tasks only by time. I am looking for a fifo queue. pg_cron neither  
prevents from simultaneous runs I believe.

Quoting Thomas Kellerer <spam_eater@gmx.net>:

> There is no built-in scheduler, but there is an extension that supplies that
>
> https://github.com/citusdata/pg_cron

Re: PostgreSQL intenal scheduler?

From

Ron

Date:

06 September 2018, 00:01:03

Maybe https://github.com/chanks/que is what you need.

On 09/05/2018 02:35 PM, Thiemo Kellner wrote:
> I have seen pg_cron but it is not what I am looking for. It schedules 
> tasks only by time. I am looking for a fifo queue. pg_cron neither 
> prevents from simultaneous runs I believe.
>
> Quoting Thomas Kellerer <spam_eater@gmx.net>:
>
>> There is no built-in scheduler, but there is an extension that supplies that
>>
>> https://github.com/citusdata/pg_cron
>
>
>
>

-- 
Angular momentum makes the world go 'round.

Re: PostgreSQL intenal scheduler?

From

Thiemo Kellner

Date:

06 September 2018, 01:03:31

This is a queue but if I am not mistaken, it is outside PostgreSQL  
where to go I am very reluctant. I will look at it in more depth.  
Thanks!

Quoting Ron <ronljohnsonjr@gmail.com>:

> Maybe https://github.com/chanks/que is what you need.
>
> On 09/05/2018 02:35 PM, Thiemo Kellner wrote:
>> I have seen pg_cron but it is not what I am looking for. It  
>> schedules tasks only by time. I am looking for a fifo queue.  
>> pg_cron neither prevents from simultaneous runs I believe.

Re: PostgreSQL intenal scheduler?

From

Thiemo Kellner

Date:

06 September 2018, 01:04:58

Because I am looking for a fifo queue and not for time schedule.  
Thanks anyway.

Quoting Tim Clarke <tim.clarke@minerva-analytics.info>:

>
> Why not just call your "do a scheduled run" code from cron?
>
> Tim Clarke

Re: PostgreSQL intenal scheduler?

From

Jeremy Finzel

Date:

06 September 2018, 08:17:02

If I follow your use case, we have written something that just may fit your scenario and plan to open source it rather soon.

It has several layers but let me boil it down. First we use an open sourced auditing system to log changes to the source tables. This becomes your queue. A postgres background worker will asynchronously process these changes based on your configuration, which is highly configurable. It also handles the concurrency you are questioning.

This allows you to build history tables without requiring you for example to do it directly via a trigger. It also removes redundancy if you have the same key updated multiple times. It assumes we are fine with the data built not being 100% up to date data because these updates obviously don’t all happen in the same transaction as the source data change.

Let me know if this interests you and I can share more.

Thanks,

Jeremy

On Wed, Sep 5, 2018 at 10:07 AM Thiemo Kellner <thiemo@gelassene-pferde.biz> wrote:

Hi all

I am designing a framework for historisation implementation (SCD). One
feature I would like to provide is a table in that the actual state of
an entity is put and if this is complete, this history table is
"updated":

------------ -------------
==> | ENTITY_ACT | ==> | ENTITY_HIST |
------------ -------------

I plan to use instead-of-triggers on the hist table that read the
actual table and perfoms all necessary inserts und updates on the
history table. If I want the termination of a record version (actually
the record of a specific business key with a specific payload) to get
propagated up and/or down referential integrities (no overlapping
validities) I have to make sure that only one of those processes is
modifying a table. I was thinking of a scheduler queue where the
trigger would put a process request and PostgreSQL would work through.
Is there a scheduler within PostgreSQL? I read the documentation and
searched the web but could not find a hint. But before going another
road or implementing something myself, I ask. Maybe this design is no
good at all.

Kind regards

Thiemo

Re: PostgreSQL intenal scheduler?

From

Thiemo Kellner

Date:

06 September 2018, 09:28:37

Thanks for the hint. Your solution seems to be good. However, I am  
designing my framework for fun, to give back something to the  
community and because I think it's time that historisation is not  
reinvented and reimplemented again and again.

Having that said, I think I can get rid of the need to cascade changes  
up and down the refential constraints by the following modelling. I am  
aware that I would restrict the use of the framework to this model.

                                   -------------        --------------------
                             ==>  | ENTITY_HEAD |>--..| PARENT_ENTITIY_HEAD |
                          //       -------------        --------------------
                         //              |
       ------------     //               |
==>  | ENTITY_ACT |  ==                 |
       ------------     \\               |
                         \\             /|\
                          \\      -------------
                            ==>  | ENTITY_HIST |
                                  -------------

The head table only contains the business key, the business surrogate  
key and foreign key attributes to parent tables. Strictly, I can  
contain also payload attributes that never ever ever ever ever ever  
change their value. But who is to guarantee that? However, It is SCD0  
so once a value is entered it never changes.

The history table contains all the other attributes the payload so to  
speak and get the historisation. Version matching between related  
entities could be done by a generated view so no one would have to  
re-invent the wheel.

One could generate the head table from the business key in the actual  
table and generate the hist table from the actual table without the  
business key and a template table with history attributes. Can one  
exclude attributes from inheritance?

Quoting Jeremy Finzel <finzelj@gmail.com>:

> If I follow your use case, we have written something that just may  
> fit your scenario and plan to open source it rather soon.
>
>    
>   It has several layers but let me boil it down. First we use an  
> open sourced auditing system to log changes to the source tables.  
> This becomes your queue. A postgres background worker will  
> asynchronously process these changes based on your configuration,  
> which is highly configurable. It also handles the concurrency you  
> are questioning.
>    
>   This allows you to build history tables without requiring you for  
> example to do it directly via a trigger. It also removes redundancy  
> if you have the same key updated multiple times. It assumes we are  
> fine with the data built not being 100% up to date data because  
> these updates obviously don’t all happen in the same transaction as  
> the source data change.
>    
>   Let me know if this interests you and I can share more.
>    
>   Thanks,
>   Jeremy