Thread: Incremental aggregate/rollup strategy advice

Incremental aggregate/rollup strategy advice

From
Morris de Oryx
Date:
I'm researching strategies for incrementally updating aggregate/rollup tables. The problem is how to do so without double-counting changes, and not skipping changes. I know enough about concurrency issues to ask the question, but do not know enough about the features and details of Postgres' concurrency management to figure out a 100% reliable solution without some help. And, with concurrency-related stuff, you're either 100% right or you're buggy.

And thanks in advance to anyone who can help out. I'm not good at writing short :( I've tried to put in enough detail to get to the point, which is "how do I find unprocessed records without missing any."

Okay, the setup is that we've got a lot of tables where we would like to do incremental aggregates. To simplify things, mostly these are INSERT-only tables, sometimes UPDATE, not worrying about DELETE yet. A couple of strategies I'd like to avoid:

* Full queries will take too long, and will scale poorly. So, MATERIALIZED VIEW is unappealing. So, rollup tables as it's possible to update them incrementally.

* We may have multiple aggregates off the same base data, and may change them over time. So, putting some kind of flag field in the source table doesn't really fit. 

* I was thinking about a posting/diff/delta/audit-like table, but that's a pretty "heavy" solution. You need some kind of ON AFTER INSERT/UPDATE selection-based trigger to push over the data that's needed to update the aggregates. Which, again, means the source table needs to know what aggregations are going to take place. Plus, it's just a ton of churn and extra data...when all of necessary data exists in the source table already.

* I saw one strategy that looks good from the folks at CitusData:

Briefly, they use a bigserial counter which, I guess, is not transaction-bound so that record insertions have a chronological stamp. 1, 2, 3, etc. This is a design familiar to me from other environments and is sometimes called a "concurrency ID." In our case, we need to support UPDATE as well, so I don't think the sequence idea will work (?) To make this more concrete, here's a simplified table with source data:

CREATE TABLE "error_report" (
"id" uuid NOT NULL DEFAULT extensions.gen_random_uuid(), -- We've got distributed sources, so UUIDs for IDs. 
"error_name" text NOT NULL DEFAULT false,                -- Something we'll summarize by.
"facility_id" uuid NOT NULL DEFAULT NULL,                -- Something we'll summarize by.
"error_dts" timestamptz NOT NULL DEFAULT NULL,           -- Set on the source machine in UTC
"last_updated_dts" timestamptz NOT NULL DEFAULT NULL);   -- Set on Postgres after INSERT or UPDATE.

The idea is that you have a stable number line as a number or a timestamp. We use timestamptz and store everything in UTC. Otherwise, it's the same basic idea as what the CitusData folks said: You have an ever-increasing number line so that you can mark where you've processed to. This way, you can fetch unprocessed rows without missing any, without a flag field the source table, and without an audit table/change queue of any kind. I've simplified the timestamps below for legibility to spell this out, as it's the crux of my question about Postgres specifics. And, just pretend that these rows are all on page 0...I've faked ctid values to make the rows easier to keep track of.

ctid   last_updated_dts
(0,1)  2018-09-25 05:53:00 
(0,2)  2018-09-25 05:54:00
(0,3)  2018-09-25 05:55:00
(0,3)  2018-09-25 05:55:00
(0,4)  2018-09-26 02:23:00
(0,5)  2018-09-26 03:14:00
(0,6)  2018-09-26 03:15:00
(0,7)  2018-09-28 05:10:00
(0,8)  2018-09-28 05:14:00
(0,9)  2018-09-28 05:15:00
(0,10) 2018-09-28 05:15:00

You need a small utility table to hold details about which records you've aggregated or processed.

CREATE TABLE "rollup_status" (
"id" uuid NOT NULL DEFAULT extensions.gen_random_uuid(), -- We use UUIDs, not necessary here, but it's what we use. 
"rollup_name" text NOT NULL DEFAULT false,               
"last_processed_dts" timestamptz NOT NULL DEFAULT NULL); -- Marks the last timestamp processed.

Now imagine that I've got a rollup_status record

rollup_name         last_processed_dts
error_name_counts   2018-09-26 02:23:00

If I search for rows that were modified after the "processed until", I get these:

ctid   last_updated_dts
(0,5)  2018-09-26 03:14:00
(0,6)  2018-09-26 03:15:00
(0,7)  2018-09-28 05:10:00
(0,8)  2018-09-28 05:14:00
(0,9)  2018-09-28 05:15:00
(0,10) 2018-09-28 05:15:00

And update the max(last_updated_dts) in the rollup_detail record: 

rollup_name     last_processed_dts
error_name_counts   2018-09-28 05:15:00

So, I got a chunk of the timeline, recorded how far I went, and processed those records. The beauty part of this technique, if I can get it implemented correctly, is that this doesn't have to block new records. While I'm processing those 5 (or 5K), new records can be added onto the end of error_report and, so long as they have a timestamp later than rollup_detail.last_processed_dts, I'll find them on the next sweep. And if I want to add a different rollup? There's no need to do _anything_ to the error_report table. Imagine two new rows have come in to the error_report table

(0,11) 2018-09-28 05:17:00
(0,12) 2018-09-28 05:18:00

The new rollup runs, processes all 12 rows, and now there are two rollup_detail entries:

rollup_name         last_processed_dts
error_name_counts   2018-09-28 05:15:00
facility_avgs       2018-09-28 05:18:00

Same timeline, different uses, different last-position-processed stored in rollup_detail.last_processed_dts.

Okay, this finally brings me to the Postgres-specific question.

     How do I make sure that I don't have transactions commit with earlier timestamps?

I've been digging into this and have seen and considered a few things, but I'm not sure which, if any will work.

* Use an ON AFTER trigger on the selection with a transition table (cool feature!) to stamp the last_updated_dts with clock_timestamp(). That function in particular as I've now learned that most timestamp functions in a trigger return the transaction _start_ time, not _commit_ time. I need the transaction commit time. Which brings us to the next idea.

* pg_xact_commit_timestamp which, if I understand it correctly, is sort of an additional, invisible system column that stores the transaction commit timestamp as a timestamptz. That sounds perfectly matched to my requirements but:

-- I've failed to find _any_ detailed discussion of this feature, which is unusual with Postgres.

-- I'm not 100% sure it will work. Do I need some kind of lock notification or something while going the aggregate to button things down?

-- I'm not wild about invisible columns as they're obscure to anyone else.

-- I'm not wild about bolting an extra 8-bytes onto every row in every table. And does this field index? I think it must. 

-- I've seen mention that the timestamps aren't retained.

I don't need the timestamp as such, it's just meant to order things along the number line. I'll put a BRIN index on error_report.last_updated_dts (and such fields in other tables.) I don't think I would need an index on the pg_xact_commit_timestamp value, and it's not directly supported. But if it is necessary, I guess you could build one with a cast AT TIME ZONE to convert the value to something immutable for the index engine.

* The xmin system column. I don't think that this would work. While transaction IDs always increase, they aren't going to commit in that order. 

* The ctid system column. I've seen this mentioned, but I honestly don't follow how you could use this to reliably detect all new and modified rows.

I will be very grateful for any help or suggestions. I'm out over my skis on some of the details on Postgres specifics, so I won't take corrections the wrong way.

Re: Incremental aggregate/rollup strategy advice

From
Tatsuo Ishii
Date:
> * Full queries will take too long, and will scale poorly. So, MATERIALIZED
> VIEW is unappealing. So, rollup tables as it's possible to update them
> incrementally.

F.Y.I. There is a proposal to implemnt incremental updation against
MATERIALIZED VIEW. It is still in WIP patch but currently it supports
count and sum.

https://www.postgresql.org/message-id/20190628195620.c306e3003a83bb85a12f54c5%40sraoss.co.jp

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Incremental aggregate/rollup strategy advice

From
Morris de Oryx
Date:
Tatsuo,

Thank you for your response, I have followed the discussion on Hackers with interest. I hope that your efforts are a great success! In my case, I need to find a solution available in shipping versions of Postgres. But, since you've joined in, I'm curious: What is the advantage of a materialized view over a real table? It seems like the update semantics and mechanics are more straightforward with a table. 


RE: Incremental aggregate/rollup strategy advice

From
Steven Winfield
Date:

There are a couple of extensions that might help you:

 

PipelineDB[1]: Their “Continuous Views” could be useful. A big caveat here is that PipelineDB’s future is uncertain[2], though.

I haven’t used it myself, but it looks like you could put triggers onto your existing tables to insert data in PipelinedDB “Streams”, then build whatever continuous views are needed around those streams.

 

The other is TimescaleDB[3], which has “Continuous Aggregates”, but they are fairly new and currently have some limitation.

 

Steve.

 

[1] https://www.pipelinedb.com/

[2] https://www.pipelinedb.com/blog/pipelinedb-is-joining-confluent

[3] https://www.timescale.com/

 





This email is confidential. If you are not the intended recipient, please advise us immediately and delete this message. The registered name of Cantab- part of GAM Systematic is Cantab Capital Partners LLP. See - http://www.gam.com/en/Legal/Email+disclosures+EU for further information on confidentiality, the risks of non-secure electronic communication, and certain disclosures which we are required to make in accordance with applicable legislation and regulations. If you cannot access this link, please notify us by reply message and we will send the contents to you.

GAM Holding AG and its subsidiaries (Cantab – GAM Systematic) will collect and use information about you in the course of your interactions with us. Full details about the data types we collect and what we use this for and your related rights is set out in our online privacy policy at https://www.gam.com/en/legal/privacy-policy. Please familiarise yourself with this policy and check it from time to time for updates as it supplements this notice

Re: Incremental aggregate/rollup strategy advice

From
Morris de Oryx
Date:
Thanks Steven, nice suggestions. I should have mentioned that the deployment setup is on RDS on PG 11.x, which rules out those extensions. I've looked at TimescaleDB several times, and it looks pretty great.

I've now read through some of the archives from years back when pg_xact_commit_timestamp was still in development, and I'm thinking it might be the right solution. I'm still not clear how long timestamps are held, or what the overhead is. It sounds like commit timestamps might be exactly the same, but that's fine for me. So long as they're never in the past, it doesn't matter how many timestamps are the same.


Re: Incremental aggregate/rollup strategy advice

From
Tatsuo Ishii
Date:
> Thank you for your response, I have followed the discussion on Hackers with
> interest. I hope that your efforts are a great success! In my case, I need
> to find a solution available in shipping versions of Postgres. But, since
> you've joined in, I'm curious: What is the advantage of a materialized view
> over a real table? It seems like the update semantics and mechanics are
> more straightforward with a table.

In my understanding, views and materialized views provide users more
flexible and easy way to access base tables. In RDB, base tables are
usually heavily normalized and may not be easy for applications to
extract information. By defining views, apps would have convenient and
intuitive way to get information from base tables.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp