Re: Implementing Incremental View Maintenance - Mailing list pgsql-hackers

From Takuma Hoshiai
Subject Re: Implementing Incremental View Maintenance
Date
Msg-id CAB3DWdenkcLuyLgkN7pSObkJNzN0R=jo=E3M2r9hj4dr3vbauw@mail.gmail.com
Whole thread Raw
In response to Re: Implementing Incremental View Maintenance  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Implementing Incremental View Maintenance  (Takuma Hoshiai <hoshiai@sraoss.co.jp>)
List pgsql-hackers
Hi Thomas,

2019年7月8日(月) 15:32 Thomas Munro <thomas.munro@gmail.com>:
On Fri, Jun 28, 2019 at 10:56 PM Yugo Nagata <nagata@sraoss.co.jp> wrote:
> Attached is a WIP patch of IVM which supports some aggregate functions.

Hi Nagata-san and Hoshiai-san,

Thank you for working on this.  I enjoyed your talk at PGCon.  I've
added Kevin Grittner just in case he missed this thread; he has talked
often about implementing the counting algorithm, and he wrote the
"trigger transition tables" feature to support exactly this.  While
integrating trigger transition tables with the new partition features,
we had to make a number of decisions about how that should work, and
we tried to come up with answers that would work for IMV, and I hope
we made the right choices!

I am quite interested to learn how IVM interacts with SERIALIZABLE.
 
 Nagata-san has been studying this. Nagata-san, any comment?

A couple of superficial review comments:
 
Thank you for your review comments.
Please find attached patches. The some of your review is reflected in patch too.

We manage and update IVM on following github repository.
https://github.com/sraoss/pgsql-ivm
you also can found latest WIP patch here.
 
+            const char *aggname = get_func_name(aggref->aggfnoid);
...
+            else if (!strcmp(aggname, "sum"))

I guess you need a more robust way to detect the supported aggregates
than their name, or I guess some way for aggregates themselves to
specify that they support this and somehow supply the extra logic.
Perhaps I just waid what Greg Stark already said, except not as well.

We have recognized the issue and are welcome any input.

+                            elog(ERROR, "Aggrege function %s is not
supported", aggname);

s/Aggrege/aggregate/
 
I fixed this typo. 

Of course it is not helpful to comment on typos at this early stage,
it's just that this one appears many times in the test output :-)

+static bool
+isIvmColumn(const char *s)
+{
+    char pre[7];
+
+     strlcpy(pre, s, sizeof(pre));
+    return (strcmp(pre, "__ivm_") == 0);
+}

What about strncmp(s, "__ivm_", 6) == 0)? 

I agree with you, I fixed it.

As for the question of how
to reserve a namespace for system columns that won't clash with user
columns, according to our manual the SQL standard doesn't allow $ in
identifier names, and according to my copy SQL92 "intermediate SQL"
doesn't allow identifiers that end in an underscore.  I don't know
what the best answer is but we should probably decide on a something
based the standard.

As for how to make internal columns invisible to SELECT *, previously
there have been discussions about doing that using a new flag in
pg_attribute:

https://www.postgresql.org/message-id/flat/CAEepm%3D3ZHh%3Dp0nEEnVbs1Dig_UShPzHUcMNAqvDQUgYgcDo-pA%40mail.gmail.com

+                            "WITH t AS ("
+                            "  SELECT diff.__ivm_count__,
(diff.__ivm_count__ = mv.__ivm_count__) AS for_dlt, mv.ctid"
+                            ", %s"
+                            "  FROM %s AS mv, %s AS diff WHERE (%s) = (%s)"
+                            "), updt AS ("
+                            "  UPDATE %s AS mv SET __ivm_count__ =
mv.__ivm_count__ - t.__ivm_count__"
+                            ", %s "
+                            "  FROM t WHERE mv.ctid = t.ctid AND NOT for_dlt"
+                            ") DELETE FROM %s AS mv USING t WHERE
mv.ctid = t.ctid AND for_dlt;",

I fully understand that this is POC code, but I am curious about one
thing.  These queries that are executed by apply_delta() would need to
be converted to C, or at least used reusable plans, right?  Hmm,
creating and dropping temporary tables every time is a clue that the
ultimate form of this should be tuplestores and C code, I think,
right?
 
Nagata-san is investing the issue.
 
> Moreover, some regression test are added for aggregate functions support.
> This is Hoshiai-san's work.

Great.  Next time you post a WIP patch, could you please fix this
small compiler warning?

describe.c: In function ‘describeOneTableDetails’:
describe.c:3270:55: error: ‘*((void *)&tableinfo+48)’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
if (verbose && tableinfo.relkind == RELKIND_MATVIEW && tableinfo.isivm)
^
describe.c:1495:4: note: ‘*((void *)&tableinfo+48)’ was declared here
} tableinfo;
^
 
It is fixed too.

Then our unofficial automatic CI system[1] will run these tests every
day, which sometimes finds problems.

[1] cfbot.cputube.org

--
Thomas Munro
https://enterprisedb.com

 
Best regards,
 
Takuma Hoshiai

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Next
From: Tom Lane
Date:
Subject: Re: coypu: "FATAL: sorry, too many clients already"