Re: Implementing Incremental View Maintenance - Mailing list pgsql-hackers

From Takuma Hoshiai
Subject Re: Implementing Incremental View Maintenance
Date
Msg-id 20190710112938.802310778628d1c6f45d8689@sraoss.co.jp
Whole thread Raw
In response to Re: Implementing Incremental View Maintenance  (Takuma Hoshiai <takuma.hoshiai@gmail.com>)
Responses Re: Implementing Incremental View Maintenance
List pgsql-hackers
On Wed, 10 Jul 2019 11:07:15 +0900
Takuma Hoshiai <takuma.hoshiai@gmail.com> wrote:

> Hi Thomas,
> 
> 2019年7月8日(月) 15:32 Thomas Munro <thomas.munro@gmail.com>:
> 
> > On Fri, Jun 28, 2019 at 10:56 PM Yugo Nagata <nagata@sraoss.co.jp> wrote:
> > > Attached is a WIP patch of IVM which supports some aggregate functions.
> >
> > Hi Nagata-san and Hoshiai-san,
> >
> > Thank you for working on this.  I enjoyed your talk at PGCon.  I've
> > added Kevin Grittner just in case he missed this thread; he has talked
> > often about implementing the counting algorithm, and he wrote the
> > "trigger transition tables" feature to support exactly this.  While
> > integrating trigger transition tables with the new partition features,
> > we had to make a number of decisions about how that should work, and
> > we tried to come up with answers that would work for IMV, and I hope
> > we made the right choices!
> >
> > I am quite interested to learn how IVM interacts with SERIALIZABLE.
> >
> 
>  Nagata-san has been studying this. Nagata-san, any comment?
> 
> > A couple of superficial review comments:
> 
> Thank you for your review comments.
> Please find attached patches. The some of your review is reflected in patch
> too.

Sorry, I forgot attaching patch.
In addition, avg() function is supported newly. We found a issue
when use avg() with IVM, added its reproduction case in
regressio test. We are being to fix now.

> We manage and update IVM on following github repository.
> https://github.com/sraoss/pgsql-ivm
> you also can found latest WIP patch here.
> 
> 
> > +            const char *aggname = get_func_name(aggref->aggfnoid);
> > ...
> > +            else if (!strcmp(aggname, "sum"))
> >
> > I guess you need a more robust way to detect the supported aggregates
> > than their name, or I guess some way for aggregates themselves to
> > specify that they support this and somehow supply the extra logic.
> > Perhaps I just waid what Greg Stark already said, except not as well.
> >
> 
> We have recognized the issue and are welcome any input.
> 
> > +                            elog(ERROR, "Aggrege function %s is not
> > supported", aggname);
> >
> > s/Aggrege/aggregate/
> >
> 
> I fixed this typo.
> 
> > Of course it is not helpful to comment on typos at this early stage,
> > it's just that this one appears many times in the test output :-)
> >
> > +static bool
> > +isIvmColumn(const char *s)
> > +{
> > +    char pre[7];
> > +
> > +     strlcpy(pre, s, sizeof(pre));
> > +    return (strcmp(pre, "__ivm_") == 0);
> > +}
> >
> > What about strncmp(s, "__ivm_", 6) == 0)?
> 
> 
> I agree with you, I fixed it.
> 
> > As for the question of how
> > to reserve a namespace for system columns that won't clash with user
> > columns, according to our manual the SQL standard doesn't allow $ in
> > identifier names, and according to my copy SQL92 "intermediate SQL"
> > doesn't allow identifiers that end in an underscore.  I don't know
> > what the best answer is but we should probably decide on a something
> > based the standard.
> >
> > As for how to make internal columns invisible to SELECT *, previously
> > there have been discussions about doing that using a new flag in
> > pg_attribute:
> >
> >
> >
https://www.postgresql.org/message-id/flat/CAEepm%3D3ZHh%3Dp0nEEnVbs1Dig_UShPzHUcMNAqvDQUgYgcDo-pA%40mail.gmail.com
> >
> > +                            "WITH t AS ("
> > +                            "  SELECT diff.__ivm_count__,
> > (diff.__ivm_count__ = mv.__ivm_count__) AS for_dlt, mv.ctid"
> > +                            ", %s"
> > +                            "  FROM %s AS mv, %s AS diff WHERE (%s) =
> > (%s)"
> > +                            "), updt AS ("
> > +                            "  UPDATE %s AS mv SET __ivm_count__ =
> > mv.__ivm_count__ - t.__ivm_count__"
> > +                            ", %s "
> > +                            "  FROM t WHERE mv.ctid = t.ctid AND NOT
> > for_dlt"
> > +                            ") DELETE FROM %s AS mv USING t WHERE
> > mv.ctid = t.ctid AND for_dlt;",
> >
> > I fully understand that this is POC code, but I am curious about one
> > thing.  These queries that are executed by apply_delta() would need to
> > be converted to C, or at least used reusable plans, right?  Hmm,
> > creating and dropping temporary tables every time is a clue that the
> > ultimate form of this should be tuplestores and C code, I think,
> > right?
> >
> 
> Nagata-san is investing the issue.
> 
> 
> > > Moreover, some regression test are added for aggregate functions support.
> > > This is Hoshiai-san's work.
> >
> > Great.  Next time you post a WIP patch, could you please fix this
> > small compiler warning?
> >
> > describe.c: In function ‘describeOneTableDetails’:
> > describe.c:3270:55: error: ‘*((void *)&tableinfo+48)’ may be used
> > uninitialized in this function [-Werror=maybe-uninitialized]
> > if (verbose && tableinfo.relkind == RELKIND_MATVIEW && tableinfo.isivm)
> > ^
> > describe.c:1495:4: note: ‘*((void *)&tableinfo+48)’ was declared here
> > } tableinfo;
> > ^
> >
> 
> It is fixed too.
> 
> > Then our unofficial automatic CI system[1] will run these tests every
> > day, which sometimes finds problems.
> >
> > [1] cfbot.cputube.org
> >
> > --
> > Thomas Munro
> > https://enterprisedb.com
> >
> >
> Best regards,
> 
> Takuma Hoshiai


-- 
Takuma Hoshiai <hoshiai@sraoss.co.jp>

Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Index Skip Scan
Next
From: Ryan Lambert
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)