Re: More efficient RI checks - take 2 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: More efficient RI checks - take 2
Date
Msg-id 20200630011729.mr25bmmbvsattxe2@alap3.anarazel.de
Whole thread Raw
In response to Re: More efficient RI checks - take 2  (Antonin Houska <ah@cybertec.at>)
List pgsql-hackers
Hi,

I was looking at this patch with Corey during a patch-review session. So
these are basically our "combined" comments.


On 2020-06-05 17:16:43 +0200, Antonin Houska wrote:
> From 6c1cb8ae7fbf0a8122d8c6637c61b9915bc57223 Mon Sep 17 00:00:00 2001
> From: Antonin Houska <ah@cybertec.at>
> Date: Fri, 5 Jun 2020 16:42:34 +0200
> Subject: [PATCH 1/5] Check for RI violation outside ri_PerformCheck().

Probably good to add a short comment to the commit explaining why you're
doing this.

The change makes sense to me. Unless somebody protests I think we should
just apply it regardless of the rest of the series - the code seems
clearer afterwards.


> From 6b09e5598553c8e57b4ef9342912f51adb48f8af Mon Sep 17 00:00:00 2001
> From: Antonin Houska <ah@cybertec.at>
> Date: Fri, 5 Jun 2020 16:42:34 +0200
> Subject: [PATCH 2/5] Changed ri_GenerateQual() so it generates the whole
>  qualifier.
> 
> This way we can use the function to reduce the amount of copy&pasted code a
> bit.


>  /*
> - * ri_GenerateQual --- generate a WHERE clause equating two variables
> + * ri_GenerateQual --- generate WHERE/ON clause.
> + *
> + * Note: to avoid unnecessary explicit casts, make sure that the left and
> + * right operands match eq_oprs expect (ie don't swap the left and right
> + * operands accidentally).
> + */
> +static void
> +ri_GenerateQual(StringInfo buf, char *sep, int nkeys,
> +                const char *ltabname, Relation lrel,
> +                const int16 *lattnums,
> +                const char *rtabname, Relation rrel,
> +                const int16 *rattnums,
> +                const Oid *eq_oprs,
> +                GenQualParams params,
> +                Oid *paramtypes)
> +{
> +    for (int i = 0; i < nkeys; i++)
> +    {
> +        Oid            ltype = RIAttType(lrel, lattnums[i]);
> +        Oid            rtype = RIAttType(rrel, rattnums[i]);
> +        Oid            lcoll = RIAttCollation(lrel, lattnums[i]);
> +        Oid            rcoll = RIAttCollation(rrel, rattnums[i]);
> +        char        paramname[16];
> +        char       *latt,
> +                   *ratt;
> +        char       *sep_current = i > 0 ? sep : NULL;
> +
> +        if (params != GQ_PARAMS_NONE)
> +            sprintf(paramname, "$%d", i + 1);
> +
> +        if (params == GQ_PARAMS_LEFT)
> +        {
> +            latt = paramname;
> +            paramtypes[i] = ltype;
> +        }
> +        else
> +            latt = ri_ColNameQuoted(ltabname, RIAttName(lrel, lattnums[i]));
> +
> +        if (params == GQ_PARAMS_RIGHT)
> +        {
> +            ratt = paramname;
> +            paramtypes[i] = rtype;
> +        }
> +        else
> +            ratt = ri_ColNameQuoted(rtabname, RIAttName(rrel, rattnums[i]));


Why do we need support for having params on left or right side, instead
of just having them on one side?


> +        ri_GenerateQualComponent(buf, sep_current, latt, ltype, eq_oprs[i],
> +                                 ratt, rtype);
> +
> +        if (lcoll != rcoll)
> +            ri_GenerateQualCollation(buf, lcoll);
> +    }
> +}
> +
> +/*
> + * ri_GenerateQual --- generate a component of WHERE/ON clause equating two
> + * variables, to be AND-ed to the other components.
>   *
>   * This basically appends " sep leftop op rightop" to buf, adding casts
>   * and schema qualification as needed to ensure that the parser will select
> @@ -1828,17 +1802,86 @@ quoteRelationName(char *buffer, Relation rel)
>   * if they aren't variables or parameters.
>   */
>  static void
> -ri_GenerateQual(StringInfo buf,
> -                const char *sep,
> -                const char *leftop, Oid leftoptype,
> -                Oid opoid,
> -                const char *rightop, Oid rightoptype)
> +ri_GenerateQualComponent(StringInfo buf,
> +                         const char *sep,
> +                         const char *leftop, Oid leftoptype,
> +                         Oid opoid,
> +                         const char *rightop, Oid rightoptype)
>  {
> -    appendStringInfo(buf, " %s ", sep);
> +    if (sep)
> +        appendStringInfo(buf, " %s ", sep);
>      generate_operator_clause(buf, leftop, leftoptype, opoid,
>                               rightop, rightoptype);
>  }

Why is this handled inside ri_GenerateQualComponent() instead of
ri_GenerateQual()? Especially because the latter now has code to pass in
a different sep into ri_GenerateQualComponent().


> +/*
> + * ri_ColNameQuoted() --- return column name, with both table and column name
> + * quoted.
> + */
> +static char *
> +ri_ColNameQuoted(const char *tabname, const char *attname)
> +{
> +    char        quoted[MAX_QUOTED_NAME_LEN];
> +    StringInfo    result = makeStringInfo();
> +
> +    if (tabname && strlen(tabname) > 0)
> +    {
> +        quoteOneName(quoted, tabname);
> +        appendStringInfo(result, "%s.", quoted);
> +    }
> +
> +    quoteOneName(quoted, attname);
> +    appendStringInfoString(result, quoted);
> +
> +    return result->data;
> +}

Why does this new function accept a NULL / zero length string? I guess
that's because we currently don't qualify in all places?


> +/*
> + * Check that RI trigger function was called in expected context
> + */
> +static void
> +ri_CheckTrigger(FunctionCallInfo fcinfo, const char *funcname, int tgkind)
> +{
> +    TriggerData *trigdata = (TriggerData *) fcinfo->context;
> +
> +    if (!CALLED_AS_TRIGGER(fcinfo))
> +        ereport(ERROR,
> +                (errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
> +                 errmsg("function \"%s\" was not called by trigger manager", funcname)));
> +
> +    /*
> +     * Check proper event
> +     */
> +    if (!TRIGGER_FIRED_AFTER(trigdata->tg_event) ||
> +        !TRIGGER_FIRED_FOR_ROW(trigdata->tg_event))
> +        ereport(ERROR,
> +                (errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
> +                 errmsg("function \"%s\" must be fired AFTER ROW", funcname)));
> +
> +    switch (tgkind)
> +    {
> +        case RI_TRIGTYPE_INSERT:
> +            if (!TRIGGER_FIRED_BY_INSERT(trigdata->tg_event))
> +                ereport(ERROR,
> +                        (errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
> +                         errmsg("function \"%s\" must be fired for INSERT", funcname)));
> +            break;
> +        case RI_TRIGTYPE_UPDATE:
> +            if (!TRIGGER_FIRED_BY_UPDATE(trigdata->tg_event))
> +                ereport(ERROR,
> +                        (errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
> +                         errmsg("function \"%s\" must be fired for UPDATE", funcname)));
> +            break;
> +
> +        case RI_TRIGTYPE_DELETE:
> +            if (!TRIGGER_FIRED_BY_DELETE(trigdata->tg_event))
> +                ereport(ERROR,
> +                        (errcode(ERRCODE_E_R_I_E_TRIGGER_PROTOCOL_VIOLATED),
> +                         errmsg("function \"%s\" must be fired for DELETE", funcname)));
> +            break;
> +    }
> +}
> +

Why did you move this around, as part of this commit?




> From 208c733d759592402901599446b4f7e7197c1777 Mon Sep 17 00:00:00 2001
> From: Antonin Houska <ah@cybertec.at>
> Date: Fri, 5 Jun 2020 16:42:34 +0200
> Subject: [PATCH 4/5] Introduce infrastructure for batch processing RI events.
> 
> Separate storage is used for the RI trigger events because the "transient
> table" that we provide to statement triggers would not be available for
> deferred constraints. Also, the regular statement level trigger is not ideal
> for the RI checks because it requires the query execution to complete before
> the RI checks even start. On the other hand, if we use batches of row trigger
> events, we only need to tune the batch size so that user gets RI violation
> error rather soon.
> 
> This patch only introduces the infrastructure, however the trigger function is
> still called per event. This is just to reduce the size of the diffs.
> ---
>  src/backend/commands/tablecmds.c    |   68 +-
>  src/backend/commands/trigger.c      |  406 ++++++--
>  src/backend/executor/spi.c          |   16 +-
>  src/backend/utils/adt/ri_triggers.c | 1385 +++++++++++++++++++--------
>  src/include/commands/trigger.h      |   25 +
>  5 files changed, 1381 insertions(+), 519 deletions(-)

My first comment here is that this is too large a change and should be
broken up.

I think there's also not enough explanation in here what the new design
is. I can infer some of that from the code, but that's imo shifting work
to the reviewer / reader unnecessarily.



> +static void AfterTriggerExecuteRI(EState *estate,
> +                                  ResultRelInfo *relInfo,
> +                                  FmgrInfo *finfo,
> +                                  Instrumentation *instr,
> +                                  TriggerData *trig_last,
> +                                  MemoryContext batch_context);
>  static AfterTriggersTableData *GetAfterTriggersTableData(Oid relid,
>                                                           CmdType cmdType);
>  static void AfterTriggerFreeQuery(AfterTriggersQueryData *qs);
> @@ -3807,13 +3821,16 @@ afterTriggerDeleteHeadEventChunk(AfterTriggersQueryData *qs)
>   *    fmgr lookup cache space at the caller level.  (For triggers fired at
>   *    the end of a query, we can even piggyback on the executor's state.)
>   *
> - *    event: event currently being fired.
> + *    event: event currently being fired. Pass NULL if the current batch of RI
> + *        trigger events should be processed.
>   *    rel: open relation for event.
>   *    trigdesc: working copy of rel's trigger info.
>   *    finfo: array of fmgr lookup cache entries (one per trigger in trigdesc).
>   *    instr: array of EXPLAIN ANALYZE instrumentation nodes (one per trigger),
>   *        or NULL if no instrumentation is wanted.
> + *    trig_last: trigger info used for the last trigger execution.
>   *    per_tuple_context: memory context to call trigger function in.
> + *    batch_context: memory context to store tuples for RI triggers.
>   *    trig_tuple_slot1: scratch slot for tg_trigtuple (foreign tables only)
>   *    trig_tuple_slot2: scratch slot for tg_newtuple (foreign tables only)
>   * ----------
> @@ -3824,39 +3841,55 @@ AfterTriggerExecute(EState *estate,
>                      ResultRelInfo *relInfo,
>                      TriggerDesc *trigdesc,
>                      FmgrInfo *finfo, Instrumentation *instr,
> +                    TriggerData *trig_last,
>                      MemoryContext per_tuple_context,
> +                    MemoryContext batch_context,
>                      TupleTableSlot *trig_tuple_slot1,
>                      TupleTableSlot *trig_tuple_slot2)
>  {
>      Relation    rel = relInfo->ri_RelationDesc;
>      AfterTriggerShared evtshared = GetTriggerSharedData(event);
>      Oid            tgoid = evtshared->ats_tgoid;
> -    TriggerData LocTriggerData = {0};
>      HeapTuple    rettuple;
> -    int            tgindx;
>      bool        should_free_trig = false;
>      bool        should_free_new = false;
> +    bool        is_new = false;
>  
> -    /*
> -     * Locate trigger in trigdesc.
> -     */
> -    for (tgindx = 0; tgindx < trigdesc->numtriggers; tgindx++)
> +    if (trig_last->tg_trigger == NULL)
>      {
> -        if (trigdesc->triggers[tgindx].tgoid == tgoid)
> +        int            tgindx;
> +
> +        /*
> +         * Locate trigger in trigdesc.
> +         */
> +        for (tgindx = 0; tgindx < trigdesc->numtriggers; tgindx++)
>          {
> -            LocTriggerData.tg_trigger = &(trigdesc->triggers[tgindx]);
> -            break;
> +            if (trigdesc->triggers[tgindx].tgoid == tgoid)
> +            {
> +                trig_last->tg_trigger = &(trigdesc->triggers[tgindx]);
> +                trig_last->tgindx = tgindx;
> +                break;
> +            }
>          }
> +        if (trig_last->tg_trigger == NULL)
> +            elog(ERROR, "could not find trigger %u", tgoid);
> +
> +        if (RI_FKey_trigger_type(trig_last->tg_trigger->tgfoid) !=
> +            RI_TRIGGER_NONE)
> +            trig_last->is_ri_trigger = true;
> +
> +        is_new = true;
>      }
> -    if (LocTriggerData.tg_trigger == NULL)
> -        elog(ERROR, "could not find trigger %u", tgoid);
> +
> +    /* trig_last for non-RI trigger should always be initialized again. */
> +    Assert(trig_last->is_ri_trigger || is_new);
>  
>      /*
>       * If doing EXPLAIN ANALYZE, start charging time to this trigger. We want
>       * to include time spent re-fetching tuples in the trigger cost.
>       */
> -    if (instr)
> -        InstrStartNode(instr + tgindx);
> +    if (instr && !trig_last->is_ri_trigger)
> +        InstrStartNode(instr + trig_last->tgindx);

I'm pretty unhappy about the amount of new infrastructure this adds to
trigger.c. We're now going to have a third copy of the tuples (for a
time). trigger.c is already a pretty complicated / archaic piece of
infrastructure, and this patchset seems to make that even worse. We'll
grow yet another separate representation of tuples, there's a lot new
branches (less concerned about the runtime costs, more about the code
complexity) etc.



> +/* ----------
> + * Construct the query to check inserted/updated rows of the FK table.
> + *
> + * If "insert" is true, the rows are inserted, otherwise they are updated.
> + *
> + * The query string built is
> + *    SELECT t.fkatt1 [, ...]
> + *        FROM <tgtable> t LEFT JOIN LATERAL
> + *            (SELECT t.fkatt1 [, ...]
> + *               FROM [ONLY] <pktable> p
> + *                 WHERE t.fkatt1 = p.pkatt1 [AND ...]
> + *                 FOR KEY SHARE OF p) AS m
> + *             ON t.fkatt1 = m.fkatt1 [AND ...]
> + *        WHERE m.fkatt1 ISNULL
> + *        LIMIT 1
> + *

Why do we need the lateral query here?


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Madan Kumar
Date:
Subject: Re: A patch for get origin from commit_ts.
Next
From: Tom Lane
Date:
Subject: Re: pg_bsd_indent compiles bytecode