Re: Problem with accessing TOAST data in stored procedures - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: Problem with accessing TOAST data in stored procedures
Date
Msg-id e152c4d7-ea1b-d143-f61d-d72f795238f7@postgrespro.ru
Whole thread Raw
In response to Re: Problem with accessing TOAST data in stored procedures  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: Problem with accessing TOAST data in stored procedures  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers


On 19.02.2021 11:12, Pavel Stehule wrote:


pá 19. 2. 2021 v 9:08 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 19.02.2021 10:47, Pavel Stehule wrote:


pá 19. 2. 2021 v 8:39 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 19.02.2021 10:14, Pavel Stehule wrote:


pá 19. 2. 2021 v 7:51 odesílatel Konstantin Knizhnik <k.knizhnik@postgrespro.ru> napsal:


On 18.02.2021 20:10, Pavel Stehule wrote:
This has a negative impact on performance - and a lot of users use procedures without transaction control. So it doesn't look like a good solution.

I am more concentrated on the Pg 14 release, where the work with SPI is redesigned, and I hope so this issue is fixed there. For older releases, I don't know. Is this issue related to Postgres or it is related to PgPro only? If it is related to community pg, then we should fix and we should accept not too good performance, because there is no better non invasive solution. If it is PgPro issue (because there are ATX support) you can fix it (or you can try backport the patch https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ee895a655ce4341546facd6f23e3e8f2931b96bf ). You have more possibilities on PgPro code base.

Sorry, it is not PgPro specific problem and recent master suffers from this bug as well.
In the original bug report there was simple scenario of reproducing the problem:

CREATE TABLE toasted(id serial primary key, data text);
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':') FROM generate_series(1, 1000)));
INSERT INTO toasted(data) VALUES((SELECT string_agg(random()::text,':') FROM generate_series(1, 1000)));
DO $$ DECLARE v_r record; BEGIN FOR v_r in SELECT data FROM toasted LOOP INSERT INTO toasted(data) VALUES(v_r.data);COMMIT;END LOOP;END;$$;

can you use new procedure_resowner?

Sorry, I do not understanf your suggestion.
How procedure_resowner can help to solve this problem?

This is just an idea - I think the most correct with zero performance impact is keeping snapshot, and this can be stored in procedure_resowner.

The fundamental question is if we want or allow more snapshots per query. The implementation is a secondary issue.

I wonder if it is correct from logical point of view.
If we commit transaction in stored procedure, then we actually implicitly start new transaction.
And new transaction should have new snapshot. Otherwise its behavior will change.

I have no problem with this. I have a problem with cycle implementation - when I iterate over some result, then this result should be consistent over all cycles.  In other cases, the behaviour is not deterministic.

I have investigated more the problem with toast data in  stored procedures and come to very strange conclusion:
to fix the problem it is enough to pass expand_external=false to expanded_record_set_tuple instead of !estate->atomic:

                                {
                                        /* Only need to assign a new tuple value */
                                        expanded_record_set_tuple(rec->erh, tuptab->vals[i],
-                                                                                         true, !estate->atomic);
+                                                                                         true, false);
                                }

Why it is correct?
Because in assign_simple_var we already forced detoasting for data:

    /*
     * In non-atomic contexts, we do not want to store TOAST pointers in
     * variables, because such pointers might become stale after a commit.
     * Forcibly detoast in such cases.  We don't want to detoast (flatten)
     * expanded objects, however; those should be OK across a transaction
     * boundary since they're just memory-resident objects.  (Elsewhere in
     * this module, operations on expanded records likewise need to request
     * detoasting of record fields when !estate->atomic.  Expanded arrays are
     * not a problem since all array entries are always detoasted.)
     */
    if (!estate->atomic && !isnull && var->datatype->typlen == -1 &&
        VARATT_IS_EXTERNAL_NON_EXPANDED(DatumGetPointer(newvalue)))
    {
        MemoryContext oldcxt;
        Datum        detoasted;

        /*
         * Do the detoasting in the eval_mcontext to avoid long-term leakage
         * of whatever memory toast fetching might leak.  Then we have to copy
         * the detoasted datum to the function's main context, which is a
         * pain, but there's little choice.
         */
        oldcxt = MemoryContextSwitchTo(get_eval_mcontext(estate));
        detoasted = PointerGetDatum(detoast_external_attr((struct varlena *) DatumGetPointer(newvalue)));


So, there is no need to initialize TOAST snapshot and "no known snapshots" error is false alarm.
What do you think about it?

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

pgsql-hackers by date:

Previous
From: Jan Wieck
Date:
Subject: Re: Extensibility of the PostgreSQL wire protocol
Next
From: Tom Lane
Date:
Subject: Re: Some regular-expression performance hacking