Thread: bug: copy progress reporting of backends which run multiple COPYs

bug: copy progress reporting of backends which run multiple COPYs

From
Justin Pryzby
Date:
pg_stat_progress_copy was added in v14 (8a4f618e7, 9d2d45700).

But if a command JOINs file_fdw tables, the progress report gets bungled
up.  This will warn/assert during file_fdw tests.

diff --git a/src/backend/utils/activity/backend_progress.c b/src/backend/utils/activity/backend_progress.c
index 6743e68cef6..7abcb4f60db 100644
--- a/src/backend/utils/activity/backend_progress.c
+++ b/src/backend/utils/activity/backend_progress.c
@@ -10,6 +10,7 @@
  */
 #include "postgres.h"
 
+#include "commands/progress.h"
 #include "port/atomics.h"        /* for memory barriers */
 #include "utils/backend_progress.h"
 #include "utils/backend_status.h"
@@ -105,6 +106,20 @@ pgstat_progress_end_command(void)
     if (beentry->st_progress_command == PROGRESS_COMMAND_INVALID)
         return;
 
+// This currently fails file_fdw tests, since pgstat_progress evidently fails
+// to support simultaneous copy commands, as happens during JOIN.
+    /* bytes progress is not available in all cases */
+    if (beentry->st_progress_command == PROGRESS_COMMAND_COPY &&
+            beentry->st_progress_param[PROGRESS_COPY_BYTES_TOTAL] > 0)
+    {
+        volatile int64 *a = beentry->st_progress_param;
+        if (a[PROGRESS_COPY_BYTES_PROCESSED] > a[PROGRESS_COPY_BYTES_TOTAL])
+            elog(WARNING, "PROGRESS_COPY_BYTES_PROCESSED %ld %ld",
+                    a[PROGRESS_COPY_BYTES_PROCESSED],
+                    a[PROGRESS_COPY_BYTES_TOTAL]);
+        // Assert(a[PROGRESS_COPY_BYTES_PROCESSED] <= a[PROGRESS_COPY_BYTES_TOTAL]);
+    }
+
     PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
     beentry->st_progress_command = PROGRESS_COMMAND_INVALID;
     beentry->st_progress_command_target = InvalidOid;



Re: bug: copy progress reporting of backends which run multiple COPYs

From
Matthias van de Meent
Date:
On Thu, 19 Jan 2023 at 06:47, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> pg_stat_progress_copy was added in v14 (8a4f618e7, 9d2d45700).
>
> But if a command JOINs file_fdw tables, the progress report gets bungled
> up.  This will warn/assert during file_fdw tests.

I don't know what to do with that other than disabling COPY progress
reporting for file_fdw, i.e. calls to BeginCopyFrom that don't supply
a pstate. This is probably the best option, because a table backed by
file_fdw would also interfere with COPY TO's progress reporting.

Attached a patch that solves this specific issue in a
binary-compatible way. I'm not super happy about relying on behavior
of callers of BeginCopyFrom (assuming that users that run copy
concurrently will not provide a ParseState* to BeginCopyFrom), but it
is what it is.

Kind regards,

Matthias van de Meent

Attachment

Re: bug: copy progress reporting of backends which run multiple COPYs

From
Ted Yu
Date:


On Fri, Jan 20, 2023 at 4:51 PM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
On Thu, 19 Jan 2023 at 06:47, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> pg_stat_progress_copy was added in v14 (8a4f618e7, 9d2d45700).
>
> But if a command JOINs file_fdw tables, the progress report gets bungled
> up.  This will warn/assert during file_fdw tests.

I don't know what to do with that other than disabling COPY progress
reporting for file_fdw, i.e. calls to BeginCopyFrom that don't supply
a pstate. This is probably the best option, because a table backed by
file_fdw would also interfere with COPY TO's progress reporting.

Attached a patch that solves this specific issue in a
binary-compatible way. I'm not super happy about relying on behavior
of callers of BeginCopyFrom (assuming that users that run copy
concurrently will not provide a ParseState* to BeginCopyFrom), but it
is what it is.

Kind regards,

Matthias van de Meent
Hi,
In `BeginCopyFrom`, I see the following :

        if (pstate)
        {
                cstate->range_table = pstate->p_rtable;
                cstate->rteperminfos = pstate->p_rteperminfos; 

Is it possible to check range_table / rteperminfos so that we don't introduce the bool field ?

Cheers

Re: bug: copy progress reporting of backends which run multiple COPYs

From
Matthias van de Meent
Date:
On Sat, 21 Jan 2023 at 02:04, Ted Yu <yuzhihong@gmail.com> wrote:
>
>
>
> On Fri, Jan 20, 2023 at 4:51 PM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
>>
>> Attached a patch that solves this specific issue in a
>> binary-compatible way. I'm not super happy about relying on behavior
>> of callers of BeginCopyFrom (assuming that users that run copy
>> concurrently will not provide a ParseState* to BeginCopyFrom), but it
>> is what it is.
>
> Is it possible to check range_table / rteperminfos so that we don't introduce the bool field ?

I think yes, but I'm not sure we can depend on rteperminfos to be set,
and the same for p_rtable. I also don't think it's a good idea for
code clarity: there is no good reason why the (un)availability of
either range_table or rteperminfos would allow progress reporting - it
would require additional extensive documentation around both the usage
sites and the field itself. Adding a well-named field provides a much
better experience in my opinion.

If someone were opposed to adding that field in backbranches I'm fine
with using one of these instead, assuming additional clear
documentation is added as well.

- Matthias



Re: bug: copy progress reporting of backends which run multiple COPYs

From
Justin Pryzby
Date:
On Sat, Jan 21, 2023 at 01:51:28AM +0100, Matthias van de Meent wrote:
> On Thu, 19 Jan 2023 at 06:47, Justin Pryzby <pryzby@telsasoft.com> wrote:
> >
> > pg_stat_progress_copy was added in v14 (8a4f618e7, 9d2d45700).
> >
> > But if a command JOINs file_fdw tables, the progress report gets bungled
> > up.  This will warn/assert during file_fdw tests.
> 
> I don't know what to do with that other than disabling COPY progress
> reporting for file_fdw, i.e. calls to BeginCopyFrom that don't supply
> a pstate. This is probably the best option, because a table backed by
> file_fdw would also interfere with COPY TO's progress reporting.
> 
> Attached a patch that solves this specific issue in a
> binary-compatible way. I'm not super happy about relying on behavior
> of callers of BeginCopyFrom (assuming that users that run copy
> concurrently will not provide a ParseState* to BeginCopyFrom), but it
> is what it is.

Thanks for looking.  Maybe another option is to avoid progress reporting
in 2nd and later CopyFrom() if another COPY was already running in that
backend.

Would you do anything different in the master branch, with no
compatibility constraints ?  I think the progress reporting would still
be limited to one row per backend, not one per CopyFrom().

-- 
Justin



Re: bug: copy progress reporting of backends which run multiple COPYs

From
Matthias van de Meent
Date:
On Sat, 21 Jan 2023 at 02:28, Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Sat, Jan 21, 2023 at 01:51:28AM +0100, Matthias van de Meent wrote:
> > On Thu, 19 Jan 2023 at 06:47, Justin Pryzby <pryzby@telsasoft.com> wrote:
> > >
> > > pg_stat_progress_copy was added in v14 (8a4f618e7, 9d2d45700).
> > >
> > > But if a command JOINs file_fdw tables, the progress report gets bungled
> > > up.  This will warn/assert during file_fdw tests.
> >
> > I don't know what to do with that other than disabling COPY progress
> > reporting for file_fdw, i.e. calls to BeginCopyFrom that don't supply
> > a pstate. This is probably the best option, because a table backed by
> > file_fdw would also interfere with COPY TO's progress reporting.
> >
> > Attached a patch that solves this specific issue in a
> > binary-compatible way. I'm not super happy about relying on behavior
> > of callers of BeginCopyFrom (assuming that users that run copy
> > concurrently will not provide a ParseState* to BeginCopyFrom), but it
> > is what it is.
>
> Thanks for looking.  Maybe another option is to avoid progress reporting
> in 2nd and later CopyFrom() if another COPY was already running in that
> backend.

Let me think about it. I think it would work, but I'm not sure that's
a great option - it adds backend state that we would need to add
cleanup handles for. But then again, COPY ... TO could use TRIGGER to
trigger actual COPY FROM statements, which would also break progress
reporting in a vanilla instance without extensions.

I'm not sure what the right thing to do is here.

> Would you do anything different in the master branch, with no
> compatibility constraints ?  I think the progress reporting would still
> be limited to one row per backend, not one per CopyFrom().

I think I would at least introduce another parameter to BeginCopyFrom
for progress reporting (instead of relying on pstate != NULL), like
how we have a bit in reindex_index's params->options that specifies
whether we want progress reporting (which is unset for parallel
workers iirc).

- Matthias



Re: bug: copy progress reporting of backends which run multiple COPYs

From
Michael Paquier
Date:
On Sat, Jan 21, 2023 at 02:45:40AM +0100, Matthias van de Meent wrote:
> Let me think about it. I think it would work, but I'm not sure that's
> a great option - it adds backend state that we would need to add
> cleanup handles for. But then again, COPY ... TO could use TRIGGER to
> trigger actual COPY FROM statements, which would also break progress
> reporting in a vanilla instance without extensions.
>
> I'm not sure what the right thing to do is here.

Simply disabling COPY reporting for file_fdw does not sound appealing
to me, because that can be really useful for users.  As long as you
rely on two code paths that call separately the progress reporting,
things are doomed with the current infrastructure.  For example,
thinking about some fancy cases, you could you create an index that
uses a function as expression, function that includes utility commands
that do progress reporting.  Things would equally go wrong in the
progress view.

What are the assertions/warnings that get triggered in file_fdw?
Joining together file_fdw with a plain COPY is surely a fancy case,
even if COPY TO would allow that.

>> Would you do anything different in the master branch, with no
>> compatibility constraints ?  I think the progress reporting would still
>> be limited to one row per backend, not one per CopyFrom().
>
> I think I would at least introduce another parameter to BeginCopyFrom
> for progress reporting (instead of relying on pstate != NULL), like
> how we have a bit in reindex_index's params->options that specifies
> whether we want progress reporting (which is unset for parallel
> workers iirc).

This makes sense, independently of the discussion about what should be
done with cross-runs of these APIs.  This could be extended with
user-controllable options for each one of them.  It does not take care
of the root of the problem, just bypasses it.
--
Michael

Attachment
On Sat, Jan 21, 2023 at 02:45:40AM +0100, Matthias van de Meent wrote:
> > Would you do anything different in the master branch, with no
> > compatibility constraints ?  I think the progress reporting would still
> > be limited to one row per backend, not one per CopyFrom().
> 
> I think I would at least introduce another parameter to BeginCopyFrom
> for progress reporting (instead of relying on pstate != NULL), like
> how we have a bit in reindex_index's params->options that specifies
> whether we want progress reporting (which is unset for parallel
> workers iirc).

This didn't get fixed for v16, and it seems unlikely that it'll be
addressed in back branches.

But while I was reviewing forgotten threads, it occurred to me to raise
the issue in time to fix it for v17.

-- 
Justin



Re: bug: copy progress reporting of backends which run multiple COPYs

From
Michael Paquier
Date:
On Tue, May 07, 2024 at 07:27:54AM -0500, Justin Pryzby wrote:
> This didn't get fixed for v16, and it seems unlikely that it'll be
> addressed in back branches.
>
> But while I was reviewing forgotten threads, it occurred to me to raise
> the issue in time to fix it for v17.

Thanks for the reminder.

FWIW, I'm rather annoyed by the fact that we rely on the ParseState to
decide if reporting should happen or not.  file_fdw tells, even if
that's accidental, that status reporting can be useful if working on a
single table.  So, shutting down the whole reporting if a caller if
BeginCopyFrom() does not give a ParseState is too heavy-handed, IMO.

The addition of report_progress in the COPY FROM state data is a good
idea, though isn't that something we should give as an argument of
BeginCopyFrom() instead if sticking this knowledge in COPY FROM?

Different idea: could it be something worth controlling with a
query-level option?  It would then be possible to provide the same
level of control for COPY TO which has reporting paths, given the
application control over the reporting even with file_fdw, and store
the value controlling the reporting in CopyFormatOptions.  I am
wondering if there would be a case where someone wants to do a COPY
but hide entirely the reporting done.

The query-level option has some appeal.
--
Michael

Attachment
On Tue, May 7, 2024 at 9:12 PM Michael Paquier <michael@paquier.xyz> wrote:
> FWIW, I'm rather annoyed by the fact that we rely on the ParseState to
> decide if reporting should happen or not.  file_fdw tells, even if
> that's accidental, that status reporting can be useful if working on a
> single table.  So, shutting down the whole reporting if a caller if
> BeginCopyFrom() does not give a ParseState is too heavy-handed, IMO.

I think you're hoping for too much. The progress reporting
infrastructure is fundamentally designed around the idea that there
can only be one progress-reporting operation in progress at a time.
For COPY, that is, I believe, true, but for file_fdw, it's false. If
we want to do any kind of progress reporting from within plannable
queries, we need some totally different and much more complex
infrastructure that can report progress for, probably, each plan node
individually. I think it's likely a mistake to try to shoehorn cases
like this into the infrastructure
that we have today. It will just encourage people to try to use the
current infrastructure in ways that are less and less like what it was
actually designed to do; whereas what we should be doing if we want
this kind of functionality, at least IMHO, is building infrastructure
that's actually fit for purpose.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: bug: copy progress reporting of backends which run multiple COPYs

From
Michael Paquier
Date:
On Wed, May 08, 2024 at 10:07:15AM -0400, Robert Haas wrote:
> I think you're hoping for too much. The progress reporting
> infrastructure is fundamentally designed around the idea that there
> can only be one progress-reporting operation in progress at a time.
> For COPY, that is, I believe, true, but for file_fdw, it's false. If
> we want to do any kind of progress reporting from within plannable
> queries, we need some totally different and much more complex
> infrastructure that can report progress for, probably, each plan node
> individually. I think it's likely a mistake to try to shoehorn cases
> like this into the infrastructure
> that we have today. It will just encourage people to try to use the
> current infrastructure in ways that are less and less like what it was
> actually designed to do; whereas what we should be doing if we want
> this kind of functionality, at least IMHO, is building infrastructure
> that's actually fit for purpose.

Hmm.  OK.  I have been looking around for cases out there where
BeginCopyFrom() could be called with a pstate where the reporting
could matter, and could not find anything worth worrying about.  It
still makes me a bit uneasy to not have a separate argument in the
function to control that.  Now, if you, Justin and Matthias agree with
this approach, I won't stand in the way either.
--
Michael

Attachment