Re: [PATCH] Initial progress reporting for COPY command - Mailing list pgsql-hackers

From Josef Šimánek
Subject Re: [PATCH] Initial progress reporting for COPY command
Date
Msg-id CAFp7QwoXHPNC16fzh1OEvSH3rKJe-AptAE3oSjbjMUpF_ugX9Q@mail.gmail.com
Whole thread Raw
In response to [PATCH] Initial progress reporting for COPY command  (Josef Šimánek <josef.simanek@gmail.com>)
Responses Re: [PATCH] Initial progress reporting for COPY command
Re: [PATCH] Initial progress reporting for COPY command
List pgsql-hackers
Thanks for all comments. I have updated code to support more options (including STDIN/STDOUT) and added some documentation.


I'm also attaching screenshot of HTML documentation and html documentation file.

I'll do my best to get this to commitfest now.

ne 14. 6. 2020 v 14:32 odesílatel Josef Šimánek <josef.simanek@gmail.com> napsal:
Hello, as proposed by Pavel Stěhule and discussed on local czech PostgreSQL maillist (https://groups.google.com/d/msgid/postgresql-cz/CAFj8pRCZ42CBCa1bPHr7htffSV%2BNAcgcHHG0dVqOog4bsu2LFw%40mail.gmail.com?utm_medium=email&utm_source=footer), I have prepared an initial patch for COPY command progress reporting.

Few examples first:

"COPY (SELECT * FROM test) TO '/tmp/ids';"

yr=# SELECT * from pg_stat_progress_copy;
   pid   | datid | datname | relid | direction | file | program | lines_processed | file_bytes_processed
---------+-------+---------+-------+-----------+------+---------+-----------------+----------------------
 3347126 | 16384 | yr      |     0 | TO        | t    | f       |         3529943 |             24906226
(1 row)
 
"COPY test FROM '/tmp/ids';

yr=# SELECT * from pg_stat_progress_copy;
   pid   | datid | datname | relid | direction | file | program | lines_processed | file_bytes_processed
---------+-------+---------+-------+-----------+------+---------+-----------------+----------------------
 3347126 | 16384 | yr      | 16385 | FROM      | t    | f       |       121591999 |            957218816
(1 row)

Columns are inspired by CREATE INDEX progress report system view.

pid - integer - PID of backend
datid - oid - OID of related database
datname - name - name of related database (this seems redundant, since oid should be enough, but it is the same in CREATE INDEX)
relid - oid - oid of table related to COPY command, when not known (for example when copying to file, it is 0)
direction - text - one of "FROM" or "TO" depends on command used
file - bool - is file is used?
program - bool - is program used?
lines_processed - bigint - amount of processed lines, works for both directions (FROM/TO)
file_bytes_processed - amount of bytes processed when file is used (otherwise 0), works for both direction (
FROM/TO) when file is used (file = t)

Patch is attached and can be found also at https://github.com/simi/postgres/pull/5.

Diff version: https://github.com/simi/postgres/pull/5.diff
Patch version: https://github.com/simi/postgres/pull/5.patch

I havefew initial notes and questions.

I'm using ftell to get current position in file to populate file_bytes_processed without error handling (ftell can return -1L and also populate errno on problems).

1. Is that a good way to get progress of file processing?
2. Is it safe in given context to not care about errors? If not, what to do on error?

Some columns are not populated on certain COPY commands. For example when a file is not used, file_bytes_processed is set to 0. Would it be better to use NULL instead when the column is not related to the current command? Same problem is for relid column.

I have not found any tests for progress reporting. Are there any? It would need two backends running (one running COPY, one checking output of report view). Is there any similar test I can inspire at? In theory, it should be possible to use dblink_send_query to run async COPY command in the background.

My initial (attached) patch also doesn't introduce documentation for this system view. I can add that later once this patch is finalized (if that happens).
Attachment

pgsql-hackers by date:

Previous
From: Josef Šimánek
Date:
Subject: Re: [PATCH] Initial progress reporting for COPY command
Next
From: Ranier Vilela
Date:
Subject: Re: Possible NULL pointer deferenced (src/interfaces/libpq/fe-exec.c(line 563)