Re: [PATCH] Initial progress reporting for COPY command - Mailing list pgsql-hackers

From vignesh C
Subject Re: [PATCH] Initial progress reporting for COPY command
Date
Msg-id CALDaNm1wePVSpgGVTT628mLxZg51yBYzyBJcPPhVeKg7-hPF=g@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Initial progress reporting for COPY command  (Josef Šimánek <josef.simanek@gmail.com>)
Responses Re: [PATCH] Initial progress reporting for COPY command
List pgsql-hackers
On Mon, Jun 22, 2020 at 4:28 PM Josef Šimánek <josef.simanek@gmail.com> wrote:
>
> Thanks for the hint regarding "CopyReadLineText". I'll take a look.
>
> For now I have tested those cases:
>
> CREATE TABLE test(id int);
> INSERT INTO test SELECT 1 FROM generate_series(1, 1000000);
> COPY (SELECT * FROM test) TO '/tmp/ids';
> COPY test FROM '/tmp/ids';
>
> psql -h /tmp yr -c 'COPY (SELECT 1 from generate_series(1,100000000)) TO STDOUT;' > /tmp/ryba.txt
> echo /tmp/ryba.txt | psql -h /tmp yr -c 'COPY test FROM STDIN'
>
> It is easy to check lines count and bytes count are in sync (since 1 line is 2 bytes here - "1" and newline
character).
> I'll try to check more complex COPY commands to ensure everything is in sync.
>
> If you have any ideas for testing queries, feel free to suggest.

For copy from statement you could attach the session, put a breakpoint
at CopyReadLineText, execution will hit this breakpoint for every
record it is doing COPY FROM and parallely check if
pg_stat_progress_copy is getting updated correctly. I noticed it was
showing the file read size instead of the actual processed bytes.

>>  +pg_stat_progress_copy| SELECT s.pid,
>> +    s.datid,
>> +    d.datname,
>> +    s.relid,
>> +        CASE s.param1
>> +            WHEN 0 THEN 'TO'::text
>> +            WHEN 1 THEN 'FROM'::text
>> +            ELSE NULL::text
>> +        END AS direction,
>> +    ((s.param2)::integer)::boolean AS file,
>> +    ((s.param3)::integer)::boolean AS program,
>> +    s.param4 AS lines_processed,
>> +    s.param5 AS file_bytes_processed
>>
>> You could include pg_size_pretty for s.param5 like
>> pg_size_pretty(S.param5) AS bytes_processed, it will be easier for
>> users to understand bytes_processed when the data size increases.
>
>
> I was looking at the rest of reporting views and for me those seem to be just basic ones providing just raw data to
beused later in custom nice friendly human-readable views built on the client side. 
> For example "pg_stat_progress_basebackup" also reports "backup_streamed" in raw form.
>
> Anyway if you would like to make this view more user-friendly, I can add that. Just ping me.

I felt we could add pg_size_pretty to make the view more user friendly.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: EXPLAIN: Non-parallel ancestor plan nodes exclude parallel worker instrumentation
Next
From: Tomas Vondra
Date:
Subject: Re: Resetting spilled txn statistics in pg_stat_replication