Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch] - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Date
Msg-id 004e01cd9d34$59b6b9e0$0d242da0$@kapila@huawei.com
Whole thread Raw
In response to Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
> On Friday, September 28, 2012 2:28 AM Pavel Stehule wrote:
> Hello
>
> I reduced this patch as just plpgsql (SPI) problem solution.
>
> Correct generic solution needs a discussion about enhancing our libpq
> interface - we can take a number of returned rows, but we should to get
> number of processed rows too. A users should to parse this number from
> completationTag, but this problem is not too hot and usually is not
> blocker, because anybody can process completationTag usually.
>
> So this issue is primary for PL/pgSQL, where this information is not
> possible. There is precedent - CREATE AS SELECT (thanks for sample :)),
> and COPY should to have same impact on ROW_COUNT like this statement
> (last patch implements it).

IMO now this patch is okay. I have marked it as Ready For Committer.

With Regards,
Amit Kapila.


>
>
> 2012/9/21 Amit Kapila <amit.kapila@huawei.com>:
> > On Friday, September 21, 2012 1:23 AM Pavel Stehule wrote:
> >
> >>> Basic stuff:
> >>> ------------
> >>> - Patch applies OK. but offset difference in line numbers.
> >>> - Compiles with errors in contrib [pg_stat_statements, sepgsql]
> >>> modules
> >>> - Regression failed; one test-case in COPY due to incomplete
> >>> test-case attached patch. – same as reported by Heikki
> >>
> >>fixed patch is in attachment
> >
> > After modifications:
> > ---------------------------
> >     - Patch applies OK
> >     - Compiles cleanly without any errors/warnings
> >     - Regression tests pass.
> >
> >>>
> >>> What it does:
> >>> --------------
> >>> Modification to get the number of processed rows evaluated via SPI.
> >>> The changes are to add extra parameter in ProcessUtility to get the
> >>> number of rows processed by COPY command.
> >>>
> >>> Code Review Comments:
> >>> ---------------------
> >>> 1. New parameter is added to ProcessUtility_hook_type function
> >>>    but the functions which get assigned to these functions like
> >>>    sepgsql_utility_command, pgss_ProcessUtility, prototype &
> >>> definition is not modified.
> >
> > Functionality is not fixed correctly for hook functions, In function
> > pgss_ProcessUtility for bellow snippet of code processed parameter is
> passed NULL, as well as not initialized.
> > because of this when "pg_stat_statements" extention is utilized COPY
> command is giving garbage values.
> >         if (prev_ProcessUtility)
> >                 prev_ProcessUtility(parsetree, queryString, params,
> >                                                         dest,
> completionTag, context, NULL);
> >         else
> >                 standard_ProcessUtility(parsetree, queryString,
> params,
> >                                                         dest,
> > completionTag, context, NULL);
> >
> >     Testcase is attached.
> >     In this testcase table has only 1000 records but it show garbage
> value.
> >         postgres=# show shared_preload_libraries ;
> >          shared_preload_libraries
> >         --------------------------
> >          pg_stat_statements
> >         (1 row)
> >         postgres=# CREATE TABLE tbl (a int);
> >         CREATE TABLE
> >         postgres=# INSERT INTO tbl VALUES(generate_series(1,1000));
> >         INSERT 0 1000
> >         postgres=# do $$
> >         declare r int;
> >         begin
> >           copy tbl to '/home/kiran/copytest.csv' csv;
> >           get diagnostics r = row_count;
> >           raise notice 'exported % rows', r;
> >           truncate tbl;
> >           copy tbl from '/home/kiran/copytest.csv' csv;
> >           get diagnostics r = row_count;
> >           raise notice 'imported % rows', r;
> >         end;
> >         $$ language plpgsql;
> >         postgres$#
> >         NOTICE:  exported 13281616 rows
> >         NOTICE:  imported 13281616 rows
> >         DO
> >
> >>>
> >>> 2. Why to add the new parameter if completionTag hold the number of
> >>> processed tuple information; can be extracted
> >>>
> >>>    from it as follows:
> >>>                     _SPI_current->processed = strtoul(completionTag
> >>> + 7, NULL, 10);
> >>
> >>this is basic question. I prefer a natural type for counter - uint64
> >>instead text. And there are no simply way to get offset (7 in this
> >>case)
> >
> >     I agree with your point, but currently in few other places we are
> > parsing the completion tag for getting number of tuples processed. So
> > may be in future we can change those places as well. For example
>
> yes, this step can be done in future - together with libpq enhancing.
> Is not adequate change API (and break lot of extensions) for this
> isolated problem. So I propose fix plpgsql issue, and add to ToDo -
> "enhance libpq to return a number of processed rows as number" - but
> this change breaking compatibility.
>
> Pavel
>
> >
> > pgss_ProcessUtility
> > {
> > ..
> >
> > /* parse command tag to retrieve the number of affected rows. */ if
> > (completionTag &&
> >         sscanf(completionTag, "COPY " UINT64_FORMAT, &rows) != 1)
> >         rows = 0;
> >
> > }
> >
> > _SPI_execute_plan
> > {
> > ..
> > ..
> > if (IsA(stmt, CreateTableAsStmt))
> > {
> >         Assert(strncmp(completionTag, "SELECT ", 7) == 0);
> >         _SPI_current->processed = strtoul(completionTag + 7,
> >
> > NULL, 10);
> >
> > ..
> > }
> >
> >
> > With Regards,
> > Amit Kapila.
> >
> >
> >
> > --
> > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To
> > make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers




pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: setting per-database/role parameters checks them against wrong context
Next
From: Pavel Stehule
Date:
Subject: Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]