Thread: [BUGS] data inconsistency between client and server at pg_ctl stop -m fast

[BUGS] data inconsistency between client and server at pg_ctl stop -m fast

From
SCHAUHUBER Hermann
Date:

Hello

 

Postgresql version of psql (PostgreSQL) 9.6.0

System: Red Hat Enterprise Linux Server release 6.8 (Santiago)

 

 

 

Test method:

- The client (used libpq in c++ or also psql in a perl) tries to insert a counter into testtable

  (insert was chosen for easy checks, but mostly we use updates and functions)

- Whenever an error is returned to the client, the client tries is to write it again

 

Problem:

- the psql client receives an error but the record was inserted into the table!

- So the record is finally insertet twice

 

 

 

I could imagine one of several methods how the problem could be solved, may be some (hidden) method already exists.

 

1) A shutdown method like pg_ctl stop -m smart would be helpful:

  

   a) which does not allow new connections (as it is now)

   b) which finishes active transations but does not allow new WRITING transactions (or also no reading transctions)

 

2) A similar function like pg_cancel_backup would be helpful, which also DISALLOWS to start new "writing" transactions

   (or inhibits any new transaction), and which guarantees that the cancelled transactions are reported with error

    and the successful transactions are reported with OK to the client.

 

3) The problem could be overcome, if the would be a command which disallows any new writing (or also reading) transactions.

   This command could be called e.g few seconds before a pg_ctl stop -m fast is executed - so all transactions

   which will be finished within these few seconds will have correct results for the client.

 

4) The problem would be solved if the pg_ctl stop -m fast behaves a little bit different:

   if a transaction is finished here (example below), the server should send success to the client

   and terminate the connection afterwards

   if a transaction is really aborted an error shall be sent to the client

 

 

   pg_ctl stop -m fast was executed at this time:

  

   At test case when about 1000000 records were insertet 29 entries were inserted twice, the first error was at this time:

 

   <2017-01-26 14:21:31.837 CET H:[local] U:postgres P:103530> LOG:  00000: duration: 1.308 ms  statement: select * from insert_testtable(57121);

   <2017-01-26 14:21:31.837 CET H:[local] U:postgres P:103530> LOCATION:  exec_simple_query, postgres.c:1171

   <2017-01-26 14:21:31.837 CET H:[local] U:postgres P:103530> FATAL:  57P01: terminating connection due to administrator command

      -------------------------------------------------------------------------------------------------------------------------------

   <2017-01-26 14:21:31.837 CET H:[local] U:postgres P:103530> LOCATION:  ProcessInterrupts, postgres.c:2875

   <2017-01-26 14:21:31.837 CET H:[local] U:postgres P:103530> STATEMENT:  select * from insert_testtable(57121);

   <2017-01-26 14:21:31.837 CET H:[local] U:postgres P:103530> DEBUG:  00000: shmem_exit(1): 1 before_shmem_exit callbacks to make

 

 

   At next start of the server (pg_ctl start) the client (psql) inserted the same value (57121) again, as it got an error before.

 

Readout of the table of this counter:

db1=# select * from testtable where counter = 57121;

counter |        insert_time

---------+----------------------------

   57121 | 2017-01-26 14:21:31.835887

   57121 | 2017-01-26 14:21:35.967155

(2 rows)

 

The perl program output at this time was:

-------------------------------------------

 

inserted 57119

insert_testtable

------------------

            57120

(1 row)

 

inserted 57120

FATAL:  terminating connection due to administrator command

server closed the connection unexpectedly

        This probably means the server terminated abnormally

        before or while processing the request.

connection to server was lost

*** error insert 57121

psql: FATAL:  the database system is shutting down

*** error insert 57121

psql: FATAL:  the database system is shutting down

*** error insert 57121

 

....

....

*** error insert 57121

psql: FATAL:  the database system is starting up

*** error insert 57121

insert_testtable

------------------

            57121

(1 row)

 

 

 

Attachments:

=============

 

1) postgresql_pgctl_stop_mfast_during_transaction.log ... part where the problem occured

2) database dump (dbname: db1, used table test_table) used for this test (including data content after test)

   db1.dump.sql ( pg_dump -p 5445 -U postgres db1 > db1.dump.sql)

3) server config files (postgresql.conf, pg_hba.conf)

4) Perl program - writing incremented counter to testtable: insert_counter.pl

5) script which periodically stops and starts the db server: periodic_start_stop_db.sh

Note: the perl and shell script were executed on database server locally.

 

Environment:

============

 

PG_VERSION=9.6

Installed Packages:

 

version of psql (PostgreSQL) 9.6.0

 

postgresql96-libs-9.6.0-1PGDG.rhel6.x86_64

postgresql96-contrib-9.6.0-1PGDG.rhel6.x86_64

postgresql96-server-9.6.0-1PGDG.rhel6.x86_64

postgresql96-9.6.0-1PGDG.rhel6.x86_64

 

 

 

Kind regards

 

Hermann Schauhuber

Sonorys Technology

A-1210 Vienna

 

Attachment

Re: [BUGS] data inconsistency between client and server at pg_ctlstop -m fast

From
Kevin Grittner
Date:
On Thu, Jan 26, 2017 at 11:16 PM, SCHAUHUBER Hermann
<hermann.schauhuber@sonorys.at> wrote:

> - The client (used libpq in c++ or also psql in a perl) tries to
> insert a counter into testtable

> - Whenever an error is returned to the client, the client tries
> is to write it again

> - the psql client receives an error but the record was inserted
> into the table!
>
> - So the record is finally insertet twice

As I understand your complaint, there is a short time between when
the request to commit modifications to the database is sent and
when the indication of successful commit is received by the client,
during which a broken connection leaves the status of the commit in
doubt. This much is unavoidable.

You further note that `pg_ctl -m fast` has more of a tendency to
cause this indeterminate state than is necessary.  That may be
true.  There may be room for a feature enhancement here, although I
don't see where it rises to the level of a bug.  Such a feature
could never eliminate occurrences of the indeterminate state; it
could only reduce the frequency of them.

Meanwhile, there is a lot you could do to reduce the frequency of
this issue.

  -  If you use explicit transactions, the indeterminate state
could only occur on a COMMIT request, never on any other request.
You could safely assume that an error on any transaction not
waiting on a COMMIT request was indeed not committed.

  -  Off-hand, I can't think of where you could see this except
when the error level at the client side was FATAL.  I think you
could safely assume that if the level is anything else that the
work was not committed.

  -  You might try performing a graceful termination of client
applications before shutting down the database.

  -  I understand that the so-called "smart" shutdown is often
impractical for production environments; however where it is used
the shutdown cannot cause the problem you describe.

  -  Finally, if you use two-phase commit with a high-quality
transaction manager (which independently keeps track of prepared
transactions, for retry as needed) you should automatically recover
from this condition.

What this definitely does not seem to me to be is a bug, which
makes this the wrong forum for further discussion of the issue.

--
Kevin Grittner


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs