Home > mailing lists

Thread: [BUGS] data inconsistency between client and server at pg_ctl stop -m fast

[BUGS] data inconsistency between client and server at pg_ctl stop -m fast

From

SCHAUHUBER Hermann

Date:

27 January 2017, 08:16:57

Hello

Postgresql version of psql (PostgreSQL) 9.6.0

System: Red Hat Enterprise Linux Server release 6.8 (Santiago)

Test method:

- The client (used libpq in c++ or also psql in a perl) tries to insert a counter into testtable

(insert was chosen for easy checks, but mostly we use updates and functions)

- Whenever an error is returned to the client, the client tries is to write it again

Problem:

- the psql client receives an error but the record was inserted into the table!

- So the record is finally insertet twice

I could imagine one of several methods how the problem could be solved, may be some (hidden) method already exists.

1) A shutdown method like pg_ctl stop -m smart would be helpful:

a) which does not allow new connections (as it is now)

b) which finishes active transations but does not allow new WRITING transactions (or also no reading transctions)

2) A similar function like pg_cancel_backup would be helpful, which also DISALLOWS to start new "writing" transactions

(or inhibits any new transaction), and which guarantees that the cancelled transactions are reported with error

and the successful transactions are reported with OK to the client.

3) The problem could be overcome, if the would be a command which disallows any new writing (or also reading) transactions.

This command could be called e.g few seconds before a pg_ctl stop -m fast is executed - so all transactions

which will be finished within these few seconds will have correct results for the client.

4) The problem would be solved if the pg_ctl stop -m fast behaves a little bit different:

if a transaction is finished here (example below), the server should send success to the client

and terminate the connection afterwards

if a transaction is really aborted an error shall be sent to the client

pg_ctl stop -m fast was executed at this time:

At test case when about 1000000 records were insertet 29 entries were inserted twice, the first error was at this time:

<2017-01-26 14:21:31.837 CET H:[local] U:postgres P:103530> LOG: 00000: duration: 1.308 ms statement: select * from insert_testtable(57121);

<2017-01-26 14:21:31.837 CET H:[local] U:postgres P:103530> LOCATION: exec_simple_query, postgres.c:1171

<2017-01-26 14:21:31.837 CET H:[local] U:postgres P:103530> FATAL: 57P01: terminating connection due to administrator command

-------------------------------------------------------------------------------------------------------------------------------

<2017-01-26 14:21:31.837 CET H:[local] U:postgres P:103530> LOCATION: ProcessInterrupts, postgres.c:2875

<2017-01-26 14:21:31.837 CET H:[local] U:postgres P:103530> STATEMENT: select * from insert_testtable(57121);

<2017-01-26 14:21:31.837 CET H:[local] U:postgres P:103530> DEBUG: 00000: shmem_exit(1): 1 before_shmem_exit callbacks to make

At next start of the server (pg_ctl start) the client (psql) inserted the same value (57121) again, as it got an error before.

Readout of the table of this counter:

db1=# select * from testtable where counter = 57121;

counter | insert_time

---------+----------------------------

57121 | 2017-01-26 14:21:31.835887

57121 | 2017-01-26 14:21:35.967155

(2 rows)

The perl program output at this time was:

-------------------------------------------

inserted 57119

insert_testtable

------------------

57120

(1 row)

inserted 57120

FATAL: terminating connection due to administrator command

server closed the connection unexpectedly

This probably means the server terminated abnormally

before or while processing the request.

connection to server was lost

*** error insert 57121

psql: FATAL: the database system is shutting down

*** error insert 57121

psql: FATAL: the database system is shutting down

*** error insert 57121

....

*** error insert 57121

psql: FATAL: the database system is starting up

*** error insert 57121

insert_testtable

------------------

57121

(1 row)

Attachments:

=============

1) postgresql_pgctl_stop_mfast_during_transaction.log ... part where the problem occured

2) database dump (dbname: db1, used table test_table) used for this test (including data content after test)

db1.dump.sql ( pg_dump -p 5445 -U postgres db1 > db1.dump.sql)

3) server config files (postgresql.conf, pg_hba.conf)

4) Perl program - writing incremented counter to testtable: insert_counter.pl

5) script which periodically stops and starts the db server: periodic_start_stop_db.sh

Note: the perl and shell script were executed on database server locally.

Environment:

============

PG_VERSION=9.6

Installed Packages:

version of psql (PostgreSQL) 9.6.0

postgresql96-libs-9.6.0-1PGDG.rhel6.x86_64

postgresql96-contrib-9.6.0-1PGDG.rhel6.x86_64

postgresql96-server-9.6.0-1PGDG.rhel6.x86_64

postgresql96-9.6.0-1PGDG.rhel6.x86_64

Kind regards

Hermann Schauhuber

Sonorys Technology

A-1210 Vienna

Attachment

test_report.tar.gz

Re: [BUGS] data inconsistency between client and server at pg_ctlstop -m fast

From

Kevin Grittner

Date:

07 February 2017, 22:16:10

On Thu, Jan 26, 2017 at 11:16 PM, SCHAUHUBER Hermann
<hermann.schauhuber@sonorys.at> wrote:

> - The client (used libpq in c++ or also psql in a perl) tries to
> insert a counter into testtable

> - Whenever an error is returned to the client, the client tries
> is to write it again

> - the psql client receives an error but the record was inserted
> into the table!
>
> - So the record is finally insertet twice

As I understand your complaint, there is a short time between when
the request to commit modifications to the database is sent and
when the indication of successful commit is received by the client,
during which a broken connection leaves the status of the commit in
doubt. This much is unavoidable.

You further note that `pg_ctl -m fast` has more of a tendency to
cause this indeterminate state than is necessary.  That may be
true.  There may be room for a feature enhancement here, although I
don't see where it rises to the level of a bug.  Such a feature
could never eliminate occurrences of the indeterminate state; it
could only reduce the frequency of them.

Meanwhile, there is a lot you could do to reduce the frequency of
this issue.

  -  If you use explicit transactions, the indeterminate state
could only occur on a COMMIT request, never on any other request.
You could safely assume that an error on any transaction not
waiting on a COMMIT request was indeed not committed.

  -  Off-hand, I can't think of where you could see this except
when the error level at the client side was FATAL.  I think you
could safely assume that if the level is anything else that the
work was not committed.

  -  You might try performing a graceful termination of client
applications before shutting down the database.

  -  I understand that the so-called "smart" shutdown is often
impractical for production environments; however where it is used
the shutdown cannot cause the problem you describe.

  -  Finally, if you use two-phase commit with a high-quality
transaction manager (which independently keeps track of prepared
transactions, for retry as needed) you should automatically recover
from this condition.

What this definitely does not seem to me to be is a bug, which
makes this the wrong forum for further discussion of the issue.

--
Kevin Grittner

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs