We have a .NET application which subscribes to a logical replication slot using wal2json. The purpose of the application is to publish events to AWS SQS. We are required to first "enrich" these events by querying the database.
We have found that these queries will often find old data (i.e. the data that existed prior to the update) which is unexpected.
For example, when I use a psql terminal to update a user's last name from "Jones" to "Smith" then I would expect the enrichment query to find "Smith" but it will sometimes still find "Jones". It finds the old data perhaps 1 in 50 times.
To reproduce this I use a psql terminal to execute an update statement which changes a user's last name with \watch 1.5 and monitor our application logs for cases where the wal2json output and the enrichment query output have different last names.
We have compared transaction ids by adding include-xids to pg_recvlogical and adding txid_current() to to the enrich query and the txid_current() is always the xid + 1.
We have found two things that appear to resolve the problem -
- Using a more powerful EC2 instance. We can reproduce the issue with a r7a.medium instance but not with a r7a.large EC2 instance.
- Changing the Postgres synchronous_commit parameter from "on" to "off". We cannot reproduce the issue with synchronous_commit set to "off".
We need help to understand this unexpected behaviour.
We are using Postgres 14.10.
Thanks,