Thread: BUG #16561: timestamp with time zone microseconds inconsistently recorded correctly and incorrectly
BUG #16561: timestamp with time zone microseconds inconsistently recorded correctly and incorrectly
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 16561 Logged by: Peter Thomas Email address: peter@mccarthy.co.nz PostgreSQL version: 9.5.9 Operating system: Windows 10 x64 Description: Windows C++ code issues UPDATE query to local Windows PostgreSQL 9.5.9 over libpq. One field being updated is a timestamp provided as UTC text representation (e.g. '2020-07-29T22:30:00.124248Z') but stored as timestamp with time zone. The timestamp sub-second component is not consistently written - sometimes it is stored correctly, sometime it is stored incorrectly. Always the sub second part of the time (including more significant digits) and never the date/time from seconds upwards. The query is verified client side then logged in PostgreSQL confirming that the client query is received by PostgreSQL verbatim. Interestingly, the same query applied manually through pgadmin iii fails to store the microsecond component incorrectly even after numerous attempts. The query is not performed within a transaction and occurs immediately subsequent to creating the record that is then updated. This is the test case that revealed the problem. The manual test through pgadmin iii differs in that the record had already been updated at least once. The query is similar to as follows ("locked_tz" and "uid" vary): UPDATE "tnt_res_b195217c_cfc8_11ea_8c1b_00155dce25bc".job SET locked_tz = CASE WHEN locked_tz IS NULL THEN '2020-07-29T22:30:00.124248Z' ELSE locked_tz END, locked_by_operative_uid = CASE WHEN locked_by_operative_uid IS NULL THEN 'b32ffd2c-cfc8-11ea-987d-00155dce25bc' ELSE locked_by_operative_uid END, version = CASE WHEN locked_tz IS NULL THEN version + 1 ELSE version END, description='2020-07-29T22:30:00.124248Z' -- added for debugging WHERE uid = '09dbe5d6-d1eb-11ea-9185-00155dce25bc' RETURNING locked_tz, locked_by_operative_uid; The query is written this way to ensure that locked_tz and locked_by_operative_uid are returned regardless of whether the values are updated or not. It is a "job locking" query. When executed in the test code the pre-existing values of locked_tz and locked_by_operative_uid are both NULL. The time was also written into the "description" column as a sanity check (debugging). The outputs from two back-to-back test runs reveals the first succeeding and the second failing. "07ff87e0-d1eb-11ea-93e6-00155dce25bc";"b32ffd2c-cfc8-11ea-987d-00155dce25bc";"New Job";"2020-07-29T22:29:56.992816Z";"";"";"";"JOTRAINING";"";FALSE;"b32ffd2c-cfc8-11ea-987d-00155dce25bc";"2020-07-30 10:29:56.992816+12";"";"";"";"";"2020-07-30 10:29:56.999488+12";1 "09dbe5d6-d1eb-11ea-9185-00155dce25bc";"b32ffd2c-cfc8-11ea-987d-00155dce25bc";"New Job";"2020-07-29T22:30:00.124248Z";"";"";"";"JOTRAINING";"";FALSE;"b32ffd2c-cfc8-11ea-987d-00155dce25bc";"2020-07-30 10:30:00.114735+12";"";"";"";"";"2020-07-30 10:30:00.125314+12";1 The description column containing the supplied time as a (debug) string follows the name column containing "New Job". On the second record you will see the time in the description has a different microsecond component to that in the "+12" timestamp with timezone field. I plan to upgrade and see if this goes away but will wait in case more info is required. As an FYI the table definition is as follows: CREATE TABLE tnt_res_b195217c_cfc8_11ea_8c1b_00155dce25bc.job ( uid uuid NOT NULL DEFAULT uuid_generate_v4(), creator_operative_uid uuid NOT NULL, name character varying(1024), description character varying(16384), scenario_uid uuid, scenario_jdoc json, folder_uid uuid, job_type_code character varying(10) NOT NULL, time_zone character varying(64), allow_lc_res boolean NOT NULL DEFAULT false, locked_by_operative_uid uuid, locked_tz timestamp with time zone, closed_tz timestamp with time zone, purged_tz timestamp with time zone, execution_jdoc json, extension_jdoc json, created_tz timestamp with time zone NOT NULL DEFAULT now(), version integer NOT NULL DEFAULT 0, CONSTRAINT job_pkey PRIMARY KEY (uid) ) WITH ( OIDS=FALSE, autovacuum_enabled=true ); postgresql_conf timezone = 'Pacific/Auckland'
Re: BUG #16561: timestamp with time zone microseconds inconsistently recorded correctly and incorrectly
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > One field being updated is a timestamp provided as UTC text representation > (e.g. '2020-07-29T22:30:00.124248Z') but stored as timestamp with time > zone. The timestamp sub-second component is not consistently written - > sometimes it is stored correctly, sometime it is stored incorrectly. Always > the sub second part of the time (including more significant digits) and > never the date/time from seconds upwards. Given the described query: > UPDATE "tnt_res_b195217c_cfc8_11ea_8c1b_00155dce25bc".job > SET > locked_tz = CASE WHEN locked_tz IS NULL THEN '2020-07-29T22:30:00.124248Z' > ELSE locked_tz END, > locked_by_operative_uid = CASE WHEN locked_by_operative_uid IS NULL THEN > 'b32ffd2c-cfc8-11ea-987d-00155dce25bc' ELSE locked_by_operative_uid END, > version = CASE WHEN locked_tz IS NULL THEN version + 1 ELSE version END, > description='2020-07-29T22:30:00.124248Z' -- added for debugging > WHERE uid = '09dbe5d6-d1eb-11ea-9185-00155dce25bc' > RETURNING locked_tz, locked_by_operative_uid; what seems far more likely than random data corruption is that some other transaction updated this same row slightly earlier, setting the locked_tz value that you are reading back. The CASE in this query would then have preserved that value, but the description field would get updated anyway. In the specific example you show, if I'm not confused, the reported locked_tz value is a bit older than the description value, so that this sequence of events seems very plausible. But even if the order were reversed, that wouldn't immediately destroy this theory, because you haven't said where the timestamps are coming from. The transaction that got to the row first could possibly try to store a "newer" timestamp than the one that got there second, unless there's some guarantee about how those timestamps are computed that you've not described. In short, I think you need to take a hard look at whatever logic you think is preventing concurrent selection of the same job row by multiple transactions, because what it looks like from here is that that's not working reliably. I won't completely deny that there could be a Postgres bug here, but there are enough moving parts that are missing from this bug report that it can't be investigated usefully. regards, tom lane
Re: BUG #16561: timestamp with time zone microseconds inconsistently recorded correctly and incorrectly
From
Peter Thomas
Date:
Hi Tom,
Thanks for the detailed consideration.
I can confirm that there is only one connection/transaction accessing the job record on the given database.
The flow is simply:
initiate test
create job
update job (using provided query)
compare timestamps
No other logic has access to the "job" record that is created in the isolated test (on the given database) other than an AFTER ROW trigger that simply does a NOTIFY (no modification of data).
The timestamp was generated by determining "now" in C++ code and provided as an ISO8601 string in the query.
When I say "on the given database" that is because the same logic is executed concurrently on two databases hosted by the same PostgreSQL server. Two separate connections. Two independent threads. Each thread using its own connection for both operations (create then update). Each thread using the same query.
So, the same update query (hence time) gets applied to two databases on the same server in parallel. In the real (production) world these will be separate servers.
Regards,
Peter Thomas
Re: BUG #16561: timestamp with time zone microseconds inconsistently recorded correctly and incorrectly
From
Thomas Munro
Date:
On Thu, Jul 30, 2020 at 11:52 AM PG Bug reporting form <noreply@postgresql.org> wrote: > One field being updated is a timestamp provided as UTC text representation > (e.g. '2020-07-29T22:30:00.124248Z') but stored as timestamp with time > zone. The timestamp sub-second component is not consistently written - > sometimes it is stored correctly, sometime it is stored incorrectly. Always > the sub second part of the time (including more significant digits) and > never the date/time from seconds upwards. Kia ora, Just to rule out another theory, if you run pg_controldata -D pgdata, you can see which storage format is used for timestamps: Date/time type storage: 64-bit integers Before release 10, it was possible for it to use floating point storage instead of integers; I wonder if that could be a factor here. There's a note about that here: https://www.postgresql.org/docs/9.5/datatype-datetime.html
Re: BUG #16561: timestamp with time zone microseconds inconsistently recorded correctly and incorrectly
From
Peter Thomas
Date:
Hi Thomas,
pg_controldata -D pgdata
returns
Date/time type storage: 64-bit integers
Regarding the parallel execution on two same-server databases, I can also reproduce the problem running sequentially (same-thread) across the two same-server databases.
(Note: in both parallel and sequential execution the multi-db same-query inserts are both completed before the multi-db same-query updates get executed)
What is interesting is that in either threading case the incorrect timestamp values are written consistently in both databases, which is a down-vote for random corruption.
Regards,
Peter Thomas
Re: BUG #16561: timestamp with time zone microseconds inconsistently recorded correctly and incorrectly
From
Peter Thomas
Date:
Hi Tom, Thomas,
Tom was correct. You may mark this as closed.
Highly embarrassing.
Default C++ param propagation caused locked_tz to be set on create.
If it's any consolation I've been staring at this for two days.
Sorry to have wasted everybody's time.
Regards,
Peter Thomas
Re: BUG #16561: timestamp with time zone microseconds inconsistently recorded correctly and incorrectly
From
Tom Lane
Date:
Thomas Munro <thomas.munro@gmail.com> writes: > Before release 10, it was possible for it to use floating point > storage instead of integers; I wonder if that could be a factor here. I wondered about that briefly as well. It doesn't seem to fit the facts though. (1) integer timestamps have been the default since 8.4; (2) if the OP were using float timestamps, he'd have to be working with dates millions of years away from the epoch to lose any substantial fraction of a second in precision. 2^53 seconds is ~285 million years if I've not miscounted. Even dropping the last microsecond digit won't happen for dates within a couple hundred years of AD 2000. regards, tom lane