Re: READ COMMITTED vs. index-only scans - Mailing list pgsql-general

From Melvin Davidson
Subject Re: READ COMMITTED vs. index-only scans
Date
Msg-id CANu8Fiwc7mdZBLxPaWMtcjkYxBiQMm_afe4RVUGSFBa+9sXwHA@mail.gmail.com
Whole thread Raw
In response to Re: READ COMMITTED vs. index-only scans  (Jacek Kołodziej <kolodziejj@gmail.com>)
List pgsql-general

On Wed, Jan 17, 2018 at 10:45 AM, Jacek Kołodziej <kolodziejj@gmail.com> wrote:


17.01.2018 3:30 PM "Melvin Davidson" <melvin6925@gmail.com> napisał(a):


On Wed, Jan 17, 2018 at 8:08 AM, Jacek Kołodziej <kolodziejj@gmail.com> wrote:
Hello everyone,

this is my first post here and I'm starting with asking a question about data consistency between two consecutive SELECTs using PostgreSQL 9.6.

I'm sorry if that's something that was already discussed - I couldn't find it either in archives, nor in _general internet_. If it is, I would appreciate pointing it out.

I have an "append-only" events table - only INSERT and SELECT queries are issued to it. It has an integer (sequence) ID as a primary key.

I'm issuing following two queries (within the same READ COMMITTED transaction) to that table:
- first query (A): SELECT id FROM events ORDER BY events.id DESC LIMIT 1 - I'm saving the result as a "max ID" for the second query
- second query (B): SELECT id, ... FROM events WHERE id > "min ID" AND id <= "max ID" AND ...
  - I won't dig into what's "min ID" but you can assume it's at most 100 less than "max ID"

Concurrently, rows are being added to that table.

Please note that there's enough data in the index for the first query (A) to perform an index-only scan. I'm not sure if that's relevant but "B" query does an index scan because of other conditions I haven't mentioned but still needs to fetch data from the table's heap.

Here's what happening to me: the "A" query occasionally (in my case: on the order of tenths per day) returns an ID _higher_ than any ID present in second query's result (other conditions I haven't specified do _not_ filter any more rows than "id <= max ID") - as if some entries were visible for the first query, but not for the second one. This is an inconsistency that is very problematic for me.

All I can think of is that it might be caused by the index-only-scan and READ COMMITTED transaction isolation level but from what I gather from documentation, it should not be possible due to the use of visibility map[0][1][2].

And yet it happens, likely for some other reason but I can't think of any. I've tried switching transaction isolation level to REPEATABLE READ (on the basis that it's an faulty phenomenon occurring during to some bug) but that didn't help.

Where am I wrong? What am I missing? What information may I provide to help with investigating this?


--
Kind regards,
Jacek Kołodziej
http://kolodziejj.info


>- first query (A): SELECT id FROM events ORDER BY events.id DESC LIMIT 1 - I'm saving the result as a "max ID" for the second query

Just a suggestion. The first query is not really needed.
You can simply do:
second query (B):

SELECT id
, ...
  FROM events
 WHERE id > MIN(ID)
   AND id <= MAX(ID)
   AND ...

MAX and MIN functions

Hi Melvin, thank you for the suggestion. Unfortunately it won't do in my case. Sorry for not providing enough context in the first place.

After making the first query (A), I'm iterating over that table (with LIMIT 100 and increasing OFFSET) - using a query "B" - until another condition is met; overall, code is supposed to gather a number of rows from the table. I'm also using the "max ID" for another purpose. 



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.



Oops, my bad. I didn't have my morning coffee yet.
Try this instead:

WITH minmax AS
  (SELECT MIN(id) as min_id,
                   MAX(id AS max_id)
         FROM events
 )        
SELECT id
, ...
  FROM events e,
              minmax m
 WHERE e.id > m.min_id
   AND e.id <= m.max_id
   AND ...



--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

pgsql-general by date:

Previous
From: Brysounds
Date:
Subject: Re: Configuration of pgaudit settings in postgreSQL.conf causespostgreSQL to fail to start
Next
From: Tom Lane
Date:
Subject: Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100