An issue with missing rows - Mailing list pgsql-bugs

From Viktor Semykin
Subject An issue with missing rows
Date
Msg-id CACaAQPXmQJR-XtMGsuXRBAq21GYWxzPaWKHTDELmq5OWFYDiFA@mail.gmail.com
Whole thread Raw
Responses Re: An issue with missing rows  (Viktor Semykin <thesame.ml@gmail.com>)
List pgsql-bugs
I have faced a weird issue with PostgreSQL database in conjunction with
nodejs+Sequelize. I saw this issue happening in 13.1 and 10.5. nodejs is 12.16.
Please let me know if I have to re-format this email somehow for
comfortable reading.

I'll try to explain step by step what's going on. Sequelize
library test suite includes a test that have been passing successfully until I
broke it. After my changes, the following queries fail about once in 20 times.

In a nutshell, my queries look like this (here and below all createdAt and
updatedAt fields may be inaccurate):

```sql
  CREATE SCHEMA IF NOT EXISTS archive;
  DROP TABLE IF EXISTS "archive"."Tasks" CASCADE;
  DROP TABLE IF EXISTS "archive"."Users" CASCADE;
  CREATE TABLE IF NOT EXISTS "archive"."Users" ("id"   SERIAL ,
"createdAt" TIMESTAMP WITH TIME ZONE NOT NULL, "updatedAt" TIMESTAMP
WITH TIME ZONE NOT NULL, PRIMARY KEY ("id"));
  CREATE TABLE IF NOT EXISTS "archive"."Tasks" ("id" INTEGER , "title"
VARCHAR(255), "createdAt" TIMESTAMP WITH TIME ZONE NOT NULL,
"updatedAt" TIMESTAMP WITH TIME ZONE NOT NULL, "UserId" INTEGER
REFERENCES "archive"."Users" ("id") ON DELETE SET NULL ON UPDATE
CASCADE, PRIMARY KEY ("id"));
  INSERT INTO "archive"."Users" ("id","createdAt","updatedAt") VALUES
(1,'...','...') RETURNING "id","createdAt","updatedAt";
  INSERT INTO "archive"."Users" ("id","createdAt","updatedAt") VALUES
(2,'...','...') RETURNING "id","createdAt","updatedAt";
  INSERT INTO "archive"."Tasks"
("id","title","createdAt","updatedAt","UserId") VALUES
(1,'b','...','...',1) RETURNING
"id","title","createdAt","updatedAt","UserId";
  INSERT INTO "archive"."Tasks"
("id","title","createdAt","updatedAt","UserId") VALUES
(2,'d','...','...',1) RETURNING
"id","title","createdAt","updatedAt","UserId";
  INSERT INTO "archive"."Tasks"
("id","title","createdAt","updatedAt","UserId") VALUES
(3,'c','...','...',1) RETURNING
"id","title","createdAt","updatedAt","UserId";
  INSERT INTO "archive"."Tasks"
("id","title","createdAt","updatedAt","UserId") VALUES
(4,'a','...','...',1) RETURNING
"id","title","createdAt","updatedAt","UserId";
  INSERT INTO "archive"."Tasks"
("id","title","createdAt","updatedAt","UserId") VALUES
(5,'a','...','...',2) RETURNING
"id","title","createdAt","updatedAt","UserId";
  INSERT INTO "archive"."Tasks"
("id","title","createdAt","updatedAt","UserId") VALUES
(6,'c','...','...',2) RETURNING
"id","title","createdAt","updatedAt","UserId";
  INSERT INTO "archive"."Tasks"
("id","title","createdAt","updatedAt","UserId") VALUES
(7,'b','...','...',2) RETURNING
"id","title","createdAt","updatedAt","UserId";
  SELECT "User"."id", "User"."createdAt", "User"."updatedAt" FROM
"archive"."Users" AS "User" ORDER BY "User"."id" ASC;
  SELECT "Task".* FROM (SELECT * FROM (SELECT "id", "title",
"createdAt", "updatedAt", "UserId" FROM "archive"."Tasks" AS "Task"
WHERE "Task"."UserId" = 1 ORDER BY "Task"."id" ASC LIMIT 2) AS sub
UNION ALL SELECT * FROM (SELECT "id", "title", "createdAt",
"updatedAt", "UserId" FROM "archive"."Tasks" AS "Task" WHERE
"Task"."UserId" = 2 ORDER BY "Task"."id" ASC LIMIT 2) AS sub) AS
"Task";
  DROP SCHEMA IF EXISTS archive CASCADE;
```

in other words, in instantiates a test data like this:

```
sequelize_test=# select * from "archive"."Users";
 id |           createdAt           |           updatedAt
----+-------------------------------+-------------------------------
  1 | 2021-02-13 16:59:04.190643-03 | 2021-02-13 16:59:04.190643-03
  2 | 2021-02-13 16:59:04.217696-03 | 2021-02-13 16:59:04.217696-03
(2 rows)

sequelize_test=# select * from "archive"."Tasks";
 id | title |           createdAt           |           updatedAt
     | UserId
----+-------+-------------------------------+-------------------------------+--------
  1 | b     | 2021-02-13 16:59:04.222593-03 | 2021-02-13
16:59:04.222593-03 |      1
  2 | d     | 2021-02-13 16:59:04.238252-03 | 2021-02-13
16:59:04.238252-03 |      1
  3 | c     | 2021-02-13 16:59:04.24318-03  | 2021-02-13
16:59:04.24318-03  |      1
  4 | a     | 2021-02-13 16:59:04.256812-03 | 2021-02-13
16:59:04.256812-03 |      1
  5 | a     | 2021-02-13 16:59:04.262364-03 | 2021-02-13
16:59:04.262364-03 |      2
  6 | c     | 2021-02-13 16:59:04.277525-03 | 2021-02-13
16:59:04.277525-03 |      2
  7 | b     | 2021-02-13 16:59:04.281861-03 | 2021-02-13
16:59:04.281861-03 |      2
```

Then I'm trying to fetch all Users, and for each one of them only two tasks
ordered by "id". This is what my query (the Faulty Query) looks like:

```sql
SELECT "User"."id",
       "User"."createdAt",
       "User"."updatedAt",
       "tasks"."id" AS "tasks.id",
       "tasks"."title" AS "tasks.title",
       "tasks"."createdAt" AS "tasks.createdAt",
       "tasks"."updatedAt" AS "tasks.updatedAt",
       "tasks"."UserId" AS "tasks.UserId"
FROM "archive"."Users" AS "User"
LEFT OUTER JOIN LATERAL
  (SELECT *
   FROM "archive"."Tasks" AS "tasks"
   WHERE "User"."id" = "tasks"."UserId"
   LIMIT 2) AS "tasks" ON TRUE
ORDER BY "User"."id" ASC;
```

and most of the time it returns result as expected:

```
 id |           createdAt           |           updatedAt           |
tasks.id | tasks.title |        tasks.createdAt        |
tasks.updatedAt        | tasks.UserId

----+-------------------------------+-------------------------------+----------+-------------+-------------------------------+-------------------------------+--------------
  1 | 2021-02-13 16:59:04.190643-03 | 2021-02-13 16:59:04.190643-03 |
      1 | b           | 2021-02-13 16:59:04.222593-03 | 2021-02-13
16:59:04.222593-03 | 1
  1 | 2021-02-13 16:59:04.190643-03 | 2021-02-13 16:59:04.190643-03 |
      2 | d           | 2021-02-13 16:59:04.238252-03 | 2021-02-13
16:59:04.238252-03 | 1
  2 | 2021-02-13 16:59:04.217696-03 | 2021-02-13 16:59:04.217696-03 |
      5 | a           | 2021-02-13 16:59:04.262364-03 | 2021-02-13
16:59:04.262364-03 | 2
  2 | 2021-02-13 16:59:04.217696-03 | 2021-02-13 16:59:04.217696-03 |
      6 | c           | 2021-02-13 16:59:04.277525-03 | 2021-02-13
16:59:04.277525-03 | 2
```

note that Tasks ids are 1,2,5,6. But there are chances that it returns
wrong tasks, for example 1,2,5,7. The problem is that whenever I add logging
it works correctly 100% of the time.

The good news I was lucky enough to catch the error with Wireshark. Sequelize
uses connection pooling so the queries are spread across 5 separate
tcp connections and
Wireshark output is quite messy. I had to parse the output with an ad-hoc
script that generates a visual graph of the underlying events. This graph is
attached as graph.svg. It's handy to open it in the browser and zoom
out.  Graphviz
is not my strongest point, so let me briefly explain what this graph means.
Each node is a pgsql-relevant network frame. Red edges denote TCP connections.
Black arrows denote sequence order. The red nodes are what matters. Each frame
begins with its sequence number and a direction (in/out). Body of each node
shows the list of commands sent/received.

Frames 1062-1084 are boring preludes. Then at 1086 comes the CREATE
SCHEMA query.
Then come some Sequelize noise like DROP TABLE IF EXISTS of tables that do not
exist. Then in 1098 and 1105 it actually queries table creation.
At 1109 it starts INSERTing data into "Users". Then the pool comes into play.
At 1122 it spawns one more connection and at 1130 it starts INSERTing the second
row into "Users". At 1142 and 1146 the Postgres reports both connections are
"Ready for query". Then at 1148 it starts INSERTing a row into "Tasks" in the
first connection, and at 1165 a row in the second connection.
At 1180 the fun begins, it spawns three more connections and keeps INSERTing
rows into "Tasks" using all the connections.
Around 1263-1270 all the INSERT queries are executed and the connections
reported "Ready for query".

At 1273 it starts SELECTing the Faulty Query and 1275 returns the seem-to-be
incorrect result. The Tasks rows in the response are
`(1,'b'),(2,'d'),(5,'a'),(7,'b')`. It looks like (6,'c') haven't landed yet.

So I come to the main question: did I find a bug or my assumptions about
row visibility are just wrong?

Attachment

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Unexpected serialization error
Next
From: Viktor Semykin
Date:
Subject: Re: An issue with missing rows