Thread: Cache lookup failed for type 34813 (CREATE TYPE AS ENUM + P/B/E insert, processor-specific)

Hello,

I have a strange bug which is either processor-specific (occurs on AMD 
Ryzen 3 PRO 4350G processor, but not on Intel) or Linux kernel version 
specific. It occurs with PostgreSQL v16.3 as packaged by Debian: 16.3 
(Debian 16.3-1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 
13.2.0-24) 13.2.0, 64-bit, running on Linux 6.8.12-amd64 / Debian GLIBC 
2.38-13.

I see a PostgreSQL error "cache lookup failed for type 34813" after the 
following sequence of network requests over a single network connection 
to PostgreSQL (which is otherwise idle):

CREATE TYPE rating AS ENUM('ungood', 'good')
CREATE TABLE act(name TEXT, value RATING)
INSERT INTO act VALUES('thoughtcrime', 'ungood') -- OK
INSERT INTO act VALUES('blackwhite', $1) -- with $1 = "good" / type oid for "rating" / text format

The last line, which triggers the error, is using the 
prepare-bind-execute protocol, with a single parameter sent in text 
mode, with the oid corresponding to the "CREATE TYPE" (queried from the 
pg_type table). This is with an obscure Emacs Lisp implementation of the 
frontend/backend protocol. I have produced a standalone Emacs Lisp file 
which reproduces the problem, available as a gist here

    https://gist.github.com/emarsden/9bbaf02d62961cabaf5404f10da2c01c

To reproduce (requires Emacs to be installed), save as "cache-bug.el" 
and run

PGURI="postgresql://pgeltestuser:password@localhost/pgeltestdb" emacs 
--batch -Q -l cache-bug.el -f run

This produces a long backtrace of which the last line is (with FR locale)

    PostgreSQL error: "ERREUR : cache lookup failed for type 34841"

I have not made the effort to reproduce this with the psycopg Python 
library (the use of custom types is a little inconvenient).

This bug does not occur on the exact same version of PostgreSQL on 
Debian (same sha256sum on the postmaster binary) running on an Intel 
processor with an older kernel version. Nor does it occur on the problem 
machine when running the current Docker image for PostgreSQL (neither 
docker.io/library/postgres:16.3-bullseye nor the 16.3-alpine label).




Eric Marsden <eric.marsden@risk-engineering.org> writes:
> I see a PostgreSQL error "cache lookup failed for type 34813" after the
> following sequence of network requests over a single network connection
> to PostgreSQL (which is otherwise idle):

> CREATE TYPE rating AS ENUM('ungood', 'good')
> CREATE TABLE act(name TEXT, value RATING)
> INSERT INTO act VALUES('thoughtcrime', 'ungood') -- OK
> INSERT INTO act VALUES('blackwhite', $1) -- with $1 = "good" / type oid for "rating" / text format

Interesting.  To clarify, are these commands run within a single
transaction, and if not where are the transaction boundaries?

Would you run this under log_error_verbosity = verbose and post the
full postmaster log entry for the failure, especially the LOCATION
data?  (Even better would be a stack trace from the errfinish call,
but perhaps LOCATION will be enough.)

            regards, tom lane



On 21/06/2024 17:35, Tom Lane wrote:
Eric Marsden <eric.marsden@risk-engineering.org> writes:
I see a PostgreSQL error "cache lookup failed for type 34813" after the 
following sequence of network requests over a single network connection 
to PostgreSQL (which is otherwise idle):
CREATE TYPE rating AS ENUM('ungood', 'good')
CREATE TABLE act(name TEXT, value RATING)
INSERT INTO act VALUES('thoughtcrime', 'ungood') -- OK
INSERT INTO act VALUES('blackwhite', $1) -- with $1 = "good" / type oid for "rating" / text format
 Thanks to very useful discussions on #postgresql (IRC), I have found out that this is a user error rather than a PostgreSQL bug. My PostgreSQL network protocol library maintains an oid <-> type-name cache that is populated at connection startup time. Due to improper cleanup in the tests, a old value for the type oid was remaining when testing on my own computer (but not in Docker nor for clean test runs on other machines), so the PostgreSQL error message is perfectly valid. 

I have realized that I need to stop using this oid <-> type-name cache for types that are not PostgreSQL builtins, because there is no way of ensuring it's synchronized with the backend values.

Thanks, and apologies for the spurious report,
Eric