Thread: BUG #18875: COPY BINARY tsvector FROM file leads to misaligned memory access
BUG #18875: COPY BINARY tsvector FROM file leads to misaligned memory access
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18875 Logged by: Alexander Lakhin Email address: exclusion@gmail.com PostgreSQL version: 17.4 Operating system: Ubuntu 24.04 Description: The following script, executed against a build with sanitizers enabled: CREATE TABLE test_tsvector(t text, a tsvector); COPY test_tsvector FROM '.../src/test/regress/data/tsearch.data'; COPY BINARY test_tsvector TO '/tmp/t.data'; COPY BINARY test_tsvector FROM '/tmp/t.data'; triggers a runtime error: 2025-04-02 17:23:25.502 UTC [1721608] LOG: statement: COPY BINARY test_tsvector FROM '/tmp/t.data'; tsvector.c:90:59: runtime error: member access within misaligned address 0x52500005a23c for type 'const struct WordEntryIN', which requires 8 byte alignment 0x52500005a23c: note: pointer points here 04 00 00 00 04 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ #0 0x5e62469fe827 in compareentry .../src/backend/utils/adt/tsvector.c:90 #1 0x5e62469fe85c in WordEntryCMP .../src/backend/utils/adt/tsvector.c:173 #2 0x5e6246a025bd in tsvectorrecv .../src/backend/utils/adt/tsvector.c:514 #3 0x5e6246af3bdc in ReceiveFunctionCall .../src/backend/utils/fmgr/fmgr.c:1715 #4 0x5e6245c759b4 in CopyReadBinaryAttribute .../src/backend/commands/copyfromparse.c:2048 #5 0x5e6245c7cd0c in CopyFromBinaryOneRow .../src/backend/commands/copyfromparse.c:1139 #6 0x5e6245c79428 in NextCopyFrom .../src/backend/commands/copyfromparse.c:890 #7 0x5e6245c6e429 in CopyFrom .../src/backend/commands/copyfrom.c:1149 #8 0x5e6245c669e5 in DoCopy .../src/backend/commands/copy.c:306 #9 0x5e62466463fd in standard_ProcessUtility .../src/backend/tcop/utility.c:738 ... Reproduced on REL_10_STABLE .. master.
Re: BUG #18875: COPY BINARY tsvector FROM file leads to misaligned memory access
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > The following script, executed against a build with sanitizers enabled: > CREATE TABLE test_tsvector(t text, a tsvector); > COPY test_tsvector FROM '.../src/test/regress/data/tsearch.data'; > COPY BINARY test_tsvector TO '/tmp/t.data'; > COPY BINARY test_tsvector FROM '/tmp/t.data'; > triggers a runtime error: > 2025-04-02 17:23:25.502 UTC [1721608] LOG: statement: COPY BINARY > test_tsvector FROM '/tmp/t.data'; > tsvector.c:90:59: runtime error: member access within misaligned address > 0x52500005a23c for type 'const struct WordEntryIN', which requires 8 byte > alignment Hmm. This is evidently because of the type pun involved: WordEntryCMP is supposed to compare WordEntry structs, but it's turning around and using compareentry which compares WordEntryIN structs. And those are larger/better aligned. Now compareentry doesn't access anything outside the WordEntry part, but it's theoretically possible that the compiler could generate load instructions that depend on the larger alignment. Given the lack of field reports, that's not happening on any platforms where it would matter. But still we ought to clean it up. ISTM this coding is basically backwards: compareentry should be coded to work on WordEntry structs, and then if it's used to compare WordEntry structs that are embedded in WordEntryIN there's no problem. And then we don't need the WordEntryCMP wrapper at all. Will fix, thanks for the report! regards, tom lane