Thread: BUG #16082: TOAST's pglz_decompress access to uninitialized data, if the database is corrupted.
BUG #16082: TOAST's pglz_decompress access to uninitialized data, if the database is corrupted.
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 16082 Logged by: cili Email address: cilizili@protonmail.com PostgreSQL version: 12.0 Operating system: Microsoft Windows [Version 10.0.18362.418] Description: The function pglz_decompress in src/common/pglz_decompress.c may refer invalid data in the corrupted database file. I show you two bad cases along with corrupted database file, and how to make. The first byte of TOAST structure is a control byte. If the LSB of control byte is set, the 2nd byte is the length and the 3rd byte is an offset of repeating bytes in dest block. There is two case that they are valid for invalid data. In the case 1, it reads an uninitialized data in the dest. In the case 2, it reads uninitialized or out-of-bound data in the dest. They are invalid. I'll show you the setup and one normal case, and then show two bad bug cases. Detail of case 1: The first byte of TOAST structure in the corrupted database file is set LSB. The line 741 in pg_lzcompress.c read a byte from dp[-off] with off = 0, and write to *dp. It refers an uninitialized byte in the dest. Detail of case 2: The first byte of TOAST structure in the corrupted database file is not set LSB, and the third byte such as the control byte in the second tag is set LSB. Similar to the case 2, the line 741 in pg_lzcompress.c read a byte from dp[-off] with off > 0, and write to *dp. It refers out-of-bound bytes in the dest. Expected: The program prevents the invalid accesses, and reports an error. Examples: = is an informal comment. % is a CLI command in OS Shell. # is a SQL command in SQL Shell. %%% is a reminder. *** is a instruction to corrupt the database. =============================================== = [setup] = create database file. = insert data into database. = find the database filename. Then we modify the database file. =============================================== % initdb testdb % psql -d testdb # CREATE TABLE test(id INTEGER, body BYTEA, PRIMARY KEY (id)); # INSERT INTO test values(1, convert_to(repeat('A', 32768), 'UTF8')); # SELECT relfilenode from pg_class where relname like 'test'; relfilenode ------------- 16401 =============================================== = [normal case] =============================================== # SELECT body from test; \x41414141... # \q %%% Please remember the result of normal case. %%% We modify the dabase file, and show two bad cases. =============================================== = [bad case 1] = corrupt the databae file =============================================== *** Edit the database file '16401' with case 1. Restart the PostgreSQL. % psql -d testdb # SELECT body from test; \x000000... ...0041 # \q =============================================== = [bad case 2] = corrupt the databae file =============================================== *** Edit the database file '16401' with case 2. Restart the PostgreSQL. % psql -d testdb # SELECT body from test; \x410000... ...0000 # \q ================================================ = example of database file '16401' ================================================ original: 00001E80 00 80 00 00 FE 41 0F 01 FF 0F 01 FF 0F 01 FF 0F case 1: 00001E80 00 80 00 00 FD 0F FF FF 41 0F 01 FF 0F 01 FF 0F case 2: 00001E80 00 80 00 00 FE 41 0F FF FF 0F 01 FF 0F 01 FF 0F
Re: BUG #16082: TOAST's pglz_decompress access to uninitializeddata, if the database is corrupted.
From
Tomas Vondra
Date:
On Sat, Oct 26, 2019 at 07:46:25AM +0000, PG Bug reporting form wrote: >The following bug has been logged on the website: > >Bug reference: 16082 >Logged by: cili >Email address: cilizili@protonmail.com >PostgreSQL version: 12.0 >Operating system: Microsoft Windows [Version 10.0.18362.418] >Description: > >The function pglz_decompress in src/common/pglz_decompress.c may refer >invalid data in the corrupted database file. >I show you two bad cases along with corrupted database file, and how to >make. > >The first byte of TOAST structure is a control byte. If the LSB of control >byte is set, the 2nd byte is the length and the 3rd byte is an offset of >repeating bytes in dest block. >There is two case that they are valid for invalid data. In the case 1, it >reads an uninitialized data in the dest. In the case 2, it reads >uninitialized or out-of-bound data in the dest. They are invalid. >I'll show you the setup and one normal case, and then show two bad bug >cases. > Well, failure like this after reading corrupted data from disk is not really surprising and it's hardly a bug. It's kinda intended to work that way, really. Essentially, if something outside PostgreSQL corrupted the data file, then all bets are off. We have a protection against that in the form of data checksums, in which case we'd (very probably) identify that while reading the page from disk. If the page was corrupted by PostgreSQL itself, we might not notice that, but then the thing that corrupted the data file is the bug, not that pglz_decompress fails. But AFAICS you have not demonstrated any such data corruption issue, you assume the data file is corrupted by something outside PostgreSQL (i.e. the first case). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: BUG #16082: TOAST's pglz_decompress access to uninitializeddata, if the database is corrupted.
From
Alvaro Herrera
Date:
On 2019-Oct-26, Tomas Vondra wrote: > On Sat, Oct 26, 2019 at 07:46:25AM +0000, PG Bug reporting form wrote: > > There is two case that they are valid for invalid data. In the case 1, it > > reads an uninitialized data in the dest. In the case 2, it reads > > uninitialized or out-of-bound data in the dest. They are invalid. > Well, failure like this after reading corrupted data from disk is not > really surprising and it's hardly a bug. It's kinda intended to work > that way, really. There's some weight to the argument that the server should just crash but instead report an ERRCODE_DATA_CORRUPTED message, such as what happens with (say) invalid page headers. It would probably require a lot more branches in the detoasting code that might decrease performance, though. A patch would help to see how bad that would be, though offhand I would expect it to be very bad. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: BUG #16082: TOAST's pglz_decompress access to uninitializeddata, if the database is corrupted.
From
Tomas Vondra
Date:
On Wed, Oct 30, 2019 at 05:30:14PM -0300, Alvaro Herrera wrote: >On 2019-Oct-26, Tomas Vondra wrote: > >> On Sat, Oct 26, 2019 at 07:46:25AM +0000, PG Bug reporting form wrote: > >> > There is two case that they are valid for invalid data. In the case 1, it >> > reads an uninitialized data in the dest. In the case 2, it reads >> > uninitialized or out-of-bound data in the dest. They are invalid. > >> Well, failure like this after reading corrupted data from disk is not >> really surprising and it's hardly a bug. It's kinda intended to work >> that way, really. > >There's some weight to the argument that the server should just crash >but instead report an ERRCODE_DATA_CORRUPTED message, such as what >happens with (say) invalid page headers. It would probably require a >lot more branches in the detoasting code that might decrease >performance, though. A patch would help to see how bad that would be, >though offhand I would expect it to be very bad. > That's true. I have to admit it wan't really clear to me the current behavior is a crash. If there's a reasonably simple and low-overhead way to detect these issues and report a data corruption, then sure - let's do that. OTOH this is interenal data, and I'm sure there are countless places where a bit of data corruption can cause issues. Checksums seem like a fairly reasonable solution, IMHO. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services