Re: pg_amcheck contrib application - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: pg_amcheck contrib application |
Date | |
Msg-id | CA+Tgmob88VSOSUxn2M=ajT0TrVF8KrLQtGBpUAEP2=geNSHJQw@mail.gmail.com Whole thread Raw |
In response to | Re: pg_amcheck contrib application (Mark Dilger <mark.dilger@enterprisedb.com>) |
Responses |
Re: pg_amcheck contrib application
|
List | pgsql-hackers |
On Thu, Apr 8, 2021 at 3:02 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote: > Imagine a toasted attribute with 18 chunks numbered [0..17]. Then we update the toast to have only 6 chunks numbered [0..5]except we corruptly keep chunks numbered [12..17] in the toast table. We'd rather see a report like this: [ toast value NNN chunk NNN has sequence number NNN, but expected sequence number NNN ] > than one like this: [ toast value NNN contains chunk NNN where chunk NNN was expected ] > > because saying the toast value ended at "chunk 12" after saying that it contains "chunk 17" is contradictory. You needthe distinction between the chunk number and the chunk sequence number, since in corrupt circumstances they may not bethe same. Hmm, I see your point, and that's a good example to illustrate it. But, with that example in front of me, I am rather doubtful that either of these is what users actually want. Consider the case where I should have chunks 0..17 and chunk 1 is just plain gone. This, by the way, seems like a pretty likely case to arise in practice, since all we need is for a block to get truncated away or zeroed erroneously, or for a tuple to get pruned that shouldn't. With either of the above schemes, I guess we're going to get a message about every chunk from 2 to 17, complaining that they're all misnumbered. We might also get a complaint that the last chunk is the wrong size, and that the total number of chunks isn't right. What we really want is a single complaint saying chunk 1 is missing. Likewise, in your example, I sort of feel like what I really want, rather than either of the above outputs, is to get some messages like this: toast value NNN contains unexpected extra chunk [12-17] Both your phrasing for those messages and what I suggested make it sound like the problem is that the chunk number is wrong. But that doesn't seem like it's taking the right view of the situation. Chunks 12-17 shouldn't exist at all, and if they do, we should say that, e.g. by complaining about something like "toast value 16444 chunk 12 follows last expected chunk 5" In other words, I don't buy the idea that the user will accept the idea that there's a chunk number and a chunk sequence number, and that they should know the difference between those things and what each of them are. They're entitled to imagine that there's just one thing, and that we're going to tell them about value that are extra or missing. The fact that we're not doing that seems like it's just a matter of missing code. If we start the index scan and get chunk 4, we can easily emit messages for chunks 0..3 right on the spot, declaring them missing. Things do get a bit hairy if the index scan returns values out of order: what if it gives us chunk_seq = 2 and then chunk_seq = 1? But I think we could handle that by just issuing a complaint in any such case that "toast index returns chunks out of order for toast value NNN" and stopping further checking of that toast value. > Purely as manual testing, and not part of the patch, I hacked the backend a bit to allow direct modification of the toasttable. After corrupting the toast with the following bit of SQL: > > WITH chunk_limit AS ( > SELECT chunk_id, MAX(chunk_seq) AS maxseq > FROM $toastname > GROUP BY chunk_id) > INSERT INTO $toastname (chunk_id, chunk_seq, chunk_data) > (SELECT t.chunk_id, > t.chunk_seq + cl.maxseq + CASE WHEN t.chunk_seq < 3 THEN 1 ELSE 7 END, > t.chunk_data > FROM $toastname t > INNER JOIN chunk_limit cl > ON t.chunk_id = cl.chunk_id) > > pg_amcheck reports the following corruption messages: > > # heap table "postgres"."public"."test", block 0, offset 1, attribute 2: > # toast value 16444 chunk 6 follows last expected chunk 5 > # heap table "postgres"."public"."test", block 0, offset 1, attribute 2: > # toast value 16444 chunk 7 follows last expected chunk 5 > # heap table "postgres"."public"."test", block 0, offset 1, attribute 2: > # toast value 16444 chunk 8 follows last expected chunk 5 > # heap table "postgres"."public"."test", block 0, offset 1, attribute 2: > # toast value 16444 chunk 9 has sequence number 15, but expected sequence number 9 > # heap table "postgres"."public"."test", block 0, offset 1, attribute 2: > # toast value 16444 chunk 10 has sequence number 16, but expected sequence number 10 > # heap table "postgres"."public"."test", block 0, offset 1, attribute 2: > # toast value 16444 chunk 11 has sequence number 17, but expected sequence number 11 > # heap table "postgres"."public"."test", block 0, offset 1, attribute 2: > # toast value 16444 was expected to end at chunk 6, but ended at chunk 12 > > I think if we'd left out the first three messages, it would read strangely. We would be complaining about three chunkswith the wrong sequence number, then conclude that there were six extra chunks. A sufficiently savvy user might deducethe presence of chunks 6, 7, and 8, but the problem is more obvious (to my eyes, at least) if we keep the first threemessages. This seems like a judgement call and not a clear argument either way, so if you still want me to change it,I guess I don't mind doing so. I mean, looking at it, the question here is why it's not just using the same message for all of them. The fact that the chunk numbers are higher than 5 is the problem. The sequence numbers seem like just a distraction. > > On a related note, as I think I said before, I still think we should > > be rejiggering this so that we're not testing both the size of each > > individual chunk and the total size, because that ought to be > > redundant. That might be better done as a separate patch but I think > > we should try to clean it up. > > Can you point me to the exact check you are mentioning, and with which patch applied? I don't see any examples of thisafter applying the v18-0003. Hmm, my mistake, I think. > > I'm a little worried about whether the additional test cases are > > Endian-dependent at all. I don't immediately know what might be wrong > > with them, but I'm going to think about that some more later. Any > > chance you have access to a Big-endian box where you can test this? > > I don't have a Big-endian box, but I think one of them may be wrong now that you mention the issue: > > # Corrupt column c's toast pointer va_extinfo field > > The problem is that the 30-bit extsize and 2-bit cmid split is not being handled in the perl test, and I don't see an easyway to have perl's pack/unpack do that for us. There isn't any requirement that each possible corruption we check actuallybe manifested in the regression tests. The simplest solution is to remove this problematic test, so that's whatI did. The other two new tests corrupt c_va_toastrelid and c_va_rawsize, both of which are read/written using unpack/pack,so perl should handle the endianness for us (I hope). I don't immediately see why this particular thing should be an issue. The format of the varlena header itself is different on big-endian and little-endian machines, which is why postgres.h has all this stuff conditioned on WORDS_BIGENDIAN. But va_extinfo doesn't have any similar treatment, so I'm not sure what could go wrong there, as long as the 4-byte value as a whole is being packed and unpacked according to the machine's endian-ness. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: