Thread: shared buffer manager problems and redesign

shared buffer manager problems and redesign

From
"Jamison, Kirk"
Date:

Hello, hackers

 

(Actually I’m not sure if I should post it here or in pgsql-general mailing list)

It's been discussed again a few times recently regarding the time-consuming behavior when mapping shared buffers

that happens in TRUNCATE, VACUUM when deleting the trailing empty pages in the shared buffer [1],

data corruption on file truncation error [2], etc.

 

Buffer Manager redesign/restructure

 

Andres Freund has been working on this before and described design and methods on how the buffer radix tree can be implemented. [4]

I think it is worth considering because there were a lot of proposed solutions in previous threads [1] [2] [3] [4],

but we could not arrive at consensus, as it has been pointed out that some methods lead to more complexities.

The ordered buffer mapping implementation (changing the current buffer mapping implementation) would always pop out

in these discussions as it would potentially address the problems.

 

But before we can work on POC patch, I think we should start a common discussion for potential  

a.)    data structure design of the modified buffer manager: open relations hash table, buffer radix tree, etc.

b.)    buffer tag, locks

c.)    implementation, operations (loading pages, flushing/writing out buffers), complexities, etc.

 

However, from what I understood, realistically speaking, it’s not possible to have it committed by PG12 given the complexity and time.

There is also question of how to resolve some/part of these problems like a potential solution without redesigning the shared buffer manager.

So, I really find it really important to be discussed soon.

 

I hope to hear more insights, ideas, suggestions, truth-bombs, or so. :)

 

Thank you very much.

 

Regards,

Kirk

 

References

[1] "reloption to prevent VACUUM from truncating empty pages at the end of relation"

https://www.postgresql.org/message-id/flat/CAHGQGwE5UqFqSq1%3DkV3QtTUtXphTdyHA-8rAj4A%3DY%2Be4kyp3BQ%40mail.gmail.com

[2] "Truncation failure in autovacuum results in data corruption (duplicate keys)"

https://www.postgresql.org/message-id/flat/5BBC590AE8DF4ED1A170E4D48F1B53AC@tunaPC

[3] "WIP: long transactions on hot standby feedback replica / proof of concept"

https://www.postgresql.org/message-id/flat/c9374921e50a5e8fb1ecf04eb8c6ebc3%40postgrespro.ru

[4] "Reducing the size of BufferTag & remodeling forks"

https://www.postgresql.org/message-id/flat/20150702133619.GB16267%40alap3.anarazel.de