Thread: [HACKERS] Should logtape.c blocks be of type long?
logtape.c stores block numbers on disk. These block numbers are represented in memory as being of type long. This is why BufFile clients that use the block-based interface (currently limited to logtape.c and gistbuildbuffers.c) must live with a specific limitation, as noted within buffile.c: /** BufFileSeekBlock --- block-oriented seek** Performs absolute seek to the start of the n'th BLCKSZ-sized block of* thefile. Note that users of this interface will fail if their files* exceed BLCKSZ * LONG_MAX bytes, but that is quite alot; we don't work* with tables bigger than that, either...** Result is 0 if OK, EOF if not. Logical position is not movedif an* impossible seek is attempted.*/ int BufFileSeekBlock(BufFile *file, long blknum) { ... } That restriction is fine on 64-bit Linux, where it is actually true that "we don't work with tables that big either", but on Win64 long is still only 4 bytes. 32-bit systems are similarly affected. Of course, MaxBlockNumber is 0xFFFFFFFE, whereas LONG_MAX is only 0x7FFFFFFF on affected systems. Is it worth changing this interface to use int64, which is already defined to be a "long int" on most real-world installations anyway? Though it hardly matters, this would have some cost: logtape.c temp files would have a higher storage footprint on win64 (i.e., the same overhead as elsewhere). I tend to be suspicious of use of the type "long" in general, because in general one should assume that it is no wider than "int". This calls into question why any code that uses "long" didn't just use "int", at least in my mind. -- Peter Geoghegan
On Sun, Feb 26, 2017 at 2:44 AM, Peter Geoghegan <pg@bowt.ie> wrote: > I tend to be suspicious of use of the type "long" in general, because > in general one should assume that it is no wider than "int". This > calls into question why any code that uses "long" didn't just use > "int", at least in my mind. Yeah. Using things that are guaranteed to be the size we want them to be (and the same size on all platforms) seems like a good plan. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Peter Geoghegan <pg@bowt.ie> writes: > logtape.c stores block numbers on disk. These block numbers are > represented in memory as being of type long. Yeah. This code is far older than our willingness to assume that every platform can support int64, and I'm pretty sure that use of "long" was just a compromise to get the widest values we could use portably and without a lot of notational hassle. (There are some similar choices in the area of memory usage, particularly calculations related to work_mem.) Having said that, I'm not sure it's worth the trouble of changing. The platforms where there's a difference are probably not muscular enough that anyone would ever get past 16TB in a temp file anyhow. regards, tom lane
On Sun, Feb 26, 2017 at 9:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yeah. This code is far older than our willingness to assume that every > platform can support int64, and I'm pretty sure that use of "long" was > just a compromise to get the widest values we could use portably and > without a lot of notational hassle. (There are some similar choices in > the area of memory usage, particularly calculations related to work_mem.) I'm glad that you pointed out the history with work_mem calculations specifically, since I have found this confusing in the past. I was about to ask "what about 64-bit Windows?", but then remembered that we don't actually support large allocations on that platform (this is why MAX_KILOBYTES exists). > Having said that, I'm not sure it's worth the trouble of changing. > The platforms where there's a difference are probably not muscular > enough that anyone would ever get past 16TB in a temp file anyhow. As things stand, a 64-bit windows installation would have any CLUSTER of a table that exceeds 16TiB fail, possibly pretty horribly (I haven't thought through the consequences much). This is made more likely by the fact that we've made tuplesort faster in the past few releases (gains which the MAX_KILOBYTES restriction won't impinge on too much, particularly in Postgres 10). I find that unacceptable, at least for Postgres 10. -- Peter Geoghegan
On Sun, Feb 26, 2017 at 10:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Having said that, I'm not sure it's worth the trouble of changing. > The platforms where there's a difference are probably not muscular > enough that anyone would ever get past 16TB in a temp file anyhow. Is this comment a reflection of your generally-low opinion about Windows, or some specific limitation that I don't know about? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Peter Geoghegan <pg@bowt.ie> writes: > On Sun, Feb 26, 2017 at 9:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Having said that, I'm not sure it's worth the trouble of changing. >> The platforms where there's a difference are probably not muscular >> enough that anyone would ever get past 16TB in a temp file anyhow. > As things stand, a 64-bit windows installation would have any CLUSTER > of a table that exceeds 16TiB fail, possibly pretty horribly (I > haven't thought through the consequences much). This is made more > likely by the fact that we've made tuplesort faster in the past few > releases (gains which the MAX_KILOBYTES restriction won't impinge on > too much, particularly in Postgres 10). I find that unacceptable, at > least for Postgres 10. [ shrug... ] If you're excited enough about it to do the work, I won't stand in your way. But I don't find it to be a stop-ship issue. regards, tom lane
On Sun, Feb 26, 2017 at 9:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > [ shrug... ] If you're excited enough about it to do the work, I won't > stand in your way. But I don't find it to be a stop-ship issue. I'll add it to my todo list for Postgres 10. I think it's worth being consistent about a restriction like this, as Robert said. Given that fixing this issue will not affect the machine code generated by compilers for the majority of platforms we support, doing so seems entirely worthwhile to me. -- Peter Geoghegan
On Sun, Feb 26, 2017 at 10:04:20AM -0800, Peter Geoghegan wrote: > I think it's worth being consistent about a restriction like this, as > Robert said. Given that fixing this issue will not affect the machine > code generated by compilers for the majority of platforms we support, > doing so seems entirely worthwhile to me. (Reviving an old thread, fives years later..) As far as I can see, no patches have been sent to do that, and the changes required to replace long by int64 in the logtape and tuplesort code are rather minimal as long as some care is taken with all the internals of logtape (correct me of course if I'm wrong). I really don't see why we shouldn't do the switch based on the argument of Windows not being able to handle more than 16TB worth of blocks as things stand because of long in these code paths. So, attached is a patch to change these longs to int64. Any thoughts? -- Michael
Attachment
On Thu, Sep 21, 2023 at 9:46 PM Michael Paquier <michael@paquier.xyz> wrote: > So, attached is a patch to change these longs to int64. Any thoughts? No new thoughts. I'm still all in favor of this. Thanks for picking it up. At some point we should completely ban the use of "long". -- Peter Geoghegan
On Thu, Sep 21, 2023 at 09:53:02PM -0700, Peter Geoghegan wrote: > No new thoughts. I'm still all in favor of this. Thanks for picking it up. Okay, thanks. I guess that nobody would complain if I were to apply that.. > At some point we should completely ban the use of "long". Indeed, or Windows decides that making long 8-byte is wiser, but I doubt that's ever going to happen on backward-compatibility ground. -- Michael
Attachment
On Sun, Sep 24, 2023 at 10:42:49AM +0900, Michael Paquier wrote: > Indeed, or Windows decides that making long 8-byte is wiser, but I > doubt that's ever going to happen on backward-compatibility ground. While looking more at that, I've noticed that I missed BufFileAppend() and BufFileSeekBlock(), that themselves rely on long. The other code paths calling these two routines rely on BlockNumber (aka uint32), so that seems to be the bottom of it. For now, I have registered this patch to the next CF: https://commitfest.postgresql.org/45/4589/ Comments are welcome. -- Michael
Attachment
On 26/09/2023 07:15, Michael Paquier wrote: > On Sun, Sep 24, 2023 at 10:42:49AM +0900, Michael Paquier wrote: >> Indeed, or Windows decides that making long 8-byte is wiser, but I >> doubt that's ever going to happen on backward-compatibility ground. > > While looking more at that, I've noticed that I missed BufFileAppend() > and BufFileSeekBlock(), that themselves rely on long. The other code > paths calling these two routines rely on BlockNumber (aka uint32), so > that seems to be the bottom of it. BufFileTellBlock should be adjusted too. Or removed altogether; it's been commented out since year 2000. Other than that, looks good to me. -- Heikki Linnakangas Neon (https://neon.tech)
On Thu, Nov 16, 2023 at 03:03:46PM +0100, Heikki Linnakangas wrote: > BufFileTellBlock should be adjusted too. Or removed altogether; it's been > commented out since year 2000. Other than that, looks good to me. Yes, I recall wondering about what to do on this one so I just let it be when updating the last version of the patch as we have many NOT_USED stuff in the core code. After 23 years being around for no purpose, I have just applied a patch to remove it for the sake of this change, then applied the main change. Thanks for double-checking, Heikki! -- Michael