Thread: [HACKERS] Size vs size_t
Hi, Noticing that the assembled hackers don't seem to agree on $SUBJECT in new patches, I decided to plot counts of lines matching \<Size\> and \<size_t\> over time. After a very long run in the lead, size_t has recently been left in the dust by Size. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
On Thu, Mar 16, 2017 at 4:40 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > Noticing that the assembled hackers don't seem to agree on $SUBJECT in > new patches, I decided to plot counts of lines matching \<Size\> and > \<size_t\> over time. After a very long run in the lead, size_t has > recently been left in the dust by Size. I guess I assumed that we wouldn't have defined PG-specific types if we wanted to just use the OS-supplied ones. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017-03-16 16:59:29 -0400, Robert Haas wrote: > On Thu, Mar 16, 2017 at 4:40 PM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: > > Noticing that the assembled hackers don't seem to agree on $SUBJECT in > > new patches, I decided to plot counts of lines matching \<Size\> and > > \<size_t\> over time. After a very long run in the lead, size_t has > > recently been left in the dust by Size. > > I guess I assumed that we wouldn't have defined PG-specific types if > we wanted to just use the OS-supplied ones. I think, in this case, defining Size in the first place was a bad call on behalf of the project. It gains us absolutely nothing, but makes it harder to read for people that don't know PG all that well. I think we should slowly phase out Size usage, at least in new code. Greetings, Andres Freund
On Thu, Mar 16, 2017 at 5:01 PM, Andres Freund <andres@anarazel.de> wrote: > On 2017-03-16 16:59:29 -0400, Robert Haas wrote: >> On Thu, Mar 16, 2017 at 4:40 PM, Thomas Munro >> <thomas.munro@enterprisedb.com> wrote: >> > Noticing that the assembled hackers don't seem to agree on $SUBJECT in >> > new patches, I decided to plot counts of lines matching \<Size\> and >> > \<size_t\> over time. After a very long run in the lead, size_t has >> > recently been left in the dust by Size. >> >> I guess I assumed that we wouldn't have defined PG-specific types if >> we wanted to just use the OS-supplied ones. > > I think, in this case, defining Size in the first place was a bad call > on behalf of the project. It gains us absolutely nothing, but makes it > harder to read for people that don't know PG all that well. I think we > should slowly phase out Size usage, at least in new code. Well, I don't think we want to end up with a mix of Size and size_t in related code. That buys nobody anything. I'm fine with replacing Size with size_t if they are always equivalent, but there's no sense in having a jumble of styles. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Mar 16, 2017 at 5:01 PM, Andres Freund <andres@anarazel.de> wrote: >> On 2017-03-16 16:59:29 -0400, Robert Haas wrote: >>> I guess I assumed that we wouldn't have defined PG-specific types if >>> we wanted to just use the OS-supplied ones. >> I think, in this case, defining Size in the first place was a bad call >> on behalf of the project. The short answer to that is that "Size" predates the universal acceptance of size_t. If we were making these decisions today, or anytime since the early 2000s, we'd surely have just gone with size_t. But it wasn't a realistic option in the 90s. > Well, I don't think we want to end up with a mix of Size and size_t in > related code. That buys nobody anything. I'm fine with replacing > Size with size_t if they are always equivalent, but there's no sense > in having a jumble of styles. I'm not in a hurry to do "s/Size/size_t/g" because I'm afraid it'll create a lot of merge pain for back-patching, while not actually buying anything much concretely. I think this falls under the same policy we use for many other stylistic details, ie make new code look like the code right around it. But I'm fine with entirely-new files standardizing on size_t. regards, tom lane
On 2017-03-16 17:24:17 -0400, Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > On Thu, Mar 16, 2017 at 5:01 PM, Andres Freund <andres@anarazel.de> wrote: > >> On 2017-03-16 16:59:29 -0400, Robert Haas wrote: > >>> I guess I assumed that we wouldn't have defined PG-specific types if > >>> we wanted to just use the OS-supplied ones. > > >> I think, in this case, defining Size in the first place was a bad call > >> on behalf of the project. > > The short answer to that is that "Size" predates the universal acceptance > of size_t. If we were making these decisions today, or anytime since the > early 2000s, we'd surely have just gone with size_t. But it wasn't a > realistic option in the 90s. Just out of curiosity I checked when we switched to backing Size with size_t: 1998 - 0ad5d2a3a886e72b429ea2b84bfcb36c0680f84d > > Well, I don't think we want to end up with a mix of Size and size_t in > > related code. That buys nobody anything. I'm fine with replacing > > Size with size_t if they are always equivalent, but there's no sense > > in having a jumble of styles. > > I'm not in a hurry to do "s/Size/size_t/g" because I'm afraid it'll create > a lot of merge pain for back-patching, while not actually buying anything > much concretely. I think this falls under the same policy we use for many > other stylistic details, ie make new code look like the code right around > it. But I'm fine with entirely-new files standardizing on size_t. That seems like sane policy. I'm a bit doubtful that the pain would be all that bad, but I'm also not wild about trying. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2017-03-16 17:24:17 -0400, Tom Lane wrote: >> The short answer to that is that "Size" predates the universal acceptance >> of size_t. If we were making these decisions today, or anytime since the >> early 2000s, we'd surely have just gone with size_t. But it wasn't a >> realistic option in the 90s. > Just out of curiosity I checked when we switched to backing Size with > size_t: > 1998 - 0ad5d2a3a886e72b429ea2b84bfcb36c0680f84d Yeah. We inherited the previous definition (as "unsigned int") from Berkeley. I wasn't involved then, of course, but I follow their reasoning perfectly because I remember fighting the same type of portability battles with libjpeg in the early 90s. "size_t" was invented by the ANSI C committee (hence, 1989 or 1990) and had only very haphazard penetration until the late 90s. If you wanted to write portable code you couldn't depend on it. regards, tom lane
On Fri, Mar 17, 2017 at 10:39 AM, Andres Freund <andres@anarazel.de> wrote: > On 2017-03-16 17:24:17 -0400, Tom Lane wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >> > On Thu, Mar 16, 2017 at 5:01 PM, Andres Freund <andres@anarazel.de> wrote: >> >> On 2017-03-16 16:59:29 -0400, Robert Haas wrote: >> > Well, I don't think we want to end up with a mix of Size and size_t in >> > related code. That buys nobody anything. I'm fine with replacing >> > Size with size_t if they are always equivalent, but there's no sense >> > in having a jumble of styles. >> >> I'm not in a hurry to do "s/Size/size_t/g" because I'm afraid it'll create >> a lot of merge pain for back-patching, while not actually buying anything >> much concretely. I think this falls under the same policy we use for many >> other stylistic details, ie make new code look like the code right around >> it. But I'm fine with entirely-new files standardizing on size_t. > > That seems like sane policy. I'm a bit doubtful that the pain would be > all that bad, but I'm also not wild about trying. Naive replacement in new files (present in master but not in 9.6) with the attached script, followed by a couple of manual corrections where Size was really an English word in a comment, gets the attached diff. src/backend/access/hash/hash_xlog.c | 26 ++-- src/backend/replication/logical/launcher.c | 4 +- src/backend/utils/misc/backend_random.c | 4 +- src/backend/utils/mmgr/dsa.c | 94 ++++++------- src/backend/utils/mmgr/freepage.c | 202 ++++++++++++++-------------- src/backend/utils/mmgr/slab.c | 34 ++--- src/include/lib/simplehash.h | 6 +- src/include/replication/logicallauncher.h | 2 +- src/include/utils/backend_random.h | 2 +- src/include/utils/dsa.h | 10 +- src/include/utils/freepage.h | 24 ++-- src/include/utils/relptr.h | 4 +- 12 files changed, 206 insertions(+), 206 deletions(-) That might be just about enough for size_t to catch up... -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Thomas Munro <thomas.munro@enterprisedb.com> writes: > Naive replacement in new files (present in master but not in 9.6) with > the attached script, followed by a couple of manual corrections where > Size was really an English word in a comment, gets the attached diff. In the case of mmgr/slab.c, a lot of those uses of Size probably correspond to instantiations of the MemoryContext APIs; so blindly changing them to "size_t" seems like a bit of a type violation (and might indeed draw warnings from pickier compilers). Don't know if any of the other things you've identified here have similar entanglements. regards, tom lane
> On 16 Mar 2017, at 23:20, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Thomas Munro <thomas.munro@enterprisedb.com> writes: >> Naive replacement in new files (present in master but not in 9.6) with >> the attached script, followed by a couple of manual corrections where >> Size was really an English word in a comment, gets the attached diff. > > In the case of mmgr/slab.c, a lot of those uses of Size probably > correspond to instantiations of the MemoryContext APIs; so blindly > changing them to "size_t" seems like a bit of a type violation > (and might indeed draw warnings from pickier compilers). Don't > know if any of the other things you've identified here have similar > entanglements. While it might not be an issue that hits many developers, Size is also defined on macOS in the MacTypes.h header so using CoreFoundation when hacking on macOS port code will cause typedef redefinition errors. Not really a case for or against, but another datapoint. cheers ./daniel