Thread: [HACKERS] Size vs size_t

[HACKERS] Size vs size_t

From
Thomas Munro
Date:
Hi,

Noticing that the assembled hackers don't seem to agree on $SUBJECT in
new patches, I decided to plot counts of lines matching \<Size\> and
\<size_t\> over time.  After a very long run in the lead, size_t has
recently been left in the dust by Size.

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] Size vs size_t

From
Robert Haas
Date:
On Thu, Mar 16, 2017 at 4:40 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> Noticing that the assembled hackers don't seem to agree on $SUBJECT in
> new patches, I decided to plot counts of lines matching \<Size\> and
> \<size_t\> over time.  After a very long run in the lead, size_t has
> recently been left in the dust by Size.

I guess I assumed that we wouldn't have defined PG-specific types if
we wanted to just use the OS-supplied ones.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Size vs size_t

From
Andres Freund
Date:
On 2017-03-16 16:59:29 -0400, Robert Haas wrote:
> On Thu, Mar 16, 2017 at 4:40 PM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
> > Noticing that the assembled hackers don't seem to agree on $SUBJECT in
> > new patches, I decided to plot counts of lines matching \<Size\> and
> > \<size_t\> over time.  After a very long run in the lead, size_t has
> > recently been left in the dust by Size.
> 
> I guess I assumed that we wouldn't have defined PG-specific types if
> we wanted to just use the OS-supplied ones.

I think, in this case, defining Size in the first place was a bad call
on behalf of the project.  It gains us absolutely nothing, but makes it
harder to read for people that don't know PG all that well.  I think we
should slowly phase out Size usage, at least in new code.

Greetings,

Andres Freund



Re: [HACKERS] Size vs size_t

From
Robert Haas
Date:
On Thu, Mar 16, 2017 at 5:01 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2017-03-16 16:59:29 -0400, Robert Haas wrote:
>> On Thu, Mar 16, 2017 at 4:40 PM, Thomas Munro
>> <thomas.munro@enterprisedb.com> wrote:
>> > Noticing that the assembled hackers don't seem to agree on $SUBJECT in
>> > new patches, I decided to plot counts of lines matching \<Size\> and
>> > \<size_t\> over time.  After a very long run in the lead, size_t has
>> > recently been left in the dust by Size.
>>
>> I guess I assumed that we wouldn't have defined PG-specific types if
>> we wanted to just use the OS-supplied ones.
>
> I think, in this case, defining Size in the first place was a bad call
> on behalf of the project.  It gains us absolutely nothing, but makes it
> harder to read for people that don't know PG all that well.  I think we
> should slowly phase out Size usage, at least in new code.

Well, I don't think we want to end up with a mix of Size and size_t in
related code.  That buys nobody anything.  I'm fine with replacing
Size with size_t if they are always equivalent, but there's no sense
in having a jumble of styles.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Size vs size_t

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Mar 16, 2017 at 5:01 PM, Andres Freund <andres@anarazel.de> wrote:
>> On 2017-03-16 16:59:29 -0400, Robert Haas wrote:
>>> I guess I assumed that we wouldn't have defined PG-specific types if
>>> we wanted to just use the OS-supplied ones.

>> I think, in this case, defining Size in the first place was a bad call
>> on behalf of the project.

The short answer to that is that "Size" predates the universal acceptance
of size_t.  If we were making these decisions today, or anytime since the
early 2000s, we'd surely have just gone with size_t.  But it wasn't a
realistic option in the 90s.

> Well, I don't think we want to end up with a mix of Size and size_t in
> related code.  That buys nobody anything.  I'm fine with replacing
> Size with size_t if they are always equivalent, but there's no sense
> in having a jumble of styles.

I'm not in a hurry to do "s/Size/size_t/g" because I'm afraid it'll create
a lot of merge pain for back-patching, while not actually buying anything
much concretely.  I think this falls under the same policy we use for many
other stylistic details, ie make new code look like the code right around
it.  But I'm fine with entirely-new files standardizing on size_t.
        regards, tom lane



Re: [HACKERS] Size vs size_t

From
Andres Freund
Date:
On 2017-03-16 17:24:17 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Thu, Mar 16, 2017 at 5:01 PM, Andres Freund <andres@anarazel.de> wrote:
> >> On 2017-03-16 16:59:29 -0400, Robert Haas wrote:
> >>> I guess I assumed that we wouldn't have defined PG-specific types if
> >>> we wanted to just use the OS-supplied ones.
> 
> >> I think, in this case, defining Size in the first place was a bad call
> >> on behalf of the project.
> 
> The short answer to that is that "Size" predates the universal acceptance
> of size_t.  If we were making these decisions today, or anytime since the
> early 2000s, we'd surely have just gone with size_t.  But it wasn't a
> realistic option in the 90s.

Just out of curiosity I checked when we switched to backing Size with
size_t:
1998 - 0ad5d2a3a886e72b429ea2b84bfcb36c0680f84d


> > Well, I don't think we want to end up with a mix of Size and size_t in
> > related code.  That buys nobody anything.  I'm fine with replacing
> > Size with size_t if they are always equivalent, but there's no sense
> > in having a jumble of styles.
> 
> I'm not in a hurry to do "s/Size/size_t/g" because I'm afraid it'll create
> a lot of merge pain for back-patching, while not actually buying anything
> much concretely.  I think this falls under the same policy we use for many
> other stylistic details, ie make new code look like the code right around
> it.  But I'm fine with entirely-new files standardizing on size_t.

That seems like sane policy.  I'm a bit doubtful that the pain would be
all that bad, but I'm also not wild about trying.

Greetings,

Andres Freund



Re: [HACKERS] Size vs size_t

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2017-03-16 17:24:17 -0400, Tom Lane wrote:
>> The short answer to that is that "Size" predates the universal acceptance
>> of size_t.  If we were making these decisions today, or anytime since the
>> early 2000s, we'd surely have just gone with size_t.  But it wasn't a
>> realistic option in the 90s.

> Just out of curiosity I checked when we switched to backing Size with
> size_t:
> 1998 - 0ad5d2a3a886e72b429ea2b84bfcb36c0680f84d

Yeah.  We inherited the previous definition (as "unsigned int") from
Berkeley.  I wasn't involved then, of course, but I follow their reasoning
perfectly because I remember fighting the same type of portability battles
with libjpeg in the early 90s.  "size_t" was invented by the ANSI C
committee (hence, 1989 or 1990) and had only very haphazard penetration
until the late 90s.  If you wanted to write portable code you couldn't
depend on it.
        regards, tom lane



Re: [HACKERS] Size vs size_t

From
Thomas Munro
Date:
On Fri, Mar 17, 2017 at 10:39 AM, Andres Freund <andres@anarazel.de> wrote:
> On 2017-03-16 17:24:17 -0400, Tom Lane wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>> > On Thu, Mar 16, 2017 at 5:01 PM, Andres Freund <andres@anarazel.de> wrote:
>> >> On 2017-03-16 16:59:29 -0400, Robert Haas wrote:
>> > Well, I don't think we want to end up with a mix of Size and size_t in
>> > related code.  That buys nobody anything.  I'm fine with replacing
>> > Size with size_t if they are always equivalent, but there's no sense
>> > in having a jumble of styles.
>>
>> I'm not in a hurry to do "s/Size/size_t/g" because I'm afraid it'll create
>> a lot of merge pain for back-patching, while not actually buying anything
>> much concretely.  I think this falls under the same policy we use for many
>> other stylistic details, ie make new code look like the code right around
>> it.  But I'm fine with entirely-new files standardizing on size_t.
>
> That seems like sane policy.  I'm a bit doubtful that the pain would be
> all that bad, but I'm also not wild about trying.

Naive replacement in new files (present in master but not in 9.6) with
the attached script, followed by a couple of manual corrections where
Size was really an English word in a comment, gets the attached diff.

 src/backend/access/hash/hash_xlog.c        |  26 ++--
 src/backend/replication/logical/launcher.c |   4 +-
 src/backend/utils/misc/backend_random.c    |   4 +-
 src/backend/utils/mmgr/dsa.c               |  94 ++++++-------
 src/backend/utils/mmgr/freepage.c          | 202 ++++++++++++++--------------
 src/backend/utils/mmgr/slab.c              |  34 ++---
 src/include/lib/simplehash.h               |   6 +-
 src/include/replication/logicallauncher.h  |   2 +-
 src/include/utils/backend_random.h         |   2 +-
 src/include/utils/dsa.h                    |  10 +-
 src/include/utils/freepage.h               |  24 ++--
 src/include/utils/relptr.h                 |   4 +-
 12 files changed, 206 insertions(+), 206 deletions(-)

That might be just about enough for size_t to catch up...

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] Size vs size_t

From
Tom Lane
Date:
Thomas Munro <thomas.munro@enterprisedb.com> writes:
> Naive replacement in new files (present in master but not in 9.6) with
> the attached script, followed by a couple of manual corrections where
> Size was really an English word in a comment, gets the attached diff.

In the case of mmgr/slab.c, a lot of those uses of Size probably
correspond to instantiations of the MemoryContext APIs; so blindly
changing them to "size_t" seems like a bit of a type violation
(and might indeed draw warnings from pickier compilers).  Don't
know if any of the other things you've identified here have similar
entanglements.
        regards, tom lane



Re: [HACKERS] Size vs size_t

From
Daniel Gustafsson
Date:
> On 16 Mar 2017, at 23:20, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
>> Naive replacement in new files (present in master but not in 9.6) with
>> the attached script, followed by a couple of manual corrections where
>> Size was really an English word in a comment, gets the attached diff.
>
> In the case of mmgr/slab.c, a lot of those uses of Size probably
> correspond to instantiations of the MemoryContext APIs; so blindly
> changing them to "size_t" seems like a bit of a type violation
> (and might indeed draw warnings from pickier compilers).  Don't
> know if any of the other things you've identified here have similar
> entanglements.

While it might not be an issue that hits many developers, Size is also defined
on macOS in the MacTypes.h header so using CoreFoundation when hacking on macOS
port code will cause typedef redefinition errors.

Not really a case for or against, but another datapoint.

cheers ./daniel