Thread: [HACKERS] Revisiting NAMEDATALEN

[HACKERS] Revisiting NAMEDATALEN

From
Emrul
Date:
Hi hackers,

This question came up again on Reddit:
https://www.reddit.com/r/PostgreSQL/comments/6kyyev/i_have_hit_the_table_name_length_limit_a_number/
and I thought I'd echo it here.

I totally am on board with short, descriptive names and a good convention. 
However, there are just so many cases where 63 characters can't
descriptively describe a column name.  I've been on projects where we have
one table maybe with only a few thousand records but hundreds of columns
each uniquely describing an attribute on the record.  It is a challenge
bordering on impossible to fit them into a consistently named field of <63
characters that someone can later refer to and know what piece of
information it actually refers to.

Is this something that can be revisited for an upcoming release? Also, are
there any technical problems that would be created by increasing this
attribute?



--
View this message in context: http://www.postgresql-archive.org/Revisiting-NAMEDATALEN-tp5969858.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: [HACKERS] Revisiting NAMEDATALEN

From
"Joshua D. Drake"
Date:
On 07/03/2017 11:31 AM, Emrul wrote:
> Hi hackers,
> 
> This question came up again on Reddit:
> https://www.reddit.com/r/PostgreSQL/comments/6kyyev/i_have_hit_the_table_name_length_limit_a_number/
> and I thought I'd echo it here.
> 
> I totally am on board with short, descriptive names and a good convention.
> However, there are just so many cases where 63 characters can't
> descriptively describe a column name.  I've been on projects where we have
> one table maybe with only a few thousand records but hundreds of columns
> each uniquely describing an attribute on the record.  It is a challenge
> bordering on impossible to fit them into a consistently named field of <63
> characters that someone can later refer to and know what piece of
> information it actually refers to.
> 
> Is this something that can be revisited for an upcoming release? Also, are
> there any technical problems that would be created by increasing this
> attribute?

Although I appreciate the sentiment this seems over the top:

datasystem_adjustmentmanagement_mm_datasystem_adjustmentmanagement_products

You can always use COMMENT ON to explode the actual meaning.

JD


-- 
Command Prompt, Inc. || http://the.postgres.company/ || @cmdpromptinc

PostgreSQL Centered full stack support, consulting and development.
Advocate: @amplifypostgres || Learn: https://pgconf.us
*****     Unless otherwise stated, opinions are my own.   *****



Re: [HACKERS] Revisiting NAMEDATALEN

From
Emrul
Date:
Yes, for the example given in the Reddit post I would tend to agree.  This is
one of those issues where for the most part the solution is better naming
conventions but for the few instances where this isn't possible it is a
right pain.




--
View this message in context: http://www.postgresql-archive.org/Revisiting-NAMEDATALEN-tp5969858p5969860.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: [HACKERS] Revisiting NAMEDATALEN

From
Tom Lane
Date:
Emrul <emrul@emrul.com> writes:
> Is this something that can be revisited for an upcoming release? Also, are
> there any technical problems that would be created by increasing this
> attribute?

This has been discussed before, eg here:
https://www.postgresql.org/message-id/23546.1353223443@sss.pgh.pa.us

You're free to build your own copy with whatever NAMEDATALEN you want,
but it seems unlikely to me that we'll change the default.
        regards, tom lane



Re: [HACKERS] Revisiting NAMEDATALEN

From
Emrul
Date:
Tom, thank you for that pointer.  I get now that it is not free and therefore
why its not something that should be changed by default.

I guess the problem is 'build your own copy' (i.e. compiling from source) is
something that sends most DB teams running into the hills.

A solution might be to make NAMEDATALEN configurable without having to
recompile source (perhaps a config variable or an initdb parameter). When I
have some free time I will investigate whether this is doable.




--
View this message in context: http://www.postgresql-archive.org/Revisiting-NAMEDATALEN-tp5969858p5970351.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Re: [HACKERS] Revisiting NAMEDATALEN

From
Robert Haas
Date:
On Fri, Jul 7, 2017 at 5:53 AM, Emrul <emrul@emrul.com> wrote:
> Tom, thank you for that pointer.  I get now that it is not free and therefore
> why its not something that should be changed by default.
>
> I guess the problem is 'build your own copy' (i.e. compiling from source) is
> something that sends most DB teams running into the hills.
>
> A solution might be to make NAMEDATALEN configurable without having to
> recompile source (perhaps a config variable or an initdb parameter). When I
> have some free time I will investigate whether this is doable.

Well, it wouldn't be free.  The problem is:

typedef struct nameData
{       char            data[NAMEDATALEN];
} NameData;

If it were not a compile-time constant, every bit of code that uses
NameData (or otherwise depends on NAMEDATALEN being constant) would
have to be changed.  That would be invasive and likely have at least a
minor performance cost.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Revisiting NAMEDATALEN

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Jul 7, 2017 at 5:53 AM, Emrul <emrul@emrul.com> wrote:
>> A solution might be to make NAMEDATALEN configurable without having to
>> recompile source (perhaps a config variable or an initdb parameter). When I
>> have some free time I will investigate whether this is doable.

> Well, it wouldn't be free.  The problem is:

> typedef struct nameData
> {
>         char            data[NAMEDATALEN];
> } NameData;

> If it were not a compile-time constant, every bit of code that uses
> NameData (or otherwise depends on NAMEDATALEN being constant) would
> have to be changed.  That would be invasive and likely have at least a
> minor performance cost.

It's a lot worse than just the code that accesses the names; if that were
all then we might hope to hide most of the issue underneath macros like
RelationGetRelationName.  The problem is that if NameData isn't constant
length, that also breaks the struct overlay mechanism for catalog rows,
introducing notational and performance costs for essentially every catalog
field access in the entire backend.  That is, in places like

CATALOG(pg_type,1247) BKI_BOOTSTRAP BKI_ROWTYPE_OID(71) BKI_SCHEMA_MACRO
{   NameData    typname;        /* type name */   Oid         typnamespace;   /* OID of namespace containing this type
*/  Oid         typowner;       /* type owner */   ...
 

typnamespace and following fields are no longer easily accessible
to C code.

You could partly get around that by relocating name columns to the ends
of their catalog rows --- but only partly, and it would be a darn odd
looking result.

It's interesting to speculate about NameData becoming some sort of short
fixed-length pointer to a variable-length string stored somewhere else
(like the end of the row).  But TBH I cannot see a scheme like that ever
getting out of the realm of speculation --- it would break too many
*other* assumptions, many of them in performance-critical places like
tuple assembly and disassembly.

In the end I just don't think this is worth the work it would take to
improve matters significantly over the current situation.  It's too much
work benefitting too few people.
        regards, tom lane



Re: [HACKERS] Revisiting NAMEDATALEN

From
Mark Dilger
Date:
> On Jul 7, 2017, at 2:53 AM, Emrul <emrul@emrul.com> wrote:
>
> Tom, thank you for that pointer.  I get now that it is not free and therefore
> why its not something that should be changed by default.
>
> I guess the problem is 'build your own copy' (i.e. compiling from source) is
> something that sends most DB teams running into the hills.

To make matters worse, if you change NAMEDATALEN, compile, and run
'make check', some of the tests will fail.  The tests are very sensitive to the
exact output of the sql they execute, and changing NAMEDATALEN, or
indeed any one of many other options, causes some of the test output to
change. Even configure's options, such as --with-blocksize, cannot be
changed from the default value without potentially breaking the regression
tests.

mark


Re: [HACKERS] Revisiting NAMEDATALEN

From
Bruce Momjian
Date:
On Mon, Jul  3, 2017 at 11:31:01AM -0700, Emrul wrote:
> Hi hackers,
> 
> This question came up again on Reddit:
> https://www.reddit.com/r/PostgreSQL/comments/6kyyev/i_have_hit_the_table_name_length_limit_a_number/
> and I thought I'd echo it here.
> 
> I totally am on board with short, descriptive names and a good convention. 
> However, there are just so many cases where 63 characters can't
> descriptively describe a column name.  I've been on projects where we have

I am coming in late on this, but just to clarify, the NAMEDATALEN is in
_bytes_, meaning multi-byte names are often less than 63 characters.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +