Thread: Path-length follies

Path-length follies

From
Tom Lane
Date:
Whilst cleaning up query-length dependencies, I noticed that our
handling of maximum file pathname lengths is awfully messy.

Different parts of the system rely on no fewer than four different
symbols that they import from several different system header
files (any one of which might not exist on a particular platform):MAXPATHLEN, _POSIX_PATH_MAX, MAX_PATH, PATH_MAX
And on top of that, postgres.h defines MAXPGPATH which is used
by yet other places.

On my system, _POSIX_PATH_MAX = 255, PATH_MAX = 1023, MAXPATHLEN = 1024
(a nearby Linux box is almost but not quite the same) whereas MAXPGPATH
is 128.  So there is absolutely no consistency to the pathname length
limits being imposed in different parts of Postgres.

AFAIK, most or all flavors of Unix have kernel limits on the maximum
length of a pathname that will be accepted by the kernel's file-access
calls (it's 1024 on my box).  So I don't feel any need to remove
hardwired limits on pathname lengths in favor of indefinitely-expansible
buffers.  But it does seem that a little more consistency in the
hardwired limits is called for.

>From the information I have, it seems that the various allegedly-
standard #defines for max pathname length are not too standard,
and I don't think that Postgres internal buffers ought to constrain
path lengths to much less than the kernel limit (so using the
seemingly "standard" _POSIX_PATH_MAX symbol would be a loser).
So my inclination is to define MAXPGPATH as 1024 in config.h, and
remove all uses of the other four symbols in favor of MAXPGPATH.
That would at least provide a single point of tweaking for anyone
who didn't like the value of 1024.

Does anyone have a better idea?  Is it worth trying to extract a
system limit on pathlength during configure, rather than leaving
MAXPGPATH as a manual configuration item --- and if so, exactly how
should configure go about it?
        regards, tom lane


Re: [HACKERS] Path-length follies

From
Bruce Momjian
Date:
> Whilst cleaning up query-length dependencies, I noticed that our
> handling of maximum file pathname lengths is awfully messy.
> 
> Different parts of the system rely on no fewer than four different
> symbols that they import from several different system header
> files (any one of which might not exist on a particular platform):
>     MAXPATHLEN, _POSIX_PATH_MAX, MAX_PATH, PATH_MAX
> And on top of that, postgres.h defines MAXPGPATH which is used
> by yet other places.
> 
> On my system, _POSIX_PATH_MAX = 255, PATH_MAX = 1023, MAXPATHLEN = 1024
> (a nearby Linux box is almost but not quite the same) whereas MAXPGPATH
> is 128.  So there is absolutely no consistency to the pathname length
> limits being imposed in different parts of Postgres.
> 
> AFAIK, most or all flavors of Unix have kernel limits on the maximum
> length of a pathname that will be accepted by the kernel's file-access
> calls (it's 1024 on my box).  So I don't feel any need to remove
> hardwired limits on pathname lengths in favor of indefinitely-expansible
> buffers.  But it does seem that a little more consistency in the
> hardwired limits is called for.
> 
> >From the information I have, it seems that the various allegedly-
> standard #defines for max pathname length are not too standard,
> and I don't think that Postgres internal buffers ought to constrain
> path lengths to much less than the kernel limit (so using the
> seemingly "standard" _POSIX_PATH_MAX symbol would be a loser).
> So my inclination is to define MAXPGPATH as 1024 in config.h, and
> remove all uses of the other four symbols in favor of MAXPGPATH.
> That would at least provide a single point of tweaking for anyone
> who didn't like the value of 1024.
> 
> Does anyone have a better idea?  Is it worth trying to extract a
> system limit on pathlength during configure, rather than leaving
> MAXPGPATH as a manual configuration item --- and if so, exactly how
> should configure go about it?

I don't like the 128 or 256 numbers, but isn't there a predefined place
for this value in standard system headers?


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Path-length follies

From
Tom Lane
Date:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
>> Does anyone have a better idea?  Is it worth trying to extract a
>> system limit on pathlength during configure, rather than leaving
>> MAXPGPATH as a manual configuration item --- and if so, exactly how
>> should configure go about it?

> I don't like the 128 or 256 numbers, but isn't there a predefined place
> for this value in standard system headers?

There are too many of 'em, actually --- I had never realized this
before, but there are three or four *different* "standard" symbols that
all purport to be max pathlength.  On my box they actually have three
different values, which doesn't leave a warm feeling in the stomach.

As I was just commenting off-list, we do not need to enforce the local
kernel's pathlength limit --- it's perfectly capable of doing that for
itself.  All we really need to do is make sure we are not a bottleneck
preventing reasonable usage.  So, although I was thinking last night
that a configure test might be a good idea, I now believe it's a waste
of cycles.  (It could even be counterproductive, if it seized on a
bogusly small value, as _POSIX_PATH_MAX appears to be on both of the
systems I've checked.)  Let's just set the value at something generous
like 1K and forget it.  But we should use a consistent, tweakable-in-
one-place value, just in case.
        regards, tom lane


Re: [HACKERS] Path-length follies

From
Bruce Momjian
Date:
> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> >> Does anyone have a better idea?  Is it worth trying to extract a
> >> system limit on pathlength during configure, rather than leaving
> >> MAXPGPATH as a manual configuration item --- and if so, exactly how
> >> should configure go about it?
> 
> > I don't like the 128 or 256 numbers, but isn't there a predefined place
> > for this value in standard system headers?
> 
> There are too many of 'em, actually --- I had never realized this
> before, but there are three or four *different* "standard" symbols that
> all purport to be max pathlength.  On my box they actually have three
> different values, which doesn't leave a warm feeling in the stomach.

Couldn't we pick one of the standard ones for use in setting a value for
our own define, or at least test one of the standard ones against ours
to see that it is either equal or greater than the 1024 we chose?

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [HACKERS] Path-length follies

From
Peter Eisentraut
Date:
This came and went already but I did some research on it and it doesn't
look as bad as it seems.

On 1999-10-23, Tom Lane mentioned:

> Different parts of the system rely on no fewer than four different
> symbols that they import from several different system header
> files (any one of which might not exist on a particular platform):
>     MAXPATHLEN, _POSIX_PATH_MAX, MAX_PATH, PATH_MAX
> And on top of that, postgres.h defines MAXPGPATH which is used
> by yet other places.
> 
> On my system, _POSIX_PATH_MAX = 255, PATH_MAX = 1023, MAXPATHLEN = 1024
> (a nearby Linux box is almost but not quite the same) whereas MAXPGPATH
> is 128.  So there is absolutely no consistency to the pathname length
> limits being imposed in different parts of Postgres.

The Posix.1 symbol is PATH_MAX, which, in theory, describes the "uniform
system limit". The symbol _POSIX_PATH_MAX defines the minimum which
PATH_MAX is required to be on any Posix system, therefore that value
should be fixed at 255 in the whole world. (Which yields code such as
this:
#ifndef MAXPATHLEN
#define MAXPATHLEN _POSIX_PATH_MAX 
#endif
--from the actual source-- conceptually incorrect.)

>From my linux/limits.h (which propagates through to limits.h):
#define PATH_MAX        4095    /* # chars in a path name */

In addition there is FILENAME_MAX, which is even defined if there is, in
fact, no limit on the filename length, in which case it is set to some
really large number. (Thus it is no good for allocating fixed size
buffers.) This seems to be an ANSI C symbol for stdio sort of stuff, not a
kernel thing. (And of course in the GNU "Any Day Now" System, there is no
such limit. ;)

MAXPATHLEN is the BSD name for PATH_MAX. From my sys/param.h:
/* BSD names for some <limits.h> values.  */. . .
#define MAXPATHLEN      PATH_MAX

Although this seems to be the most popular thing to use, I can hardly see
it referenced in any documentation at all on this machine.

If one wishes to be anally proper one could use pathconf() to find out the
limits on the fly as they apply to a particular file system.

Finally, the symbol MAX_PATH is not described anywhere and I didn't find
it in the source either.

Which would lead one to suggest the following as portable as possible way
out:

#if defined(PATH_MAX) #define MAXPGPATH PATH_MAX
#else #if defined(MAXPATHLEN)   #define MAXPGPATH MAXPATHLEN #else   #define MAXPGPATH 255  /* because this is the
lowestcommon              denominator on Posix systems */ #endif
 
#endif

That ought to cover all bases really. And if your system doesn't have
either Posix or BSD includes (whoops!) you can tweak it yourself. Put that
in config.h and everyone is happy.

Then again, I would be even happier if we just used PATH_MAX and not
invent a PostgreSQL-specific constant for everything in the world, but I'm
not sure about the Posix'ness of other systems in the crowd out there. How
about simply:

#ifndef PATH_MAX
#define PATH_MAX 255
#endif

in c.h (not config.h) -- end of story.

(Of course the code would actually have to use this as well. Currently,
MAXPATHLEN is most widespread.)
-Peter

-- 
Peter Eisentraut                  Sernanders vaeg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: [HACKERS] Path-length follies

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Which would lead one to suggest the following as portable as possible way
> out:

> #if defined(PATH_MAX)
>   #define MAXPGPATH PATH_MAX
> #else
>   #if defined(MAXPATHLEN)
>     #define MAXPGPATH MAXPATHLEN
>   #else
>     #define MAXPGPATH 255  /* because this is the lowest common
>                   denominator on Posix systems */
>   #endif
> #endif

I don't think this would be an improvement.  The main problem with it is
that the above code could yield different values for MAXPGPATH *on the
same system* depending on which system include file(s) you had included
before reading config.h.  Of course it would be a very bad thing if
different Postgres source files had different ideas about the value of
MAXPGPATH --- it could lead to different interpretations of a struct
layout, for example.  (I'm not sure that we actually have any such
structs, but there's obviously potential for trouble.)

If it were really important to have MAXPGPATH exactly equal to the
local filename length limit, I'd be more interested in trying to
configure it just so.  One possibility would be to have the configure
script do the equivalent of the above logic once at configure time,
and then put the nailed-down value into config.h.  But I can't see
that it's worth the trouble.  As long as we are not getting in people's
way with an unreasonably small limit on pathlengths, it doesn't much
matter exactly what the limit is.  IMHO anyway.

However, this line of thought does lead to something that maybe we
should change: right now, most of the source files are set up as
#include <all necessary system header files>
#include "postgres.h"
#include "necessary postgres headers"

where config.h is read as part of postgres.h.  I wonder whether it's
such a good idea to have different source files reading different
sets of system headers before config.h.  Maybe the standard order
ought to be
#include "postgres.h"
#include <all necessary system header files>
#include "necessary postgres headers"

so that config.h is always read in a uniform context.
        regards, tom lane


Re: [HACKERS] Path-length follies

From
Peter Eisentraut
Date:
On 1999-11-06, Tom Lane mentioned:

> Peter Eisentraut <peter_e@gmx.net> writes:
> > Which would lead one to suggest the following as portable as possible way
> > out:
> 
> > #if defined(PATH_MAX)
> >   #define MAXPGPATH PATH_MAX
> > #else
> >   #if defined(MAXPATHLEN)
> >     #define MAXPGPATH MAXPATHLEN
> >   #else
> >     #define MAXPGPATH 255  /* because this is the lowest common
> >                   denominator on Posix systems */
> >   #endif
> > #endif
> 
> I don't think this would be an improvement.  The main problem with it is

That's why I suggested:

#ifndef PATH_MAX
#define PATH_MAX 255
#endif

instead. Then remove all references to MAXPATHLEN and MAXPGPATH. That can
be done rather quickly. The above is standardized and then we'll have a
uniform limit throughout the source, that should be equal to the actual
system limit on 99% of all systems. And it makes the source simpler along
the way. As it is right now, the vast majority of files doesn't use
MAXPGPATH anyway.

Of course, this is a stupid topic to discuss, but please consider the
point.


> However, this line of thought does lead to something that maybe we
> should change: right now, most of the source files are set up as
> 
>     #include <all necessary system header files>
> 
>     #include "postgres.h"
> 
>     #include "necessary postgres headers"
> 
> where config.h is read as part of postgres.h.  I wonder whether it's
> such a good idea to have different source files reading different
> sets of system headers before config.h.  Maybe the standard order
> ought to be
> 
>     #include "postgres.h"
> 
>     #include <all necessary system header files>
> 
>     #include "necessary postgres headers"
> 
> so that config.h is always read in a uniform context.

Definitely.

-- 
Peter Eisentraut                  Sernanders vaeg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: [HACKERS] Path-length follies

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> As it is right now, the vast majority of files doesn't use
> MAXPGPATH anyway.

??  I think you are looking at out-of-date sources, because I changed
everything to use MAXPGPATH a week or two ago...
        regards, tom lane