Thread: File count restriction of directory limits number of relations inside a database.

Hi, We have a requirement to have multiple schemas in a database.Each schema
will have all application tables and hold a set of users data. We have
segmented data across different schemas. We dynamically increase the schemas
as in when user signs up. So the table files created across all schemas
resides in a single folder under data/base/16546/***. We can't find any
restriction to number of relation(table,index) on a database in postgresql.
But Number of files in a folder in linux is restricted in our file system
what we use currently 32k/64K (ext2 / ext4). To overcome this we create  a
folders inside a database folder and associate as table space for each
schema but it breaks the streaming replication in standby server which we
use currently.
To overcome this We changed the postgresql-9.4.0  source code to create a
sub directory "my_<oid of tablespace>" inside the location given in create
tablespace statement  and take that location as tablespace location. Now we
specify one common location to create multiple tablespaces, on both slave
and master because postgresql creates a tablespace on subdirectory(my_<oid
of tablespace>). Since I'm not an expert in postgresql, i can't assure that it will not
affect other functionality of postgres. Please help me, i changed the method
named "create_tablespace_directories" in a file "tablespace.c".

Here's the modified method.
     static void       create_tablespace_directories(const char *location, const Oid
tablespaceoid)       {           char       *linkloc;           char       *location_with_version_dir;           char *
newlocation;          struct stat st;
 
           linkloc = psprintf("pg_tblspc/%u", tablespaceoid);
newlocation=psprintf("%s/my_%d",location,tablespaceoid);          location_with_version_dir = psprintf("%s/%s",
newlocation,                                                  
 
TABLESPACE_VERSION_DIRECTORY);

           /*            * Attempt to coerce target directory to safe permissions.  If
this fails,            * it doesn't exist or has the wrong owner.            */           if (chmod(location, S_IRWXU)
!=0)           {               if (errno == ENOENT)                   ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_FILE),                           errmsg("directory \"%s\" does not exist",
 
location),                            InRecovery ? errhint("Create this directory for
the tablespace before "                                                 "restarting the server.")
: 0));               else                   ereport(ERROR,                           (errcode_for_file_access(),
                errmsg("could not set permissions on directory
 
\"%s\": %m",                                location)));           }                    ereport(WARNING,(errmsg("Trying
tocreate :
 
\"%s\"",newlocation)));
           if(mkdir(newlocation,S_IRWXU)<0)           {               if (errno == EEXIST)
ereport(ERROR,                                  (errcode(ERRCODE_OBJECT_IN_USE),
errmsg("directory\"%s\" already in use
 
as a tablespace",                                           newlocation)));                       else
        ereport(ERROR,                                   (errcode_for_file_access(),
errmsg("couldnot create directory
 
\"%s\": %m",                                           newlocation)));           }                  
           if (InRecovery)           {               /*                * Our theory for replaying a CREATE is to
forciblydrop the
 
target                * subdirectory if present, and then recreate it. This may
be more                * work than needed, but it is simple to implement.                */               if
(stat(location_with_version_dir,&st) == 0 &&
 
S_ISDIR(st.st_mode))               {                   if (!rmtree(location_with_version_dir, true))
  /* If this failed, mkdir() below is going to error.
 
*/                       ereport(WARNING,                               (errmsg("some useless files may be left
behind in old database directory \"%s\"",                                       location_with_version_dir)));
   }           }
 
           /*            * The creation of the version directory prevents more than one
tablespace            * in a single location.            */           if (mkdir(location_with_version_dir, S_IRWXU) <
0)          {               if (errno == EEXIST)                   ereport(ERROR,
(errcode(ERRCODE_OBJECT_IN_USE),                           errmsg("directory \"%s\" already in use as a
 
tablespace",                                   location_with_version_dir)));               else
ereport(ERROR,                          (errcode_for_file_access(),                            errmsg("could not create
directory\"%s\": %m",                                   location_with_version_dir)));           }
 
           /*            * In recovery, remove old symlink, in case it points to the
wrong place.            *            * On Windows, junction points act like directories so we must
be able to            * apply rmdir; in general it seems best to make this code work
like the            * symlink removal code in destroy_tablespace_directories,
except that            * failure to remove is always an ERROR.            */           if (InRecovery)           {
        if (lstat(linkloc, &st) == 0 && S_ISDIR(st.st_mode))               {                   if (rmdir(linkloc) < 0)
                    ereport(ERROR,                               (errcode_for_file_access(),
   errmsg("could not remove directory \"%s\":
 
%m",                                       linkloc)));               }               else               {
   if (unlink(linkloc) < 0 && errno != ENOENT)                       ereport(ERROR,
(errcode_for_file_access(),                               errmsg("could not remove symbolic link
 
\"%s\": %m",                                       linkloc)));               }           }
           /*            * Create the symlink under PGDATA            */           if (symlink(newlocation, linkloc) <
0)              ereport(ERROR,                       (errcode_for_file_access(),                        errmsg("could
notcreate symbolic link \"%s\": %m",                               linkloc)));
 
           pfree(linkloc);           pfree(newlocation);           pfree(location_with_version_dir);       }





-----
sudalai
--
View this message in context:
http://postgresql.nabble.com/File-count-restriction-of-directory-limits-number-of-relations-inside-a-database-tp5844711.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



Hi,

On 4.4.2015 21:24, sudalai wrote:
> Hi,
> We have a requirement to have multiple schemas in a database.Each 
> schema will have all application tables and hold a set of users
> data. We have segmented data across different schemas. We
> dynamically increase the schemas as in when user signs up. So the
> table files created across all schemas resides in a single folder
> under data/base/16546/***. We can't find any restriction to number of
> relation(table,index) on a database in postgresql.

PostgreSQL is using 32-bit object IDs, so technically you can have up to
2^32 objects (tables, indexes). You'll certainly run into other issues
before hitting this limit, though.

> But Number of files in a folder in linux is restricted in our file
> system what we use currently 32k/64K (ext2 / ext4). To overcome this
> we create a folders inside a database folder and associate as table
> space for each schema but it breaks the streaming replication in
> standby server which we use currently.

I'm pretty sure ext4 allows more directory entries. For example I just
did this on my ext4 partition, within a new directory:
 $ for i in `seq 1 1000000`; do touch $i.file; done

and guess what, I do have 1M files there:
 $ ls | wc -l 1000000

So either you're using some old version of ext4, or you've used some
strange parameters when creating the filesystem.

> To overcome this We changed the postgresql-9.4.0 source code to
> create a sub directory "my_<oid of tablespace>" inside the location
> given in create tablespace statement and take that location as
> tablespace location. Now we specify one common location to create
> multiple tablespaces, on both slave and master because postgresql
> creates a tablespace on subdirectory(my_<oid of tablespace>).
>
> Since I'm not an expert in postgresql, i can't assure that it will 
> not affect other functionality of postgres. Please help me, i
> changed the method named "create_tablespace_directories" in a file
> "tablespace.c".

This seems extremely foolish, IMNSHO.

Firstly, I'm not convinced there actually is a problem - you haven't
posted any error messages demonstrating the existence of such ext4
limits, you only claimed they exist. I've created a directory on ext4
with 1M files in it, without any problem.

Secondly, even if there is such problem on your system, it's most likely
by using improper options when creating the ext4 filesystem. Solving
this at the PostgreSQL level is a poor solution.

And finally - no one is going to confirm that your changes are safe,
without a detailed review of your patch. But that would take a lot of
time, and it seems you're solving an artificial problem, so I'd be
surprised if anyone invests the time into reviewing this.

In other words, you should probably fix the filesystem configuration,
not PostgreSQL code.

-- 
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Hi,
     Ya you are right, ext4 allows more directory entries(more than 32000)
but we limited the number of files insides the directory to 32000 to get
better performance. Sorry i'm not mentioned that in my post.
That the reason we plan to use tablespace.  The problem we faced in
tablespace is, the location should be present on both master and slave and
we need to create multiple tablespaces. That why i changed the source, to
create a sub directory on the given location and take that location for
tablespace creation. So i can given one location (that present in both
master & slave) to create multiple tablespaces.





-----
sudalai
--
View this message in context:
http://postgresql.nabble.com/File-count-restriction-of-directory-limits-number-of-relations-inside-a-database-tp5844711p5845044.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.



On 4/7/15 10:49 PM, sudalai wrote:
>        Ya you are right, ext4 allows more directory entries(more than 32000)
> but we limited the number of files insides the directory to 32000 to get
> better performance. Sorry i'm not mentioned that in my post.
> That the reason we plan to use tablespace.  The problem we faced in
> tablespace is, the location should be present on both master and slave and
> we need to create multiple tablespaces. That why i changed the source, to
> create a sub directory on the given location and take that location for
> tablespace creation. So i can given one location (that present in both
> master & slave) to create multiple tablespaces.

Having run databases that probably had more than that number of files 
and not seeing any issues with that, why are you even bothering with 
this? We've gotten no reports from the field that this is actually a 
problem.

If you could provide some data that this was causing a real (not a 
hypothetical) issue it'd be a lot easier to get the community excited 
about it.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com




On 04/08/15 07:09, Jim Nasby wrote:
> On 4/7/15 10:49 PM, sudalai wrote:
>>        Ya you are right, ext4 allows more directory entries(more than
>> 32000)
>> but we limited the number of files insides the directory to 32000 to get
>> better performance. Sorry i'm not mentioned that in my post.
>> That the reason we plan to use tablespace.  The problem we faced in
>> tablespace is, the location should be present on both master and slave
>> and
>> we need to create multiple tablespaces. That why i changed the source, to
>> create a sub directory on the given location and take that location for
>> tablespace creation. So i can given one location (that present in both
>> master & slave) to create multiple tablespaces.
>
> Having run databases that probably had more than that number of files
> and not seeing any issues with that, why are you even bothering with
> this? We've gotten no reports from the field that this is actually a
> problem.
>
> If you could provide some data that this was causing a real (not a
> hypothetical) issue it'd be a lot easier to get the community excited
> about it.

Right. I was just going to write something along these lines. I've 
personally ran databases with far more objects, and the filesystem was 
never the main problem (unless going way back into past).

Sudalaj, I'd like to see some numbers showing that this indeed is a 
problem. Do you have any benchmark demonstrating the issue, or are you 
acting based on obsolete folk wisdom / assumptions?

Also, we already have a solution for that - use separate databases 
instead of schemas. That creates separate directory per database, and it 
also solves other issues (e.g. the statistics file will be split per 
database). It may not fit your application needs, though.

--
Tomas Vondra                   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



On 2015-04-07 20:49:21 -0700, sudalai wrote:
> Hi,
> 
>       Ya you are right, ext4 allows more directory entries(more than 32000)
> but we limited the number of files insides the directory to 32000 to get
> better performance. Sorry i'm not mentioned that in my post.

There's actually a restriction to ~64k *subdirectories* in a
directory. But none on the number of files. And with the b-tree
directories ext3/4 have been using for a long while accessing a
directory entry is going to be cheaper than accessing a dirent in a
subdirectory.

So I think you're just prematurely optimizing at the wrong end.

Greetings,

Andres Freund