This is my first post on the list. My name is Antonio. I am a CS grad student and my field of study is about databases and information retrieval. To get some practical knowledge, I've been studying Postgresql codebase for a while.
Now I would like to contribute with some code and I've chosen the following topic of the TODO list :
Allow reporting of which objects are in which tablespaces
This item is difficult because a tablespace can contain objects from multiple databases. There is a server-side function that returns the databases which use a specific tablespace, so this requires a tool that will call that function and connect to each database to find the objects in each database for that tablespace.
The topic suggests to use the pg_tablespace_databases to discover which database is using a specific tablespace and then connect to each database and find the objects in the tablespaces.
I checked the code of pg_tablespace_databases, defined in src/backend/utils/adt/misc.c, and see that it uses a much simpler approach : It just reads the tablespaces directories and return the name of the directories that represents databases OIDs.
Although the function works as expected, I can see some issues not addressed in the code :
- It does not check for permissions. Any user can execute it;
- It does not check if the platform supports symlinks, which can cause an error because the function is trying to follow the links defined in base/pg_tblspc.
I could use the same approach and write a function that goes down one more level in the directory structure and find the objects' OIDs inside each database directory, but I don't know if this is the better way to do that.
Please, could someone give me feedback and help me with this topic ?