Review: Patch to compute Max LSN of Data Pages - Mailing list pgsql-hackers

From Muhammad Usama
Subject Review: Patch to compute Max LSN of Data Pages
Date
Msg-id CAEJvTzXArhYnmi8RBEbMOU9M9U+zVMs5aoSN19Tm_9SZ3PW_Qg@mail.gmail.com
Whole thread Raw
Responses Re: Review: Patch to compute Max LSN of Data Pages  (Amit Kapila <amit.kapila@huawei.com>)
List pgsql-hackers
Hi Amit

I have reviewed and tested the patch, Following are my observations and comments.

Basic stuff
-----------------
- Patch applies OK 
- Compiles cleanly with no warnings 
- All regression tests pass. 


Observations and Comments
-------------------------------------------

- If no option is given to pg_computemaxlsn utility then we get a wrong error message
./pg_computemaxlsn  ../data
pg_computemaxlsn: file or directory "(null)" exists
In my opinion we should consider the -P (Print max LSN from whole data directory) as default in case no option is specified, or otherwise throw a more descriptive error.

- Options -p and -P should be mutually exclusive

- Long options missing for -p and -P

- Help string for -P  should be something like "print max LSN from whole data directory" instead of  "print max LSN from whole database"

- Help menu is silent about -V or --version option

- I think when finding the max LSN of single file the utility should consider all relation segments.

- There is no check for extra arguments
e.g
 ./pg_computemaxlsn  -P . ../data/ ../data/base/ ../data2/ 

- For -p {file | dir}  option the utility expects the file path relative to the specified data directory path which makes thing little confusing
for example 
./pg_computemaxlsn -p data/base/12286/12200 data
pg_computemaxlsn: file or directory "data/base/12286/12200" does not exists
although data/base/12286/12200 is valid file path but we gets file does not exists error

- Utility should print the max LSN number in hexadecimal notation and LSN format (logid/segno) for consistency with PG, and may also print the file name which contains that LSN.

- When finding max LSN from directory, the utility does not considers the sub directories, which I think could be useful in some cases, like if we want to find the max LSN in tablespace directory. With the current implementation we would require to run the utility multiple times, once for each database.
  

Code Review
---------------------

Code generally looks good, I have few small points.

- In main() function variable maxLSN is defined as uint64, Although XLogRecPtr and uint64 are the same thing, but I think we should use XLogRecPtr instead.

- Just a small thing although will hardly improve anything but still for sake of saving few cycles, I think you should use XLByteLT ( < ) instead of XLByteLE ( <= )
to find if the previous max LSN is lesser then current. 

- '\n' is missing from one the printf in the code :-) 
fprintf(stderr, _("%s: file or directory \"%s\" does not exists"),
progname, LsnSearchPath);

- While finding the max LSN from single file, you are not verifying that the file actually exists inside the data directory path provided. and can end up looking
at wrong data directory for checking if the server is running.
For example:
bin/pg_computemaxlsn -p $PWD/data/base/12286/12200 $PWD/otherinstallation/data
Maximum LSN found is: 0, WAL segment file name (fileid,seg): 0000000000000000

- Code is not verifying, if the provided data directory is the valid data directory, This could make pg_computemaxlsn to compute max LSN from the running server file.
For example: 
bin/pg_computemaxlsn -p $PWD/data/base/12286/12200 NONDATADIR/
Maximum LSN found is: 0, WAL segment file name (fileid,seg): 0000000000000000

Questions
-----------------

- I am wondering why are you not following the link inside the "pg_tblspc" directory to get to the table space location instead of building the directory path. I am thinking if you follow the link instead, The utility can be used across the different versions of PG.



Regards,
Muhammad Usama

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: the number of pending entries in GIN index with FASTUPDATE=on
Next
From: Fujii Masao
Date:
Subject: Re: PQconninfo function for libpq