Re: Review: Patch to compute Max LSN of Data Pages - Mailing list pgsql-hackers

From Amit kapila
Subject Re: Review: Patch to compute Max LSN of Data Pages
Date
Msg-id 6C0B27F7206C9E4CA54AE035729E9C383BE96305@szxeml509-mbs
Whole thread Raw
In response to Review: Patch to compute Max LSN of Data Pages  (Muhammad Usama <m.usama@gmail.com>)
Responses Re: Review: Patch to compute Max LSN of Data Pages  (Muhammad Usama <m.usama@gmail.com>)
List pgsql-hackers
Friday, November 23, 2012 5:38 AM Muhammad Usama wrote:
 
> Observations and Comments
> -------------------------------------------
> - If no option is given to pg_computemaxlsn utility then we get a wrong
> error message
> ./pg_computemaxlsn  ../data
> pg_computemaxlsn: file or directory "(null)" exists
> In my opinion we should consider the -P (Print max LSN from whole data
> directory) as default in case no option is specified, or otherwise throw a
> more descriptive error.

 

Fixed. By considering default option as Whole data directory.



> - Options -p and -P should be mutually exclusive
 
Fixed. Both options will not be allowed at the same time.



- Long options missing for -p and -P

 

Fixed. introduced long options as --path & --data-directory.



> - Help string for -P  should be something like "print max LSN from whole
> data directory" instead of  "print max LSN from whole database"
 
Fixed, by changing the message as suggested.

> - Help menu is silent about -V or --version option
 
Fixed.

> - I think when finding the max LSN of single file the utility should
> consider all relation segments.
 
IMHO, as this utility is not aware of relation or any other logical entity and deals with only file or directory,

  it is okay to find only for that file.




> - There is no check for extra arguments
> e.g
> ./pg_computemaxlsn  -P . ../data/ ../data/base/ ../data2/
 
Fixed.
        - Extra options gives error.
        - Multiple -p options also gives error.
        Eg:
        ./pg_computemaxlsn
-p base/12286/12200 -p base/12286/12201 data



> - For -p {file | dir}  option the utility expects the file path relative to
> the specified data directory path which makes thing little confusing
> for example
> ./pg_computemaxlsn -p data/base/12286/12200 data
> pg_computemaxlsn: file or directory "data/base/12286/12200" does not exists
> although data/base/12286/12200 is valid file path but we gets file does not
> exists error
As the path of file is relative to data directory, that's why in error it prints the path as "data/base/12286/12200".
Do you have any suggestion what should be done for this?

> - Utility should print the max LSN number in hexadecimal notation and LSN
> format (logid/segno) for consistency with PG,
 
Fixed
 
> and may also print the file name which contains that LSN.
 
It might be confusing to users for the cases where it has to follow link (pg_tblspc) and then print some absolute path which is not related to relative path given by user.
Also I am not sure if it will be of any use to user.
 


> - When finding max LSN from directory, the utility does not considers the
> sub directories, which I think could be useful in some cases, like if we
> want to find the max LSN in tablespace directory. With the current
> implementation we would require to run the utility multiple times, once for
> each database.
 
It will consider sub-directories as well.


> Code Review
> ---------------------

> Code generally looks good, I have few small points.

> - In main() function variable maxLSN is defined as uint64,
> Although XLogRecPtr and uint64 are the same thing, but I think we should
> use XLogRecPtr instead.
Changed as per suggestion.

> - Just a small thing although will hardly improve anything but still for
> sake of saving few cycles, I think you should use XLByteLT ( < ) instead
> of XLByteLE ( <= )
> to find if the previous max LSN is lesser then current.
 
Changed as per suggestion.

> - '\n' is missing from one the printf in the code :-)
> fprintf(stderr, _("%s: file or directory \"%s\" does not exists"),
> progname, LsnSearchPath);
 
Changed as per suggestion.

> - While finding the max LSN from single file, you are not verifying that
> the file actually exists inside the data directory path provided. and can
> end up looking
>at wrong data directory for checking if the server is running.
> For example:
> bin/pg_computemaxlsn -p $PWD/data/base/12286/12200
> $PWD/otherinstallation/data
> Maximum LSN found is: 0, WAL segment file name (fileid,seg):
> 0000000000000000

> - Code is not verifying, if the provided data directory is the valid data
> directory, This could make pg_computemaxlsn to compute max LSN from the
> running server file.
> For example:
> bin/pg_computemaxlsn -p $PWD/data/base/12286/12200 NONDATADIR/
> Maximum LSN found is: 0, WAL segment file name (fileid,seg):
>0000000000000000
 
Fixed, by allowing only relative path for file or directory and check if that is relative and below data directory with function path_is_relative_and_below_cwd().

> Questions
> -----------------

> - I am wondering why are you not following the link inside the "pg_tblspc"
> directory to get to the table space location instead of building the
> directory path. I am thinking if you follow the link instead, The utility
> can be used across the different versions of PG.
Changed as per suggestion.
 
 
Apart from above, updated the documentation.
With Regards,
Amit Kapila.
Attachment

pgsql-hackers by date:

Previous
From: Pavan Deolasee
Date:
Subject: Re: Re: Problem Observed in behavior of Create Index Concurrently and Hot Update
Next
From: Michael Paquier
Date:
Subject: Re: [WIP] pg_ping utility