On Wed, Nov 14, 2012 at 05:39:29PM -0500, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > In pg_upgrade, copy fsm, vm, and extent files by checking for file
> > existence via open(), rather than collecting a directory listing and
> > looking up matching relfilenode files with sequential scans of the
> > array.  This speeds up pg_upgrade by 2x for a large number of tables,
> > e.g. 16k.
>
> Uh ... you replaced a strcmp() with an open()?
Yes, strcmp() on all elements of an array.
> I'm prepared to believe that's a win for sufficiently large N, if you
> assume that the filesystem is smart enough to have O(1) lookup time
> regardless of the directory size ... but that doesn't seem like a very
> good assumption, and in any case surely this loses badly for a smaller
> number of files.
>
> You would have been better off keeping the array and sorting it so you
> could use binary search, instead of passing the problem off to the
> filesystem.
Well, testing showed using open() was a big win.  To do this with the
directory listing, as I mentioned, you need to pull listings from all
tablespaces, sort them, then do a binary search.  I thought the removal
of the directory array code itself was a win, and I figured the
directory code was already doing a binary search in the kernel.
Do you want me to code up a test?
--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com
  + It's impossible for everything to be true. +