On Wed, Nov 14, 2012 at 05:39:29PM -0500, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > In pg_upgrade, copy fsm, vm, and extent files by checking for file
> > existence via open(), rather than collecting a directory listing and
> > looking up matching relfilenode files with sequential scans of the
> > array. This speeds up pg_upgrade by 2x for a large number of tables,
> > e.g. 16k.
>
> Uh ... you replaced a strcmp() with an open()?
Yes, strcmp() on all elements of an array.
> I'm prepared to believe that's a win for sufficiently large N, if you
> assume that the filesystem is smart enough to have O(1) lookup time
> regardless of the directory size ... but that doesn't seem like a very
> good assumption, and in any case surely this loses badly for a smaller
> number of files.
>
> You would have been better off keeping the array and sorting it so you
> could use binary search, instead of passing the problem off to the
> filesystem.
Well, testing showed using open() was a big win. To do this with the
directory listing, as I mentioned, you need to pull listings from all
tablespaces, sort them, then do a binary search. I thought the removal
of the directory array code itself was a win, and I figured the
directory code was already doing a binary search in the kernel.
Do you want me to code up a test?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +