Re: filesystem performance with lots of files - Mailing list pgsql-performance

From David Roussel
Subject Re: filesystem performance with lots of files
Date
Msg-id 43A80668.1040604@diroussel.xsmail.com
Whole thread Raw
In response to filesystem performance with lots of files  (David Lang <dlang@invendra.net>)
Responses Re: filesystem performance with lots of files  ("Jim C. Nasby" <jnasby@pervasive.com>)
List pgsql-performance
David Lang wrote:

  ext3 has an option to make searching directories faster (htree), but enabling it kills performance when you create files. And this doesn't help with large files.

The ReiserFS white paper talks about the data structure he uses to store directories (some kind of tree), and he says it's quick to both read and write.  Don't forget if you find ls slow, that could just be ls, since it's ls, not the fs, that sorts this files into alphabetical order.

> how long would it take to do a tar-ftp-untar cycle with no smarts

Note that you can do the taring, zipping, copying and untaring concurrentlt.  I can't remember the exactl netcat command line options, but it goes something like this

Box1:
tar czvf - myfiles/* | netcat myserver:12345

Box2:
netcat -listen 12345 | tar xzvf -

Not only do you gain from doing it all concurrently, but not writing a temp file means that disk seeks a reduced too if you have a one spindle machine.

Also condsider just copying files onto a network mount.  May not be as fast as the above, but will be faster than rsync, which has high CPU usage and thus not a good choice on a LAN.

Hmm, sorry this is not directly postgres anymore...

David

pgsql-performance by date:

Previous
From: Nicolas Barbier
Date:
Subject: Re: Read only transactions - Commit or Rollback
Next
From: Tom Lane
Date:
Subject: Re: High context switches occurring