On Oct 28, 2007, at 12:59 AM, Guy Rouillier wrote:
> Matthew Wilson wrote:
>> I have a lot of code -- millions of lines at this point, written
>> over the last 5 years. Everything is in a bunch of nested folders.
>> At least once a week, I want to find some code that uses a few
>> modules,
>> so I have to launch a find + grep at the top of the tree and then
>> wait
>> for it to finish.
>> I wonder if I could store our source code in a postgresql table and
>> then use full text searching to index. Then I hope I could run a
>> query
>> where I ask for all files that use modules X, Y, and Z.
>
> DBMSs are great tools for the right job, but IMO this is not the
> right job. I can't see how a database engine, with all it's
> transactional overhead and many other layers, will ever beat a
> simple grep performance-wise. I've used Eclipse for refactoring,
> but having done it once, I'm sticking with grep.
This is exactly what cscope is good for.
http://cscope.sourceforge.net/
I've used it since the early 90's. I do level 3 support for really
big companies. If you are an emacs fan, its hooked in to it as well.
You want to use the -q option. If it is a million lines of code, its
going to take a while. It pseudo-parses the code (some tricky
constructs will confuse it) and builds a very simple database file.
I think it uses Berkeley's DB file. After that, finding all the
occurrences of foo is a few seconds.
If you want to find just definitions (like where is foo defined),
then use ctags or etags. There is exuberant ctags here:
http://ctags.sourceforge.net/
Perry Smith ( pedz@easesoftware.com )
Ease Software, Inc. ( http://www.easesoftware.com )
Low cost SATA Disk Systems for IBMs p5, pSeries, and RS/6000 AIX systems