Queries are a way of asking a computer system questions. (I heard you say ‘duh!’) You’ve probably done this lots in SQL to ask your database questions. How many members have this condition? How much money was sent to these people in August? But how about this one: Which file has the bean named ‘myCrazyService’ in it? Or, How many projects refer to the old version of ‘projectFastMoving’? For these questions you should turn to the *nix-like file utilities.

Standard *nix tools can be used to create queries that your filesystem can answer. Let’s start with find.

The find utility

The find command lets you locate files that are named in a specific way, and that have certain file attributes. Let’s look at an example:

find . -iname SpriNg.xml

This query searches the current directory, and looks for files with names like: spring.xml, Spring.xml, or sprinG.xml. The i in iname means that the pattern will use a case-insensitive comparison when matching against existing files. Let’s try another:

find . -name proj*.xml -maxdepth 2

This code searches the current directory, and will go down only 2 folders below this one in search of matches. It will look for files named like: projector.xml, project.xml, and proj2.xml. Limiting the depth can greatly speed your search. So can starting at a lower level of the directory structure.

Add grep to your bucket

Now that you’ve located the set of files under suscpicion, it’s time to execute your unwarrented search. For example, if you’re looking for which projects depend on my-shared-code then your query might look something like this:

find . -name project.xml -maxdepth 2 -exec grep -l my-shared-code {} ;

There’s a lot going on here. The first half of the find command should be fairly familiar. The last half invokes a program for each result found, using the -exec parameter. Everything from that point until the semicolon is considered part of the executed command. The {}‘s indicate where the filename should be substituted at execution time. So the resulting command runs as if it were something like this:

grep -l my-shared-code \work\my-products\myproduct\trunk\myproject\project.xml

The option used in grep is a lowercase L. It tells grep to list only the filename whose contents match, and not the line contents that triggered the match.

Stuck on Windows?

Try Console 2 or Cygwin. I’ve been exclusively on Mac for over 4 years, so I can’t help you much more than that. :-)

Go try it

Now go try it. See what kinds of queries you can write. See which take a long time, and how you can focus them to get the info you need in a reasonable time frame. Enjoy!