Wednesday, October 31, 2007

Searching for files in Unix


Few days back, one of my friends asked me how to search for a file on Unix. I asked him to use locate but unfortunately he didn't have rights to use it or update the database (updatedb). That's when I realized, we have the old and reliable tool called "find". Here I cover some cool stuff that "find" can do for you


Introduction

The find command allows the Unix user to process a set of files and/or directories in a file subtree.

You can specify the following:

* where to search (pathname)
* what type of file to search for (-type: directories, data files, links)
* how to process the files (-exec: run a process against a selected file)
* the name of the file(s) (-name)
* perform logical operations on selections (-o and -a)

1. Search for file with a specific name in a set of files (-name)

find . -name "rc.conf" -print

This command will search in the current directory and all sub directories for a file named rc.conf.

Note: The -print option will print out the path of any file that is found with that name. In general -print wil print out the path of any file that meets the find criteria.

2. How to apply a unix command to a set of file (-exec).

find . -name "rc.conf" -exec chmod o+r '{}' \;

This command will search in the current directory and all sub directories. All files named rc.conf will be processed by the chmod -o+r command. The argument '{}' inserts each found file into the chmod command line. The \; argument indicates the exec command line has ended.

The end results of this command is all rc.conf files have the other permissions set to read access (if the operator is the owner of the file).

3. How to apply a complex selection of files (-o and -a).

find /usr/src -not \( -name "*,v" -o -name ".*,v" \) '{}' \; -print

This command will search in the /usr/src directory and all sub directories. All files that are of the form '*,v' and '.*,v' are excluded. Important arguments to note are:

* -not means the negation of the expression that follows
* \( means the start of a complex expression.
* \) means the end of a complex expression.
* -o means a logical or of a complex expression.
In this case the complex expression is all files like '*,v' or '.*,v'

The above example is shows how to select all file that are not part of the RCS system. This is important when you want go through a source tree and modify all the source files... but ... you don't want to affect the RCS version control files.

4. How to search for a string in a selection of files (-exec grep ...).

find . -exec grep "www.athabasca" '{}' \; -print

This command will search in the current directory and all sub directories. All files that contain the string will have their path printed to standard output.

If you want to just find each file then pass it on for processing use the -q grep option. This finds the first occurrance of the search string. It then signals success to find and find continues searching for more files.

find . -exec grep -q "www.athabasca" '{}' \; -print

This command is very important for process a series of files that contain a specific string. You can then process each file appropriately. An example is find all html files with the string "www.athabascau.ca". You can then process the files with a sed script to change those occurrances of "www.athabascau.ca" with "intra.athabascau.ca".

Special thanks to: http://www.athabascau.ca/html/depts/compserv/webunit/HOWTO/find.htm

No comments: