There’s nothing more frustrating than knowing that a file exists on your system but not knowing where it is. It’s like losing your car keys or misplacing your phone.
Sometimes you’ll have contextual clues – for example, system configuration files are usually in /etc. But even there, you’ve got 253 directories and subdirectories and over 1,730 files.
Let’s look at some tools to make find files easier.
I’ve setup a Debian 10 VPS and done the following:
- created two users, named frank and mary
- in /home/mary, I’ve created a directory called diary, placed some files in it (entry1.txt, entry2.txt, etc.), and made it mode 700 so only mary (and root) could see files in that directory.
- in /home/frank, I created files called “plan1.txt” and “plan2.txt”, both of which are mode 600, so only frank (and root) can see them.
mlocate (Merging Locate) is a package that builds a global database of files that can be queried to find files. locate has a long history in Unix (1982). slocate was an improvement on locate and mlocate is a further improvement which seeks to operate more quickly and not blow up the filesystem cache when it updates.
mlocate also only shows files that are available to the user, not all files. All these programs run as unpriveleged users, so you needn’t worry that user A is going to see user B’s file called /home/userb/my-secret-diary.txt
To install on Debian:
apt-get install mlocate
When installed, the database is empty, so trying to use the locate command produces an error:
root@vnc:~# locate hosts
locate: can not stat () `/var/lib/mlocate/mlocate.db': No such file or directory
Let’s get the database up to date:
# time updatedb
On an AMD EPYC 7601 with 25GB RAID SSD disk that 95% free, updating the database takes less than a second. For comparison, on 6TB of spinning disk RAID-1 on less-than-enterprise-grade storage with an i3 on my home fileserver, updating the database can take several minutes.
Let’s see what the mlocate database looks like:
root@vnc:~# locate -S
1,433,462 bytes in file names
640,990 bytes used to store database
Now that we’re up to date, I can query. As root:
root@vnc:~# locate hosts
If I wanted to do a case-insensitive locate, I could use locate -i.
Because I’m root, I can see files in mary’s home directory:
# locate entry1.txt
But if I’m frank, I cannot:
# su - frank
$ locate entry1.txt
updatedb will run nightly to update out of /etc/cron/cron.daily/mlocate.
One weakness of locate is that if you create a file during the day, you have to wait until the overnight run of updatedb (or run it manually) before the file is in the database. For real-time queries, we can use find.
Find is a very old Linux command that goes back to Unix version 5 (1978!). It does a real-time search of filesystems. Unlike locate, you can search with criteria besides just a name.
The general format for find is
find PATH EXPRESSIONS... ACTIONS...
Let’s say I wanted to find /etc/passwd. I would type:
find /etc -name passwd -print
- “go look in the /etc directory and all its subdirectories”
- “match files named ‘passwd’”
- “print out each file you find”
Here are the results:
# find /etc -name passwd -print
The -print is optional so if you leave it off, you’ll get the same result.
Of course, I may not know the directory, so I could run this against the root filesystem:
# find / -name passwd -print
-iname is the case-insensitive parallel to -name.
With find, I can also find based on other criteria. Some examples to whet your appetite:
# dd if=/dev/zero of=/root/bigfile bs=1048576 count=512
512+0 records in
512+0 records out
536870912 bytes (537 MB, 512 MiB) copied, 0.490124 s, 1.1 GB/s
# find / -size +100M -print
find: ‘/proc/2381/task/2381/fd/6’: No such file or directory
find: ‘/proc/2381/task/2381/fdinfo/6’: No such file or directory
find: ‘/proc/2381/fd/5’: No such file or directory
find: ‘/proc/2381/fdinfo/5’: No such file or directory
Here I’ve created a 512MB file and then searched for files bigger than 100M (“-size +100M”). The errors are because I asked find to search root and that includes /proc, and during find’s run some processes that were running when it started finished and their files no longer existed.
I can also find files based on date:
# mkdir /backup
# touch -t 201008201111 /backup/some_old_backup.tar.gz
# touch /backup/current_backup.tar.gz
# ll /backup
drwxr-xr-x 2 root root 4096 May 24 18:42 .
drwxr-xr-x 19 root root 4096 May 24 18:40 ..
-rw-r--r-- 1 root root 0 May 24 18:42 current_backup.tar.gz
-rw-r--r-- 1 root root 0 Aug 20 2010 some_old_backup.tar.gz
# find /backup -mtime +30 -print
That find expression (“-mtime +30”) menas “older than 30 days”.
There is much more you can do with find – for example, printing is not the only action. There are also a galaxy of expressions you can use to seach by: owner, group owner, newer than or older than other files, type of file, etc. Consult the find man page to learn all about find.