When working with Linux servers, it is often essential to monitor the disk space usage. One of the most common tasks is to find large files and directories that occupy too much disk space. To do this, we can use various command-line tools in Linux. In this tutorial, we will discuss some of the most effective methods for locating large files and directories, sorted by largest ones first.
3 Ways to Find Large Files and Directors in Linux
Method 1: Using the du Command
The du command is a popular Linux utility that helps to estimate file space usage. It can be used to display the sizes of individual files or directories on a Linux system. The following command shows the sizes of all directories in the current working directory:
$ du -sh *
Here, the -s
flag indicates that we want to display only the total size of each file or directory, and the -h
flag specifies that we want the output to be human-readable (in KB, MB, or GB).
However, this command will list the directories based on their alphabetical order. To display them sorted by size, we can pipe the output of the du
command to the sort
command, like this:
$ du -sh * | sort -rh
Here, the sort
command is used to sort the output in reverse numerical order (-r
flag) and human-readable format (-h
flag). The sort
command will show the largest files and directories at the top of the list.
Method 2: Using the find Command
The find
command is another powerful Linux utility that can be used to locate large files and directories on a Linux system. It is a versatile command that can search for files or directories based on various criteria, such as name, size, type, and time.
To find files and directories larger than a specific size, we can use the find
command with the -size
option. For example, to find all files larger than 100 MB in the current directory and its sub-directories, we can run this command:
$ find . -type f -size +100M -exec ls -lh {} \; | awk '{ print $5 ": " $NF }' | sort -n -r | head
Here, the .
specifies that we want to search in the current directory and its subdirectories. The -type f
option specifies that we only want to find files (not directories). The -size +100M
option indicates that we want to find files larger than 100 MB.
The ls -lh {} \;
command is used with the -exec
option to display the file details for each file found. The awk
command is then used to extract the file size and name from the output, separated by a colon.
Finally, the output is sorted by numerical order (-n
flag) and reverse order (-r
flag), and the first 10 largest files are displayed (head
command).
To find large directories rather than files, we need to modify the above command slightly. We can replace -type f
with -type d
to search for directories instead of files:
$ find . -type d -size +1G -exec du -sh {} \; | sort -rh
Here, we are searching for directories larger than 1 GB (-size +1G
). The du -sh {} \;
command is used with the -exec
option to display the total size of each directory found. The output is then sorted by size in reverse order (-r
and -h
flags).
Method 3: Using the ncdu Command
The ncdu command is a useful disk space analyzer tool that can help users monitor their disk usage and find large files and directories on their Linux servers. It offers an interactive text-based user interface that displays information on all files and directories.
To install ncdu on your Linux system, run the following command:
$ sudo apt-get install ncdu
Once installed, we can run the ncdu command to analyze the disk usage in a directory:
$ ncdu /path/to/directory
This will show a detailed list of all files and directories in the specified directory, sorted by size in descending order. The largest files and directories appear at the top of the list. We can use the arrow keys to navigate through the list and press Enter
to explore subdirectories.
Conclusion
In this tutorial, we have learned how to find large files and directories on a Linux server, sorted by largest ones first. We have explored various command-line tools, such as du, find, and ncdu, that can help us to monitor the disk space usage of our Linux systems. By using these tools, we can effectively manage the disk space of our servers and ensure that they are running optimally.