Comandos linux: find con ejemplos

El comando linux find es uno de esos comandos que todo usuario de linux debería saber “casi por obligación” por que simplemente es muy poderoso cuando se trata de tener la información del archivo exacto que se necesita, pero precisamente lo importante es poder hacer algo con el o los archivos que se encuentran y el comando linux find es el comando correcto para esto.

Hace algún tiempo escribí sobre el comando find y una segunda parte donde explicaba con ejemplos como se usa el comando linux find para encontrar diversos tipos de archivos, como buscar con diferentes condiciones, opciones y demás.

Pero una de las características mas interesantes es poder encontrar los archivos que se buscan y hacer algo con ellos mediante tuberías o pipes y es lo que te voy a compartir en este artículo.

Sigue leyendo

The du Command

The du (i.e., disk usage) command reports the sizes of directory trees inclusive of all of their contents and the sizes of individual files. This makes it useful for tracking down space hogs, i.e., directories and files that consume large or excessive amounts of space on a hard disk drive (HDD) or other storage media.

A directory tree is a hierarchy of directories that consists of a single directory, called the parent directory or top level directory, and all levels of its subdirectories (i.e., directories within a directory). Any directory can be regarded as being the start of its own directory tree, at least if it contains subdirectories. Thus, a typical computer contains a large number of directory trees.

du is commonly employed by system administrators as a supplement to automated monitoring and notification programs that help prevent key directories and partitions (i.e., logically independent sections of a HDD) from becoming full. Full, or even nearly full, directories and partitions can cause a system to slow down, prevent users from logging in and even result in a system crash. Although visually identifying heavy consumers of disk space can be practical if there are relatively few users on a system, it is clearly not efficient for large systems with hundreds or thousands of users.

A minor limitation of du is the fact that the sizes of directories and files it reports are approximations, not exact numbers, and there is frequently a small discrepancy between these sizes and the sizes reported by other commands. However, this rarely detracts from its usefulness.

Also, du can only be used to estimate space consumption for directories and files for which the user has reading permission. Thus, an ordinary user would generally not be able to use du to determine space consumption for files or directories belonging to other users, including those belonging to the root account (i.e., the system administrator). However, as du is used mainly by system administrators, this is usually not a problem.


The basic syntax for du is:

du [options] [directories and/or files]

The items in the square brackets are optional. When used with no options or arguments (i.e., names of directories or files), du lists the names and space consumption of each of the directories (including all levels of subdirectories) in the directory tree that begins with the current directory (i.e., the directory in which the user is currently working). The space consumption of any directory consists of the space occupied by all of the files in it and all of its subdirectories at all levels inclusive of all of the files in them. A final line at the end of the report gives the total space consumption for the directory tree.

du can provide information about any directory trees or files on the system whose names are given as arguments. For example, the following will report the names and sizes for each directory in the directory tree that begins with a directory named directory2 that resides in a directory named directory1, which, in turn, is located in the current directory:

du directory1/directory2

Likewise, the following will report the sizes of the two files named file1 and file2 that are located in the /sbin directory (which contains executable programs):

du /sbin/file1 /sbin/file2

du can accept any number of arguments, and they can be any combination of files and directories. When there are multiple arguments, no grand total is provided by default, although a total is still provided for each argument.


As is the case with most commands on Linux and other Unix-like operating systems, du has a number of options, a few of which are commonly used. The options can vary somewhat according to the particular operating system and the version of du.

One of the most useful options is -h (i.e., human readable), which can make the output easier to read by displaying it in kilobytes (K), megabytes (M) and gigabytes (G) rather than just in the default kilobytes. Thus, the following command can be used to show the sizes of all the subdirectories in the current directory as well as the total size of the current directory, all formatted with the appropriate K, M or G:

du -h

The -s (for suppress or summarize) option tells du to report only the total disk space occupied by a directory tree and to suppress individual reports for its subdirectories. Thus, for example, the following would provide the total disk space occupied by the current directory in an easy-to-read format:

du -sh

The output is the same as the last line of a report issued by du with only the -h option.

The -a (i.e., all) option tells du to report not just the total disk usage for each directory at every level in a directory tree but also to report the space consumption for each individual file anywhere within the tree. Thus, for example, the following would list the name and size of every directory and file in the /etc directory (which contains system configuration files) for which the user has reading permission:

du -a /etc

A somewhat similar report is provided by using the star ( * ) wildcard, which will match any character or characters. For example, the following command would list the sizes of all directories that are in the tree that begins with the current directory:

du *

However, the only files listed are those in the the parent directory, not those in its subdirectories. Also, no total for the directory tree as a whole is provided.

The use of the -s option and the star wildcard together would cause du to report the names and sizes of only the files and directories contained directly in the top level directory itself (and to not list the names of any of its subdirectories and the files in them). The size of each listed directory is, of course, inclusive of all of its files and subdirectories (including all of the files in them). For example, such a report about the directory tree beginning with the current directory would be provided by the following:

du -hs *

The wildcard can also be used to filter the output to list only those items whose names begin with, contain or end with certain characters or sequences of characters. For example, the following would report the names and sizes of all of the directories and files in the current directory whose names begin with the letter s as well as the names and sizes of all levels of subdirectories of those directories regardless of what their names begin with:

du -h s*

The -c option can be added to provide a grand total for all of the files and directories that are listed. In the case of the above example, this would be

du -hc s*

As another example of the use of the wildcard, the following command would report the name and size of each gif (one of the two most popular image formats) file in the current directory as well as a total for all of the gifs:

du -hc *.gif

Another useful option is --max-depth=, which instructs du to list its subdirectories and their sizes to any desired level of depth (i.e., to any level of subdirectories) in a directory tree. For example, the following would cause du to list only the first tier (i.e., layer) of directories in the current directory and their sizes (inclusive of all of their contents, including those of their subdirectories):

du --max-depth=1

The total space consumption for the current directory tree will also be reported, and it will, of course, be the same regardless of the depth of the files listed.

Setting --max-depth= to zero tells du to not list any of the subdirectories within the selected directory, i.e., to list only report the size of the selected directory itself. The result is the same as using the -s option.

Using du With Filters

As is the case with other commands on Unix-like operating systems, du can be linked with pipes to filters to create powerful pipelines of commands. A filter is a (usually) small and specialized program that transforms data in some meaningful way.

For example, to arrange the output items according to size, du can be piped to the sort command, whose -n option tells it to list the output in numeric order with the smallest files first, as follows:

du | sort -n

As du will often generate more output than can fit on the monitor screen at one time, the output will fly by at high speed and be virtually unreadable. Fortunately, it is easy to display the output one screenful at a time by piping it to the less filter, for example,

du -h | less

The output of less can be advanced one screenful at a time by pressing the space bar, and it can be moved backward one screenful at a time by pressing the b key.

The output of du can likewise be piped to less after it has been passed through one or more other filters, for example,

du -h | sort -n | less

The grep filter can be used to search through du's output for any desired string (i.e., sequence of characters). Thus, for example, the following will provide a list of the names and sizes of directories and files in the current directory that contain the word linux:

du -ah | grep linux

One way in which du can be used to produce a list of (mostly) directories and files in a directory tree that are consuming large amounts of disk space is to use grep to search for all the lines that contain the upper case letter M (i.e., for megabytes) or G (for gigabytes), such as

du -ah | grep M

The only problem with this approach is that it will also select directories and files that contain an upper case M or G in their names even if the file size is not measured in megabytes or gigabytes. (However, this problem could be overcome through the use of regular expressions, an advanced pattern matching technique).

Alternatives to du

There are several other ways of monitoring disk space consumption and reporting file sizes. Although very useful tools, they are generally not good substitutes for du.

Among them is the df command, which is likewise used by system administrators to monitor disk usage. However, unlike du, it can only show the space consumption on entire partitions, and it lacks du's fine-grained ability to track the space usage of individual directories and files.

du is not designed to show the space consumption of partitions. The closest that it can come is to show the sizes of the first tier of directories in the root directory (i.e., the directory which contains all other directories and which is represented by a forward slash), several of which may be on their own partitions (depending on how the system has been set up). This is accomplished by becoming the root user and issuing the following command:

du -h --max-depth=1 /

The ls (i.e., list) command can provide the sizes of individual files by using its -s option, and its -h option (which is similar to du's -h option) can be added to make the output easier to read. For example, the following would list the names and sizes of the files in the current directory:

ls -sh

Although the names of the first tier of directories within the current directory are also listed, the size data accompanying them does not represent their actual disk space consumption (i.e., inclusive of their contents). Nor does ls report the contents of any lower tiers of directories, unless such directories are specifically listed as arguments.

A convenient alternative for finding the sizes of files and directory trees when using a GUI (graphical user interface) is to click with the right mouse button on the icon (i.e., a small picture or symbol) for that item and then select Properties from the menu that appears. Although this is frequently sufficient, it does not provide the detailed control and reporting that du provides.



$ du -sm *


$ du -sm *
1172    Descargas
68855   Documentos
4084    Escritorio
22270   Imágenes
174192  Linux
50887   Música
3088    Proyectos
1379    Trabajo
219515  Videos

$ du -bsh Videos/


du -bsh Videos/
215G    Videos/

Si sólo quisiéramos ver cuáles son, por ejemplo, los 5 directorios más pesados en nuestro /home podríamos usa du con una serie de comandos extras, por ejemplo:

$ du -sm * | sort -nr | head -5

Lo cual devolvería:

$ du -sm * | sort -nr | head -5
219515  Videos
174192  Linux
68855   Documentos
50887   Música
22270   Imágenes

Pero los valores que nos devuelven no son “tan humanos” pues están representados en MB y son más engorrosos de entender. Es por ello que ejecutamos:

$ du -hs * | sort -nr | head -5

Como crear un administrador mediante linea de comandos en linux Ubuntu?

Para poder agregar un nuevo usuario administrador (sudo), estos son los comandos a ejecutarse:

sudo adduser nuevousuario

Donde nuevousuario es el nombre del usuario que desea crear. Este comando crea el usuario, pero todo esto aún no le configura los permisos de administrador. Para dar ese permiso al usuario recién creado, seguidamente ejecute:

sudo adduser nuevousuario sudo

Esto asignara el usuario al grupo sudo, lo cual le permitirá trabajar como un administrador.

Extraer parte de un archivo tar

Supongamos que tenemos esta estructura de directorios y archivos:

|-- subdirectorio-1
|   |-- archivo-11.txt
|   `-- archivo-12.dat
|-- subdirectorio-2
|   `-- archivo-20.ogg
`-- subdirectorio-3

Hemos creado un archivo TAR que contiene este árbol, y lo hemos guardado con el nombre directorio.tar, Posteriormente queremos mostrar el contenido de ese archivo *.tar, lo hacemos de la siguiente forma:

$ tar tvf directorio.tar
drwxr-xr-x usuario/grupo   0 2008-12-05 16:51 directorio/
drwxr-xr-x usuario/grupo   0 2008-12-05 16:51 directorio/subdirectorio-3/
drwxr-xr-x usuario/grupo   0 2008-12-05 16:52 directorio/subdirectorio-1/
-rw-r--r-- usuario/grupo   0 2008-12-05 16:51 directorio/subdirectorio-1/archivo-11.txt
-rw-r--r-- usuario/grupo   0 2008-12-05 16:51 directorio/subdirectorio-1/archivo-12.dat
drwxr-xr-x usuario/grupo   0 2008-12-05 16:52 directorio/subdirectorio-2/
-rw-r--r-- usuario/grupo   0 2008-12-05 16:52 directorio/subdirectorio-2/archivo-20.ogg

Y finalmente, si lo que queremos realmente es extraer sólo uno de los subdirectorios, haremos lo siguiente:

$ tar xvf directorio.tar directorio/subdirectorio-1

En este caso hemos extraido un subdirectorio entero, pero podemos extraer archivos concretos, además podemos usar wildcards como * o ? en el comando.

¿Y que pasa con los archivos comprimidos con Gzip o con Bzip2? Pues los comandos son prácticamente los mismos, sólo tenemos que añadir la letra z en caso de *.tar.gz o la letra j en el caso de *.tar.bz2.

Comando find

Para localizar archivos con un nombre especifico desde un ruta especifica en toda la estructura interna de carpetas se utiliza:

find /home -iname 'archivo.txt'

Este comando se puede utilizar para borrar el archivo 'archivo.txt' de forma recursiva en toda la estructura de directorios que se esta especificando.

find /home -iname 'archivo.txt' | xargs rm -rfv

Tuberías (pipes)

Podríamos representar cada programa como una «caja negra» que tiene una entrada y una salida que se pueden unir entre ellos.

El ejemplo que utilizamos se encuentra esquematizado en la figura 48 siendo la entrada estándar el teclado y la salida estándar «el terminal» o por simplicidad la pantalla.


Figura 48. Esquema de entrada y salida estándar del ejemplo

Vamos a suponer un caso ficticio donde necesitamos todas las definiciones de cada palabra en un texto. Primero las ordenamos alfabéticamente, luego utilizamos un comando ficticio llamado diccionario que toma palabras de la entrada estándar y las reescribe junto a su significado en la salida estándar.

Su esquema se ve en la figura 49. En este caso nombramos por separado las entradas y salidas estándares de los dos programas, pero la «unión» entre ambos programas se puede considerar como un sólo «tubo».


Figura 49. Esquema de entrada y salida estándar del ejemplo 2

En ese tubo, el flujo está en un estado intermedio, donde está ordenado, pero no tiene las definiciones de diccionario.

En la línea de comandos esto se escribe de la siguiente manera:

$ sort | dicccionario

Donde el caracter | representa la conexión entre la salida estándar de un programa y la entrada estándar de otro.

Con este fuerte y simple concepto se pueden concatenar gran cantidad de programas como si fuera una línea de producción en serie para generar resultados complejos.

Para mejorar nuestro ejemplo sacaremos las palabras repetidas, antes de mostrarlas con definiciones. Suponiendo que exista un programa llamado sacar-repetidas, la línea de comando sería:

$ sort | sacar-repetidas | diccionario

Simple, utilizando herramientas sencillas logramos algo un poco más complicado. El inconveniente que tenemos en este ejemplo es que hay que escribir aquello a procesar. Normalmente queremos utilizar archivos como entrada de nuestros datos. Es necesario un comando que envíe a salida estándar un archivo, así se procesa como la entrada estándar del sort y continúa el proceso normalmente. Este comando es cat. La sintaxis es simple cat nombre-de-archivo.

Quedando nuestro ejemplo:

$ cat archivo.txt | sort | sacar-repetidas | diccionario

... crea un glosario de las palabras que se encuentren en archivo.txt

La combinación de comandos es incalculable y brinda posibilidades enormes. Veamos algunos ejemplos.

Ejemplo 31. Uso de Tuberías

En el caso que se quieran buscar procesos con el string http:

$ ps ax | grep http
3343 ?        S      0:00 httpd -DPERLPROXIED -DHAV
3344 ?        S      0:00 httpd -DPERLPROXIED -DHAV
3975 ?        S      0:00 httpd -DPERLPROXIED -DHAV
12342 pts/6   S      0:00 grep http

Si queremos eliminar la ultima linea podemos volver a usar grep con la opcion -v

$ ps ax | grep http | grep -v grep
3343 ?        S      0:00 httpd -DPERLPROXIED -DHAV
3344 ?        S      0:00 httpd -DPERLPROXIED -DHAV
3975 ?        S      0:00 httpd -DPERLPROXIED -DHAV

Se pueden filtrar las líneas que contengan la palabra linux del archivo arch1.txt y luego mostrarlas en un paginador como less

$ cat arch1.txt | grep linux | less

Podemos enviar los resultados por correo a un amigo,

$ cat arch1.txt | grep linux | mail