ls is one of the most common commands in a Linux shell and probably the first command that you would use when learning Linux. Let’s take a look at what ls does when we enter it in a shell:
As you can see it prints all available files and folders in the current path or directory. Normally,this list is sorted alphabetically . However, you are able to sort it the way you want. The command ‘man’ ca help us access the manual page of ls within the terminal. This displays all the different options or flags that can be applied together with ls for more advanced uses.
Getting back to the question, the second part of the command is ‘*.c’ which consists of a wildcard (*) and an extension (.c). The place that an asterisk is used can take any character zero or more times. Needless to say, files that end with the extension “.c” extension are calledC files , so the output of the command simply lists all C files which are present in the current directory.
In the picture above, after entering ‘ls -l *.c’, 11 files with the extension ‘.c’ have been listed in the current directory:
But, what really happens when we type ‘ls -l *.c’ and press enter?
- Feeding the command to the bash! : After ls is entered, the keyboard driver realizes that characters have been entered and pushes them to the shell. The string is passed as one single string .This is then split into tokens by excluding the white spaces. Our command has now three tokens, ‘ls’, ‘-l’ and ‘*.c’.This is placed in an array of strings. This whole process is known as Tokenization.
- Now that the array is tokenized, we need to see if each token has an alias assigned. If an alias is found it is stored as a token after removing the spaces like before and again it is checked for aliases. Normally, an alias is stored in the following files:~/.bashrc, ~/.bash_profile, /etc/bashrc, /etc/profile.
- Next, the computer checks if tokens are built-in functions or not. If the command is a built in, the shell runs the command directly, without using another program. For example, ‘cd’ is a built in; however, ‘ls’ is not a built in , so now system needs to find the executable or program for ‘ls’.
- Interpreting and finding the executable for the command : After that, the bash interprets the command. The first search for the command ‘ls’ is done through $PATH. $PATH is an environmental variable which holds the paths of all executable programs. The search calls a series of functions like find_user_command() ,find_user_command_in_path , find_in_path_element. Each path in the PATH variable is searched for the executable that corresponds the command ‘ls’. BASH invokes the function stat() to check if there is a matching executable in each path.
- Pulling the executable to memory : After all these, when when the file is located at ‘/usr/bin/ls’ , BASH performs execve() command to run the file. There are plenty of other things that needs to be achieved before the binary /usr/bin/ls is executed — the program needs to be read from the disk, its binary format has to be found and the proper handling code needs to be called upon through which the binary will be read into memory.
- And eventually executing the command : Last but not least, the program is in memory and is about to be run when it gets a chance. Nevertheless, how are directories and files read from disk by ‘ls’ ? In order to do this, a list of functions are executed internally to access the final output.The ls utility makes use of a function to read the directory contents, which in turn invokes a system call to read the list of files in the directory by consulting the underlying filesystem’s inode entries. ls is formatted based on which file system the file has been specified to. As soon as all the entries have been retrieved, the system call returns . Finally, the shell prints the prompt again. Then the prompt is stored as an environment variable called
PS1.Then the list of files is returned to the prompt .
I hope you enjoyed reading this blog, and remember:
Logic will take you from a to z. imagination will take you anywhere
— Albert Einstein