Basic UNIX Concepts and CommandsThis chapter is intended to be a quick primer on unix for those who have little or no previous experience with it. We will guide the new user through some very basic and essential concepts of the unix operating system. Simple examples will be used to illustrate these concepts and the user will learn a small set of essential unix commands in the process. We will introduce the following concepts (not necessarily in this order):
Unix is a multi-user, multi-tasking operating system with roots going back to the 1960's. It was the first complete programming environment designed by programmers to make software development easier. Although it has a reputation for being user-unfriendly and archaic, people in research and academic circles have continued to thrive in its elegance and simplicity. A brief tour of unix should be sufficient to allay the apprehensions of the novice user. Logging in to a UNIX SystemLet's start at the very beginning, by connecting and gaining access to a unix system (also known as "logging in"). You can connect to one of our machines over the network by using one of several methods. You could use a telnet or ssh client, or an X-windows client, for instance. For more information on these different methods, see How to Access Our Machines Remotely. Regardless of which method you use, at some point during the process you will be presented with a screen similar in essence to the following: AIX Version 5 (C) Copyrights by IBM and by others 1982, 2000. login: faisal faisal's Password: Access to a unix system requires a login name and a password. UNIX login names (sometimes called "usernames") and passwords are case sensitive. When attempting to log in to a system, it will first prompt you for your login name and then your password; if both are correct, the system will grant access.
_______________________________________________________________________
| Welcome to agave, a 32-cpu IBM p690 Running AIX 5.1 |
| A Component of the Texas A&M University Supercomputing Facility |
| Website: http://sc.tamu.edu Consulting: help@sc.tamu.edu or 845-0219 |
|_______________________________________________________________________|
Jun 12 ------ Abaqus 6.3
Abaqus 6.3 is now installed on k2. Please consult the man page (man
abaqus) to find out how to use the new version. Note that Abaqus 6.2
remains the default version.
Jun 27 ------ PBS queue reconfiguration
We have eliminated the PBS extra small and extra large memory queues
(*_xsm and *_xlm). The large memory queues` memory limit has been
set to 8GB.
Jul 25 ------ Scheduled downtime
Agave will be brought down for hardware repairs on the morning of
July 28 (Monday). The machine will be unavailable for use most of the
day starting at 10 am. The longer cpu-time LSF queues are being
drained starting today.
Use the "msg" command to review the notices above
................................................................................
Last login: Wed Jul 23 16:17:23 CDT 2003 on ssh from jerusalem.tamu.edu
>
Once you successfully log on, the system will print some information to the screen. This information includes important messages from the TAMU-SC staff regarding issues such as software upgrades, system problems, and scheduled downtimes; it is referred to as the "motd" or the "message of the day" and it may change from day to day or whenever there is a need for the staff to notify users about anything. The system also prints the date and time of your last login. After all of this, the system will present you with a "prompt". By default, the prompt will be a symbol such as '>', or '%', or '$'. The appearance of the prompt can be customized by the user, however, as we will show later. The the prompt indicates that the system is now ready to accept commands from the user. Every time a command completes execution, the system returns with a new prompt, awaiting further commands. The program that writes the prompt on the screen and waits for user input is called the unix shell. The shell acts mostly as a medium through which other programs are invoked. While it has a set of built-in functions which it performs directly, most commands cause it to execute programs that are, in fact, external to the shell. So the shell is essentially an intermediary between the user and the unix operating system and it provides certain services to make the life of the user easier and more productive. Different types of shells are available to the user. While for the most part the specific shell used will not make a big difference to most users, certain shells are recommended for certain types of work. We'll discuss this subject in greater detail later. Changing PasswordsIf you log in to any unix system for the very first time, chances are the system will force you to change your password for security purposes; so let's talk briefly about passwords. It is a good practice to change your login password from time to time, even if the system does not enforce regular changes as a matter of policy. Your password can be changed manually by issuing the "passwd" command at the prompt once you are logged in to a system. This command will ask you for your old password, and then will ask you to enter your proposed new password twice (for verification). Note that some systems are configured to require that users change their passwords after a certain period of time has elapsed since their last password was established. Passwords should be easy enough for you to remember that you do not feel the need to write them down on paper (that defeats the purpose of passwords). Yet, they should not be simple dictionary words that can be guessed by programs designed to crack passwords. Remember, passwords are case sensitive, so "PaSSworD" and "pAssWORd" are considered two different animals! Good passwords will contain a mixture of upper and lower case letters, as well as digits and special characters ( _ , $ @ ! # etc). Never share passwords with anyone and never send them in email messages or give them to anyone claiming to be a system administrator! Anatomy of a UNIX CommandGoing back to our session on the unix machine which we just logged into, we notice that the shell is presenting us with a prompt and therefore awaiting our command. UNIX commands are executed by typing them at the prompt and pressing enter. The figure below illustrates the anatomy of a simple unix command. ![]() A command is a sequence of "words" separated by blanks. The first word is the name of the
command to be executed. The remaining words are passed as arguments to the command that is
invoked. Unix commands can be thought of as being somewhat similar to function calls in programming
languages, except that their arguments are not enclosed in parentheses nor separated by
commas. In the example above we specify the While many arguments to commands specify file names or user names, some arguments rather
specify an optional capability of the command which the user wishes to invoke. By convention, such
arguments begin with the character '-' (hyphen) and are referred to as flag arguments, or simply "flags".
In our example with the While many unix commands and utilities perform only one or two functions rather than having a large number of hard-to-remember options, there are many others that do offer many flag arguments. It is hard to remember options of commands which are not used very frequently, so users need to refer to the relevant man pages whenever detailed documentation needs to be referenced. The UNIX FilesystemLet's see the output of an > ls -al total 320 drwxr-xr-x 4 faisal staff 1024 Aug 04 16:08 . drwxr-xr-x 149 bin bin 2560 Aug 02 13:27 .. -rw-r--r-- 1 faisal staff 18 Nov 06 2002 .login drwx------ 2 faisal staff 512 May 07 10:58 .lsbatch -rw-r--r-- 1 faisal staff 74 Sep 26 2002 .ncbirc -rwxr----- 1 faisal staff 315 Jul 21 09:46 .profile drwxr-xr-x 2 faisal staff 512 Jun 10 10:48 .qt -rw------- 1 faisal staff 828 Nov 06 2002 .sh_history -rw-r--r-- 1 faisal staff 463 Nov 06 2002 .tcshrc -rw-r--r-- 1 faisal staff 5040 Jun 17 16:38 WebSM.pref -rw-r--r-- 1 faisal staff 5719 May 07 10:58 errfile drwxr-xr-x 2 faisal staff 512 Aug 05 15:42 mysubdirectory -rw-r--r-- 1 faisal staff 3172 May 07 10:58 outfile.2835 -rwxr--r-- 1 faisal staff 37799 Mar 27 10:22 regatta_intro.txt -rw-r--r-- 1 faisal staff 184 May 07 09:29 sample.job -rw-r--r-- 1 faisal staff 789 Jan 14 2003 smit.log -rw-r--r-- 1 faisal staff 0 Jan 14 2003 smit.script -rw-r--r-- 1 faisal staff 20404 Sep 26 2002 test.out -rw-r--r-- 1 faisal staff 574 Sep 26 2002 test.txt -rw-r--r-- 1 faisal staff 26965 Jun 17 16:15 websm.script -rw-r--r-- 1 faisal staff 84 Jun 17 14:28 wsmmonitoring.data Here, we did not explicitly specify the name of the directory whose contents we wanted
to view. By default, On UNIX, all files are alike and the system does not impose any constraints on the way a file can be used. Each file is simply a sequence of bytes, whether it contains text, program source code, or executable object code. This is different from some operating systems that force the user to specify the kind of file the user intends to work with before providing access to the file, and files of a certain type can only be accessed in pre-defined ways. Such an operating system may store random-access files differently than it stores sequential files or database files. Another thing to note is that as an operating system, UNIX does not impose any file naming conventions. A file with the suffix .txt need not contain ASCII text, it could be a binary executable file; UNIX does not care. Users and application developers, however, do voluntarily adhere to sensible naming conventions to maintain sanity, even though UNIX itself does not mandate any such measures. Let us go back to our unix prompt and once again enter the > ls WebSM.pref outfile.2835 smit.log test.txt errfile regatta_intro.txt smit.script websm.script mysubdirectory sample.job test.out wsmmonitoring.data > Now, the output of the command is missing much of the detail of the previous listing
because the default behavior of the Let's now move about in the file system and explore what's out there. Using the
> cd mysubdirectory > ls file1 file2 file3 file4 file5 > pwd /home/faisal/mysubdirectory > Having changed our current working directory to mysubdirectory, we now have a different
vantage point of the filesystem. So now when we issue the UNIX has what is called a hierarchical file system. All information on disk is organized into files and files can be grouped together into directories. Directories are like folders on a windows operating system. The top level directory is called the root directory and is represented by the slash (/) character. Within the root directory, there are many other directories, some of which contain executable programs and utilities, some others contain configuration files and settings, and still others contain user data. Since directories can be created within other directories, we can have multiple levels of directories. An illustration of such a hierarchy of files and directories looks like an upside down tree structure. ![]() When a user logs into a UNIX system, she is placed in her home directory. When the user wants to create a new grouping of files that belong together, she simply issues the command to create a new directory and moves those files into it. The new directory looks almost like any other file except that the user can use the files placed in that directory any time. Or the user can "move down" to the new directory: move, because the new directory becomes the new vantage point for examining files, and down, because the new directory can be thought of as being below the home directory. This process can be repeated any reasonable number of times by creating new subdirectories one level below the home directory or at deeper and deeper levels. Each directory is logically distinct from all others, so two files with identical names can reside in separate directories without causing confusion. Each file in a UNIX filesystem has a unique location that can be specified unambiguously by its pathname. If a file called unix resides in the top-level (root) directory, then its pathname is /unix. The slash (/) character in front of the filename signifies that it will be found under the root directory. Each level of descent into the filesystem is denoted by an additional slash, so the home directory of the user faisal would be /home/faisal, two levels down from the root, and one of his files would be called /home/faisal/.login. Putting all the slashes in the filename identifies the unique path followed down from the root and thus uniquely identifies the file itself. This is known as specifying the full path name. Another type of path is known as the relative path. For instance, if the user's current working directory is /home/faisal and he wants to print the contents of the file named "file3" one level below in the subdirectory called mysubdirectory, he can do so by issuing the following command from his current location: > more mysubdirectory/file3 This is a sample file. It consists of simple ASCII text typed using the vi editor. This message was created for use in examples used to demonstrate unix commands and concepts. > Here, the name of the file to be viewed on screen (using the Filename ExpansionMost filenames consist of a number of alphanumeric characters and '.'s (periods). In fact, all printing characters except '/' (slash) may appear in filenames. It is inconvenient to have most non-alphabetic characters in filenames because many of these have special meaning to the shell. The character '.' (period) is not a shell-metacharacter and is often used to separate the extension of a file name from the base of the name. Thus prog.c prog.o prog.errs prog.outputare four related files. They share a base portion of a name (a base portion being that part of the name that is left when a trailing '.' and the following characters are stripped off). The file 'prog.c' might be the source for a C program, the file 'prog.o' the corresponding object file, the file 'prog.errs' the errors resulting from a compilation of the program and the file 'prog.output' the output of a run of the program. If we wished to refer to all four of these files in a command, we could use the notation prog.*This expression is expanded by the shell, before the command to which it is an argument is executed, into a list of names which begin with 'prog.'. The character '*' here matches any sequence (including the empty sequence) of characters in a file name. The names which match are alphabetically sorted and placed in the argument list of the command. Thus the command echo prog.*will echo the names prog.c prog.errs prog.o prog.outputNote that the names are in sorted order here, and a different order than we listed them above. The echo command receives four words as arguments, even though we only typed one word as an argument directly. The four words were generated by filename expansion of the one input word. Other notations for filename expansion are also available. The character '?' matches any single character in a filename. Thus echo ? ?? ???will echo a line of filenames; first those with one character names, then those with two character names, and finally those with three character names. The names of each length will be independently sorted. Another mechanism consists of a sequence of characters between '[' and ']'. This metasequence matches any single character from the enclosed set. Thus prog.[co]will match prog.c prog.oin the example above. We can also place two characters around a '-' in this notation to denote a range. Thus chap.[1-5]might match files chap.1 chap.2 chap.3 chap.4 chap.5if they existed. This is shorthand for chap.[12345]and otherwise equivalent. An important point to note is that files with the character '.' at the beginning are treated specially. Neither '*' or '?' or the '[' ']' mechanism will match it. This prevents accidental matching of the filenames '.' and '..' in the working directory which have special meaning to the system, as well as other files such as .cshrc which are not normally visible. Another filename expansion mechanism gives access to the pathname of the home directory of any user. This notation consists of the character '~' (tilde) followed by a user's login name. For instance the word '~keith' would map to the pathname '/home/keith' if the home directory for 'keith' was '/home/keith'. The '~' on its own (without any user name) expands to the home directory pathname of the user executing the command. QuotationWe have already seen a number of metacharacters used by the shell. These metacharacters pose a problem in that we cannot use them directly as parts of words. Thus the command echo *will not echo the character '*'. It will either echo a sorted list of filenames in the current working directory, or print the message 'No match' if there are no files in the working directory. The recommended mechanism for placing a character which is neither a number, digit, '/', '.' or '-' in an argument word to a command is to enclose it with single quotation characters ('), i.e. echo '*'The character ' itself can be preceded by a single '\' to prevent its special meaning. Thus echo \'prints 'These two mechanisms suffice to place any printing character into a word which is an argument to a shell command. They can be combined, as in echo \''*'which prints '*since the first \ escaped (the special meaning of) the first ' and the * was enclosed between ' characters. File Ownership and PermissionsSince UNIX is a multi-user operating system, it implements mechanisms to regulate access to files and directories to prevent unauthorized access. We would not want a situation where other users could simply move into our home directory and view our files or execute our programs. Let us take a few moments to understand unix filesystem security. Let's go back and re-examine the detailed output of the ls command we discussed earlier: > ls -al total 320 drwxr-xr-x 4 faisal staff 1024 Aug 04 16:08 . drwxr-xr-x 149 bin bin 2560 Aug 02 13:27 .. -rw-r--r-- 1 faisal staff 18 Nov 06 2002 .login drwx------ 2 faisal staff 512 May 07 10:58 .lsbatch -rw-r--r-- 1 faisal staff 74 Sep 26 2002 .ncbirc -rwxr----- 1 faisal staff 315 Jul 21 09:46 .profile drwxr-xr-x 2 faisal staff 512 Jun 10 10:48 .qt -rw------- 1 faisal staff 828 Nov 06 2002 .sh_history -rw-r--r-- 1 faisal staff 463 Nov 06 2002 .tcshrc -rw-r--r-- 1 faisal staff 5040 Jun 17 16:38 WebSM.pref -rw-r--r-- 1 faisal staff 5719 May 07 10:58 errfile drwxr-xr-x 2 faisal staff 512 Aug 05 15:42 mysubdirectory -rw-r--r-- 1 faisal staff 3172 May 07 10:58 outfile.2835 -rwxr--r-- 1 faisal staff 37799 Mar 27 10:22 regatta_intro.txt -rw-r--r-- 1 faisal staff 184 May 07 09:29 sample.job -rw-r--r-- 1 faisal staff 789 Jan 14 2003 smit.log -rw-r--r-- 1 faisal staff 0 Jan 14 2003 smit.script -rw-r--r-- 1 faisal staff 20404 Sep 26 2002 test.out -rw-r--r-- 1 faisal staff 574 Sep 26 2002 test.txt -rw-r--r-- 1 faisal staff 26965 Jun 17 16:15 websm.script -rw-r--r-- 1 faisal staff 84 Jun 17 14:28 wsmmonitoring.data ![]() The first column in the output above displays file permissions. Each file (or directory) has an associated set of protection bits (also known as mode bits) which the owner of the file can control individually. If a particular bit is enabled, its value is visible in the output, otherwise only a hyphen (-) is displayed for that bit. For each file, the first bit in this ten bit field is called the directory bit, the next three are "user" bits, the following three are "group" bits, and the final three are "other" bits. An enabled directory bit means that the file in question is a directory: the bit will have the value of "d" (a valuse of "l" in this bit is also used to mark files that are links to other files). The following three sets of protection bits determine how the user (the owner of the file), his or her working group (a collection of other people wishing to share file access for a project), and all other system users can access the file. These three sets can be thought to represent three different classes of users whose access permissions can be defined separately. By default, a user is the owner of his or her own home diectory as well as of every file and subdirectory that he or she creates. On our machines, every user automatically belongs to a group called "user". The staff members of our facility belong to a group called "staff". Our system administrators can define any number of groups and give these groups new names. For instance, if Dr. Jones is collaborating with a number of users on one of the machines and would like to share files with his co-researchers over the lifetime of his gene sequencing project, we could define a new group called "genes" and make these users part of this group. The group members could then set the group permissions on their files to give other members the ability to read, modify, or execute their files. When the read or write bit is enabled for one of the three classes of users mentioned above, a user belonging to that class is permitted read or write access. When the read bit, but not the write bit, is set, a user cannot add to, change, or destroy the file. If the write bit, but not the read bit, is set, a user has write only permission. When the execute bit is set, it means that the file may be executed as a program because it is either object code or a shell program. The significance of the permission bits changes slighty when applied to directories. With directories, when the write bit is enabled, it means that a user may create or delete files within that directory. When the execute bit is turned on, the user may search through the listing of files in that directory and read, write, or execute them if allowed to do so by the permission bits on the files themselves. If only the read bit is enabled on a directory, a user may only produce a simple listing of the files contained therein. Keep in mind that the system administrator (otherwise known as the superuser) can bypass all file protections and has access to everything. There are two different methods for setting file permissions. They can be set "relative" to their
current settings or directly, by using a numeric code. Let's examine how to assign file permissions
using a numeric code (the user may refer to the
man page for chmod nnn file_namewhere each n is a number between 0 and 7. Each number represents the file permissions for a class of users ("owner", "group", "other", in that order) and is constructed from the sum of values assigned to the "read", "write", and "execute" attributes as shown: read=4 write=2 execute=1. This relationship between the numeric code and the associated permissions is illustrated in the figure below. ![]() So, to assign file permissions that would allow the owner to read, write, and execute, the group to read and execute, and the other users to only execute a file, we could use the following command: > chmod 751 file_name Input and Output Re-directionMost programs need input to complete their tasks. This input could be in the form of mouse clicks on a graphical user interface that tell a program how to behave, or it could be what a user types in at the system prompt in response to questions posed by the program, and it could also be information contained in a data file. Similarly, the output of a program can be sent to the screen, or to another data file, or a number of other places. On unix, commands that normally read input or write output on the terminal can also be executed with this input and/or output done to a file. This is called input (or output) re-direction. Let's look at how unix handles I/O re-direction. Using the > date > now The "greater than" (>) character sends the output of the > PS1="agave% " agave%"PS1" is a shell variable (more about these later) that causes the shell to display a prompt the value of which we set to the character string "agave% " (agave happens to be the name of the machine on which we are logged in during our sample session). Now let's move on and view the contents of the file "now": agave% cat now Tue Aug 19 16:03:48 CDT 2003 agave%The output of the date command had been saved in the file and we used the
cat command to view its contents. It is important to know that the date
command was unaware that its output was going to a file rather than to the terminal. The shell
performed this redirection before the command began executing.
One other thing to note here is that the file 'now' need not have existed before the It is also possible to redirect the standard input of a command from a file. For instance, we could say agave% sort < datato run the sort command by connecting its input with the contents of the file called data.
A most useful capability is the ability to combine the standard output of one command with the standard input of another, i.e. to run the commands in a sequence known as a pipeline. For instance the command agave% ls -snormally produces a list of the files in our directory along with the size of each file. If we are interested in learning which of our files is largest we may wish to have this output sorted by size rather than by name, which is the default way in which ls sorts. We can use a couple
of simple options of the sort command, combining it with ls to get
what we want.
The -n option of sort specifies a numeric sort rather than an alphabetic sort. Thus
agave% ls -s | sort -nspecifies that the output of the ls command (run with the option -s) is to be piped
to the command sort (run with the numeric sort option). This would give us a sorted
list of our files by size, but with the smallest first. We could then use the -r reverse sort option and
the head command in combination with the previous command as follows:
agave% ls -s | sort -n -r | head -5Here we have taken a list of our files sorted alphabetically, each with the size in blocks. We have run this to the standard input of the sort command asking it to sort numerically in reverse
order (largest first). This output has then been run into the command head which
gives us the first few lines. In this case we have asked head for the first 5 lines.
Thus this command gives us the names and sizes of our5 largest files.
The notation introduced above is called the pipe mechanism. Commands separated by '|' characters are connected together by the shell and the standard output of each is run into the standard input of the next. The leftmost command in a pipeline will normally take its standard input from the terminal and the rightmost will place its standard output on the terminal. Combining Standard Output and ErrorBy default, any process (which can be thought of as a program in execution) has three 'channels' of communication with the outside world. Through one of these channels, it receives information; we have been referring to this channel as the "standard input". Another channel is used to send out information; we have called this the "standard output". There is a third channel which is also used to send out information from the process, but by convention it is used to send out error messages or diagnostic messages as distinguished from ordinary output; this channel is called the "standard error". The technical unix term for these channels is "streams". Each of these I/O streams is represented internally (within the operating system) by a "file descriptor" which is referenced by an integer: standard input (often written as "stdin") is assigned descriptor 0, standard output (stdout) is descriptor 1, and standard error (stderr) is descriptor 2. Initially, all three of these descriptors are connected to the terminal by default. When a file descriptor is assigned to something other than the terminal, redirection is said to have occurred. Until now, we have not talked about the standard error stream. From the user's perspective, it is often useful to separate output that consists of error, warning, or diagnostic messages from that which constitutes the normal output of a program. This kind of separation can allow a user to redirect the regular output of a command to a file (or to another command) and yet still see error or warning messages appear on the terminal while the command is in operation. Alternatively, it can allow the user to see standard output on the terminal but redirect the diagnostic or warning messages to a file so they do not clutter and confuse the regular output on the screen. Occasionally, there are also circumstances in which it is more convenient to combine the standard output and standard error streams into one. Such a capability would allow a user to send both relgular output as well as error messages to one and the same file simeltaneously. Let's examine how we can manipulate the standard output and error streams in the following examples: agave% cd /tmp agave% find . -name faisal -print find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. ./faisal find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. find: 0652-081 cannot change directory to : : The file access permissions do not allow the specified action. agave% Here we invoked the agave% find . -name faisal -print 2> find-errors.txt ./faisal agave% In this example, all the warning messages of the previous run have been saved in the file we
called "find-errors.txt". The symbol Had we used the If we wanted to save both the standard output and the standard error in a single file for later examination, we could do the following: agave% find . -name faisal -print > find-errors.txt 2>&1 The The examples above show how to achieve re-direction under the bourne shell family. Some diffenences in syntax will apply when attempting the same things using one of the shells in the C shell family. See the section More on I/O Re-Direction for more information. The concept of shells is discussed next. The Shell and its EnvironmentWe have mentioned that the shell is a special program used as an interface between the user and the heart of the unix operating system, a program called the kernel. The kernel is loaded into memory at boot-up and manages the system until shutdown. It creates and controls processes, manages memory, file systems, communications, etc. All other programs, including the shell program, reside out on the disk. The kernel loads those programs into memory, executes them, and cleans up the system when they terminate. The shell is a utility program that starts up when you log in and it interprets commands that are typed either at the command line (the prompt on the screen) or in a script file. New users typically spend most of their time using the shell interactively. If a user types the same set of commands on a regular basis though, he would find it useful to automate those tasks by putting the commands in a script file and allowing the shell to execute that script. A script is analagous to a DOS batch file. More sophisticated scripts contain programming constructs for making decisions, looping, file testing, etc. When executing commands from within a script, the user has in fact begun to use the shell as a programming language. The three popular and supported shells on most unix systems are the Bourne shell (written by S. Bourne
at AT&T), the C shell (written by William Joy at Berkeley), and the Korn shell (written by David Korn at AT&T).
The Bourne shell, otherwise known simply as To the shell, commands are generally one of four types: aliases, functions, built-in, or external programs. Aliases are user-defined abbreviations or nicknames for existing commands and apply only to the C shell, tcsh, and Korn shell. Functions apply only to the Bourne and Korn shells; they are user-defined groups of commands organized as separate routines. Aliases and functions are defined within the shell's memory. Built-in commands are internal routines within the shell program itself (they do not require separate processes to be executed), and executable programs reside on disk. While built-in commands and external executables are invoked directly by the user, aliases and functions need to be defined before they can be used. Read the section The Alias Command for more information on how to define aliases (a brief note about functions is also included). Apart from interpreting commands and executing them, a shell also serves to customize a user's working environment. This is normally done by setting what are called variables. The statements which set these variables can be saved in files that are read by the shell. Such files are called shell initialization files and they typically contain definitions for setting terminal characteristics and window charactertistics, variables that define the search path, the appearance of the prompts, the terminal type, variables required by the shell to locate specific applications and programming libraries within the filesystem, as well as other things. Variables are of 2 types with regards to their scope: shell and environment. Shell variables are known only to the shell in which they are defined while environment variables are passed down and inherited by processes created by the shell. This distinction is analagous to the one between local and global variables in common programming languages. Some shell variables are special in that they are created automatically when the shell starts up. Apart from such shell-defined variables, there are also user-defined shell variables. A user may create a variable of any name and assign it any value. Users may also define the scope of the variables they create (in other words, they can create both shell as well as environment variables). For more information on how to set and read variables, read the sections entitled: ProcessesWhile a program can be defined as an executable file, a process is an instance of a program that is being executed by the operating system. Some operating systems use the term "task" instead of process. Operating systems that are capable of executing more than one task (process) at a time are called multi-tasking systems. The unix kernel (which is what the operating system is called) provides many "access points" through which an active process can obtain services from the kernel. These are called system calls. The standard unix C library provides a C interface to each system call; as a result, the actual system calls appear as normal C functions to the programmer. A process consists of several components: the executable program code itself (referred to as the "code" or sometimes the "text" portion of the process), the data on which the program will execute, the resources required for the execution (such as memory workspace and access to various files), and information about the state of the process. The data portion contains items such as program variables and their values. Among the resources required for execution is memory space, which can be divided into two types: heap and stack. ![]() The heap is a portion of memory allocated dynamically (as needed, at runtime) for the use of the process. Whenever the malloc or calloc functions are used in C for instance, they reserve space in heap memory. The stack portion of memory, on the other hand, is used by the process to aid the invocation of functions. Every time a function is called, the process reserves a portion of stack memory to store the values of parameters passed to the functions as well as for results returned by the functions and the local variables used within the functions. The stack is also where space for all declared data types and structures is reserved at compile time. Unix is a multi-tasking system because it allows the concurrent execution of multiple processes. In such a system, the CPU switches automatically from process to process running each for tens or hundreds of milliseconds, giving the impression that more than one process is running simeltaneously. Technically, though, the CPU can actually work on only one program at any given instance in time. An operating system like unix allows the successful sharing of the CPU by ensuring that each process gets a chance to run and that no process is able to modify the state of another process. In order to manage and control processes, the operating system must know certain specific information about each process (stored in what is called the Process Control Block). This information defines the state of the process execution and allows the operating system to temporarily de-activate the process, giving some other process an opportunity to work on the CPU, and then at some point in the future to restore the state of the de-activated process, allowing it to resume where it stopped work previously. This swapping out of one process and replacing it with another is called "context-switching" since the context of one process is saved for later restoration and the context of another process is loaded and activated. In this manner, the operating system keeps cycling through its list of "concurrently" running processes, giving everyone a chance at the CPU. All processes in unix exist in a hierarchy of parent-child relationships. Any process that creates or spawns another process becomes the parent of the created process. The created process itself is called the child of the creating process. A process can have multiple child processes, but a child process can have only one parent process. Every process has a unique process ID (or PID). The PID is an integer that is assigned by the kernel when the process is created. The process with PID 0 is a special kernel process called the "swapper" (or sometimes called the "scheduler") which implements the concurrent execution of multiple processes on a single CPU as mentioned above. PID 1 is also a special process called "init" which initializes the system and makes it ready for use by users. init is considered the parent of all other processes since it creates them. The
|