Chapter 3: Introduction to UNIX and the Basic Commands
This chapter describes the basic UNIX commands with a particular emphasis on
the commands necessary to create and manipulate files and directories. The purpose
is to present a working set of UNIX commands for the beginning UNIX user rather
than an in-depth description of everything. If you are already acquainted with
UNIX, you should still at least read section
"Printer-Related Commands"
because it contains important information about the local characteristics of
our environment.
UNIX Help Docs - The man command
The man command is short for ``manual'' and invokes a ``manual'' page corresponding to the command, file format, or system call specified.
If you know exactly which command you want in the manual, it's very easy to find the page you want. For example,
% man cat
will return the man page for cat.
If you are looking for information about ``copying'' things in UNIX, you can do a keyword search of the manual by invoking man with the option -k. For example,
% man -k copy
will return a list of all of the commands whose name or short description contains the word ``copy''. As this will often return a long list of names, you might find it convenient to look at one "page" of responses at a time, using less.
% man -k copy | less
This will let you look at each in turn and decide which you want to see, which you can do using the exact command name, as described above.
For detailed information about man and how to use it, you can read the man page on man:
% man man
A man page can be rather intimidating at first glance. You needn't worry about the size of the documentation. It does help to know what the most common sections of man pages are, though. The following list should prove useful.
- Name
- The name of that which is being described, usually a command, system call, library, or file format.
- Synopsis
- How to invoke the command or call, which options are available.
- Availability
- The name of the package in which the command was included. For example, man is included in a package called SUNWdoc.
- Description
- A brief description of the purpose of the tool.
- Options
- Detailed description of each of the options listed in the Synopsis section.
- Environment
- How the program's behavior might be affected by settings in user's environment.
- Files
- Which files the program uses, including configuration files.
- See Also
- Other man pages that might be of interest to someone reading the current; related topics for further reading.
- Notes
- Random notes that the author thought worth mentioning.
- Bugs
- Any known bugs should be listed here.
Remember that the man pages will provide you with information about particular individual commands, not with specific examples of how a command can be used in conjunction with other commands to accomplish whatever task you need to accomplish. You are encouraged and expected to learn as much as you can about the various commands (using this manual, the man pages, and other sources). Do not, however, be afraid to ask questions. The consultants and operators are there to help you. Please be considerate, however, and do not expect operators or consultants to do the ``leg work'' you should be doing yourself. A properly prepared user allows the consultant to get to the heart of the problem, thus freeing up time to work with other users.
Introduction to the C Shell
The program that accepts and executes UNIX commands is the ``command interpreter'' or ``shell.'' The default shell used under Berkeley UNIX is the C shell (csh). Under System V UNIX, the most commonly used shell is the Korn shell (ksh), a variant of the Bourne Shell (sh). New accounts in the CSE department are set up with tcsh as their default shell. tcsh is an extended version of csh. Most of the tcsh commands described here also work under csh.
The philosophy of UNIX is quite simple: provide small tools that do one job and one job only, but do that job very well. In order to accomplish jobs that require several tasks to be performed, the philosophy of UNIX tells us that we should do each of these specific tasks in turn, using the result of the first task as input for the task, and so on, until the job at hand is finished. This is accomplished largely through input/output redirection and pipelining.
Standard Input, Standard Output, and Standard Error
UNIX commands are sometimes considered ``filters'' because they generally take input and spew a modified version of that input.
If a command is supposed to use a certain input file to execute that command, and you do not specify the particular input file it should use, the command assumes it should use ``standard input'' (also written ``stdin''). Likewise, if the command commonly produces output and you do not specify to which file the output should go, it will be written to ``standard ouput'' (also known as ``stdout''). Further, when a command generates an error message, it will go to what is known as ``standard error'' (also known as ``stderr''). Collectively, these three files are referred to as ``standard I/O''.
For example, let's consider the case of cat. If you type the command
% cat .login
The contents of the .login file will be displayed. If you don't specify an input file, however, cat will sit there waiting for input from stdin, which is your keyboard, unless otherwise specified.
The existence of standard input and standard output gives the UNIX shell its true strength, allowing the shell to connect the standard output of one program to the standard input of another.
Redirection
As described above, the shell can redirect a program's standard output to a file. If you type
> filename
at the end of a command, the shell will send the output of the command into the file you have indicated. You can, for instance, create a file listing all of the files in your directory. If you redirect the output of a command to an existing file, you will get an error message. You can force the shell to overwrite the contents of an existing file with the standard output from a command by inserting an exclamation point between the redirection symbol and the name of the file like so:
>! file1
If you want to add the output of a command to an existing file,
use >> instead of >. If you try to
append to a file that does not exist you will get an error message.
You can force the shell to create a file if none exists by inserting
an exclamation point between the redirection symbol and the name of
the file like so:
>>! foobar
will send the output to foobar, irrespective of whether foobar already exists.
Standard error can be redirected as well. To put the standard
error messages in the same file as the standard output, use
>& instead of >, and
>>& instead of >>.
Standard input can also be redirected. Many commands will assume
that input is coming from stdin if an input file is not specified.
However, for commands such as pine that cannot check for a
file name, you can redirect the standard input from the terminal to a
file by using <. For example,
% pine bob < letter.txt
will read letter.txt into a PINE composition buffer with Bob as the recipient.
The UNIX shell connects one program's standard output to
another program's standard input through the
use of ``pipes.'' A special character (the vertical bar or
|) tells the shell to connect one command's standard
output to another command's standard input. The second command
can then have its standard output piped to a third command's
standard input and so on.
For example, suppose you have a large data file called data.1 and you want to print the first one hundred lines so you can take it home for study. You could open up the file in an editor, make a copy of it, get rid of all but the first one hundred lines, save the copy, and then print the second file. But doing so would neither be terribly efficient nor a good use of the UNIX command set.
A better option would be to use the head command you could dump the first one hundred lines into a file by entering
% head -100 data.1 >data.tmp
to put the first 100 lines of data.1 into data.tmp so you can then use the lp command to print data.tmp. But that still isn't such a great idea, since you've got to remember to delete data.tmp when you're done, and you still had to enter two commands separately.
An even better option sould be to construct a single pipeline with everything you need in it.
% head -100 data.1 | lp
Further, if you suddenly remembered that you really wanted the first two hundred lines and you do not want to waste paper by printing the first one hundred over again, you could use the tail command to print just lines 101 through 200 by entering
head -200 data.1 | tail -100 | lp
Directory and File Names Under UNIX
Special Characters
UNIX is case sensitive; that is, Lab1 is a different name
from lab1. You may use letters, digits and special
characters in UNIX names. Aside from letters and digits, the hyphen
(-) and the underscore (_) are the best
characters to use in file names. Some other characters frequently
have special meanings and you should refrain from using these
characters in your file names to avoid problems. A few examples of
things especially to avoid are characters that have special meaning
to the shell such as <, >,
&, and |. Whitespace characters
(spaces, tabs, and returns) can be difficult to use unless you have
become fairly proficient in use of the shell, so you should avoid
these.
If you should find yourself in need of accessing a file with a
special character in it, the way to do this is by escaping the
special character. This is done by typing a backslash
(\). For example, to remove a file named
strange>name, you could try
% rm strange\>name
Another option would be to quote the name of the file, such as
% rm 'strange>name'
Directories
There are a few standard names for UNIX directories and subdirectories based on the directory's contents. Table 3.1 lists the standard directory name and its corresponding contents.
| Directory | Contents |
|---|---|
| bin | Executable programs |
| doc | Miscellaneous documentation |
| include | Files containing special constant, function, and procedure declarations for use in C programs |
| lib | Libraries of various procedures and functions which can be used when writing programs. |
| man | Manual pages |
| src | Stored source code |
| tmp | Temporarily stored files |
| /usr/local | Local software installed and maintained by the systems staff |
| /usr/contrib | Local software installed and maintained by volunteers in their spare time. |
Files
On some computer systems, the name of a file consists of two parts: the file name, which identifies the file itself; and the extension, which identifies the file's type. While this is not true for UNIX systems, it is a common practice to include an extension on file names. Some programs actually make use of the extensions. For example, XEmacs checks for extensions and can activate a special editing mode for that type of file. The table 3.2 lists some of the more common extensions and what they mean.
| Extension | Definition |
|---|---|
| .text, .txt | A plain text file. |
| .h | A C header file. |
| .c | C source code. |
| .cc | C++ source code. |
| .mod | Modula-2 source code. |
| .p .pas | Pascal source code. |
| .scm | Scheme source code. |
| .lisp | Common Lisp source code. |
| .tex | TEX or LATEX source. |
| .mm | nroff source, using mm macros. |
| .o | An object file. The step between the compilation of source code that links everything together into a program. |
| .so | A shared object, a library for dynamic linking. |
| .hqx | A Macintosh file converted from binary to text for transmission over a UNIX network. |
| .Z | A file that has been run through the compress program. |
| .gz | A file compressed with gzip. |
| .tar | A set of files that have been bundled together. Commonly found in combination with another extension, such as .tar.gz, which is sometimes written .tgz. |
| .ps | A postscript file. |
| A Portable Document Format file. |
Globbing
The UNIX shell supports the use patterns, expressions that will match filenames inexactly. For example, if you wanted to look at the directory entry of the foo.tex file, you could do so thusly:
% ls -l foo.tex
When you don't know the exact file name, or you want to see all of the files in a particular pattern (i.e., all of the files ending in .tex), you can use a pattern:
% ls -l *.tex
Some have fallen victim to thinking that these are ``wildcard'' characters. Wildcards are taken from UNIX patterns, but are significantly less powerful.
The two most commonly used characters in pattens is *
and ?. * will match zero or more
characters. ? will match exactly one character. It is
possible to specify ranges of characters, groups of characters to
match, characters not to match, etc. Consult the man page for your
shell for more detailed information.
Directory-Related Commands
Directory Shortcuts
There are four shortcuts you can use to refer to different directories. Any user's home directory can be referred to as ~username, instead of /home/?/username. Your own home directory can be called ~. The directory you are currently in can be written as . (pronounced ``dot''). The directory one level up from your current directory (known as the parent directory) can be written as .. (pronounced ``double dot'', or ``dot-dot''). For example, if you are in the directory /class/cse321, . refers to /class/cse321, and .. refers to /class.
Creating Directories
One of the main purposes for the structure of the UNIX file system is to let the user separate files into logical groups. For example you might want to create a subdirectory to hold all the files you will be using to do a particular lab. You can create directories using the mkdir command. The command
% mkdir Lab1
will create a Lab1 subdirectory in the current directory.
Moving About the Filesystem
The cd (think ``change directory'') command lets you move from one directory to another. You can move to a subdirectory of the current directory by entering
% cd nameofsubdirectoryTo move to your Lab1 subdirectory, enter
% cd Lab1
You can also use the directory's full pathname. For example, to move to /class/cse321, you would enter the command
cd /class/cse321
The directory shortcuts mentioned previously can be used here; for example,
% cd ~/Lab1
would take you back to your Lab1 directory, and then
% cd ..would take you back to your home directory. While
cd
.. only takes you back to your home directory if your current
directory is exactly one level down from it, cd (that is,
without a directory specified) will take you back to your home
directory from anywhere.
Another useful option to cd is the dash (-).
It will take you back to the directory from which you came. For
example, if Alice is in her personal source code directory, which she
can verify with the pwd command, then changes to
/tmp, she can return to her source directory with easy
command:
% pwd /home/0/alice/src % cd /tmp % pwd /tmp % cd - % pwd /home/0/alice/src
Listing the Contents of a Directory
The ls command shows filenames that are in a directory.
Consider the following examples:
- ls
- All files in the current directory that do not begin with a period are displayed.
- ls /class/cse321
- Lists the files in some directory other than the current one, in this case, /class/cse321.
- ls -a
- The -a switch means show ``all'' files, including those that start with a dot (.).
- ls -l
- A ``long'' listing is possible by using the -l switch. It displays the same files as ls, but more information is displayed about each file.
- ls -al
- Note that switches can be combined to produce the -al switch as shown here. This will cause the actions of both switches to be followed, showing a long listing of all of the files in the directory.
- ls -alF
- Here, a third switch, -F, has been added. This uses a special character to indicate whether a file is an executable program, a directory, or an ordinary file. In the example below, Lab1, Mail and News are directories as indicated by the / following their names, and new-prog is an executable program as indicated by the *.
When you try to list the contents of a directory other than your own (for example, /class/cse321), you might not be able to see anything. Each directory has a set of permissions that control who can see or change its contents. You can change the permissions on your files and directory (but not other users' files and directories) using the chmod command. See the man pages for more information. If you need to access someone else's directory or file and attempts to do so return errors like ``unreadable'', you do not have permission and you need to contact the owner and have her grant you permission.
There are a number of other switches for the ls command. You should not worry about memorizing a command's switches but instead just remember that switches do exist. In case you need to know the possible switches for a command, consult the man pages for that command.
Moving and Renaming Directories
The mv command will let you move directories and subdirectories, or rename them. You can move a directory into another directory by entering:
% mv directorytobemoved newparentdirectory
You can change the name of a directory by entering:
% mv oldname newname
Finally, you can move a directory and rename it at the same time by entering:
% mv directoryname newdirectory/newname
Removing Directories
The rmdir command removes a directory. To remove a directory, you can simply enter
% rmdir nameofdirectory
This command works only if the directory is empty, that is, it has no files in it. If a directory still has files in it, you can get rid of the files by using the rm inside the directory containing the files you want deleted. There is also a shortcut; you can use this to remove all files and directories in nameofdirectory, as well as the nameofdirectory directory itself.
% rm -r nameofdirectory
NOTE: Be very carefule when using rm -i * and rm -r. Once files and directories have been removed they cannot be easily recovered.
File-Related commands
Creating Files
The most common method for creating files is to use an editor. Some examples of editors on the system are XEmacs, vi, and dtpad.
You can create files without using an editor by using shell
redirection as explained earlier, in section titled
"Redirection".
Remember to observe the
difference between creating a new file (with >) and
appending to an existing file (with >>).
Copying Files
The copy command cp can be used to make a copy of a file either in the same directory or to another.
Let us assume that we want to copy a file or files from direcrtory Lab1 to directory Lab2. You might want to do this because the new lab assignment should be built around the old assignment. If the current directory is Lab1, then the command
% cp lab1.mod ~/Lab2/lab2.modwould make a copy of lab1.mod in directory Lab2, and would rename it lab2.mod.
If you did not want to change the name of the file, simply enter
% cp lab1.mod ~/Lab2
This puts a copy of lab1.mod into the lab2 directory, but retains the name lab1.mod.
The command
% cp * ~/Lab2would put copies of all the files in Lab1 and Lab2, keeping all of the names the same as the originals.
If you want to make a copy of a file that is in your current working directory, and you want the copy to stay in the current directory as well, then pathnames are unnecessary; simply use
% cp filename nameofcopy
In each of the above cases, the original files are left untouched.
Moving and Renaming Files
This section explains how to use the mv command to move or to rename directories. The command can also be used to move or to rename files. You can move a file to another directory by entering
% mv filename newdirectory
You change the name of the file by entering
% mv oldfilename newfilename
Displaying Entire Files
There are a couple of simple commands that display the contents of files: cat, and less. These commands allow one to display the contents of a file in the terminal window in several convenient ways. You are strongly encouraged to use these commands to preview long files before you print them out in order to save time and paper. The cat command will display a file on the terminal window without pausing. If the file is displayed too quickly to be read, you can cause it to pause by typing C-s (remember, that's ``Control-S''). In order to resume the scrolling, type C-q. The above procedure of stopping and starting can be done as many times as you want during the execution of the cat command. You can also use the scroll bar on the right of the terminal window to scroll backward. In practice, cat is used to display short files, and less is preferred for longer files. Please note that if you hit C-s while the mouse pointer is in a window that is executing a program (like less), the program will pause, and it will appear that the window is frozen when really all you need to do is hit C-q for the program to resume displaying output.
The command
less lab1.1stdisplays the first screen's worth of the contents of labl.1st in your terminal window. The prompt
filename is displayed at the bottom of the window. Table
3.3 shows the three ways to respond to this
question.| Keystroke | Meaning |
|---|---|
| Space Bar | Scroll to the next page. |
| Return | Scroll down one line. |
| q | Quit displaying the file and return to the UNIX prompt. |
| b | Scroll back one page. |
| y | Scroll back one line. |
| g | Jump back to the start of the file. |
| G | Jump to the end of the file. |
less offers the ability to search for patterns in the file. Check the man pages for details.
Displaying Part of a File
There are two commands that can be used to display part of a file: tail and head. The tail command displays the last ten lines of a file. The head command displays the first ten lines of a file. tail makes it easy to look at just the end of an output file to see how far a program executed. It is also useful for looking at program listing files when the compiler puts the syntax errors at the end of the listing. head can be helpful if you want to confirm that the contents of a file are what you think they are, by showing you the first few lines.
For both commands you may use a switch to specify the number of
lines displayed. Some examples are shown in table 3.5.
| Command | Meaning |
|---|---|
| tail -25 lab.1st | Display the last 25 lines of lab.1st |
| head -100 lab1.mod | Display the last 100 lines of lab1.mod. |
| tail +100 lab1.mod | Display all but the first 100 lines of lab1.mod. |
Deleting Files
In your programming work there will be times when you will want to delete files from your directories. Files which can be recreated easily (the output from your lab programs, the intermediate files created as your lab compiles, etc.) should be deleted once you have completed the lab they are associated with. Also, if you know you will not need information in a file again, you should delete it. Finally, certain programs leave special files lying around which can also be deleted. Three of the most basic are shown in table 3.6.
| Filename | Description |
|---|---|
| filename~ | A file created by XEmacs, reflecting the previous ``version'' of filename. |
| #filename# | An XEmacs autosave file. |
| core | A memory dump produced when a program crashes. |
The rm command removes files. You can use rm to remove a single file, several specific files, or all files matching a particular pattern. If you are removing a number of files at once you should use rm -i * as described in the section "Moving about the Filesytem". In addition, rm -r removes a directory and all of its contents. You need to be very careful when using rm. The rm command automatically delete the files it is told to delete unless you add the -i switch. The confirmation process helps to protect you from accidentally deleting the wrong files. If you have deleted all the files you wanted and rm is still asking you whether it should delete other files, simply type C-c instead of an n. The current file, and any others still in the queue will be left alone.
| Files to Delete | Command |
|---|---|
| Single file | rm filename |
| Several different files | rm filename1 filename2 etc. |
| Several files all ending in .txt | rm -i *.txt |
| All files in a directory | rm -i * |
gzip
The gzip command reduces the size of the named file using Lempel-Ziv coding. Normally, the file is replaced by a file with a .gz extension. Permissions are kept the same for any file compressed using gzip. gzip will only attempt to compress regular files and will ignore symbolic links. If the new file name is too long for its file system, gzip truncates it and keeps the original file name in the compressed file. gzip will restore the original file name once you uncompress the file.
To use gzip simply enter the following in the directory containing the file you want to compress.
% gzip file
To uncompress the file enter the following in the directory containg the compressed file:
% gunzip file
You should use gzip to help stay under quota (see section "Conserving Disk Space" and section "Disk Quotas"). gzip any large files you have, especially graphics, and then gunzip the files as you need them.
compress
Files and directories can also be compressed using the UNIX compress and compressdir commands, respectively. A .Z extension is added to compressed files.
For more information concerning these commands, see the man pages.
chmod
The chmod command controls the permissions for your directories and for your files. This means you can use chmod to control access to whatever you have in a particular directory. chmod literally means ``change file mode.'' chmod only works with the additon of a switch or a numeric mode added to the command. The first thing you need to know is how to check permissions.
Checking Permissions
To check the permissions for the files in a directory you need to do a ls -al. Let's check the permissions of our new-prog file:
% ls -l new-prog -rwxr-xr-x 1 cmcurtin adm 845 Sep 19 21:47 new-prog
The leftmost part of the line is the permissions section. It's broken into four parts of significance, as shown in figure 3.1.
Figure 3.1: UNIX File Permissions
The initial - indicates that this is an ordinary file. Other file types are shown in table 3.8.
| ls Output | Filetype |
|---|---|
| d | Directory |
| l | Symbolic link |
| b | Block special file |
| c | Character special file |
| p | Named pipe special file |
| - | Ordinary file |
| | | FIFO |
The remainder of the permissions can be broken into three sets of three bits. There is a read bit (r), a write bit (w), and an execute bit (x) for each of the user (that is, the user whose login appears in the ls output), the group name specified in the ls output, and the rest of the users on the system.
If a bit is set, an appropriate letter will appear. If the bit is not set, a dash will be in that bit's spot. So our new-prog's permissions can be read as: ``a regular file; user has read, write, and execute permission; group has read and execute permission; and others have read and execute permission.''
Using chmod (Octal Modes)
One way we could get new-prog to have the permissions it now has would be to use chmod:
% chmod 755 new-prog
The chmod man page has detailed information about the use of octal modes.
Using chmod with Switches
There is also another way to use chmod that some users find easier. Rather than specifying an octal value for permissions, it's possible to turn bits on and off, rather than specifying the mode absolutely. To turn on the read bit for the others, the command
% chmod o+r file
will do the trick. To turn off read permission for the group, try
% chmod g-r file
To set the user's permission to be read and execute, try
% chmod u=rx file
Printer-Related commands
Printing Files
To printout a hardcopy of a file use the lp command. Do not use lp to print binary files (such as any file ending in .o or a compiled program). These files cannot be printed with lp and if you try it you will get an error message. To use lp to print out any other file enter the following at any terminal prompt:
% lp nameoffile
lp can also be used to print several files at once. The following command:
% lp file1 file2...filen
would print the indicated files out as separate parts of a single print job. The effect is the same as if you had printed the files individually, except that it is faster and only one banner page (identifying the owner of the printout) is printed.
Printouts generated by the lp command are sent to your current default printer, which is generally the printer closest to you. In the labs this should be one of the printers in that lab. On the machines available for remote use the printouts are sent to a printer in room 118 Bolz Hall. If you want your printout sent to a different printer for some reason, you can use the -d switch. For example:
% lp -dlj111a lab1.1st
would print out a copy of lab1.1st on one of the LaserJet printers in Bolz Hall room 111. If you are printing to another printer, make sure that it is a printer in a public lab to which you have access.
You should be aware that there are several switches available to
set print options. These switches are generally prefaced with a
-o option. The most notable ones are listed in table
3.9.
| Switch | Meaning |
|---|---|
| -oduplex | Causes two-sided printing if the printer is capable. |
| -o2up | Causes output to print two pages per page. This is default for text. |
| -o4up | Causes output to print four pages per page. |
| -olandscape | Causes the page to rotate 90 degrees. |
These options can be combined (except where they're mutually exclusive, of course, such as 2up and 4up). To print four logical pages per physical page and double-sided, try
% lp -oduplex -o4up
This option should be used when printing code as it gets you a total of eight pages per sheet and saves paper.
Note: output in Postscript may have embedded commands which will override any of the above options.
3.6.2 Viewing Print Jobs
The Short Way
The lpq command will report the queue of your default printer.
The Long Way
When the labs are extremely busy, there can be a number of people who are all waiting for printouts. The lpstat command lets you see how many print jobs there are ahead of yours, and how large they are. If you type in lpstat at a terminal window prompt, lpstat will begin to list the printer status for every printer on the network.
To use lpstat effectively you must first determine the name of your default printer. You do so by using lpstat with the -d switch. Once you have identified the name of your default printer you can then use lpstat for that specific printer.
Canceling Print Jobs
If you have accidentally sent the wrong printjob, or sent a duplicate request, you can use cancel to cancel that job request. First, do an lpq for the printer to which you have sent the job(s).
Once you have the job ID, you'll need to use that as an argument to cancel. For example, if your job ID is 718, you'd cancel it thusly:
% cancel 718
Once the job has been cancelled you will be notified via email and by a message in the terminal in which you have run cancel that the job was cancelled. For more information see the man pages.
Printer Etiquette
Be sure to read the CSE Policy on UNIX Printer Usage (in Appendix C). It covers what is considered appropriate use of our printers. In addition to the official policies, we ask that you be polite to other users by following a few basic rules.
- Remember that each user's output is considered private. Do not handle another person's output more than necessary. It is very bad manners to read someone else's output deliberately while it is printing or waiting to be collected.
- Each printout can be identified by its banner page. This page contains the user's account name, the name of the file printed, the date, and the time. Each page of the output will also have a time and date stamp. When you remove your output from the printer, check to make sure you are not accidentally taking someone else's output. If someone else's output is there, separate the other person's print jobs from yours and put them on the table.
- Remember that printing costs real money. You should make every effort to minimize the number of pages you print. Never print the output files from your labs without checking their contents first.