Advertisement
v1ral_ITS

The unix programming environment

Jun 28th, 2018
387
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Bash 280.88 KB | None | 0 0
  1.  
  2. The unix programming environment
  3. Edition 2.1, Feb 1999
  4. Mark Burgess
  5. Centre of Science and Technology
  6. Faculty of Engineering, Oslo College
  7.  
  8. Copyright (C) 1996/7 Mark Burgess Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided also that the section entitled "GNU General Public License" is included exactly as in the original, and provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that the section entitled "GNU General Public License" may be included in a translation approved by the author instead of in the original English.
  9.  
  10. Foreword
  11.  
  12. This is a revised version of the UNIX compendium which is available in printed form and online via the WWW and info hypertext readers. It forms the basis for a one or two semester course in UNIX. The most up-to-date version of this manual can be found at
  13.  
  14. http://www.iu.hioslo.no/~mark/unix.html.
  15.  
  16. It is a reference guide which contains enough to help you to find what you need from other sources. It is not (and probably can never be) a complete and self-contained work. Certain topics are covered in more detail than others. Some topics are included for future reference and are not intended to be part of an introductory course, but will probably be useful later. The chapter on X11 programming has been deleted for the time being.
  17.  
  18. Comments to Mark.Burgess@iu.hioslo.no Oslo, June 1996
  19. Welcome
  20.  
  21. If you are coming to unix for the first time, from a Windows or MacIntosh environment, be prepared for a rather different culture than the one you are used to. Unix is not about `products' and off-the-shelf software, it is about open standards, free software and the ability to change just about everything.
  22.  
  23.    What you personally might perceive as user friendliness in other systems, others might perceive as annoying time wasting. Unix offers you just about every level of friendliness and unfriendliness, if you choose your programs right. In this book, we take the programmer's point of view.
  24.     Unix is about functionality, not about simplicity. Be prepared for powerful, not necessarily `simple' solutions.
  25.  
  26. You should approach Unix the way you should approach any new system: with an open mind. The journey begins...
  27. Overview
  28.  
  29. In this manual the word "host" is used to refer to a single computer system -- i.e. a single machine which has a name termed its "hostname".
  30.  
  31. What is unix?
  32.  
  33. Unix is one of the most important operating system in use today, perhaps even the most important. Since its invention around the beginning of the 1970s it has been an object of continual research and development. UNIX is not popular because it is the best operating system one could imagine, but because it is an extremely flexible system which is easy to extend and modify. It is an ideal platform for developing new ideas.
  34.  
  35. Much of the success of UNIX may be attributed to the rapid pace of its development (a development to which all of its users have been able to contribute) its efficiency at running programs and the many powerful tools which have been written for it over the years, such as the C programming language, make, shell, lex and yacc and many others. UNIX was written by programmers for programmers. It is popular in situations where a lot of computing power is required and for database applications, where timesharing is critical. In contrast to some operating systems, UNIX performs equally well on large scale computers (with many processors) and small computers which fit in your suitcase!
  36.  
  37. All of the basic mechanisms required of a multi-user operating system are present in UNIX. During the last few years it has become ever more popular and has formed the basis of newer, though less mature, systems like NT. One reason for this that now computers have now become powerful enough to run UNIX effectively. UNIX places burdens on the resources of a computer, since it expects to be able to run potentially many programs simultaneously.
  38.  
  39. If you are coming to UNIX from DOS you may well be used to using applications software or helpful interactive utilities to solve every problem. UNIX is not usually like this: the operating system has much greater functionality and provides the possibilities for making your own, so it is less common to find applications software which implements the same things. UNIX has long been in the hands of academics who are used to making their own applications or writing their own programs, whereas as the DOS world has been driven by businesses who are willing to spend money on software. For that reason commerical UNIX software is often very expensive and therefore not available at this college. On the other hand, the flexibility of UNIX means that it is easy to write programs and it is possible to fetch gigabytes of free software from the internet to suit your needs. It may not look like what you are used to on your PC, but then you have to remember that UNIX users are a different kind of animal altogether
  40.  
  41. Like all operating systems, UNIX has many faults. The biggest problem for any operating system is that it evolves without being redesigned. Operating systems evolve as more and more patches and hacks are applied to solve day-to-day problems. The result is either a mess which works somehow (like UNIX) or a blank refusal to change (like DOS or MacIntosh). From a practical perspective, Unix is important and successful because it is a multi-process system which
  42.  
  43.    has an enormous functionality built in, and the capacity to adapt itself to changing technologies,
  44.    is relatively portable,
  45.    is good at sharing resources (but not so good at security),
  46.    has tools which are each developed to do one thing well,
  47.    allows these tools to be combined in every imaginable way, using pipes and channeling of data streams,
  48.    incorporates networking almost trivially, because all the right mechanisms are already there for providing services and sharing, building client-server pairs etc,.
  49.    it is very adaptable and is often used to develop new ideas because of the rich variety of tools it possesses.
  50.  
  51. Unix has some problems: it is old, it contains a lot of rubbish which no one ever bothered to throw away. Although it develops quickly (at light speed compared to either DOS or MacIntosh) the user interface has been the slowest thing to change. Unix is not user-friendly for beginners, it is user-friendly for advanced users: it is made for users who know about computing. It sometimes makes simple things difficult, but above all it makes things possible!
  52.  
  53. The aim of this introduction is to
  54.  
  55.    introduce the unix system basics and user interface,
  56.    develop the unix philosophy of using and combining tools,
  57.    learn how to make new tools and write software,
  58.    learn how to understand existing software.
  59.  
  60. To accomplish this task, we must first learn something about the shell (the way in which UNIX starts programs). Later we shall learn how to solve more complex problems using Perl and C. Each of these is a language which can be used to put UNIX to work. We must also learn when to use which tool, so that we do not waste time and effort. Typical uses for these different interfaces are
  61.  
  62. shell
  63.    Command line interaction, making scripts which perform simple jobs such as running programs in batch, installing new software, simple system configuration and administration.
  64. perl
  65.    Text interpretation, text formatting, output filters, mail robots, WWW cgi (common gateway interface) scripts in forms, password testing, simple database manipulation, simple client-server applications.
  66. C
  67.    Nearly all of UNIX is written in C. Any program which cannot be solved quickly using shell or perl can be written in C. One advantage is that C is a compiled language and many simple errors can be caught at compile time.
  68.  
  69. Much of UNIX's recent popularity has been a result of its networking abilities: unix is the backbone of the internet. No other widely available system could keep the internet alive today.
  70.  
  71. Once you have mastered the unix interface and philosophy you will find that i) the PC and MacIntosh window environments seem to be easy to use, but simplistic and primitive by comparison; ii) UNIX is far from being the perfect operating system--it has a whole different set of problems and flaws.
  72.  
  73. The operating system of the future will not be UNIX as we see it today, nor will is be DOS or MacIntosh, but one thing is for certain: it will owe a lot to the UNIX operating system and will contain many of the tools and mechanisms we shall describe below.
  74.  
  75. Flavours of unix
  76.  
  77. Unix is not a single operating system. It has branched out in many different directions since it was introduced by AT&T. The most important `fork()' in its history happened early on when the university of Berkeley, California created the BSD (Berkeley Software Distribution), adding network support and the C-shell.
  78.  
  79. Here are some of the most common implementations of unix.
  80.  
  81. BSD:
  82.    Berkeley, BSD
  83. SunOS:
  84.    Sun Microsystems, BSD/sys 5
  85. Solaris:
  86.    Sun Microsystems, Sys 5/BSD
  87. Ultrix:
  88.    Digital Equipment Corperation, BSD
  89. OSF 1:
  90.    Digital Equipment Corperation, BSD/sys 5
  91. HPUX:
  92.    Hewlett-Packard, Sys 5
  93. AIX:
  94.    IBM, Sys 5 / BSD
  95. IRIX:
  96.    Silicon Graphics, Sys 5
  97. GNU/Linux:
  98.    GNU, BSD/Posix
  99.  
  100. How to use this reference guide
  101.  
  102. This programming guide is something between a user manual and a tutorial. The information contained here should be sufficient to get you started with the unix system, but it is far from complete.
  103.  
  104. To use this programming guide, you will need to work through the basics from each chapter. You will find that there is much more information here than you need straight away, so try not to be overwhelmed by the amount of material. Use the contents and the indices at the back to find the information you need. If you are following a one-semester UNIX course, you should probably concentrate on the following:
  105.  
  106.    The remainder of this introduction
  107.    The detailed knowledge of the C shell
  108.    An appreciation of the Bourne shell
  109.    A detailed knowledge of Perl, guided by chapter 6. This chapter provides pointers on how to get started in perl. It is not a substitute for the perl book.
  110.    Everything in chapter 7 about C programming. This chapter is written in note form, since it is assumed that you know a lot about C programming already.
  111.    A sound appreciation of chapter 8 on network programming.
  112.  
  113. The only way to learn UNIX is to sit down and try it. As with any new thing, it is a pain to get started, but once you are started, you will probably come to agree that UNIX contains a wealth of possibilities, perhaps more than you had ever though was possible or useful!
  114.  
  115. One of the advantages of the UNIX system is that the entire UNIX manual is available on-line. You should get used to looking for information in the online manual pages. For instance, suppose you do not remember how to create a new directory, you could do the following:
  116.  
  117. nexus% man -k dir
  118.  
  119. dir             ls (1)          - list contents of directories
  120. dirname         dirname (1)     - strip non-directory suffix from file name
  121. dirs            bash (1)        - bash built-in commands, see bash(1)
  122. find            find (1)        - search for files in a directory hierarchy
  123. ls              ls (1)          - list contents of directories
  124. mkdir           mkdir (1)       - make directories
  125. pwd             pwd (1)         - print name of current/working directory
  126. rmdir           rmdir (1)       - remove empty directories
  127.  
  128. The `man -k' command looks for a keyword in the manual and lists all the references it finds. The command `apropos' is completely equivalent to `man -k'. Having discovered that the command to create a directory is `mkdir' you can now look up the specific manaul page on `mkdir' to find out how to use it:
  129.  
  130. man mkdir
  131.  
  132. Some but no all of the UNIX commands also have a help option which is activated with the `-h' or `--help' command-line option.
  133.  
  134. dax% mkdir --help
  135. Usage: mkdir [OPTION] DIRECTORY...
  136.  
  137.   -p, --parents     no error if existing, make parent directories as needed
  138.   -m, --mode=MODE   set permission mode (as in chmod), not 0777 - umask
  139.       --help        display this help and exit
  140.       --version     output version information and exit
  141. dax%
  142.  
  143.  
  144. NEVER-DO's in UNIX
  145.  
  146. There are some things that you should never do in UNIX. Some of these will cause you more serious problems than others. You can make your own list as you discover more.
  147.  
  148.    You should NEVER EVER switch off the power on a Unix computer unless you know what you are doing. A Unix machine is not like a PC running DOS. Even when you are not doing anything, the system is working in the background. If you switch off the power, you could interrupt the system while it is writing to the disk drive and destroy your disk. You must also remember that several users might be using the system even though you cannot see them: they do not have to be sitting at the machine, they could be logged in over the network. If you switch off the power, you might ruin their valuable work.
  149.    Once you have deleted a UNIX file using rm it is impossible to recover it! Don't use wildcards with rm without thinking quite carefully about what you are doing! It has happened to very many users throughout the history of UNIX that one tries to type
  150.  
  151.     rm *~
  152.  
  153.     but instead, by a slip of the hand, one writes
  154.  
  155.     rm * ~
  156.  
  157.     Unix then takes these wildcards in turn, so that the first command is rm * which deletes all of your files! BE CAREFUL!
  158.     Don't ever call a program or an important file `core'. Many scripts go around deleting files called `core' because the, when a program crashes, UNIX dumps the entire kernel image to a file called `core' and these files use up a lot of disk space. If you call a file `core' it might get deleted!
  159.    Don't call test programs test. There is a UNIX command which is already called test and chances are that when you try to run your program you will start the UNIX command instead. This can cause a lot of confusion because the UNIX command doesn't seem to do very much at all!
  160.  
  161. What you should know before starting
  162.  
  163. One library: several interfaces
  164.  
  165. The core of unix is the library of functions (written in C) which access the system. Everything you do on a unix system goes through this set of functions. However, you can choose your own interface to these library functions. Unix has very many different interfaces to its libraries in the form of languages and command interpreters.
  166.  
  167. You can use the functions directly in C, or you can use command programs like `ls', `cd' etc. These functions just provide a simple user interface to the C calls. You can also use a variety of `script' languages: C-shell, Bourne shell, Perl, Tcl, scheme. You choose the interface which solves your problem most easily.
  168.  
  169. Unix commands are files
  170.  
  171. With the exception of a few simple commands which are built into the command interpreter (shell), all unix commands and programs consist of executable files. In other words, there is a separate executable file for each command. This makes it extremely simple to add new commands to the system. One simply makes a program with the desired name and places it in the appropriate directory.
  172.  
  173. Unix commands live in special directories (usually called bin for binary files). The location of these directories is recorded in a variable called path or PATH which is used by the system to search for binaries. We shall return to this in more detail in later chapters.
  174.  
  175. Kernel and Shell
  176.  
  177. Since users cannot command the kernel directly, UNIX has a command language known as the shell. The word shell implies a layer around the kernel. A shell is a user interface, or command interpreter.
  178.  
  179. There are two main versions of the shell, plus a number of enhancements.
  180.  
  181. /bin/sh
  182.     The Bourne Shell. The shell is most often used for writing system scripts. It is part of the original unix system.
  183. /bin/csh
  184.     The C-shell. This was added to unix by the Berkeley workers. The commands and syntax resemble C code. C-shell is better suited for interactive work than the Bourne shell.
  185.  
  186. The program tcsh is a public-domain enhancement of the csh and is in common use. Two improved versions of the Bourne shell also exist: ksh, the Korn shell and bash, the Bourne-again shell.
  187.  
  188. Although the shells are mainly tools for typing in commands (which are excutable files to be loaded and run), they contain features such as aliases, a command history, wildcard-expansions and job control functions which provide a comfortable user environment.
  189.  
  190. The role of C
  191.  
  192. Most of the unix kernel and daemons are written in the C programming language (1). Calls to the kernel and to services are made through functions in the standard C library. The commands like chmod, mkdir and cd are all C functions. The binary files of the same name /bin/chmod, /bin/mkdir etc. are just trivial "wrapper" programs for these C functions.
  193.  
  194. Until Solaris 2, the C compiler was a standard part of the UNIX operating system, thus C is the most natural language to program in in a UNIX environment. Some tools are provided for C programmers:
  195.  
  196. dbx
  197.     A symbolic debugger. Also gdb, xxgdb ddd.
  198. make
  199.     A development tool for compiling large programs.
  200. lex
  201.     A `lexer'. A program which generates C code to recognize words of text.
  202. yacc
  203.    A `parser'. This is a tool which generates C code for checking the syntax of groups of textual words.
  204. rpcgen
  205.     A protocol compiler which generates C code from a higher level langauge, for programming RPC applications.
  206.  
  207. Stdin, stdout, stderr
  208.  
  209. Unix has three logical streams or files which are always open and are available to any program.
  210.  
  211. stdin
  212.     The standard input - file descriptor 0.
  213. stdout
  214.     The standard output - file descriptor 1.
  215. stderr
  216.     The standard error - file descriptor 2.
  217.  
  218. The names are a part of the C language and are defined as pointers of type FILE.
  219.  
  220. #include <stdio.h>
  221.  
  222. /* FILE *stdin, *stdout, *stderr; */
  223.  
  224. fprintf(stderr,"This is an error message!\n");
  225.  
  226. The names are `logical' in the sense that they do not refer to a particular device, or a particular place for information to come from or go. Their role is analogous to the `.' and `..' directories in the filesystem. Programs can write to these files without worrying about where the information comes from or goes to. The user can personally define these places by redirecting standard I/O. This is discussed in the next chapter.
  227.  
  228. A separate stream is kept for error messages so that error output does not get mixed up with a program's intended output.
  229. The superuser (root) and nobody
  230.  
  231. When logged onto a UNIX system directly, the user whose name is root has unlimited access to the files on the system. root can also become any other user without having to give a password. root is reserved for the system administrator or trusted users.
  232.  
  233. Certain commands are forbidden to normal users. For example, a regular user should not be able to halt the system, or change the ownership of files (see next paragraph). These things are reserved for the root or superuser.
  234.  
  235. In a networked environment, root has no automatic authority on remote machines. This is to prevent the system administrator of one machine in Canada from being able to edit files on another in China. He or she must log in directly and supply a password in order to gain access privileges. On a network where files are often accessible in principle to anyone, the username root gets mapped to the user nobody, who has no rights at all.
  236. The file hierarchy
  237.  
  238. Unix has a hierachical filesystem, which makes use of directories and sub-directories to form a tree. The root of the tree is called the root filesystem or `/'. Although the details of where every file is located differ for different versions of unix, some basic features are the same. The main sub-directories of the root directory together with the most important file are shown in the figure. Their contents are as follows.
  239.  
  240. `/bin'
  241.     Executable (binary) programs. On most systems this is a separate directory to /usr/bin. In SunOS, this is a pointer (link) to /usr/bin.
  242. `/etc'
  243.    Miscellaneous programs and configuration files. This directory has become very messy over the history of UNIX and has become a dumping ground for almost anything. Recent versions of unix have begun to tidy up this directory by creating subdirectories `/etc/mail', `/etc/services' etc!
  244. `/usr'
  245.     This contains the main meat of UNIX. This is where application software lives, together with all of the basic libraries used by the OS.
  246. `/usr/bin'
  247.    More executables from the OS.
  248. `/usr/local'
  249.     This is where users' custom software is normally added.
  250. `/sbin'
  251.     A special area for statically linked system binaries. They are placed here to distinguish commands used solely by the system administrator from user commands and so that they lie on the system root partition where they are guaranteed to be accessible during booting.
  252. `/sys'
  253.    This holds the configuration data which go to build the system kernel. (See below.)
  254. `/export'
  255.     Network servers only use this. This contains the disk space set aside for client machines which do not have their own disks. It is like a `virtual disk' for diskless clients.
  256. `/dev, /devices'
  257.     A place where all the `logical devices' are collected. These are called `device nodes' in unix and are created by mknod. Logical devices are UNIX's official entry points for writing to devices. For instance, /dev/console is a route to the system console, while /dev/kmem is a route for reading kernel memory. Device nodes enable devices to be treated as though they were files.
  258. `/home'
  259.     (Called /users on some systems.) Each user has a separate login directory where files can be kept. These are normally stored under /home by some convention decided by the system administrator.
  260. `/var'
  261.    System 5 and mixed systems have a separate directory for spooling. Under old BSD systems, /usr/spool contains spool queues and system data. /var/spool and /var/adm etc are used for holding queues and system log files.
  262. `/vmunix'
  263.     This is the program code for the unix kernel (see below). On HPUX systems with file is called `hp-ux'. On linux it is called `linux'.
  264. `/kernel'
  265.    On newer systems the kernel is built up from a number of modules which are placed in this directory.
  266.  
  267. Every unix directory contains two `virtual' directories marked by a single dot and two dots.
  268.  
  269. ls -a
  270. .   ..
  271.  
  272. The single dot represents the directory one is already in (the current directory). The double dots mean the directory one level up the tree from the current location. Thus, if one writes
  273.  
  274. cd /usr/local
  275. cd ..
  276.  
  277. the final directory is /usr. The single dot is very useful in C programming if one wishes to read `the current directory'. Since this is always called `.' there is no need to keep track of what the current directory really is.
  278.  
  279. `.' and `..' are `hard links' to the true directories.
  280. Symbolic links
  281.  
  282. A symbolic link is a pointer or an alias to another file. The command
  283.  
  284. ln -s fromfile /other/directory/tolink
  285.  
  286. makes the file fromfile appear to exist at /other/directory/tolink simultaneously. The file is not copied, it merely appears to be a part of the file tree in two places. Symbolic links can be made to both files and directories.
  287.  
  288. A symbolic link is just a small file which contains the name of the real file one is interested in. It cannot be opened like an ordinary file, but may be read with the C call readlink() See section lstat and readlink. If we remove the file a symbolic link points to, the link remains -- it just points nowhere.
  289. Hard links
  290.  
  291. A hard link is a duplicate inode in the filesystem which is in every way equivalent to the original file inode. If a file is pointed to by a hard link, it cannot be removed until the link is removed. If a file has @math{n} hard links -- all of them must be removed before the file can be removed. The number of hard links to a file is stored in the filesystem index node for the file.
  292. Getting started
  293.  
  294. If you have never met unix, or another multiuser system before, then you might find the idea daunting. There are several things you should know.
  295. Logging in
  296.  
  297. Each time you use unix you must log on to the system by typing a username and a password. Your login name is sometimes called an `account' because some unix systems implement strict quotas for computer resources which have to be paid for with real money(2).
  298.  
  299.  
  300.   login: mark
  301.   password:
  302.  
  303. Once you have typed in your password, you are `logged on'. What happens then depends on what kind of system you are logged onto and how. If you have a colour monitor and keyboard in front of you, with a graphical user interface, you will see a number of windows appear, perhaps a menu bar. You then use a mouse and keyboard just like any other system.
  304.  
  305. This is not the only way to log onto unix. You can also log in remotely, from another machine, using the telnet or rlogin programs. If you use these programs, you will normally only get a text or command line interface (though graphics can still be arranged).
  306.  
  307. Once you have logged in, a short message will be printed (called Message of the Day or motd) and you will see the C-shell prompt: the name of the host you are logged onto followed by a percent sign, e.g.
  308.  
  309.  
  310.  SunOS Release 5.5 Version Generic [UNIX(R) System V Release 4.0]
  311.  Copyright (c) 1983-1995, Sun Microsystems, Inc.
  312.  
  313.  Please report problems to sysadm@iu.hioslo.no
  314.  
  315.  dax%
  316.  
  317. Remember that every unix machine is a separate entity: it is not like logging onto a PC system where you log onto the `network' i.e. the PC file server. Every unix machine is a server. The network, in unix-land, has lots of players.
  318.  
  319. The first thing you should do once you have logged on is to set a reliable password. A poor password might be okay on a PC which is not attached to a large network, but once you are attached to the internet, you have to remember that the whole world will be trying to crack your password. Don't think that no one will bother: some people really have nothing better to do. A password should not contain any word that could be in a list of words (in any language), or be a simple concatenation of a word and a number (e.g. mark123). It takes seconds to crack such a password. Choose instead something which is easy to remember. Feel free to use the PIN number from your bankers card in your password! This will leave you with fewer things to remember. e.g. Ma9876rk). Passwords can be up to eight characters long.
  320.  
  321. Some sites allow you to change your password anywhere. Other sites require you to log onto a special machine to change your password:
  322.  
  323.  
  324. dax%
  325. dax% passwd
  326. Change your password on host nexus
  327. You cannot change it here
  328. dax% rlogin nexus
  329. password: ******
  330.  
  331. nexus% passwd
  332. Changing password for mark
  333. Enter login password: ********
  334. Enter new password: ********
  335. Reenter new passwd: ********
  336.  
  337. You will be prompted for your old password and your new password twice. If your network is large, it might take the system up to an hour or two to register the change in your password, so don't forget the old one right away!
  338. Mouse buttons
  339.  
  340. Unix has three mouse buttons. On some PC's running GNU/Linux or some other PC unix, there are only two, but the middle mouse button can be simulated by pressing both mouse buttons simultaneously. The mouse buttons have the following general functions. They may also have additional functions in special software.
  341.  
  342. index finger
  343.    This is used to select and click on objects. It is also used to mark out areas and copy by dragging. This is the button you normally use.
  344. middle finger
  345.    Used to pull down menus. It is also used to paste a marked area somewhere at the mouse position.
  346. outer finger
  347.    Pulls down menus.
  348.  
  349. On a left-handed system right and left are reversed.
  350. E-mail
  351.  
  352. Reading electronic mail on unix is just like any other system, but there are many programs to choose from. There are very old programs from the seventies such as
  353.  
  354. mail
  355.  
  356. and there are fully graphical mail programs such as
  357.  
  358. tkrat
  359. mailtool
  360.  
  361. Choose the program you like best. Not all of the programs support modern multimedia extensions because of their age. Some programs like tkrat have immediate mail notification alerts. To start a mail program you just type its name. If you have an icon-bar, you can click on the mail-icon.
  362. Simple commands
  363.  
  364. Inexperienced computer users often prefer to use file-manager programs to avoid typing anything. With a mouse you can click your way through directories and files without having to type anything (e.g. the fmgr or tkdesk programs). More experienced users generally find this to be slow and tedious after a while and prefer to use written commands. Unix has many short cuts and keyboard features which make typed commands extremely fast and much more powerful than use of the mouse.
  365.  
  366. If you come from a DOS environment, the unix commands can be a little strange. Because they stem from an era when keyboards had to be hit with hammer force, and machines were very slow, the command names are generally as short as possible, so they seem pretty cryptic. Some familar ones which DOS borrowed from unix include,
  367.  
  368. cd
  369. mkdir
  370.  
  371. which change to a new directory and make a new directory respectively. To list the files in the current directory you use,
  372.  
  373. ls
  374.  
  375. To rename a file, you `move' it:
  376.  
  377. mv old-name new-name
  378.  
  379. Text editing and word processing
  380.  
  381. Text editing is one of the things which people spend most time doing on any computer. It is important to distingiush text editing from word processing. On a PC or MacIntosh, you are perhaps used to Word or WordPerfect for writing documents.
  382.  
  383. Unix has a Word-like program called lyx, but for the most part Unix users do not use word processors. It is more common in the unix community to write all documents, regardless of whether they are letters, books or computer programs, using a non-formatting text editor. (Unix word processors like Framemaker do exist, but they are very expensive. A version of MS-Word also exists for some unices.) Once you have written a document in a normal text editor, you call up a text formatter to make it pretty. You might think this strange, but the truth of the matter is that this two-stage process gives you the most power and flexibilty--and that is what most unix folks like.
  384.  
  385. For writing programs, or anything else, you edit a file by typing:
  386.  
  387.  
  388.  emacs myfile
  389.  
  390. emacs is one of dozens of text-editors. It is not the simplest or most intuitive, but it is the most powerful and if you are going to spend time learning an editor, it wouldn't do any harm to make it this one. You could also click on emacs' icon if you are relying on a window system. Emacs is almost certainly the most powerful text editor that exists on any system. It is not a word-processor, it not for formatting printed documents, but it can be linked to almost any other program in order to format and print text. It contains a powerful programming language and has many intelligent features. We shall not go into the details of document formatting in this book, but only mention that programs like troff and Tex or Latex are used for this purpose to obtain typset-quality printing. Text formatting is an area where Unix folks do things differently to PC folks.
  391. The login environment
  392.  
  393. Unix began as a timesharing mainframe system in the seventies, when the only terminals available were text based teletype terminals or tty-s. Later, the Massachusetts Institute of Technology (MIT) developed the X-windows interface which is now a standard across UNIX platforms. Because of this history, the X-window system works as a front end to the standard UNIX shell and interface, so to understand the user environment we must first understand the shell.
  394. Shells
  395.  
  396. A shell is a command interpreter. In the early days of unix, a shell was the only way of issuing commands to the system. Nowadays many window-based application programs provide menus and buttons to perform simple commands, but the UNIX shell remains the most powerful and flexible way of interacting with the system.
  397.  
  398. After logging in and entering a password, the unix process init starts a shell for the user logging in. Unix has several different kinds of shell to choose from, so that each user can pick his/her favourite command interface. The type of shell which the system starts at login is determined by the user's entry in the passwd database. On most systems, the standard login shell is a variant of the C-shell.
  399.  
  400. Shells provide facilities and commands which
  401.  
  402.    Start and stop processes (programs)
  403.    Allow two processes to communicate through a pipe
  404.    Allow the user to redirect the flow of input or output
  405.    Allow simple command line editing and command history
  406.    Define aliases to frequently used commands
  407.    Define global "environment" variables which are used to configure the default behaviour of a variety of programs. These lie in an "associated array" for each process and may be seen with the `env' command. Environment variables are inherited by all processes which are started from a shell.
  408.     Provide wildcard expansion (joker notation) of filenames using `*,?,[]'
  409.    Provide a simple script language, with tests and loops, so that users can combine system programs to create new programs of their own.
  410.    Change and remember the location of the current working directory, or location within the file hierarchy.
  411.  
  412. The shell does not contain any more specific functions--all other commands, such as programs which list files or create directories etc., are executable programs which are independent of the shell. When you type `ls', the shell looks for the executable file called `ls' in a special list of directories called the command path and attempts to start this program. This allows such programs to be developed and replaced independently of the actual command interpreter.
  413.  
  414. Each shell which is started can be customized and configured by editing a setup file. For the C-shell and its variants this file is called `.cshrc', and for the Bourne shell and its variants it is called `.profile'. (Note that files which begin with leading dots are not normally visible with the `ls' command. Use `ls -a' to view these.) Any commands which are placed in these files are interpreted by the shell before the first command prompt is issued. These files are typically used to define a command search path and terminal characteristics.
  415.  
  416. On each new command line you can use the cursor keys to edit the line. The up-arrow browses back through earlier commands. CTRL-a takes you to the start of the line. CTRL-e takes you to the end of the line. The TAB can be used to save typing with the `completion' facility See section Command/filename completion.
  417.  
  418. Shell commands generally
  419.  
  420. Shell commands are commands like cp, mv, passwd, cat, more, less, cc, grep, ps etc..
  421.  
  422. Very few commands are actually built into the shell command line interpreter, in the way that they are in DOS -- commands are mostly programs which exist as files. When we type a command, the shell searches for a program with the same name and tries to execute it. The file must be executable, or a Command not found error will result. To see what actually happens when you type a command like gcc, try typing in the following C-shell commands directly into a C-shell. (We shall discuss these commands soon.)
  423.  
  424.   foreach dir ( $path )       # for every directory in the list path
  425.     if ( -x $dir/gcc ) then   # if the file is executable
  426.       echo Found $dir/gcc     # Print message found!
  427.       break                   # break out of loop
  428.     else
  429.       echo Searching $dir/gcc
  430.     endif
  431.   end
  432.  
  433. The output of this command is something like
  434.  
  435.   Searching /usr/lang/gcc
  436.   Searching /usr/openwin/bin/gcc
  437.   Searching /usr/openwin/bin/xview/gcc
  438.   Searching /physics/lib/framemaker/bin/gcc
  439.   Searching /physics/motif/bin/gcc
  440.   Searching /physics/mutils/bin/gcc
  441.   Searching /physics/common/scripts/gcc
  442.   Found /physics/bin/gcc
  443.  
  444. If you type
  445.  
  446.   echo $path
  447.  
  448. you will see the entire list of directories which are searched by the shell. If we had left out the `break' command, we might have discovered that UNIX often has several programs with the same name, in different directories! For example,
  449.  
  450. /bin/mail
  451. /usr/ucb/mail
  452. /bin/Mail
  453.  
  454. /bin/make
  455. /usr/local/bin/make.
  456.  
  457. Also, different versions of unix have different conventions for placing the commands in directories, so the path list needs to be different for different types of unix machine. In the C-shell a few basic commands like cd and kill are built into the shell (as in DOS).
  458.  
  459. You can find out which directory a command is stored in using the
  460.  
  461. which
  462.  
  463. command. For example
  464.  
  465. nexus% which cd
  466. cd: shell built-in command.
  467. nexus% which cp
  468. /bin/cp
  469. nexus%
  470.  
  471. which only searches the directories in $path and quits after the first match, so if there are several commands with the same name, you will only see the first of them using which.
  472.  
  473. Finally, in the C-shell, the which command is built in. In the Bourne shell it is a program:
  474.  
  475. nexus% which which
  476. which: shell built-in command.
  477. nexus% sh
  478. $ which which
  479. /bin/which
  480. $ exit
  481. nexus%
  482.  
  483. Take a look at the script /usr/ucb/which. It is a script written in the C-shell.
  484.  
  485. Environment and shell variables
  486.  
  487. Environment variables are variables which the shell keeps. They are normally used to configure the behaviour of utility programs like lpr (which sends a file to the printer) and mail (which reads and sends mail) so that special options do not have to be typed in every time you run these programs.
  488.  
  489. Any program can read these variables to find out how you have configured your working environment. We shall meet these variables frequently. Here are some important variables
  490.  
  491. PATH             # The search path for shell commands (sh)
  492. TERM             # The terminal type (sh and csh)
  493. DISPLAY          # X11 - the name of your display
  494. LD_LIBRARY_PATH  # Path to search for object and shared libraries
  495. HOST             # Name of this unix host
  496. PRINTER          # Default printer (lpr)
  497. HOME             # The path to your home directory (sh)
  498.  
  499. path             # The search path for shell commands (csh)
  500. term             # The terminal type (csh)
  501. noclobber        # See below under redirection
  502. prompt           # The default prompt for csh
  503. home             # The path to your home directory (csh)
  504.  
  505. These variables fall into two groups. Traditionally the first group always have names in uppercase letters and are called environment variables, whereas variables in the second group have names with lowercase letters and are called shell variables-- but this is only a convention. The uppercase variables are global variables, whereas the lower case variables are local variables. Local variables are not defined for programs or sub-shells started by the current shell, while global variables are inherited by all sub-shells.
  506.  
  507. The Bourne-shell and the C-shell use these conventions differently and not always consistently. You will see how to define these below. For now you just have to know that you can use the following commands from the C-shell to list these variables. The command env can be used in either C-shell or Bourne shell to see all of the defined environment variables.
  508.  
  509. Wildcards
  510.  
  511. Sometimes you want to be able to refer to several files in one go. For instance, you might want to copy all files ending in `.c' to a new directory. To do this one uses wildcards. Wildcards are characters like * ? which stand for any character or group of characters. In card games the joker is a `wild card' which can be substituted for any other card. Use of wildcards is also called filename substitution in the unix manuals, in the sections on sh and csh.
  512.  
  513. The wildcard symbols are,
  514.  
  515. `?'
  516.     Match single character. e.g. ls /etc/rc.????
  517. `*'
  518.    Match any number of characters. e.g. ls /etc/rc.*
  519. `[...]'
  520.     Match any character in a list enclosed by these brackets. e.g. ls [abc].C
  521.  
  522. Here are some examples and explanations.
  523.  
  524. `/etc/rc.????'
  525.    Match all files in /etc whose first three characters are rc. and are 7 characters long.
  526. `*.c'
  527.     Match all files ending in `.c' i.e. all C programs.
  528. `*.[Cc]'
  529.     List all files ending on `.c' or `.C' i.e. all C and C++ programs.
  530. `*.[a-z]'
  531.    Match any file ending in .a, .b, .c, ... up to .z etc.
  532.  
  533. It is important to understand that the shell expands wildcards. When you type a command, the program is not invoked with an argument that contains * or ?. The shell expands the special characters first and invokes commands with the entire list of files which match the patterns. The programs never see the wildcard characters, only the list of files they stand for. To see this in action, you can type
  534.  
  535. echo /etc/rc.*
  536.  
  537. which gives
  538.  
  539. /etc/rc /etc/rc.boot /etc/rc.ip /etc/rc.local /etc/rc.local%
  540. /etc/rc.local~ /etc/rc.single /etc/rc~
  541.  
  542. All shell commands are invoked with a command line of this form. This has an important corollary. It means that multiple renaming cannot work!
  543.  
  544. Unix files are renamed using the mv command. In many microcomputer operating systems one can write
  545.  
  546. rename *.x *.y
  547.  
  548. which changes the file extension of all files ending in `.x' to the same name with a `.y' extension. This cannot work in unix, because the shell tries expands everything before passing the arguments to the command line.
  549.  
  550. The local shell variable noglob switches off wildcard expansion in the C shell, but you still cannot rename multiple files using mv. Some free-software programs make this possible.
  551.  
  552. Regular expressions
  553.  
  554. The wildcards belong to the shell. They are used for matching filenames. UNIX has a more general and widely used mechanism for matching strings, this is through regular expressions.
  555.  
  556. Regular expressions are used by the egrep utility, text editors like ed, vi and emacs and sed and awk. They are also used in the C programming language for matching input as well as in the Perl programming language and lex tokenizer. Here are some examples using the egrep command which print lines from the file /etc/rc which match certain conditions. The contruction is part of egrep. Everything in between these symbols is a regular expression. Notice that special shell symbols ! * & have to be preceded with a backslash \ in order to prevent the shell from expanding them!
  557.  
  558.  
  559. # Print all lines beginning with a comment #
  560.  
  561. egrep '(^#)'           /etc/rc
  562.  
  563. # Print all lines which DON'T begin with #
  564.  
  565. egrep '(^[^#])'        /etc/rc
  566.  
  567. # Print all lines beginning with e, f or g.
  568.  
  569. egrep '(^[efg])'       /etc/rc
  570.  
  571. # Print all lines beginning with uppercase
  572.  
  573. egrep '(^[A-Z])'       /etc/rc
  574.  
  575. # Print all lines NOT beginning with uppercase
  576.  
  577. egrep '(^[^A-Z])'      /etc/rc
  578.  
  579. # Print all lines containing ! * &
  580.  
  581. egrep '([\!\*\&])'     /etc/rc
  582.  
  583. # All lines containing ! * & but not starting #
  584.  
  585. egrep '([^#][\!\*\&])' /etc/rc
  586.  
  587. Regular expressions are made up of the following `atoms'.
  588.  
  589. These examples assume that the file `/etc/rc' exists. If it doesn't exist on the machine you are using, try to find the equivalent by, for instance, replacing /etc/rc with /etc/rc* which will try to find a match beginning with the rc.
  590.  
  591. `.'
  592.     Match any single character except the end of line.
  593. `^'
  594.    Match the beginning of a line as the first character.
  595. `$'
  596.     Match end of line as last character.
  597. `[..]'
  598.    Match any character in the list between the square brackets.(see below).
  599. `*'
  600.     Match zero or more occurrances of the preceding expression.
  601. `+'
  602.    Match one or more occurrences of the preceding expression.
  603. `?'
  604.     Match zero or one occurrance of the preceding expression.
  605.  
  606. You can find a complete list in the unix manual pages. The square brackets above are used to define a class of characters to be matched. Here are some examples,
  607.  
  608.     If the square brackets contain a list of characters, $[a-z156]$ then a single occurrance of any character in the list will match the regular expression: in this case any lowercase letter or the numbers 1, 5 and 6.
  609.     If the first character in the brackets is the caret symbol `^' then any character except those in the list will be matched.
  610.    Normally a dash or minus sign `-' means a range of characters. If it is the first character after the `[' or after `[^' then it is treated literally.
  611.  
  612. Nested shell commands and "
  613.  
  614. The backwards apostrophes `...` can be used in all shells and also in the programming language Perl. When these are encountered in a string the shell tries to execute the command inside the quotes and replace the quoted expression by the result of that command. For example:
  615.  
  616. unix% echo "This system's kernel type is `/bin/file /vmunix`"
  617. This system's kernel type is /vmunix: sparc executable not stripped
  618.  
  619. unix% foreach file ( `ls /etc/rc*` )
  620. ? echo I found a config file $file
  621. ? echo Its type is `/bin/file $file`
  622. ? end
  623.  
  624. I found a config file /etc/rc
  625. Its type is /etc/rc: executable shell script
  626. I found a config file /etc/rc.boot
  627. Its type is /etc/rc.boot: executable shell script
  628. I found a config file /etc/rc.ip
  629. Its type is /etc/rc.ip: executable shell script
  630. I found a config file /etc/rc.local
  631. Its type is /etc/rc.local: ascii text
  632. I found a config file /etc/rc.local~
  633. Its type is /etc/rc.local~: ascii text
  634. I found a config file /etc/rc.single
  635. Its type is /etc/rc.single: executable shell script
  636. I found a config file /etc/rc~
  637. Its type is /etc/rc~: executable shell script
  638.  
  639. This is how we insert the result of a shell command into a text string or variable.
  640. UNIX command overview
  641. Important keys
  642.  
  643. CTRL-A
  644.     Jump to start of line. If `screen' is active, this prefixes all control key commands for `screen' and then the normal CTRL-A is replaced by CTRL-a a.
  645. CTRL-C
  646.     Interrupt or break key. Sends signal 15 to a process.
  647. CTRL-D
  648.     Signifies `EOF' (end of file) or shows expansion matches in command/filename completion See section Command/filename completion.
  649. CTRL-E
  650.    Jump to end of line.
  651. CTRL-L
  652.    Clear screen in newer shells and in emacs. Same as `clear' in the shell.
  653. CTRL-Z
  654.     Suspend the present process, but do not destroy it. This sends signal 18 to the process.
  655.  
  656. Alternative shells
  657.  
  658. bash
  659.     The Bourne Again shell, an improved sh.
  660. csh
  661.     The standard C-shell.
  662. jsh
  663.     The same as sh, with C-shell style job control.
  664. ksh
  665.     The Korn shell, an improved sh.
  666. sh
  667.     The original Bourne shell.
  668. sh5
  669.     On ULTRIX systems the standard Bourne shell is quite stupid. sh5 corresponds to the normal Bourne shell on these systems.
  670. tcsh
  671.     An improved C-shell.
  672. zsh
  673.     An improved sh.
  674.  
  675. Window based terminal emulators
  676.  
  677. xterm
  678.     The standard X11 terminal window.
  679. shelltool, cmdtool
  680.     Openwindows terminals from Sun Microsystems. These are not completely X11 compatible during copy/paste operations.
  681. screen
  682.     This is not a window in itself, but allows you to emulate having several windows inside a single (say) xterm window. The user can switch between different windows and open new ones, but can only see one window at a time See section Multiple screens.
  683.  
  684. Remote shells and logins
  685.  
  686. rlogin
  687.     Login onto a remote unix system.
  688. rsh
  689.     Open a shell on a remote system (require access rights).
  690. telnet
  691.     Open a connection to a remove system using the telnet protocol.
  692.  
  693. Text editors
  694.  
  695. ed
  696.     An ancient line-editor.
  697. vi
  698.     Visual interface to ed. This is the only "standard" unix text editor supplied by vendors.
  699. emacs
  700.     The most powerful UNIX editor. A fully configurable, user programmable editor which works under X11 and on tty-terminals.
  701. xemacs
  702.     A pretty version of emacs for X11 windows.
  703. pico
  704.     A tty-terminal only editor, comes as part of the PINE mail package.
  705. xedit
  706.     A test X11-only editor supplied with X-windows.
  707. textedit
  708.     A simple X11-only editor supplied by Sun Microsystems.
  709.  
  710. File handling commands
  711.  
  712. ls
  713.     List files in specified directory (like dir on other systems).
  714. cp
  715.     Copy files.
  716. mv
  717.     Move or rename files.
  718. touch
  719.     Creates an empty new file if none exists, or updates date and time stamps on existing files.
  720. rm, unlink
  721.     Remove a file or link (delete).
  722. mkdir, rmdir
  723.     Make or remove a directory. A directory must be empty in order to be able to remove it.
  724. cat
  725.     Concatenate or join together a number of files. The output is written to the standard output by default. Can also be used to simply print a file on screen.
  726. lp, lpr
  727.     Line printer. Send a file to the default printer, or the printer defined in the `PRINTER' evironment variable.
  728. lpq, lpstat
  729.    Show the status of the print queue.
  730.  
  731. File browsing
  732.  
  733. more
  734.    Shows one screen full at a time. Possibility to search for a string and edit the file. This is like `type file | more' in DOS.
  735. less
  736.     An enhanced version of more.
  737. mc
  738.     Midnight commander, a free version of the `Norton Commander' PC utility for unix. (Only for non-serious UNIX users...)
  739. fmgr
  740.    A window based file manager with icons and all that nonsense.
  741.  
  742. Ownership and granting access permission
  743.  
  744. chmod
  745.    Change file access mode.
  746. chown, chgrp
  747.    Change owner and group of a file. The GNU version of chown allows both these operations to be performed together using the syntax chown owner.group file.
  748. acl
  749.    On newer Unices, Access control lists allow access to be granted on a per-user basis rather than by groups.
  750.  
  751. Extracting from and rebuilding files
  752.  
  753. cut
  754.    Extract a column in a table
  755. paste
  756.    Merge several files so that each file becomes a column in a table.
  757. sed
  758.    A batch text-editor for searching, replacing and selecting text without human intervention.
  759. awk
  760.    A prerunner to the Perl language, for extracting and modifying textfiles.
  761. rmcr
  762.    Strip carriage return (ASCII 13) characters from a file. Useful for converting DOS files to unix.
  763.  
  764. Locating files
  765.  
  766. find
  767.    Search for files from a specified directory using various criteria.
  768. locate
  769.    Fast search in a global file database for files containing a search-string.
  770. whereis
  771.    Look for a command and its documentation on the system.
  772.  
  773. Disk usage.
  774.  
  775. du
  776.    Show number of blocks used by a file or files.
  777. df
  778.    Show the state of usage for one or more disk partitions.
  779.  
  780. Show other users logged on
  781.  
  782. users
  783.    Simple list of other users.
  784. finger
  785.    Show who is logged onto this and other systems.
  786. who
  787.    List of users logged into this system.
  788. w
  789.    Long list of who is logged onto this system and what they are doing.
  790.  
  791. Contacting other users
  792.  
  793. write
  794.    Send a simple message to the named user, end with CTRL-D. The command `mesg n' switches off messages receipt.
  795. talk
  796.     Interactive two-way conversation with named user.
  797. irc
  798.     Internet relay chat. A conferencing system for realtime multi-user conversations, for addicts and losers.
  799.  
  800. Mail senders/readers
  801.  
  802. mail
  803.     The standard (old) mail interface.
  804. Mail
  805.     Another mail interface.
  806. elm
  807.     Electronic Mail program. Lots of functionality but poor support for multimedia.
  808. pine
  809.     Pine Is No-longer Elm. Improved support for multimedia but very slow and rather stupid at times. Some of the best features of elm have been removed!
  810. mailtool
  811.     Sun's openwindows client program.
  812. rmail
  813.    A mail interface built into the emacs editor.
  814. netscape mail
  815.    A mail interface built into the netscape navigator.
  816. zmail
  817.    A commerical mail package.
  818. tkrat
  819.    A graphical mail reader which supports most MIME types, written in tcl/tk. This program has a nice feel and allows you to create a searchable database of old mail messages, but has a hopeless locking mechanism.
  820.  
  821. File transfer
  822.  
  823. ftp
  824.    The File Transfer program - copies files to/from a remote host.
  825. ncftp
  826.    An enhanced ftp for anonymous login.
  827.  
  828. Compilers
  829.  
  830. cc
  831.    The C compiler.
  832. CC
  833.    The C++ compiler.
  834. gcc
  835.    The GNU C compiler.
  836. g++
  837.    The GNU C++ compiler.
  838. ld
  839.    The system linker/loader.
  840. ar
  841.    Archive library builder.
  842. dbx
  843.    A symbolic debugger.
  844. gdb
  845.    The GNU symbolic debugger.
  846. xxgdb
  847.    The GNU debugger with a windown driven front-end.
  848. ddd
  849.    A motif based front-end to the gdb debugger.
  850.  
  851. Other interpreted languages
  852.  
  853. perl
  854.    Practical extraction an report language.
  855. tcl
  856.    A perl-like language with special support for building user interfaces and command shells.
  857. scheme
  858.    A lisp-like extensible scripting language from GNU.
  859. mercury
  860.    A prolog-like language for artificial intelligence.
  861.  
  862. Processes and system statistics
  863.  
  864. ps
  865.    List system process table.
  866. vmstat
  867.    List kernel virtual-memory statistics.
  868. netstat
  869.    List network connections and statistics.
  870. rpcinfo
  871.    Show rpc information.
  872. showmount
  873.    Show clients mounting local filesystems.
  874.  
  875. System identity
  876.  
  877. uname
  878.    Display system name and operating system release.
  879. hostname
  880.    Show the name of this host.
  881. domainname
  882.    Show the name of the local NIS domain. Normally this is chosen to be the same as the BIND/DNS domain, but it need not be.
  883. nslookup
  884.    Interrogate the DNS/BIND name service (hostname to IP address conversion).
  885.  
  886. Internet resources
  887.  
  888. archie, xarchie
  889.    Search the internet ftp database for files.
  890. xrn, fnews
  891.    Read news (browser).
  892. netscape, xmosaic
  893.    Read world wide web (WWW) (browser).
  894.  
  895. Text formatting and postscript
  896.  
  897. tex, latex
  898.    Donald Knuth's text formatting language, pronounced "tek" (the x is really a greek "chi"). Used widely for technical publications. Compiles to dvi (device independent) file format.
  899. texinfo
  900.     A hypertext documentation system using tex and "info" format. This is the GNU documentation system. This UNIX guide is written in texinfo!!!
  901. xdvi
  902.     View a tex dvi file on screen.
  903. dvips
  904.     Convert dvi format into postscript.
  905. ghostview, ghostscript
  906.     View a postscript file on screen.
  907.  
  908. Picture editors and processors
  909.  
  910. xv
  911.     Handles, edits and processes pictures in a variety of standard graphics formats (gif, jpg, tiff etc). Use xv -quit to place a picture on your root window.
  912. xpaint
  913.     A simple paint program.
  914. xfig
  915.     A line drawing figure editor. Produces postscript, tex, and a variety of other output formats.
  916. xsetroot
  917.     Load an X-bitmap image into the screen (root window) background. Small images are tiled.
  918.  
  919. Miscellaneous
  920.  
  921. date
  922.     Print the date and time.
  923. ispell
  924.     Spelling checker.
  925. xcalc
  926.     A graphical calculator.
  927. dc,bc
  928.     Text-based calculators.
  929. xclock
  930.     A clock!
  931. ping
  932.     Send a "sonar" ping to see if another unix host is alive.
  933.  
  934. Terminals
  935.  
  936. In order to communicate with a user, a shell needs to have access to a terminal. Unix was designed to work with many different kinds of terminals. Input/output commands in Unix read and write to a virtual terminal. In reality a terminal might be a text-based Teletype terminal (called a tty for short) or a graphics based terminal; it might be 80-characters wide or it might be wider or narrower. Unix take into account these possibility by defining a number of instances of terminals in a more or less object oriented way.
  937.  
  938. Each user's terminal has to be configured before cursor based input/output will work correctly. Normally this is done by choosing one of a number of standard terminal types a list which is supplied by the system. In practice the user defines the value of the environment variable `TERM' to an appropriate name. Typical examples are `vt100' and `xterm'. If no standard setup is found, the terminal can always be configured manually using UNIX's most cryptic and opaque of commands: `stty'.
  939.  
  940. The job of configuring terminals is much easier now that hardware is more standard. Users' terminals are usually configured centrally by the system administrator and it is seldom indeed that one ever has to choose anything other than `vt100' or `xterm'.
  941. The X window system
  942.  
  943. Because UNIX originated before windowing technology was available, the user-interface was not designed with windowing in mind. The X window system attempts to be like a virtual machine park, running a different program in each window. Although the programs appear on one screen, they may in fact be running on unix systems anywhere in the world, with only the output being local to the user's display. The standard shell interface is available by running an X client application called `xterm' which is a graphical front-end to the standard UNIX textual interface.
  944.  
  945. The `xterm' program provides a virtual terminal using the X windows graphical user interface. It works in exactly the same way as a tty terminal, except that standard graphical facilities like copy and paste are available. Moreover, the user has the convenience of being able to run a different shell in every window. For example, using the `rlogin' command, it is possible to work on the local system in one window, and on another remote system in another window. The X-window environment allows one to cut and paste between windows, regardless of which host the shell runs on.
  946.  
  947. The components of the X-window system
  948.  
  949. The X11 system is based on the client-server model. You might wonder why a window system would be based on a model which was introduced for interprocess communication, or network communication. The answer is straightforward.
  950.  
  951. The designers of the X window system realized that network communication was to be the paradigm of the next generation of computer systems. They wanted to design a system of windows which would enable a user to sit at a terminal in Massachusetts and work on a machine in Tokyo -- and still be able to get high quality windows displayed on their terminal. The aim of X windows from the beginning is to create a distributed window environment.
  952.  
  953. When I log onto my friend's Hewlett Packard workstation to use the text editor (because I don't like the one on my EUNUCHS workstation) I want it to work correctly on my screen, with my keyboard -- even though my workstation is manufactured by a different company. I also want the colours to be right despite the fact that the HP machine uses a completely different video hardware to my machine. When I press the curly brace key {, I want to see a curly brace, and not some hieroglyphic because the HP station uses a different keyboard.
  954.  
  955. These are the problems which X tries to address. In a network environment we need a common window system which will work on any kind of hardware, and hide the differences between different machines as far as possible. But it has to be flexible enough to allow us to change all of the things we don't like -- to choose our own colours, and the kind of window borders we want etc. Other windowing systems (like Microsoft windows) ignore these problems and thereby lock the user to a single vendors products and a single operating system. (That, of course, is no accident.)
  956.  
  957. The way X solves this problem is to use the client server model. Each program which wants to open a window on somebody's compute screen is a client of the X window service. To get something drawn on a user's screen, the client asks a server on the host of interest to draw windows for it. No client ever draws anything itself -- it asks the server to do it on its behalf. There are several reasons for this:
  958.  
  959.     The clients can all talk a common `window language' or protocol. We can hide the difference between different kinds of hardware by making the machine-specific part of drawing graphics entirely a problem of implementing the server on the particular hardware. When a new type of hardware comes along, we just need to modify the server -- none of the clients need to be modified.
  960.    We can contact different servers and send our output to different hardware -- thus even though a program is running on a CPU in Tokyo, it can ask the server in Massachusetts to display its window for it.
  961.    When more than one window is on a user's display, it eventually becomes necessary to move the windows around and then figure out which windows are on top of which other windows etc. If all of the drawing information is kept in a server, it is straightforward to work out this information. If every client drew where it wanted to, it would be impossible to know which window was supposed to be on top of another.
  962.  
  963. In X, the window manager is a different program to the server which does the drawing of graphics -- but the client-server idea still applies, it just has one more piece to its puzzle.
  964.  
  965. How to set up X windows
  966.  
  967. The X windows system is large and complex and not particularly user friendly. When you log in to the system, X reads two files in your home directory which decide which applications will be started what they will look like. The files are called
  968.  
  969. .Xsession
  970.     This file is a shell script which starts up a number of applications as background processes and exits by calling a window manager. Here is a simple example file
  971.  
  972.     #!/bin/csh
  973.     #
  974.     # .xsession file
  975.     #
  976.     #
  977.  
  978.     setenv PATH /usr/bin:/bin:/local/gnu/bin:/usr/X11R6/bin
  979.  
  980.     #
  981.     # List applications here, with & at the end
  982.     # so they run in the background
  983.     #
  984.  
  985.       xterm -T NewTitle -sl 1000 -geometry 90x45+16+150 -sb &
  986.       xclock &
  987.       xbiff -geometry 80x80+510+0 &
  988.  
  989.     # Start a window manager. Exec replaces this script with
  990.     # the fvwm process, so that it doesn't exist as a separate
  991.     # (useless) process.
  992.  
  993.       exec /local/bin/fvwm
  994.  
  995. .Xdefaults
  996.     This file specifies all of the resources which X programs use. It can be used to change the colours used by applications, or font types etc. The subject of X-rescources is a large one and we don't have time for it here. Here is a simple example, which shows how you can make your over-bright xterm and emacs windows less bright grey shade.
  997.  
  998.    xterm*background: LightGrey
  999.    Emacs*background: grey92
  1000.    Xemacs*background: grey92
  1001.  
  1002. X displays and authority
  1003.  
  1004. In the terminology used by X11, every client program has to contact a display in order to open a window. A display is a virtual screen which is created by the X server on a particular host. X can create several separate displays on a given host, though most machines only have one.
  1005.  
  1006. When an X client program wants to open a window, it looks in the UNIX environment variable `DISPLAY' for the IP address of a host which has an X server it can contact. For example, if we wrote
  1007.  
  1008. setenv DISPLAY myhost:0
  1009.  
  1010. the client would try to contact the X server on `myhost' and ask for a window on display number zero (the usual display). If we wrote
  1011.  
  1012. setenv DISPLAY 198.112.208.35:0
  1013.  
  1014. the client would try to open display zero on the X server at the host with the IP address `198.112.208.35'.
  1015.  
  1016. Clearly there must be some kind of security mechanism to prevent just anybody from opening windows on someone's display. X has two such mechanisms:
  1017.  
  1018. xhost
  1019.    This mechanism is now obsolete. The `xhost' command is used to define a list of hosts which are allowed to open windows on the user's display. It cannot destinguish between individual users. i.e. the command xhost yourhost would allow anyone using yourhost to access the local display. This mechanism is only present for backward compatibility with early versions of X windows. Normally one should use the command xhost - to exclude all others from accessing the display.
  1020. Xauthority
  1021.    The Xauthority mechanism has replaced the xhost scheme. It provides a security mechanism which can distinguish individual users, not just hosts. In order for a user to open a window on a display, he/she must have a ticket--called a "magic cookie". This is a binary file called `.Xauthority' which is created in the user's home directory when he/she first starts the X-windows system. Anyone who does not have a recent copy of this file cannot open windows or read the display of the user's terminal. This mechanism is based on the idea that the user's home directory is available via NFS on all hosts he/she will log onto, and thus the owner of the display will always have access to the magic cookie, and will therefore always be able to open windows on the display. Other users must obtain a copy of the file in order to open windows there. The command xauth is an interactive utility used for controlling the contents of the `.Xauthority' file. See the `xauth' manual page for more information.
  1022.  
  1023. Multiple screens
  1024.  
  1025. The window paradigm has been very successful in many ways, but anyone who has used a window system knows that the screen is simply not big enough for all the windows one would like! Unix has several solutions to this problem.
  1026.  
  1027. One solution is to attach several physical screens to a terminal. The X window system can support any number of physical screens of different types. A graphical designer might want a high resolution colour screen for drawing and a black and white screen for writing text, for instance. The disadvantage with this method is the cost of the hardware.
  1028.  
  1029. A cheaper solution is to use a window manager such as `fwvm' which creates a virtual screen of unlimited size on a single monitor. As the mouse pointer reaches the edge of the true screen, the window manager replaces the display with a new "blank screen" in which to place windows. A miniaturized image of the windows on a control panel acts as a map which makes it possible to find the applications on the virtual screen.
  1030.  
  1031. Yet another possibility is to create virtual displays inside a single window. In other words, one can collapse several shell windows into a single `xterm' window by running the program `screen'. The screen command allows you to start several shells in a single window (using CTRL-a CTRL-c) and to switch between them (by typing CTRL-a CTRL-n). It is only possible to see one shell window at a time, but it is still possible to cut and paste between windows and one has a considerable saving of space. The `screen' command also allows you to suspend a shell session, log out, log in again later and resume the session precisely where you left off.
  1032.  
  1033. Here is a summary of some useful screen commands:
  1034.  
  1035. screen
  1036.    Start the screen server.
  1037. screen -r
  1038.    Resume a previously suspended screen session if possible.
  1039. CTRL-a CTRL-c
  1040.    Start a new shell on top of the others (a fresh `screen') in the current window.
  1041. CTRL-a CTRL-n
  1042.     Switch to the next `screen'.
  1043. CTRL-a CTRL-a
  1044.    Switch to the last screen used.
  1045. CTRL-a a
  1046.    When screen is running, CTRL-a is used for screen commands and cannot therefore be used in its usual shell meaning of `jump to start of line'. CTRL-a a replaces this.
  1047. CTRL-a CTRL-d
  1048.     Detach the screen session from the current window so that it can be resumed later. It can be resumed with the `screen -r' command.
  1049. CTRL-a ?
  1050.    Help screen.
  1051.  
  1052. Files and access
  1053.  
  1054. To prevent all users from being able to access all files on the system, unix records information about who creates files and also who is allowed to access them later.
  1055.  
  1056. Each user has a unique username or loginname together with a unique user id or uid. The user id is a number, whereas the login name is a text string -- otherwise the two express the same information. A file belongs to user A if it is owned by user A. User A then decides whether or not other users can read, write or execute the file by setting the protection bits or the permission of the file using the command chmod.
  1057.  
  1058. In addition to user identities, there are groups of users. The idea of a group is that several named users might want to be able to read and work on a file, without other users being able to access it. Every user is a member of at least one group, called the login group and each group has both a textual name and a number (group id). The uid and gid of each user is recorded in the file /etc/passwd (See chapter 6). Membership of other groups is recorded in the file /etc/group or on some systems /etc/logingroup.
  1059. Protection bits
  1060.  
  1061. The following output is from the command ls -lag executed on a SunOS type machine.
  1062.  
  1063.  
  1064. lrwxrwxrwx  1 root     wheel           7 Jun  1  1993 bin -> usr/bin
  1065. -r--r--r--  1 root     bin        103512 Jun  1  1993 boot
  1066. drwxr-sr-x  2 bin      staff       11264 May 11 17:00 dev
  1067. drwxr-sr-x 10 bin      staff        2560 Jul  8 02:06 etc
  1068. drwxr-sr-x  8 root     wheel         512 Jun  1  1993 export
  1069. drwx------  2 root     daemon        512 Sep 26  1993 home
  1070. -rwxr-xr-x  1 root     wheel      249079 Jun  1  1993 kadb
  1071. lrwxrwxrwx  1 root     wheel           7 Jun  1  1993 lib -> usr/lib
  1072. drwxr-xr-x  2 root     wheel        8192 Jun  1  1993 lost+found
  1073. drwxr-sr-x  2 bin      staff         512 Jul 23  1992 mnt
  1074. dr-xr-xr-x  1 root     wheel         512 May 11 17:00 net
  1075. drwxr-sr-x  2 root     wheel         512 Jun  1  1993 pcfs
  1076. drwxr-sr-x  2 bin      staff         512 Jun  1  1993 sbin
  1077. lrwxrwxrwx  1 root     wheel          13 Jun  1  1993 sys->kvm/sys
  1078. drwxrwxrwx  6 root     wheel         732 Jul  8 19:23 tmp
  1079. drwxr-xr-x 27 root     wheel        1024 Jun 14  1993 usr
  1080. drwxr-sr-x 10 bin      staff         512 Jul 23  1992 var
  1081. -rwxr-xr-x  1 root     daemon    2182656 Jun  4  1993 vmunix
  1082.  
  1083. The first column is a textual representation of the protection bits for each file. Column two is the number of hard links to the file (See exercises below). The third and fourth columns are the user name and group name and the remainder show the file size in bytes and the creation date. Notice that the directories /bin and /sys are symbolic links to other directories.
  1084.  
  1085. There are sixteen protection bits for a UNIX file, but only twelve of them can be changed by users. These twelve are split into four groups of three. Each three-bit number corresponds to one octal number.
  1086.  
  1087. The leading four invisible bits gives information about the type of file: is the file a plain file, a directory or a link. In the output from ls this is represented by a single character: -, d or l.
  1088.  
  1089. The next three bits set the so-called s-bits and t-bit which are explained below.
  1090.  
  1091. The remaining three groups of three bits set flags which indicate whether a file can be read `r', written to `w' or executed `x' by (i) the user who created them, (ii) the other users who are in the group the file is marked with, and (iii) any user at all.
  1092.  
  1093. For example, the permission
  1094.  
  1095. Type Owner Group Anyone
  1096.   d  rwx   r-x   ---
  1097.  
  1098. tells us that the file is a directory, which can be read and written to by the owner, can be read by others in its group, but not by anyone else.
  1099.  
  1100. Note about directories. It is impossible to cd to a directory unless the x bit is set. That is, directories must be `executable' in order to be accessible.
  1101.  
  1102. Here are some examples of the relationship between binary, octal and the textual representation of file modes.
  1103.  
  1104. Binary  Octal   Text
  1105.  
  1106. 001      1       x
  1107. 010      2       w
  1108. 100      4       r
  1109. 110      6      rw-
  1110. 101      5      r-x  
  1111.  -      644   rw-r--r--
  1112.  
  1113. It is well worth becoming familiar with the octal number representation of these permissions.
  1114. chmod
  1115.  
  1116. The chmod command changes the permission or mode of a file. Only the owner of the file or the superuser can change the permission. Here are some examples of its use. Try them.
  1117.  
  1118. # make read/write-able for everyone
  1119. chmod a+w myfile    
  1120.  
  1121. # add the 'execute' flag for directory
  1122. chmod u+x mydir/  
  1123.  
  1124. # open all files for everyone
  1125. chmod 755 *        
  1126.  
  1127. # set the s-bit on my-dir's group
  1128. chmod g+s mydir/    
  1129.  
  1130. # descend recursively into directory opening all files
  1131. chmod -R a+r dir    
  1132.  
  1133. Umask
  1134.  
  1135. When a new file gets created, the operating system must decide what default protection bits to set on that file. The variable umask decides this. umask is normally set by each user in his or her .cshrc file (see next chapter). For example
  1136.  
  1137. umask 077    # safe
  1138. umask 022    # liberal
  1139.  
  1140. According the UNIX documentation, the value of umask is `XOR'ed (exclusive `OR') with a value of 666 & umask for plain files or 777 & umask for directories in order to find out the standard protection. Actually this is not quite true: `umask' only removes bits, it never sets bits which were not already set in 666. For instance
  1141.  
  1142. umask               Permission
  1143.  
  1144. 077                 600 (plain)
  1145. 077                 700 (dir)  
  1146. 022                 644 (plain)
  1147. 022                 755 (dir)  
  1148.  
  1149. The correct rule for computing permissions is not XOR but `NOT AND'.
  1150.  
  1151. Making programs executable
  1152.  
  1153. A unix program is normally executed by typing its pathname. If the x execute bit is not set on the file, this will generate a `Permission denied' error. This protects the system from interpreting nonsense files as programs. To make a program executable for someone, you must therefore ensure that they can execute the file, using a command like
  1154.  
  1155. chmod u+x filename
  1156.  
  1157. This command would set execute permissions for the owner of the file;
  1158.  
  1159. chmod ug+x filename
  1160.  
  1161. would set execute permissions for the owner and for any users in the same group as the file. Note that script programs must also be readable in order to be executable, since the shell has the interpret them by reading.
  1162. chown and chgrp
  1163.  
  1164. These two commands change the ownership and the group ownership of a file. Only the superuser can change the ownership of a file on most systems. This is to prevent users from being able to defeat quota mechanisms. (On some systems, which do not implement quotas, ordinary users can give a file away to another user but not get it back again.) The same applies to group ownership.
  1165. Making a group
  1166.  
  1167. Normally users other than root cannot define their own groups. This is a weakness in Unix from older times which no one seems to be in a hurry to change. At Oslo College, Computer Science, we use a local solution whereby users can edit a file to create their own groups. This file is called `/iu/nexus/local/iu/etc/iu-group'. The format of the group file is:
  1168.  
  1169. group-name::group-number:comma-separated-list-of-users
  1170.  
  1171. s-bit and t-bit (sticky bit)
  1172.  
  1173. The s and t bits have special uses. They are described as follows.
  1174.  
  1175. Octal     Text       Name
  1176.  
  1177. 4000      chmod u+s  Setuid bit
  1178. 2000      chmod g+s  Setgid bit
  1179. 1000      chmod +t   Sticky bit
  1180.  
  1181. The effect of these bits differs for plain files and directories and differ between different versions of UNIX. You should check the manual page man sticky to find out about your system! The following is common behaviour.
  1182.  
  1183. For executable files, the setuid bit tells UNIX that regardless of who runs the program it should be executed with the permissions and rights of owner of the file. This is often used to allow normal users limited access to root privileges. A setuid-root program is executed as root for any user. The setgid bit sets the group execution rights of the program in a similar way.
  1184.  
  1185. In BSD unix, if the setgid bit is set on a directory then any new files created in that directory assume the group ownership of the parent directory and not the logingroup of the user who created the file. This is standard policy under system 5.
  1186.  
  1187. A directory for which the sticky bit is set restrict the deletion of files within it. A file or directory inside a directory with the t-bit set can only be deleted or renamed by its owner or the superuser. This is useful for directories like the mail spool area and /tmp which must be writable to everyone, but should not allow a user to delete another user's files.
  1188.  
  1189. (Ultrix) If an executable file is marked with a sticky bit, it is held in the memory or system swap area. It does not have to be fetched from disk each time it is executed. This saves time for frequently used programs like ls.
  1190.  
  1191. (Solaris 1) If a non-executable file is marked with the sticky bit, it will not be held in the disk page cache -- that is, it is never copied from the disk and held in RAM but is written to directly. This is used to prevent certain files from using up valuable memory.
  1192.  
  1193. On some systems (e.g. ULTRIX), only the superuser can set the sticky bit. On others (e.g. SunOS) any user can create a sticky directory.
  1194. C shell
  1195.  
  1196. The C shell is the command interpreter which you use to run programs and utilities. It contains a simple programming language for writing tailor-made commands, and allows you to join together unix commands with pipes. It is a configurable environment, and once you know it well, it is the most efficient way of working with unix.
  1197. .cshrc and .login files
  1198.  
  1199. Most users run the C-shell `/bin/csh' as their login environment, or these days, preferably the `tcsh' which is an improved version of csh. When a user logs in to a UNIX system the C-shell starts by reading some files which configure the environment by defining variables like path.
  1200.  
  1201.    The file `.cshrc' is searched for in your home directory. i.e. `~/.cshrc'. If it is found, its contents are interpreted by the C-shell as C-shell instructions, before giving you the command prompt(3).
  1202.    If and only if this is the login shell (not a sub-shell that you have started after login) then the file `~/.login' is searched for and executed.
  1203.  
  1204. With the advent of the X11 windowing system, this has changed slightly. Since the window system takes over the entire login procedure, users never get to run `login shells', since the login shell is used up by the X11 system. On an X-terminal or host running X the `.login' file normally has no effect.
  1205.  
  1206. With some thought, the `.login' file can be eliminated entirely, and we can put everything into the .cshrc file. Here is a very simple example `.cshrc' file.
  1207.  
  1208. #
  1209. # .cshrc - read in by every csh that starts.
  1210. #
  1211.  
  1212. # Set the default file creation mask
  1213. umask 077
  1214.  
  1215. # Set the path
  1216. set path=( /usr/local/bin /usr/bin/X11 /usr/ucb /bin /usr/bin . )
  1217.  
  1218. # Exit here if the shell is not interactive
  1219. if ( $?prompt == 0 ) exit
  1220.  
  1221. # Set some variables
  1222.  
  1223. set noclobber notify filec nobeep
  1224. set history=100
  1225. set prompt="`hostname`%"
  1226. set prompt2 = "%m %h>"    # tcsh, prompt for foreach and while
  1227.  
  1228. setenv PRINTER myprinter
  1229. setenv LD_LIBRARY_PATH /usr/lib:/usr/local/lib:/usr/openwin/lib
  1230.  
  1231. # Aliases are shortcuts to unix commands
  1232.  
  1233. alias passwd  yppasswd
  1234. alias dir     'ls -lg \!* | more'
  1235. alias sys     'ps aux | more'
  1236. alias h       history
  1237.  
  1238. It is possible to make a much more complicated .cshrc file than this. The advent of distributed computing and NFS (Network file system) means that you might log into many different machines running different versions of unix. The command path would have to be set differently for each type of machine.
  1239. Defining variables with set, setenv
  1240.  
  1241. We have already seen in the examples above how to define variables in C-shell. Let's formalize this. To define a local variable -- that is, one which will not get passed on to programs and sub-shells running under the current shell, we write
  1242.  
  1243. set local = "some string"
  1244. set myname = "`whoami`"
  1245.  
  1246. These variables are then referred to by using the dollar `$' symbol. i.e. The value of the variable `local' is `$local'.
  1247.  
  1248. echo $local $myname
  1249.  
  1250. Global variables, that is variables which all sub-shells inherit from the current shell are defined using `setenv'
  1251.  
  1252. setenv GLOBAL "Some other string"
  1253. setenv MYNAME "`who am i`"
  1254.  
  1255. Their values are also referred to using the `$' symbol. Notice that set uses an `=' sign while `setenv' does not.
  1256.  
  1257. Variables can be also created without a value. The shell uses this method to switch on and off certain features, using variables like `noclobber' and `noglob'. For instance
  1258.  
  1259. nexus% set flag
  1260. nexus% if ($?flag) echo 'Flag is set!'
  1261. Flag is set!
  1262. nexus% unset flag
  1263. nexus% if ( $?flag ) echo 'Flag is set!'
  1264. nexus%
  1265.  
  1266. The operator `$?variable' is `true' if variable exists and `false' if it does not. It does not matter whether the variable holds any information.
  1267.  
  1268. The commands `unset' and `unsetenv' can be used to undefine or delete variables when you don't want them anymore.
  1269. Arrays
  1270.  
  1271. A useful facility in the C-shell is the ability to make arrays out of strings and other variables. The round parentheses `(..)' do this. For example, look at the following commands.
  1272.  
  1273. nexus% set array = ( a b c d )
  1274. nexus% echo $array[1]
  1275. a
  1276. nexus% echo $array[2]
  1277. b
  1278. nexus% echo $array[$#array]
  1279. d
  1280.  
  1281. nexus% set noarray = ( "a b c d" )
  1282. nexus% echo $noarray[1]
  1283. a b c d
  1284. nexus% echo $noarray[$#noarray]
  1285. a b c d
  1286.  
  1287. The first command defines an array containing the elements `a b c d'. The elements of the array are referred to using square brackets `[..]' and the first element is `$array[1]'. The last element is `$array[4]'. NOTE: this is not the same as in C or C++ where the first element of the array is the zeroth element!
  1288.  
  1289. The special operator `$#' returns the number of elements in an array. This gives us a simple way of finding the end of the array. For example
  1290.  
  1291. nexus% echo $#path
  1292. 23
  1293.  
  1294. nexus% echo "The last element in path is $path[$#path]"
  1295. The last element in path is .
  1296.  
  1297. To find the next last element we need to be able to do arithmetic. We'll come back to this later.
  1298. Pipes and redirection in csh
  1299.  
  1300. The symbols
  1301.  
  1302.              <  >  >>  <<  |  &
  1303.  
  1304. have a special meaning in the shell. By default, most commands take their input from the file `stdin' (the keyboard) and write their output to the file `stdout' and their error messages to the file `stderr' (normally, both of these output files are defined to be the current terminal device `/dev/tty', or `/dev/console').
  1305.  
  1306. `stdin', `stdout' and `stderr', known collectively as `stdio', can be redefined or redirected so that information is taken from or sent to a different file. The output direction can be changed with the symbol `>'. For example,
  1307.  
  1308. echo testing > myfile
  1309.  
  1310. produces a file called `myfile' which contains the string `testing'. The single `>' (greater than) sign always creates a new file, whereas the double `>>' appends to the end of a file, if it already exists. So the first of the commands
  1311.  
  1312. echo blah blah >> myfile
  1313. echo Newfile > myfile
  1314.  
  1315. adds a second line to `myfile' after `testing', whereas the second command writes over `myfile' and ends up with just one line `Newfile'.
  1316.  
  1317. Now suppose we mistype a command
  1318.  
  1319. ehco test > myfile
  1320.  
  1321. The command `ehco' does not exist and so the error message `ehco: Command not found' appears on the terminal. This error message was sent to stderr -- so even though we redirected output to a file, the error message appeared on the screen to tell us that an error occurred. Even this can be changed. `stderr' can also be redirected by adding an ampersand `&' character to the `>' symbol. The command
  1322.  
  1323. ehco test >& myfile
  1324.  
  1325. results in the file `myfile' being created, containing the error message `ehco: Command not found'.
  1326.  
  1327. The input direction can be changed using the `<' symbol for example
  1328.  
  1329. /bin/mail mark < message
  1330.  
  1331. would send the file `message' to the user `mark' by electronic mail. The mail program takes its input from the file instead of waiting for keyboard input.
  1332.  
  1333. There are some refinements to the redirection symbols. First of all, let us introduce the C-shell variable `noclobber'. If this variable is set with a command like
  1334.  
  1335. set noclobber
  1336.  
  1337. then files will not be overwritten by the `>' command. If one tries to redirect output to an existing file, the following happens.
  1338.  
  1339. unix% set noclobber
  1340. unix% touch blah        # create an empty file blah
  1341. unix% echo test > blah
  1342. blah: File exists.
  1343.  
  1344. If you are nervous about overwriting files, then you can set `noclobber' in your `.cshrc' file. `noclobber' can be overridden using the pling `!' symbol. So
  1345.  
  1346. unix% set noclobber
  1347. unix% touch blah        # create an empty file blah
  1348. unix% echo test >! blah
  1349.  
  1350. writes over the file `blah' even though `noclobber' is set.
  1351.  
  1352. Here are some other combinations of redirection symbols
  1353.  
  1354. `>>'
  1355.     Append, including `stderr'
  1356. `>>!'
  1357.     Append, ignoring `noclobber'
  1358. `>>&!'
  1359.     Append `stdout', `stderr', ignore `noclobber'
  1360. `<<'
  1361.     See below.
  1362.  
  1363. The last of these commands reads from the standard input until it finds a line which contains a word. It then feeds all of this input into the program concerned. For example,
  1364.  
  1365. nexus% mail mark <<quit
  1366. nexus 1> Hello mark
  1367. nexus 2> Nothing much to say...
  1368. nexus 2> so bye
  1369. nexus 2>
  1370. nexus 2> quit
  1371. Sending mail...
  1372. Mail sent!
  1373.  
  1374. The mail message contains all the lines up to, but not including `marker'. This method can also be used to print text verbatim from a file without using multiple echo commands. Inside a script one may write:
  1375.  
  1376. cat << "marker";
  1377.  
  1378.           MENU
  1379.  
  1380.    1) choice 1
  1381.    2) choice 2
  1382.    ...
  1383.  
  1384. marker
  1385.  
  1386. The cat command writes directly to stdout and the input is redirected and taken directly from the script file.
  1387.  
  1388. A very useful construction is the `pipe' facility. Using the `|' symbol one can feed the `stdout' of one program straight into the `stdin' of another program. Similarly with `|&' both `stdout' and `stderr' can be piped into the input of another program. This is very convenient. For instance, look up the following commands in the manual and try them.
  1389.  
  1390. ps aux | more
  1391. echo 'Keep on sharpenin them there knives!' | mail henry
  1392. vmstat 1 | head
  1393. ls -l /etc | tail
  1394.  
  1395. Note that when piping both standard input and standard error to another program, the two files do not mix synchronously. Often `stderr' appears first.
  1396. `tee' and `script'
  1397.  
  1398. Occasionally you might want to have a copy of what you see on your terminal sent to a file. `tee' and `script' do this. For instance,
  1399.  
  1400. find / -type l  -print | tee myfile
  1401.  
  1402. sends a copy of the output of `find' to the file `myfile'. `tee' can split the output into as many files as you want:
  1403.  
  1404. command | tee file1 file2 ....
  1405.  
  1406. You can also choose to record the output an entire shell session using the `script' command.
  1407.  
  1408. nexus% script mysession
  1409. Script started, file is mysession
  1410.  
  1411. nexus% echo Big brother is scripting you
  1412. Big brother is scripting you
  1413.  
  1414. nexus% exit
  1415. exit
  1416. Script done, file is mysession
  1417.  
  1418. The file `mysession' is a text file which contains a transcript of the session.
  1419. Command history
  1420.  
  1421. The history feature in C-shell means that you do not have to type commands over and over again. In the `tcsh' version of the C shell, and the `bash' version of the Bourne shell, you can use the UP ARROW key to browse back through the list of commands you have typed previously.
  1422.  
  1423. In the normal C-shell (`csh') there are three main commands.
  1424.  
  1425. `!!'
  1426.     Execute the last command again.
  1427. `!-3'
  1428.    Execute the third last command again.
  1429. `!4'
  1430.     Execute command number 4.
  1431.  
  1432. The first of these simply repeats the last command. The second counts backwards from the last command to three commands-ago. The final command gives an absolute number. The absolute command number can be seen by typing `history'.
  1433. Command/filename completion
  1434.  
  1435. In the `tcsh' extension of the C-shell, you can save hours worth of typing errors by using the completion mechanism. This feature is based on the TAB key.
  1436.  
  1437. The idea is that if you type half a filename and press TAB, the shell will try to guess the remainder of the filename. It does this by looking at the files which match what you have already typed and trying to fill in the rest. If there are several files which match, the shell sounds the "bell" or beeps. You can then type CTRL-D to obtain a list of the possible alternatives. Here is an example: suppose you have just a single file in the current directory called `very_long_filename', typing
  1438.  
  1439. more TAB
  1440.  
  1441. results in the following appearing on the command line
  1442.  
  1443. more very_long_filename
  1444.  
  1445. The shell was able to identify a unique file. Now suppose that you have two files called `very_long_filename' and `very_big_filename', typing
  1446.  
  1447. more TAB
  1448.  
  1449. results in the following appearing on the command line
  1450.  
  1451. more very_
  1452.  
  1453. and the shell beeps, indicating that the choice was not unique and a decision is required. Next, you type CTRL-D to see which files you ahve to choose from and the shell lists them and returns you to the command line, exactly where you were. You now choose `very_long_filename' by typing `l'. This is enough to uniquely identify the file. Pressing the TAB key again results in
  1454.  
  1455. more very_long_filename
  1456.  
  1457. on the screen. As long as you have written enough to select a file uniquely, the shell will be able to complete the name for you.
  1458.  
  1459. Completion also works on shell commands, but it is a little slower since the shell must serach through all the directories in the command path to complete commands.
  1460. Single and double quotes
  1461.  
  1462. Two kinds of quotes can be used in shell apart from the backward quotes we mentioned above. The essential difference between them is that certain shell commands work inside double quotes but not inside single quotes. For example
  1463.  
  1464.  
  1465. nexus% echo /etc/rc.*
  1466. /etc/rc.boot /etc/rc.ip /etc/rc.local
  1467.  
  1468. nexus% echo "/etc/rc.*"
  1469. /etc/rc.*
  1470.  
  1471. nexus% echo "`who am i`  -- my name is $user ???"
  1472. nexus!mark     ttyp7   Jul 13 10:16  -- my name is mark ???
  1473.  
  1474. nexus% echo '`who am i`  -- my name is $user ???'
  1475. `who am i`  -- my name is $user ???
  1476.  
  1477. We see that the single quotes prevent variable substitution and sub-shells. Wildcards do not work inside either single or double quotes.
  1478. Job control, break key, `fg', `bg'
  1479.  
  1480. So far we haven't mentioned UNIX's ability to multitask. In the Bourne shell (`sh') there are no facilities for controlling several user processes (4). C-shell provides some commands for starting and stopping processes. These originate from the days before windows and X11, so some of them may seem a little old-fashioned. They are still very useful nonetheless.
  1481.  
  1482. Let's begin by looking at the commands which are true for any shell. Most programs are run in the foreground or interactively. That means that they are connected to the standard input and send their output to the standard output. A program can be made to run in the background, if it does not need to use the standard I/O. For example, a program which generates output and sends it to a file could run in the background. In a window environment, programs which create their own windows can also be started as background processes, leaving standard I/O in the shell free.
  1483.  
  1484. Background processes run independently of what you are doing in the foreground.
  1485. Unix Processes and BSD signals
  1486.  
  1487. A background process is started using the special charcter `&' at the end of the command line.
  1488.  
  1489. find / -name '*lib*' -print >& output  &
  1490.  
  1491. The final `&' on the end of this line means that the job will be run in the background. Note that this is not confused with the redirection operator `>&' since it must be the last character on the line. The command above looks for any files in the system containing the string `lib' and writes the list of files to a file called `output'. This might be a useful way of searching for missing libraries which you want to include in your environment variable `LD_LIBRARY_PATH'. Searching the enire disk from the root directory `/' could take a long time, so it pays to run this in the background.
  1492.  
  1493. If we want to see what processes are running, we can use the `ps' command. `ps' without any arguments lists all of your processes, i.e. all processes owned by the user name you logged in with. `ps' takes many options, for instance `ps auxg' will list all processes in gruesome detail. (The "g" is for group, not gruesome!) `ps' reads the kernel's process tables directly.
  1494.  
  1495. Processes can be stopped and started, or killed one and for all. The `kill' command does this. There are, in fact, two versions of the `kill' command. One of them is built into the C-shell and the other is not. If you use the C-shell then you will never care about the difference. We shall nonetheless mention the special features of the C-shell built-ins below. The kill command takes a number called a signal as an argument and another number called the process identifier or PID for short. Kill send signals to processes. Some of these are fatal and some are for information only. The two commands
  1496.  
  1497. kill -15 127
  1498. kill 127
  1499.  
  1500. are identical. They both send signal 15 to PID 127. This is the normal termination signal and it is often enough to stop any process from running.
  1501.  
  1502. Programs can choose to ignore certain signals by trapping signals with a special handler. One signal they cannot ignore is signal 9.
  1503.  
  1504. kill -9  127
  1505.  
  1506. is a sure way of killing PID 127. Even though the process dies, it may not be removed from the kernel's process table if it has a parent (see next section).
  1507.  
  1508. Here is the complete list of unix signals which the kernel send to processes in different circumstances.
  1509.  
  1510. 1   "SIGHUP",         /* hangup */
  1511. 2   "SIGINT",         /* interrupt */
  1512. 3   "SIGQUIT",        /* quit */
  1513. 4   "SIGILL",         /* illegal instruction (not reset when caught) */
  1514. 5   "SIGTRAP",        /* trace trap (not reset when caught) */
  1515. 6   "SIGIOT/SIGABRT", /* IOT instruction */
  1516. 7   "SIGEMT",         /* EMT instruction */
  1517. 8   "SIGFPE",         /* floating point exception */
  1518. 9   "SIGKILL",        /* kill (cannot be caught or ignored) */
  1519. 10  "SIGBUS",         /* bus error */
  1520. 11  "SIGSEGV",        /* segmentation violation */
  1521. 12  "SIGSYS",         /* bad argument to system call */
  1522. 13  "SIGPIPE",        /* write on a pipe with no one to read it */
  1523. 14  "SIGALRM",        /* alarm clock */
  1524. 15  "SIGTERM",        /* software termination signal from kill */
  1525. 16  "SIGURG",         /* urgent condition on IO channel */
  1526. 17  "SIGSTOP",        /* sendable stop signal not from tty */
  1527. 18  "SIGTSTP",        /* stop signal from tty */
  1528. 19  "SIGCONT",        /* continue a stopped process */
  1529. 20  "SIGCHLD/SIGCLD", /* to parent on child stop or exit */
  1530. 21  "SIGTTIN",        /* to readers pgrp upon background tty read */
  1531. 22  "SIGTTOU",        /* like TTIN for output if (tp->t_local&LTOSTOP) */
  1532. 23  "SIGIO/SIGPOLL",  /* input/output possible signal */
  1533. 24  "SIGXCPU",        /* exceeded CPU time limit */
  1534. 25  "SIGXFSZ",        /* exceeded file size limit */
  1535. 26  "SIGVTALRM",      /* virtual time alarm */
  1536. 27  "SIGPROF",        /* profiling time alarm */
  1537. 28  "SIGWINCH",       /* window changed */
  1538. 29  "SIGLOST",        /* resource lost (eg, record-lock lost) */
  1539. 30  "SIGUSR1",        /* user defined signal 1 */
  1540. 31  "SIGUSR2"
  1541.  
  1542. We have already mentioned 15 and 9 which are the main signals for users. Signal 1, or `HUP' can be sent to certain programs by the superuser. For instance
  1543.  
  1544. kill -1   <inetd>
  1545. kill -HUP <inetd>
  1546.  
  1547. which forces `inetd' to reread its configuration file. Sometimes it is useful to suspend a process temporarily and then restart it later.
  1548.  
  1549. kill -18 <PID>       # suspend process <PID>
  1550. kill -19 <PID>       # resume process <PID>
  1551.  
  1552. Child Processes and zombies
  1553.  
  1554. When you start a process from a shell, regardless of whether it is a background process or a foreground process, the new process becomes a child of the original shell. Remember that the shell is just a unix process itself. Moreover, if one of the children starts a new process then it will be a child of the child (a grandchild?)! Processes therefore form hierachies. Several children can have a common parent.
  1555.  
  1556. If we kill a parent, then (unless the child has detached itself from the parent) all of its children die too. If a child dies, the parent is not affected. Sometimes when a child is killed, it does not die but becomes "defunct" or a zombie process. This means that the child has a parent which is waiting for it to finish. If the parent has not yet been informed that the child has died, for example because it has been suspended itself, then the dead child is not removed from the kernel's process table. When the parent wakes up and receives the message that the child has terminated, the process entry for the dead child can be removed.
  1557. C-shell builtins: `jobs', `kill', `fg',`bg', break key
  1558.  
  1559. Now let's look at some commands which are built into the C-shell for starting and stopping processes. C-shell refers to user programs as `jobs' rather than processes -- but there is no real difference. The added bonus of the C-shell is that each shell has a job number in addition to its PID. The job numbers are simpler and are private for the shell, whereas the PIDs are assigned by the kernel and are often very large numbers which are difficult to to remember. When a command is executed in the shell, it is assigned a job number. If you never run any background jobs then there is only ever one job number: 1, since every job exits before the next one starts. However, if you run background tasks, then you can have several jobs "active" at any time. Moreover, by suspending jobs, C-shell allows you to have several interactive programs running on the same terminal -- the `fg' and `bg' commands allow you to move commands from the background to the foreground and vice-versa.
  1560.  
  1561. Take a look at the following shell session.
  1562.  
  1563. nexus% emacs myfile &
  1564. [1] 4990
  1565. nexus%
  1566.  
  1567.   ( other commands ... , edit myfile and close emacs )
  1568.  
  1569. [1]    Exit 70                emacs myfile
  1570.  
  1571. When a background job is done, the shell prints a message at a suitable moment between prompts.
  1572.  
  1573. [1]  Done    emacs myfile
  1574.  
  1575. This tells you that job number 1 finished normally. If the job exits abnormally then the word `Done' may be replaced by some other message. For instance, if you kill the job, it will say
  1576.  
  1577. unix% kill %12
  1578. [12] Terminated     textedit file
  1579.  
  1580. You can list the jobs you have running using the `jobs' command. The output looks something like
  1581.  
  1582. [1]  + Running                textedit c.tex
  1583. [3]    Running                textedit glossary.tex
  1584. [4]    Running                textedit net.tex
  1585. [5]    Running                textedit overview.tex
  1586. [6]    Running                textedit perl.tex
  1587. [7]    Running                textedit shell.tex
  1588. [8]    Running                textedit sysadm.tex
  1589. [9]    Running                textedit unix.tex
  1590. [10]   Running                textedit x11.tex
  1591. [11] - Running                shelltool
  1592. [15]   Suspended              emacs myfile
  1593.  
  1594. To suspend a program which you are running in the foreground you can type CTRL-z (this is like sending a `kill -18' signal from the keyboard). (5) You can suspend any number of programs and then restart them one at a time using `fg' and `bg'. If you want job 5 to be restarted in the foreground, you would type
  1595.  
  1596. fg %5
  1597.  
  1598. When you have had enough of job 5, you can type CTRL-z to suspend it and then type
  1599.  
  1600. fg %6
  1601.  
  1602. to activate job 6. Provided a job does not want to send output to `stdout', you can restart any job in the background, using a command like.
  1603.  
  1604. bg %4
  1605.  
  1606. This method of working was useful before windows were available. Using `fg' and `bg', you can edit several files or work on several programs without have to quit to move from one to another.
  1607.  
  1608. See also some related commands for batch processing `at', `batch' and `atq', `cron'.
  1609.  
  1610. NOTE: CTRL-c sends a `kill -2' signal, which send a standard interrupt message to a program. This is always a safe way to interrupt a shell command.
  1611. Scripts with arguments
  1612.  
  1613. One of the useful features of the shell is that you can use the normal unix commands to make programs called scripts. To make a script, you just create a file containing shell commands you want to execute and make sure that the first line of the file looks like the following example.
  1614.  
  1615. #!/bin/csh -f
  1616. #
  1617. # A simple script: check for user's mail
  1618. #
  1619. #
  1620.  
  1621. set path = ( /bin /usr/ucb )               # Set the local path
  1622.  
  1623. cd /var/spool/mail                         # Change dir
  1624.  
  1625. foreach uid ( * )
  1626.  
  1627.    echo "$uid has mail in the intray! "    # space prevents an error!
  1628.  
  1629. end
  1630.  
  1631. The sequence `#!/bin/csh' means that the following commands are to be fed into `/bin/csh'. The two symbols `#!' must be the very first two characters in the file. The `-f' option means that your `.cshrc' file is not read by the shell when it starts up. The file containing this script must be executable (see `chmod') and must be in the current path, like all other programs.
  1632.  
  1633. Like C programs, C-shell scripts can accept command line arguments. Suppose you want to make a program to say hello to some other users who are logged onto the system.
  1634.  
  1635. say-hello mark sarah mel
  1636.  
  1637. To do this you need to know the names that were typed on the command line. These names are copied into an array in the C-shell called the argument vector, or `argv'. To read these arguments, you just treat `argv' as an array.
  1638.  
  1639. #!/bin/csh -f
  1640. #
  1641. # Say hello
  1642. #
  1643.  
  1644. foreach name ( $argv )
  1645.  
  1646.    echo Saying hello to $name
  1647.    echo "Hello from $user! " | write $name
  1648.  
  1649. end
  1650.  
  1651. The elements of the array can be referred to as `argv[1]'..`argv[$#argv]' as usual. They can also be referred to as `$1'..`$3' upto the last acceptable number. This makes C-shell compatible with the Bourne shell as far as arguments are concerned. One extra flourish in this method is that you can also refer to the name of the program itself as `$0'. For example,
  1652.  
  1653. #!/bin/csh -f
  1654.  
  1655. echo This is program $0 running for $user
  1656.  
  1657. `$argv' represents all the arguments. You can also use `$*' from the Bourne shell.
  1658. Sub-shells ()
  1659.  
  1660. The C-shell does not allow you to define subroutines or functions, but you can create a local shell, with its own private variables by enclosing commands in parentheses.
  1661.  
  1662. #!/bin/csh
  1663.  
  1664. cd /etc
  1665.  
  1666. ( cd /usr/bin; ls * ) > myfile
  1667.  
  1668. pwd
  1669.  
  1670. This program changes the working directory to /etc and then executes a subshell which inside the brackets changes directory to /usr/bin and lists the files there. The output of this private shell are sent to a file `myfile'. At the end we print out the current working directory just to show that the `cd' command in brackets had no effect on the main program.
  1671.  
  1672. Normally both parentheses must be on the same line. If a subshell command line gets too long, so that the brackets are not on the same line, you have to use backslash characters to continue the lines,
  1673.  
  1674.  
  1675. (  command \
  1676.   command \
  1677.   command \
  1678. )
  1679.  
  1680. Tests and conditions
  1681.  
  1682. No programming language would be complete without tests and loops. C-shell has two kinds of decision structure: the `if..then..else' and the `switch' structure. These are closely related to their C counterparts. The syntax of these is
  1683.  
  1684. if (condition) command
  1685.  
  1686. if (condition) then
  1687.   command
  1688.   command..
  1689. else
  1690.   command
  1691.   command..
  1692. endif
  1693.  
  1694.  
  1695. switch (string)
  1696.  
  1697.  case one:
  1698.              commands
  1699.              breaksw
  1700.  
  1701.  case two:
  1702.              commands
  1703.              breaksw
  1704.  
  1705.  ...
  1706.  
  1707. endsw
  1708.  
  1709. In the latter case, no commands should appear on the same line as a `case' statement, or they will be ignored. Also, if the `breaksw' commands are omitted, then control flows through all the commands for case 2, case 3 etc, exactly as it does in the C programming language.
  1710.  
  1711. We shall consider some examples of these statements in a moment, but first it is worth listing some important tests which can be used in `if' questions to find out information about files.
  1712.  
  1713. `-r file'
  1714.    True if the file exists and is readable
  1715. `-w file'
  1716.     True if the file exists and is writable
  1717. `-x file'
  1718.    True if the file exists and is executable
  1719. `-e file'
  1720.     True if the file simply exists
  1721. `-z file'
  1722.    True if the file exists and is empty
  1723. `-f file'
  1724.     True if the file is a plain file
  1725. `-d file'
  1726.    True if the file is a directory
  1727.  
  1728. We shall also have need of the following comparision operators.
  1729.  
  1730. `=='
  1731.     is equal to (string comparison)
  1732. `!='
  1733.    is not equal to
  1734. `>'
  1735.     is greater than
  1736. `<'
  1737.    is less than
  1738. `>='
  1739.     is greater than or equal to
  1740. `<='
  1741.    is less than or equal to
  1742. `=~'
  1743.     matches a wildcard
  1744. `!~'
  1745.    does not match a wildcard
  1746.  
  1747. The simplest way to learn about these statements is to use them, so we shall now look at some examples.
  1748.  
  1749. #!/bin/csh -f
  1750. #
  1751. #  Safe copy from <arg[1]> to <arg[2]>
  1752. #
  1753. #
  1754.  
  1755. if ($#argv != 2) then
  1756.  
  1757.  echo "Syntax: copy <from-file> <to-file>"
  1758.  exit 0
  1759.  
  1760. endif
  1761.  
  1762. if ( -f $argv[2] ) then
  1763.  
  1764.   echo "File exists. Copy anyway?"
  1765.  
  1766.   switch ( $< )                     # Get a line from user
  1767.  
  1768.      case y:
  1769.               breaksw
  1770.  
  1771.      default:
  1772.               echo "Doing nothing!"
  1773.               exit 0
  1774.  
  1775.   endsw
  1776.  
  1777. endif
  1778.  
  1779. echo -n "Copying $argv[1] to $argv[2]..."
  1780. cp $argv[1] $argv[2]
  1781. echo done
  1782.  
  1783. endif
  1784.  
  1785. This script tries to copy a file from one location to another. If the user does not type exactly two arguments, the script quits with a message about the correct syntax. Otherwise it tests to see whether a plain file has the same name as the file the user wanted to copy to. If such a file exists, it asks the user if he/she wants to continue before proceding to copy.
  1786. Switch example: configure script
  1787.  
  1788. Here is another example which compiles a software package. This is a problem we shall return to later See section Make. The problem this script tries to address is the following. There are many different versions of UNIX and they are not exactly compatible with one another. The program this file compiles has to work on any kind of UNIX, so it tries first to determine what kind of UNIX system the script is being run on by calling `uname'. Then it defines a variable `MAKE' which contains the path to the `make' program which will build software. The make program reads a file called `Makefile' which contains instructions for compiling the program, but this file needs to know the type of UNIX, so the script first copies a file `Makefile.src' using `sed' replace a dummy string with the real name of the UNIX. Then it calls make and sets the correct permission on the file using `chmod'.
  1789.  
  1790. #!/bin/csh -f
  1791. #################################################
  1792. #
  1793. #
  1794. # CONFIGURE Makefile AND BUILD software
  1795. #
  1796. #
  1797. #################################################
  1798.  
  1799. set NAME = ( `uname -r -s` )
  1800.  
  1801. switch ($NAME[1])
  1802.  
  1803.    case SunOS*:
  1804.                   switch ($NAME[2])
  1805.  
  1806.                      case 4*:
  1807.                               setenv TYPE SUN4
  1808.                               setenv MAKE /bin/make
  1809.                               breaksw
  1810.                      case 5*:
  1811.                               setenv TYPE SOLARIS
  1812.                               setenv MAKE /usr/ccs/bin/make
  1813.                               breaksw
  1814.  
  1815.                   endsw
  1816.                   breaksw
  1817.  
  1818.    case ULTRIX*:
  1819.                   setenv TYPE ULTRIX
  1820.                   setenv MAKE /bin/make
  1821.                   breaksw
  1822.    case HP-UX*:
  1823.                   setenv TYPE HPuUX
  1824.                   setenv MAKE /bin/make
  1825.                   breaksw
  1826.    case AIX*:
  1827.                   setenv TYPE AIX
  1828.                   setenv MAKE /bin/make
  1829.                   breaksw
  1830.  
  1831.    case OSF*:
  1832.                   setenv TYPE OSF
  1833.                   setenv MAKE /bin/make
  1834.                   breaksw
  1835.    case IRIX*:
  1836.                   setenv TYPE IRIX
  1837.                   setenv MAKE /bin/make
  1838.                   breaksw
  1839.  
  1840.    default:
  1841.                   echo Unknown architecture $NAME[1]
  1842.  
  1843. endsw
  1844.  
  1845.  # Generate Makefile from source file
  1846.  
  1847. sed s/HOSTTYPE/$TYPE/ Makefile.src > Makefile
  1848.  
  1849. echo "Making software. Type CTRL-C to abort and edit Makefile"
  1850.  
  1851. $MAKE software         # call make to build program
  1852. chmod 755 software     # set correct protection
  1853.  
  1854. Loops in csh
  1855.  
  1856. The C-shell has three loop structures: `repeat', `while' and `foreach'. We have already seen some examples of the `foreach' loop.
  1857.  
  1858. The structure of these loops is as follows
  1859.  
  1860. repeat number-of-times command
  1861.  
  1862. while ( test expression )
  1863.  
  1864.    commands
  1865.  
  1866. end
  1867.  
  1868. foreach  control-variable  ( list-or-array )
  1869.  
  1870.    commands
  1871.  
  1872. end
  1873.  
  1874. The commands `break' and `continue' can be used to break out of the loops at any time. Here are some examples.
  1875.  
  1876. repeat 2 echo "Yo!" | write mark
  1877.  
  1878. This sends the message "Yo!" to mark's terminal twice.
  1879.  
  1880. repeat 5 echo `echo "Shutdown time! Log out now" | wall ; sleep 30` ; halt
  1881.  
  1882. This example repeats the command `echo Shutdown time...' five times at 30 second intervals, before shutting down the system. Only the superuser can run this command! Note the strange construction with `echo echo'. This is to force the repeat command to take two shell commands as an argument. (Try to explain why this works for yourself.)
  1883.  
  1884. Input from the user
  1885.  
  1886.  
  1887. # Test a user response
  1888.  
  1889. echo "Answer y/n (yes or no)"
  1890.  
  1891. set valid = false
  1892.  
  1893. while ( $valid == false )
  1894.  
  1895.   switch ( $< )
  1896.  
  1897.      case y:
  1898.              echo "You answered yes"
  1899.              set valid = true
  1900.              breaksw
  1901.  
  1902.      case n:
  1903.              echo "You answered no"
  1904.              set valid = true
  1905.              breaksw
  1906.  
  1907.      default:
  1908.              echo "Invalid reponse, try again"
  1909.              breaksw
  1910.  
  1911.   endsw
  1912.  
  1913. end
  1914.  
  1915. Notice that it would have been simpler to replace the two lines
  1916.  
  1917.  set valid = true
  1918.  breaksw
  1919.  
  1920. by a single line `break'. `breaksw' jumps out of the switch construction, after which the `while' test fails. `break' jumps out of the entire while loop.
  1921.  
  1922. Extracting parts of a pathname
  1923.  
  1924. A path name consists of a number of different parts:
  1925.  
  1926.    The path to the directory where a file is held.
  1927.    The name of the file itself.
  1928.    The file extension (after a dot).
  1929.  
  1930. By using one of the following modifiers, we can extract these different elements.
  1931.  
  1932. `:h'
  1933.     The path to the file
  1934. `:t'
  1935.    The filename itself
  1936. `:e'
  1937.     The file extension
  1938. `:r'
  1939.    The complete file-path minus the file extension
  1940.  
  1941. Here are some examples and the results:
  1942.  
  1943. set f = ~/progs/c++/test.C
  1944.  
  1945. echo $f:h    
  1946.  
  1947. /home/mark/progs/c++
  1948.  
  1949. echo $f:t    
  1950.  
  1951.  test.C
  1952.  
  1953. echo $f:e    
  1954.  
  1955.   C
  1956. echo $f:r    
  1957.  
  1958.   /home/mark/progs/c++/test
  1959.  
  1960. Arithmetic
  1961.  
  1962. Before using these features in a real script, we need one more possibility: numerical addition, subtraction and multiplication etc.
  1963.  
  1964. To tell the C-shell that you want to perform an operation on numbers rather than strings, you use the `@' symbol followed by a space. Then the following operations are possible.
  1965.  
  1966. @ var = 45                      # Assign a numerical value to var
  1967. echo $var                       # Print the value
  1968.  
  1969. @ var = $var + 34               # Add 34 to var
  1970. @ var += 34                     # Add 34 to var
  1971.  
  1972. @ var -= 1                      # subtract 1 from var
  1973. @ var *= 5                      # Multiply var by 5
  1974.  
  1975. @ var /= 3                      # Divide var by 3 (integer division)
  1976. @ var %= 3                      # Remainder after dividing var by 3
  1977.  
  1978. @ var++                         # Increment var by 1
  1979. @ var--                         # Decrement var by 1
  1980.  
  1981. @ array[1] = 5                  # Numerical array
  1982.  
  1983. @ logic = ( $x > 6 && $x < 10)  # AND
  1984. @ logic = ( $x > 6 || $x < 10)  # OR
  1985. @ false = ! $var                # Logical NOT
  1986.  
  1987. @ bits = ( $x | $y )            # Bitwise OR
  1988. @ bits = ( $x ^ $y )            # Bitwise XOR
  1989. @ bits = ( $x & $y )            # Bitwise AND
  1990.  
  1991. @ shifted = ( $var >> 2 )       # Bitwise shift right
  1992. @ back    = ( $var << 2 )       # Bitwise shift left
  1993.  
  1994. These operators are precisely those found in the C programming language.
  1995.  
  1996. Examples
  1997.  
  1998. The following script uses the operators in the last two sections to take a list of files with a given file extension (say `.doc') and change it for another (say `.tex'). This is a partial solution to the limitation of not being able to do multiple renames in shell.
  1999.  
  2000. #!/bin/csh -f
  2001. #############################################################
  2002. #
  2003. # Change file extension for multiple files
  2004. #
  2005. #############################################################
  2006.  
  2007. if ($#argv < 2) then
  2008.   echo Syntax: chext oldpattern newextension
  2009.   echo "e.g: chext *.doc tex "
  2010.   exit 0
  2011. endif
  2012.  
  2013. mkdir /tmp/chext.$user                 # Make a scratch area
  2014.  
  2015. set newext="$argv[$#argv]"             # Last arg is new ext
  2016. set oldext="$argv[1]:e"
  2017.  
  2018. echo "Old extenstion was ($oldext)""
  2019. echo "New extension ($newext) -- okay? (y/n)"
  2020.  
  2021. switch( $< )
  2022.  
  2023.   case y:
  2024.           breaksw
  2025.   default:
  2026.           echo "Nothing done."
  2027.           exit 0
  2028. endsw
  2029.  
  2030. ##############################################################
  2031. # Remove the last file extension from files
  2032. ##############################################################
  2033.  
  2034. i = 0
  2035.  
  2036. foreach file ($argv)
  2037.  
  2038.   i++
  2039.  if ( $i == $#argv ) break
  2040.  cp $file /tmp/chext.$user/$file:r        # temporary store
  2041.  
  2042. end
  2043.  
  2044. ###############################################################
  2045. # Add .newext file extension to files
  2046. ###############################################################
  2047.  
  2048. set array = (`ls /tmp/chext.$user`)
  2049.  
  2050. foreach file ($array)
  2051.  
  2052.  if ( -f $file.$newext ) then
  2053.    echo  destination file $file.$newext exists. No action taken.
  2054.    continue
  2055.  endif
  2056.  
  2057.  cp /tmp/chext.$user/$file $file.$newext
  2058.  rm $file.$oldext
  2059.  
  2060. end
  2061.  
  2062. rm -r /tmp/chext.$user
  2063.  
  2064. Here is another example to try to decipher. Use the manual pages to find out about `awk'. This script can be written much more easily in Perl or C, as we shall see in the next chapters. It is also trivially implemented as a script in the system administration language cfengine.
  2065.  
  2066. #!/bin/csh -f
  2067. ###########################################################
  2068. #
  2069. # KILL all processes owned by $argv[1] with PID > $argv[2]
  2070. #
  2071. ###########################################################
  2072.  
  2073.  
  2074. if ("`whoami`" != "root") then
  2075.  echo Permission denied
  2076.  exit 0
  2077. endif
  2078.  
  2079. if ( $#argv < 1 || $#argv > 2 ) then
  2080.  echo Usage: KILL username lowest-pid
  2081.  exit 0
  2082. endif
  2083.  
  2084. if ( $argv[1] == "root") then
  2085.  echo No! Too dangerous -- system will crash
  2086.  exit 0
  2087. endif
  2088.  
  2089. ############################################################
  2090. # Kill everything
  2091. ############################################################
  2092.  
  2093. if ( $#argv == 1 ) then
  2094.  
  2095.  set killarray = ( `ps aux |  awk '{ if ($1 == user) \
  2096. {printf "%s ",$2}}' user=$argv[1]` )
  2097.  
  2098.  foreach process ($killarray)
  2099.  
  2100.     kill -1 $process
  2101.     kill -15 $process > /dev/null
  2102.     kill -9 $process > /dev/null
  2103.  
  2104.     if ("`kill -9 $process | egrep -e 'No such process'`" == "") then
  2105.        echo "Warning - $process would not die - try again"
  2106.     endif
  2107.  end
  2108.  
  2109. #############################################################
  2110. # Start from a certain PID
  2111. #############################################################
  2112.  
  2113. else if ( $#argv == 2 ) then
  2114.  
  2115.  set killarray = ( `ps aux |  awk '{ if ($1 == user && $2 > uid) \
  2116. {printf "%s ",$2}}' user=$argv[1] uid=$argv[2]` )
  2117.  
  2118.  foreach process ($killarray)
  2119.  
  2120.     kill -1 $process > /dev/null
  2121.     kill -15 $process
  2122.     sleep 2
  2123.     kill -9 $process > /dev/null
  2124.  
  2125.     if ("`kill -9 $process | egrep -e 'No such process'`" == "") then
  2126.        echo "Warning - $process would not die - try again"
  2127.     endif
  2128.  end
  2129.  
  2130. endif
  2131.  
  2132. This program would be better written in C or Perl.
  2133. Bourne shell
  2134.  
  2135. Programmers who are used to C or C++ often find it easier to program in C-shell because there are strong similarities between the two. The Bourne shell is somewhat different in style, but is structured in a way which makes it better suited to more complicated script writing, especially for system administrators. Also it is closer to the kernels own exec mechanism. The Bourne shell allows subroutines and default values for parameters. Most of the system scripts in UNIX are written in the Bourne shell.
  2136.  
  2137. The principles of the Bourne shell are largely the same as those for the C-shell, so we shall skip fairly quickly through the details. Historically, the Bourne shell came before the C shell.
  2138.  
  2139. .profile
  2140.  
  2141. The `.profile' file is the Bourne shell's answer to `.cshrc'. This file is read by interactive `/bin/sh' shells on starting up. On Sun systems the file `/etc/profile' is also read. On `HPUX' machines, the file `/etc/src.sh' is read.
  2142.  
  2143. Variables and export
  2144.  
  2145. Local and global variables are both defined using the syntax
  2146.  
  2147. VARIABLE="Some string"
  2148. VAR=13
  2149.  
  2150. It is important that there be no space between the variable and the equals sign. By default these variables are local. To make them global (so that child processes will inherit them) we use the command
  2151.  
  2152. export VARIABLE
  2153.  
  2154. This adds the variable to the process environment. It is the analogue of making `environment variables' with setenv in C shell. The command
  2155.  
  2156. set -a
  2157.  
  2158. changes the default so that all variables, after the command are created global.
  2159.  
  2160. Arrays or lists are often simulated in shell by sandwiching the colon `:' symbol between items
  2161.  
  2162. PATH=/bin:/usr/bin:/etc:/local/bin:.
  2163.  
  2164. LD_LIBARAY_PATH=/usr/lib:/usr/openwin/lib:/local/lib
  2165.  
  2166. but there is no real facility for arrays in the Bourne shell. Note that the UNIX `cut' command can be used to extract the elements of the list. Loops can also read such lists directly See section Loops in sh. A Perl script can also be used.
  2167.  
  2168. The value of a variable is given by the dollar symbol as in C-shell. It is also possible to use curly braces around the variable name to `protect' the variable from interfering text. For example:
  2169.  
  2170. $ animal=worm  
  2171. $ echo book$animal
  2172. bookworm
  2173. $ thing=book
  2174. $ echo $thingworm
  2175.                       (nothing..)
  2176. $ echo ${thing}worm
  2177. bookworm
  2178.  
  2179. Default values can be given to variables in the Bourne shell. The following commands illustrate this.
  2180.  
  2181. echo ${var-"No value set"}
  2182. echo ${var="Octopus"}
  2183. echo ${var+"Forced value"}
  2184. echo ${var?"No such variable"}
  2185.  
  2186. The first of these prints out the contents of `$var', if it is defined. If it is not defined the variable is substituted for the string "No value set". The value of `var' is not changed by this operation. It is only for convenience.
  2187.  
  2188. The second command has the same effect as the first, but here the value of `$var' is actually changed to "Octopus" if `$var' is not set.
  2189.  
  2190. The third version is slightly peculiar. If `$var' is already set, its value will be forced to be "Forced value", otherwise it is left undefined.
  2191.  
  2192. Finally the last instance issues an error message "No such variable" if `$var' is not defined.
  2193.  
  2194. Stdin, stdout and stderr
  2195.  
  2196. In the Bourne shell, the standard input/output files are referred to by numbers rather than by names.
  2197.  
  2198. stdin
  2199.    File number 0
  2200. stdout
  2201.    File number 1
  2202. stderr
  2203.    File number 2
  2204.  
  2205. The default routes for these files can be changed by redirection. The redirection commands are more complicated than in C-shell, but they are also more flexible. Here is a comparison.
  2206.  
  2207. sh                           csh               Description
  2208.  
  2209. command > file            command > file      Stdout to file
  2210. command 1> file           command > file      Stdout to file
  2211. command 2> errs            (No analogue)      Stderr only to file errs
  2212. command 1> file 2>&1      command >& file     stdout and stderr to file
  2213. command 1> file 2> errs    (No analogue)      stdout to file, stderr to errs
  2214.  
  2215. Arithmetic in sh
  2216.  
  2217. Arithmetic is performed entirely `by proxy'. There are no internal arithmetic operators as in the C-shell. To evaluate an expression we call the `expr' command or the `bc' precision calculator. Here are some examples of `expr'
  2218.  
  2219. a=`expr $a+1`                 # increment a
  2220. a=`expr 4 + 10 \* 5`          # 4+10*5
  2221. check = `expr $a \> $b`       # true=1, false=0. True if $a > $b
  2222.  
  2223. `expr' is very sensitive to spaces and backslash characters.
  2224.  
  2225. Scripts and arguments
  2226.  
  2227. Scripts are created by making an executable file which begins with the sequence of characters
  2228.  
  2229. #!/bin/sh
  2230.  
  2231. Although we didn't discuss it before, this construction is quite general: any executable file which begins with a sequence
  2232.  
  2233. #!myprogram -option
  2234.  
  2235. will cause the shell to attempt to execute
  2236.  
  2237. myprogam -option filename
  2238.  
  2239. where filename is the name of the file.
  2240.  
  2241. If a script is to accept arguments then these can be referred to as ` $1 $2 $3..$9'. There is a logical limit of nine arguments to a script, but in practice it is possibile to get around this limitation. `$0' is the name of the script itself.
  2242.  
  2243. Here is a simple script in the Bourne shell which prints out all its arguments.
  2244.  
  2245.  
  2246. #!/bin/sh
  2247. #
  2248. # Print all arguments (version 1)
  2249. #
  2250.  
  2251. for arg in $*
  2252. do
  2253.  echo Argument $arg
  2254. done
  2255.  
  2256. echo Total number of arguments was $#
  2257.  
  2258. The `$*' symbol stands for the entire list of arguments (like `$argv' in C-shell) and `$#' is the total number of arguments (like `$#argv' in C-shell).
  2259.  
  2260. Another way of achieving the same is to use the `shift' command. We shall meet this again in the Perl programming language. `shift' takes the first argument from the argument list and deletes it, moving all of the other arguments down one number -- this is how we can handle long lists of arguments in `sh'.
  2261.  
  2262. #!/bin/sh
  2263. #
  2264. #  Print all arguments (version 2)
  2265. #
  2266.  
  2267. while ( true )
  2268. do
  2269.  arg=$1;
  2270.  shift;
  2271.  echo $arg was an argument;
  2272.  if [ $# -eq 0 ]; then
  2273.    break
  2274.  fi
  2275. done
  2276.  
  2277. Return codes
  2278.  
  2279. All programs which execute in UNIX return a value through the C `return' command. There is a convention that a return value of zero (0) means that everything went well, whereas any other value implies that some error occurred. The return value is usually the value returned in `errno', the extenal error variable in C.
  2280.  
  2281. Shell scripts can test for these values either by placing the command directly inside an `if' test, or by testing the variable `$?' which is always set to the return code of the last command. Some examples are given following the next two sections.
  2282.  
  2283. Tests and conditionals
  2284.  
  2285. The Bourne shell has the usual array of tests. They are written as follows. Notice that `test' is itself not a part of the shell, but is a program which works out conditions and provides a return code. See the manual page on `test' for more details.
  2286.  
  2287. test -f file
  2288.    True if the file is a plain file
  2289. test -d file
  2290.    True if the file is a directory
  2291. test -r file
  2292.    True if the file is readable
  2293. test -w file
  2294.    True if the file is writable
  2295. test -x file
  2296.    True if the file is executable
  2297. test -s file
  2298.    True if the file contains something
  2299. test -g file
  2300.    True if setgid bit is set
  2301. test -u file
  2302.    True if setuid bit is set
  2303. test s1 = s2
  2304.    True if strings s1 and s2 are equal
  2305. test s1 != s2
  2306.    True if strings s1 and s2 are unequal
  2307. test x -eq y
  2308.    True if the integers x and y are numerically equal
  2309. test x -ne y
  2310.    True if integers are not equal
  2311. test x -gt y
  2312.    True if x is greater than y
  2313. test x -lt y
  2314.    True if x is less than y
  2315. test x -ge y
  2316.    True if x>=y
  2317. test x -le y
  2318.    True if x <= y
  2319. !
  2320.    Logical NOT operator
  2321. -a
  2322.    Logical AND
  2323. -o
  2324.    Logical OR
  2325.  
  2326. Note that an alternate syntax for writing these commands if to use the square brackets, instead of writing the word test.
  2327.  
  2328. [ $x -lt $y ]   "=="    test $x -lt $y
  2329.  
  2330. The conditional structures have the following syntax.
  2331.  
  2332. if unix-command
  2333. then
  2334.   command
  2335. else
  2336.   commands
  2337. fi
  2338.  
  2339. The `else' clause is, of course, optional. As noted before, the first unix command could be anything, since every command has a return code. The result is TRUE if it evaluates to zero and false otherwise (in contrast to the conventions in most languages). Multiple tests can be made using
  2340.  
  2341. if unix-command
  2342. then
  2343.   commands
  2344. elif unix-command
  2345. then
  2346.   commands
  2347. elif unix-command
  2348. then
  2349.   commands
  2350. else
  2351.   commands
  2352. fi
  2353.  
  2354. where `elif' means `else-if'.
  2355.  
  2356. The equivalent of the C-school's `switch' statement is a more Pascal-like `case' structure.
  2357.  
  2358. case unix-command-or-variable in
  2359.  
  2360.   wildcard1) commands ;;
  2361.   wildcard2) commands ;;
  2362.   wildcard3) commands ;;
  2363.  
  2364. esac
  2365.  
  2366. This structure uses the wildcards to match the output of the command or variable in the first line. The first pattern which matches gets executed.
  2367.  
  2368. Input from the user in sh
  2369.  
  2370. In shell you can read the value of a variable using the `read' command, with syntax
  2371.  
  2372. read variable
  2373.  
  2374. This reads in a string from the keyboard and terminates on a newline character. Another way to do this is to use the `input' command to access a particular logical device. The keyboard device in the current terminal is `/dev/tty', so that one writes
  2375.  
  2376. variable = `line < /dev/tty`
  2377.  
  2378. which fetches a single line from the user.
  2379.  
  2380. Here are some examples of these commands. First a program which asks yes or no...
  2381.  
  2382. #!/bin/sh
  2383. #
  2384. # Yes or no
  2385. #
  2386.  
  2387. echo "Please answer yes or no: "
  2388.  
  2389. answer=`line < /dev/tty`
  2390.  
  2391. case $answer in
  2392.  
  2393.  y* | Y* | j* | J* )  echo YES!! ;;
  2394.  
  2395.  n* | N* )            echo NO!! ;;
  2396.  
  2397.  *)                   echo "Can't you answer a simple question?"
  2398.  
  2399. esac
  2400.  
  2401. echo The end
  2402.  
  2403. Notice the use of pattern matching and the `|' `OR' symbol.
  2404.  
  2405. #!/bin/sh
  2406. #
  2407. # Kernel check
  2408. #
  2409.  
  2410. if test ! -f /vmunix          # Check that the kernel is there!
  2411. then
  2412.   echo "This is not BSD unix...hmmm"
  2413.   if [ -f /hp-ux ]
  2414.   then
  2415.      echo "It's a Hewlett Packard machine!"
  2416.   fi
  2417. elif [ -w /vmunix ]
  2418. then
  2419.   echo "HEY!! The kernel is writable my me!";
  2420. else
  2421.   echo "The kernel is write protected."
  2422.   echo "The system is safe from me today."
  2423. fi
  2424.  
  2425. Loops in sh
  2426.  
  2427. The loop structures in the Bourne shell have the following syntax.
  2428.  
  2429. while unix-command
  2430. do
  2431.  commands
  2432. done
  2433.  
  2434. The first command will most likely be a test but, as before, it could in principle be any UNIX command. The `until' loop, reminiscent of BCPL, carries out a task until its argument evaluates to TRUE.
  2435.  
  2436. until unix-command
  2437. do
  2438.    commands
  2439. done
  2440.  
  2441. Finally the `for' structure has already been used above.
  2442.  
  2443. for variable in list
  2444. do
  2445.   commands
  2446. done
  2447.  
  2448. Often we want to be able to use an array of values as the list which for parses, but Bourne shell has no array variables. This problem is usually solved by making a long string separated by, for example, colons. For example, the $PATH variable has the form
  2449.  
  2450. PATH = /usr/bin:/bin:/local/gnu/bin
  2451.  
  2452. Bourne shell allows us to split such a string on whatever character we wish. Normally the split is made on spaces, but the variable `IFS' can be defined with a replacement. To make a loop over all directories in the command path we would therefore write
  2453.  
  2454. IFS=:
  2455.  
  2456. for name in $PATH; do
  2457.  
  2458.   commands
  2459.  
  2460. done
  2461.  
  2462. The best way to gain experience with these commands is through some examples.
  2463.  
  2464. #!/bin/sh
  2465. #
  2466. # Get text from user repeatedly
  2467. #
  2468.  
  2469. echo "Type away..."
  2470.  
  2471. while read TEXT
  2472. do
  2473.  
  2474.   echo You typed $TEXT
  2475.  
  2476.   if [ "$TEXT" = "quit" ]; then
  2477.      echo "(So I quit!)"
  2478.      exit 0
  2479.   fi
  2480.  
  2481. done
  2482.  
  2483. echo "HELP!"
  2484.  
  2485. This very simple script is a typical use for a while-loop. It gets text repeatedly until the user type `quit'. Since read never returns `false' unless an error occurs or it detects an EOF (end of file) character CTRL-D, it will never exit without some help from an `if' test. If it does receive a CTRL-D signal, the script prints `HELP!'.
  2486.  
  2487. #!/bin/sh
  2488. #
  2489. # Watch in the background for a particular user
  2490. # and give alarm if he/she logs in
  2491. #
  2492. # To be run in the background, using &
  2493. #
  2494.  
  2495. if [ $# -ne 1 ]; then
  2496.  echo "Give the name of the user as an argument" > /dev/tty
  2497.  exit 1
  2498. fi
  2499.  
  2500. echo "Looking for $1"
  2501.  
  2502. until users | grep -s $1
  2503. do
  2504.   sleep 60
  2505. done
  2506.  
  2507. echo "!!! WAKE UP !!!" > /dev/tty
  2508. echo "User $1 just logged in" > /dev/tty
  2509.  
  2510. This script uses `grep' in `silent mode' (-s option). i.e. grep never writes anything to the terminal. The only thing we are interested in is the return code the piped command produces. If `grep' detects a line containing the username we are interested in, then the result evaluates to TRUE and the sleep-loop exits.
  2511.  
  2512. Our final example is the kind of script which is useful for a system administrator. It transfers over the Network Information Service database files so that a slave server is up to date. All we have to do is make a list of the files and place it in a `for' loop. The names used below are the actual names of the NIS maps, well known to system administrators.
  2513.  
  2514. #!/bin/sh
  2515. #
  2516. # Update the NIS database maps on a client server. This program
  2517. # shouldn't have to be run, but sometimes things go wrong and we
  2518. # have to force a download from the main sever.
  2519. #
  2520. PATH=/etc/yp:/usr/etc/yp:$PATH
  2521.  
  2522. MASTER=myNISserver
  2523.  
  2524. for map in auto.direct auto.master ethers.byaddr ethers.byname\
  2525.           group.bygid group.byname hosts.byaddr hosts.byname\
  2526.           mail.aliases netgroup.byhost netgroup.byuser netgroup\
  2527.           netid.byname networks.byaddr networks.byname passwd.byname\
  2528.           passwd.byuid priss.byname protocols.byname protocols.bynumber\
  2529.           rpc.bynumber services.byname services usenetgroups.byname;
  2530. do
  2531.  ypxfr $1 -h $MASTER $map
  2532. done
  2533.  
  2534. Procedures and traps
  2535.  
  2536. One of the worthy features of the Bourne shell is that it allows you to define subroutines or procedures. Subroutines work just like subroutines in any other programming language. They are executed in same shell (not as a sub-process).
  2537.  
  2538. Here is an interesting program which demonstrates two useful things at the same time. First of all, it shows how to make a hierachical subroutine structure using the Bourne shell. Secondly, it shows how the `trap' directive can be used to trap signals, so that Bourne shell programs can exit safely when they are killed or when CTRL-C is typed.
  2539.  
  2540. #!/bin/sh
  2541. #
  2542. #  How to make a signal handler in Bourne Shell
  2543. #  using subroutines
  2544. #
  2545.  
  2546. #####################################################
  2547. # Level 2
  2548. #####################################################
  2549.  
  2550. ReallyQuit()
  2551. {
  2552. while true
  2553. do
  2554.  echo "Do you really want to quit?"
  2555.  read answer
  2556.  
  2557.  case $answer in
  2558.  
  2559.     y* | Y* ) return 0;;
  2560.     *)        echo "Resuming..."
  2561.               return 1;;
  2562.  
  2563.  esac
  2564.  
  2565. done
  2566. }
  2567.  
  2568. #####################################################
  2569. # Level 1
  2570. #####################################################
  2571.  
  2572. SignalHandler()
  2573.  
  2574. {
  2575. if ReallyQuit           # Call a function
  2576. then
  2577.   exit 0
  2578. else
  2579.   return 0
  2580. fi
  2581. }
  2582.  
  2583. #####################################################
  2584. # Level 0 : main program
  2585. #####################################################
  2586.  
  2587. trap SignalHandler 2 15  # Trap kill signals 2 and 15
  2588.  
  2589. echo "Type some lines of text..."
  2590.  
  2591. while read text
  2592. do
  2593.  
  2594.   echo "$text - CTRL-C to exit"
  2595.  
  2596. done
  2597.  
  2598. Note that the logical tree structure of this program is upside down (the highest level comes at the bottom). This is because all subroutines must be defined before they are used.
  2599.  
  2600. This example concludes our brief survey of the Bourne shell.
  2601.  
  2602. setuid and setgid scripts
  2603.  
  2604. The superuser `root' is the only privileged user in UNIX. All other users have only restricted access to the system. Usually this is desirable, but sometimes it is a nuisance.
  2605.  
  2606. A setuid script is a script which has its setuid-bit set. When such a script is executed by a user, it is run with all the rights and privileges of the owner of the script. All of the commands in the script are executed as the owner of the file and not with the user-id of the person who ran the script. If the owner of the setuid script is `root' then the commands in the script are run with root privileges!
  2607.  
  2608. Setuid scripts are clearly a touchy security issue. When giving away one's rights to another user (especially those of `root') one is tempting hackers. Setuid scripts should be avoided.
  2609.  
  2610. A setgid program is almost the same, but only the group id is set to that of the owner of the file. Often the effect is the same.
  2611.  
  2612. An example of a setuid program is the `ps' program. `ps' lists all of the processes running in the kernel. In order to do this it needs permission to access the private data structures in the kernel. By making `ps' setgid root, it allows ordinary users to be able to read as much as the writers of `ps' thought fit, but no more.
  2613.  
  2614. Naturally, only the superuser can make a file setuid or setgid root.
  2615.  
  2616. Summary: Limitations of shell programming
  2617.  
  2618. To summarize the last two long and oppressive chapters we shall take a step back from the details and look at what we have achieved.
  2619.  
  2620. The idea behind the shell is to provide a user interface, with access to the system's facilities at a simple level. In the 70's user interfaces were not deisgned to be user-friendly. The UNIX shell is not particularly use friendly, but it is very powerful. Perhaps it would have been enough to provide only commands to allow users to write C programs. Since all of the system functions are available from C, that would certainly allow everyone to do what anything that UNIX can do. But shell programming is much more immediate than C. It is an environment of frequently used tools. Also for quick programming solutions: C is a compiled language, whereas the shell is an interpreter. A quick shell program can solve many problems in no time at all, without having to compile anything.
  2621.  
  2622. Shell programming is only useful for `quick and easy' programs. To use it for anything serious is an abuse. Programming difficult things in shell is clumsy, and it is difficult to get returned-information (like error messages) back in a useful form. Besides, shell scripts are slow compared to real programs since they involve starting a new program for each new command.
  2623.  
  2624. These difficulties are solved partly by Perl, which we shall consider next -- but in the final analysis, real programs of substance need to be written in C. Contrary to popular belief, this is not more difficult than programming in the shell -- in fact, many things are much simpler, because all of the shell commands originated as C functions. The shell is an extra layer of the UNIX onion which we have to battle our way through to get where we're going.
  2625.  
  2626. Sometimes it is helpful to be shielded from low level details -- sometimes it is a hindrance. In the remaining chapters we shall consider more involved programming needs.
  2627.  
  2628. Exercises
  2629.  
  2630.    Write an improved `which' command in C-shell.
  2631.    Make a counter program which records in a file how many times you log in to your account. You can call this in your .cshrc file.
  2632.    Make a Bourne shell script to kill all the processes owned by a particular user. (Note, that if you are not the superuser, you cannot kill processes owned by other users.)
  2633.    Write a script to replace the `rm' command with something safer. Think about a way of implementing `rm' so that it is possible to get deleted files back again in case of emergencies. This is not possible using the normal `rm' command. Hint: save files in a hidden directory `.deleted'. Make your script delete files in the `.deleted' directory if they are older than a week, so that you don't fill up the disk with rubbish.
  2634.    Suppose you have a bunch of files with a particular file-extension: write a script in csh to change the extension to something else. e.g. to change *.C into *.c. Give the old and new extensions as arguments to the script.
  2635.    Write a program in sh to search for files in the current directory which contain a certain string. e.g. search for all files which contain the word "if". Hint: use the "find" command.
  2636.    Use the manual pages to find out about the commands `at', `batch' and `atq'. Test these commands by executing the shell command `date' at some time of your choice. Use the `-m' option so that the result of the job is mailed to you.
  2637.    Write a script in sh or csh to list all of the files bigger than a certain size starting from the current directory, and including all subdirectories. This kind of program is useful for system administrators when a disk becomes full.
  2638.  
  2639. Perl
  2640.  
  2641. So far, we have been looking at shell programming for performing fairly simple tasks. Now let's extend the idea of shell programming to cover more complex tasks like systems programming and network communications. Perl is a language which was designed to retain the immediateness of shell languages, but at the same time capture some of the flexibility of C. Perl is an acronym for Practical extraction and report language. In this chapter, we shall not aim to teach Perl from scratch -- the best way to learn it is to use it! Rather we shall concentrate on demonstrating some principles.
  2642.  
  2643. Sed and awk, cut and paste
  2644.  
  2645. One of the reasons for using Perl is that it is extremely good at textfile handling--one of the most important things for UNIX users, and particularly useful in connection with CGI script processing on the World Wide Web. It has simple built-in constructs for searching and replacing text, storing information in arrays and retrieving them in sorted form. All of the these things have previously been possible using the UNIX shell commands
  2646.  
  2647. sed
  2648. awk
  2649. cut
  2650. paste
  2651.  
  2652. but these commands were designed to work primarily in the Bourne shell and are a bit `awk'ward to use for all but the simplest applications.
  2653.  
  2654. `sed'
  2655.    is a stream editor. It takes command line instructions, reads input from the stream stdin and produces output on stdout according to those instructions. `sed' works line by line from the start of a textfile.
  2656. `awk'
  2657.    is a pattern matching and processing language. It takes a textfile and reads it line by line, matching regular expressions and acting on them. `awk' is powerful enough to have conditional instructions like `if..then..else' and uses C's `printf' construction for output.
  2658. `cut'
  2659.    Takes a line of input and cuts it into fields, separated by some character. For instance, a normal line of text is a string of words separated by spaces. Each word is a different field. `cut' can be used, for instance, to pick out the third column in a table. Any character can be specified as the separator.
  2660. `paste'
  2661.    is the logical opposite of cut. It concatenates @math{n} files, and makes each line in the file into a column of a table. For instance, `paste one two three' would make a table in which the first column consisted of all lines in `one', the second of all lines in `two' and the third of all lines in `three'. If one file is longer than the others, then some columns have blank spaces.
  2662.  
  2663. Perl unifies all of these operations and more. It also makes them much simpler.
  2664.  
  2665. Program structure
  2666.  
  2667. To summarize Perl, we need to know about the structure of a Perl program, the conditional constructs it has, its loops and its variables. In the latest versions of Perl (Perl 5), you can write object oriented programs of great complexity. We shall not go into this depth, for the simple reason that Perl's strength is not as a general programming language but as a specialized language for textfile handling. The syntax of Perl is in many ways like the C programming language, but there are important differences.
  2668.  
  2669.    Variables do not have types. They are interpreted in a context sensitive way. The operators which acts upon variables determine whether a variable is to be considered a string or as an integer etc.
  2670.    Although there are no types, Perl defines arrays of different kinds. There are three different kinds of array, labelled by the symbols `$', `@' and `%'.
  2671.    Perl keeps a number of standard variables with special names e.g. `$_ @ARGV' and `%ENV'. Special attention should be paid to these. They are very important!
  2672.    The shell reverse apostrophe notation `command` can be used to execute UNIX programs and get the result into a Perl variable.
  2673.  
  2674. Here is a simple `structured hello world' program in Perl. Notice that subroutines are called using the `&' symbol. There is no special way of marking the main program -- it is simply that part of the program which starts at line 1.
  2675.  
  2676. #!/local/bin/perl
  2677. #
  2678. # Comments
  2679. #
  2680.  
  2681. &Hello();
  2682. &World;
  2683.  
  2684. # end of main
  2685.  
  2686. sub Hello
  2687.   {
  2688.   print "Hello";
  2689.   }
  2690.  
  2691. sub World
  2692.   {
  2693.   print "World\n";
  2694.   }
  2695.  
  2696. The parentheses on subroutines are optional, if there are no parameters passed. Notice that each line must end in a semi-colon.
  2697.  
  2698. Perl variables
  2699.  
  2700. Scalar variables
  2701.  
  2702. In Perl, variables do not have to be declared before they are used. Whenever you use a new symbol, Perl automatically adds the symbol to its symbol table and initializes the variable to the empty string.
  2703.  
  2704. It is important to understand that there is no practical difference between zero and the empty string in perl -- except in the way that you, the user, choose to use it. Perl makes no distinction between strings and integers or any other types of data -- except when it wants to interpret them. For instance, to compare two variables as strings is not the same as comparing them as integers, even if the string contains a textual representation of an integer. Take a look at the following program.
  2705.  
  2706. #!/local/bin/perl
  2707. #
  2708. # Nothing!
  2709. #
  2710.  
  2711. print "Nothing == $nothing\n";
  2712.  
  2713. print "Nothing is zero!\n" if ($nothing == 0);
  2714.  
  2715. if ($nothing eq "")
  2716.   {
  2717.   print STDERR "Nothing is really nothing!\n";
  2718.   }
  2719.  
  2720. $nothing = 0;
  2721.  
  2722. print "Nothing is now $nothing\n";
  2723.  
  2724. The output from this program is
  2725.  
  2726.  
  2727. Nothing ==
  2728. Nothing is zero!
  2729. Nothing is really nothing!
  2730. Nothing is now 0
  2731.  
  2732. There are several important things to note here. First of all, we never declare the variable `nothing'. When we try to write its value, perl creates the name and associates a NULL value to it i.e. the empty string. There is no error. Perl knows it is a variable because of the `$' symbol in front of it. All scalar variables are identified by using the dollar symbol.
  2733.  
  2734. Next, we compare the value of `$nothing' to the integer `0' using the integer comparison symbol `==', and then we compare it to the empty string using the string comparison symbol `eq'. Both tests are true! That means that the empty string is interpreted as having a numerical value of zero. In fact any string which does not form a valid integer number has a numerical value of zero.
  2735.  
  2736. Finally we can set `$nothing' explicitly to a valid integer string zero, which would now pass the first test, but fail the second.
  2737.  
  2738. As extra spice, this program also demonstrates two different ways of writing the `if' command in perl.
  2739.  
  2740. The default scalar variable.
  2741.  
  2742. The special variable `$_' is used for many purposes in Perl. It is used as a buffer to contain the result of the last operation, the last line read in from a file etc. It is so general that many functions which act on scalar variables work by default on `$_' if no other argument is specified. For example,
  2743.  
  2744. print;
  2745.  
  2746. is the same as
  2747.  
  2748. print $_;
  2749.  
  2750. Array (vector) variables
  2751.  
  2752. The complement of scalar variables is arrays. An array, in Perl is identified by the `@' symbol and, like scalar variables, is allocated and initialized dynamically.
  2753.  
  2754.  
  2755. @array[0] = "This little piggy went to market";
  2756. @array[2] = "This little piggy stayed at home";
  2757.  
  2758. print "@array[0] @array[1] @array[2]";
  2759.  
  2760. The index of an array is always understood to be a number, not a string, so if you use a non-numerical string to refer to an array element, you will always get the zeroth element, since a non-numerical string has an integer value of zero.
  2761.  
  2762. An important array which every program defines is
  2763.  
  2764. @ARGV
  2765.  
  2766. This is the argument vector array, and contains the commands line arguments by analogy with the C-shell variable `$argv[]'.
  2767.  
  2768. Given an array, we can find the last element by using the `$#' operator. For example,
  2769.  
  2770. $last_element = $ARGV[$#ARGV];
  2771.  
  2772. Notice that each element in an array is a scalar variable. The `$#' cannot be interpreted directly as the number of elements in the array, as it can in the C-shell. You should experiment with the value of this quantity -- it often necessary to add 1 or 2 to its value in order to get the behaviour one is used to in the C-shell.
  2773.  
  2774. Perl does not support multiple-dimension arrays directly, but it is possible to simulate them yourself. (See the Perl book.)
  2775.  
  2776. Special array commands
  2777.  
  2778. The `shift' command acts on arrays and returns and removes the first element of the array. Afterwards, all of the elements are shifted down one place. So one way to read the elements of an array in order is to repeatedly call `shift'.
  2779.  
  2780. $next_element=shift(@myarray);
  2781.  
  2782. Note that, if the array argument is omitted, then `shift' works on `@ARGV' by default.
  2783.  
  2784. Another useful function is `split', which takes a string and turns it into an array of strings. `split' works by choosing a character (usually a space) to delimit the array elements, so a string containing a sentence separated by spaces would be turned into an array of words. The syntax is
  2785.  
  2786. @array = split;                       # works with spaces on $_
  2787. @array = split(pattern,string);       # Breaks on pattern
  2788. ($v1,$v2...) = split(pattern,string); # Name array elements with scalars
  2789.  
  2790. In the first of these cases, it is assumed that the variable `$_' is to be split on whitespace characters. In the second case, we decide on what characterthe split is to take place and on what string the function is to act. For instance
  2791.  
  2792. @new_array = split(":","name:passwd:uid:gid:gcos:home:shell");
  2793.  
  2794. The result is a seven element array called `@new_array', where `$new_array[0]' is `name' etc.
  2795.  
  2796. In the final example, the left hand side shows that we wish to capture elements of the array in a named set of scalar variables. If the number of variables on the lefthand side is fewer than the number of strings which are generated on the right hand side, they are discarded. If the number on the left hand side is greater, then the remainder variables are empty.
  2797.  
  2798. Associated arrays
  2799.  
  2800. One of the very nice features of Perl is the ability to use one string as an index to another string in an array. For example, we can make a short encyclopaedia of zoo animals by constructing an associative array in which the keys (or indices) of the array are the names of animals, and the contents of the array are the information about them.
  2801.  
  2802.  
  2803. $animals{"Penguin"} = "A suspicious animal, good with cheese crackers...";
  2804. $animals{"dog"} = "Plays stupid, but could be a cover...";
  2805.  
  2806. if ($index eq "fish")
  2807.   {
  2808.   $animals{$index} = "Often comes in square boxes. Very cold.";
  2809.   }
  2810.  
  2811. An entire associated array is written `%array', while the elements are `$array{$key}'.
  2812.  
  2813. Perl provides a special associative array for every program called `%ENV'. This contains the environment variables defined in the parent shell which is running the Perl program. For example
  2814.  
  2815. print "Username = $ENV{"USER"}\n";
  2816.  
  2817. $ld = "LD_LIBRARY_PATH";
  2818. print "The link editor path is $ENV{$ld}\n";
  2819.  
  2820. To get the current path into an ordinary array, one could write,
  2821.  
  2822. @path_array= split(":",$ENV{"PATH"});
  2823.  
  2824. Array example program
  2825.  
  2826. Here is an example which prints out a list of files in a specified directory, in order of their UNIX protection bits. The least protected file files come first.
  2827.  
  2828. #!/local/bin/perl
  2829. #
  2830. # Demonstration of arrays and associated arrays.
  2831. # Print out a list of files, sorted by protection,
  2832. # so that the least secure files come first.
  2833. #
  2834. # e.g.     arrays <list of words>
  2835. #          arrays *.C
  2836. #
  2837. ############################################################
  2838.  
  2839. print "You typed in ",$#ARGV+1," arguments to command\n";
  2840.  
  2841. if ($#ARGV < 1)  
  2842.   {
  2843.   print "That's not enough to do anything with!\n";
  2844.   }
  2845.  
  2846. while ($next_arg = shift(@ARGV))  
  2847.   {
  2848.   if ( ! ( -f $next_arg || -d $next_arg))
  2849.      {
  2850.      print "No such file: $next_arg\n";
  2851.      next;
  2852.      }
  2853.  
  2854.   ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size) = stat($next_arg);
  2855.   $octalmode = sprintf("%o",$mode & 0777);
  2856.  
  2857.   $assoc_array{$octalmode} .= $next_arg.
  2858.            " : size (".$size."), mode (".$octalmode.")\n";
  2859.   }
  2860.  
  2861. print "In order: LEAST secure first!\n\n";
  2862.  
  2863. foreach $i (reverse sort keys(%assoc_array))
  2864.   {
  2865.   print $assoc_array{$i};
  2866.   }
  2867.  
  2868. Loops and conditionals
  2869.  
  2870. Here are some of the most commonly used decision-making constructions and loops in Perl. The following is not a comprehensive list -- for that, you will have to look in the Perl bible: Programming Perl, by Larry Wall and Randal Schwartz. The basic pattern follows the C programming language quite closely. In the case of the `for' loop, Perl has both the C-like version, called `for' and a `foreach' command which is like the C-shell implementation.
  2871.  
  2872. if (expression)
  2873.    {
  2874.    block;
  2875.    }
  2876. else
  2877.    {
  2878.    block;
  2879.    }
  2880.  
  2881. command if (expression);
  2882.  
  2883. unless (expression)
  2884.    {
  2885.    block;
  2886.    }
  2887. else
  2888.    {
  2889.    block;
  2890.    }
  2891.  
  2892. while (expression)
  2893.    {
  2894.    block;
  2895.    }
  2896.  
  2897. do
  2898.    {
  2899.    block;
  2900.    }
  2901. while (expression);
  2902.  
  2903. for (initializer; expression; statement)
  2904.    {
  2905.    block;
  2906.    }
  2907.  
  2908. foreach variable(array)
  2909.    {
  2910.    block;
  2911.    }
  2912.  
  2913. In all cases, the `else' clauses may be omitted.
  2914.  
  2915. Strangely, perl does not have a `switch' statement, but the Perl book describes how to make one using the features provided.
  2916.  
  2917. The for loop
  2918.  
  2919. The for loop is exactly like that in C or C++ and is used to iterate over a numerical index, like this:
  2920.  
  2921.  
  2922. for ($i = 0; $i < 10; $i++)
  2923.    {
  2924.    print $i, "\n";
  2925.    }
  2926.  
  2927. The foreach loop
  2928.  
  2929. The foreach loop is like its counterpart in the C shell. It is used for reading elements one by one from a regular array. For example,
  2930.  
  2931.  
  2932. foreach $i ( @array )
  2933.    {
  2934.    print $i, "\n";
  2935.    }
  2936.  
  2937. Iterating over elements in arrays
  2938.  
  2939. One of the main uses for `for' type loops is to iterate over successive values in an array. This can be done in two ways which show the essential difference between for and foreach.
  2940.  
  2941. If we want to fetch each value in an array in turn, without caring about numerical indices, the it is simpest to use the foreach loop.
  2942.  
  2943.  
  2944. @array = split(" ","a b c d e f g");
  2945.  
  2946. foreach $var ( @array )
  2947.    {
  2948.    print $var, "\n";
  2949.    }
  2950.  
  2951. This example prints each letter on a separate line. If, on the other hand, we are interested in the index, for the purposes of some calculation, then the for loop is preferable.
  2952.  
  2953.  
  2954. @array = split(" ","a b c d e f g");
  2955.  
  2956. for ($i = 0; $i <= $#array; $i++)
  2957.    {
  2958.    print $array[$i], "\n";
  2959.    }
  2960.  
  2961. Notice that, unlike the for-loop idiom in C/C++, the limit is `$i <= $#array', i.e. `less than or equal to' rather than `less than'. This is because the `$#' operator does not return the number of elements in the array but rather the last element.
  2962.  
  2963. Associated arrays are slightly different, since they do not use numerical keys. Instead they use a set of strings, like in a database, so that you can use one string to look up another. In order to iterate over the values in the array we need to get a list of these strings. The keys command is used for this.
  2964.  
  2965. $assoc{"mark"} = "cool";
  2966. $assoc{"GNU"} = "brave";
  2967. $assoc{"zebra"} = "stripy";
  2968.  
  2969. foreach $var ( keys %assoc )
  2970.    {
  2971.    print "$var , $assoc{$var} \n";
  2972.    }
  2973.  
  2974. The order of the keys is not defined in the above example, but you can choose to sort them alphabetically by writing
  2975.  
  2976. foreach $var ( sort keys %assoc )
  2977.  
  2978. instead.
  2979.  
  2980. Iterating over lines in a file
  2981.  
  2982. Since Perl is about file handling we are very interested in reading files. Unlike C and C++, perl likes to read files line by line. The angle brackets are used for this, See section Files in perl. Assuming that we have some file handle `<file>', for instance `<STDIN>', we can always read the file line by line with a while-loop like this.
  2983.  
  2984.  
  2985. while ($line = <file>)
  2986.    {
  2987.    print $line;
  2988.    }
  2989.  
  2990. Note that $line includes the end of line character on the end of each line. If you want to remove it, you should add a `chop' command:
  2991.  
  2992.  
  2993.  while ($line = <file>)
  2994.     {
  2995.     chop $line;
  2996.     print "line = ($line)\n";
  2997.     }
  2998.  
  2999. Files in perl
  3000.  
  3001. Opening files is straightforward in Perl. Files must be opened and closed using -- wait for it -- the commands `open' and `close'. You should be careful to close files after you have finished with them -- especially if you are writing to a file. Files are buffered and often large parts of a file are not actually written until the `close' command is received.
  3002.  
  3003. Three files are, of course, always open for every program, namely `STDIN', `STDOUT'and `STDERR'.
  3004.  
  3005. Formally, to open a file, we must obtain a file descriptor or file handle. This is done using `open';
  3006.  
  3007. open (file_descrip,"Filename");
  3008.  
  3009. The angular brackets `<..>' are used to read from the file. For example,
  3010.  
  3011. $line = <file_descrip>;
  3012.  
  3013. reads one line from the file associated with `file_descrip'.
  3014.  
  3015. Let's look at some examples of filing opening. Here is how we can implement UNIX's `cut' and `paste' commands in perl:
  3016.  
  3017. #!/local/bin/perl
  3018. #
  3019. # Cut in perl
  3020. #
  3021.  
  3022. # Cut second column
  3023.  
  3024. while (<>)
  3025.   {
  3026.   @cut_array = split;
  3027.  
  3028.   print "@cut_array[1]\n";
  3029.   }
  3030.  
  3031. This is the simplest way to open a file. The empty file descriptor `<>' tells perl to take the argument of the command as a filename and open that file for reading. This is really short for `while($_=<STDIN>)' with the standard input redirected to the named file.
  3032.  
  3033. The `paste'program can be written as follows:
  3034.  
  3035. #!/local/bin/perl
  3036. #
  3037. # Paste in perl
  3038. #
  3039. # Two files only, syntax : paste file 1file2
  3040. #
  3041.  
  3042. open (file1,"@ARGV[0]") || die "Can't open @ARGV[0]\n";
  3043. open (file2,"@ARGV[1]") || die "Can't open @ARGV[1]\n";
  3044.  
  3045. while (($line1 = <file1>) || ($line2 = <file2>))
  3046.    {
  3047.    chop $line1;
  3048.    chop $line2;
  3049.  
  3050.    print "$line1    $line2\n";    # tab character between
  3051.    }
  3052.  
  3053. Here we see more formally how to read from two separate files at the same time. Notice that, by putting the read commands into the test-expression for the `while' loop, we are using the fact that `<..>' returns a non-zero (true) value unless we have reached the end of the file.
  3054.  
  3055. To write and append to files, we use the shell redirection symbols inside the `open' command.
  3056.  
  3057. open(fd,"> filename");    # open file for writing
  3058. open(fd,">> filename");   # open file for appending
  3059.  
  3060. We can also open a pipe from an arbitrary UNIX command and receive the output of that command as our input:
  3061.  
  3062. open (fd,"/bin/ps aux | ");
  3063.  
  3064. A simple perl program
  3065.  
  3066. Let us now write the simplest perl program which illustrates the way in which perl can save time. We shall write it in three different ways to show what the short cuts mean. Let us implement the `cat' command, which copies files to the standard output. The simplest way to write this is perl is the following:
  3067.  
  3068.  
  3069. #!/local/bin/perl
  3070.  
  3071. while (<>)
  3072.    {
  3073.     print;
  3074.    }
  3075.  
  3076. Here we have made heavy use of the many default assumptions which perl makes. The program is simple, but difficult to understand for novices. First of all we use the default file handle <> which means, take one line of input from a default file. This object returns true as long as it has not reached the end of the file, so this loop continues to read lines until it reaches the end of file. The default file is standard input, unless this script is invoked with a command line argument, in which case the argument is treated as a filename and perl attempts to open the argument-filename for reading. The print statement has no argument telling it what to print, but perl takes this to mean: print the default variable `$_'.
  3077.  
  3078. We can therefore write this more explicitly as follows:
  3079.  
  3080.  
  3081. #!/local/bin/perl
  3082.  
  3083. open (HANDLE,"$ARGV[1]");
  3084.  
  3085. while (<HANDLE>)
  3086.   {
  3087.    print $_;
  3088.   }
  3089.  
  3090. Here we have simply filled in the assumptions explicitly. The command `<HANDLE>' now reads a single line from the named file-handle into the default variable `$_'. To make this program more general, we can elimiate the defaults entirely.
  3091.  
  3092.  
  3093. #!/local/bin/perl
  3094.  
  3095. open (HANDLE,"$ARGV[1]");
  3096.  
  3097. while ($line=<HANDLE>)
  3098.   {
  3099.    print $line;
  3100.   }
  3101.  
  3102. == and `eq'
  3103.  
  3104. Be careful to distinguish between the comparison operator for integers `==' and the corresponding operator for strings `eq'. These do not work in each other's places so if you get the wrong comparison operator your program might not work and it is quite difficult to find the error.
  3105.  
  3106. chop
  3107.  
  3108. The command `chop' cuts off the last character of a string. This is useful for removing newline characters when reading files etc. The syntax is
  3109.  
  3110. chop;         # chop $_;
  3111.  
  3112. chop $scalar; # remove last character in $scalar
  3113.  
  3114. Perl subroutines
  3115.  
  3116. Subroutines are indicated, as in the example above, by the ampersand `&' symbol. When parameters are passed to a Perl subroutine, they are handed over as an array called `@_'. Which is analogous to the `$_' variable. Here is a simple example:
  3117.  
  3118. #!/local/bin/perl
  3119.  
  3120. $a="silver";
  3121. $b="gold";
  3122.  
  3123. &PrintArgs($a,$b);
  3124.  
  3125. # end of main
  3126.  
  3127. sub PrintArgs
  3128.  
  3129.   {
  3130.   ($local_a,$local_b) = @_;
  3131.  
  3132.   print "$local_a, $local_b\n";
  3133.   }
  3134.  
  3135. die - exit on error
  3136.  
  3137. When a program has to quit and give a message, the `die' command is normally used. If called without an argument, Perl generates its own message including a line number at which the error occurred. To include your own message, you write
  3138.  
  3139. die "My message....";
  3140.  
  3141. If the string is terminated with a `\n' newline character, the line number of the error is not printed, otherwise Perl appends the line number to your string.
  3142.  
  3143. When opening files, it is common to see the syntax:
  3144.  
  3145. open (filehandle,"Filename") || die "Can't open...";
  3146.  
  3147. The logical `OR' symbol is used, because `open' returns true if all goes well, in which case the right hand side is never evaluated. If `open' is false, then die is executed. You can decide for yourself whether or not you think this is good programming style -- we mention it here because it is common practice.
  3148.  
  3149. The stat() idiom
  3150.  
  3151. The unix library function stat() is used to find out information about a given file. This function is available both in C and in Perl. In perl, it returns an array of values. Usually we are interested in knowing the access permissions of a file. stat() is called using the syntax
  3152.  
  3153.  
  3154. @array = stat ("filename");
  3155.  
  3156. or alternatively, using a named array
  3157.  
  3158.  
  3159. ($device,$inode,$mode) = stat("filename");
  3160.  
  3161. The value returned in the mode variable is a bit-pattern, See section Protection bits. The most useful way of treating these bit patterns is to use octal numbers to interpret their meaning.
  3162.  
  3163. To find out whether a file is readable or writable to a group of users, we use a programming idiom which is very common for dealing with bit patterns: first we define a mask which zeroes out all of the bits in the mode string except those which we are specifically interested in. This is done by defining a mask value in which the bits we want are set to 1 and all others are set to zero. Then we AND the mask with the mode string. If the result is different from zero then we know that all of the bits were also set in the mode string. As in C, the bitwise AND operator in perl is called `&'.
  3164.  
  3165. For example, to test whether a file is writable to other users in the same group as the file, we would write the following.
  3166.  
  3167.  
  3168. $mask = 020;   # Leading 0 means octal number
  3169.  
  3170. ($device,$inode,$mode) = stat("file");
  3171.  
  3172. if ($mode & $mask)
  3173.   {
  3174.   print "File is writable by the group\n";
  3175.   }
  3176.  
  3177. Here the 2 in the second octal number means "write", the fact that it is the second octal number from the right means that it refers to "group". Thus the result of the if-test is only true if that particular bit is true. We shall see this idiom in action below.
  3178.  
  3179. Perl example programs
  3180.  
  3181. The passwd program and `crypt()' function
  3182.  
  3183. Here is a simple implementation of the UNIX `passwd' program in Perl.
  3184.  
  3185. #!/local/bin/perl
  3186. #
  3187. # A perl version of the passwd program.
  3188. #
  3189. # Note - the real passwd program needs to be much more
  3190. # secure than this one. This is just to demonstrate the
  3191. # use of the crypt() function.
  3192. #
  3193. #############################################################
  3194.  
  3195. print "Changing passwd for $ENV{'USER'} on $ENV{'HOST'}\n";
  3196.  
  3197. system 'stty','-echo';
  3198. print "Old passwd: ";
  3199.  
  3200. $oldpwd = <STDIN>;
  3201. chop $oldpwd;
  3202.  
  3203. ($name,$coded_pwd,$uid,$gid,$x,$y,$z,$gcos,$home,$shell)
  3204.                                 = getpwnam($ENV{"USER"});
  3205.  
  3206. if (crypt($oldpwd,$coded_pwd) ne $coded_pwd)
  3207.   {
  3208.   print "\nPasswd incorrect\n";
  3209.   exit (1);
  3210.   }
  3211.  
  3212. $oldpwd = "";                         # Destroy the evidence!
  3213.  
  3214. print "\nNew passwd: ";
  3215.  
  3216. $newpwd = <STDIN>;
  3217.  
  3218. print "\nRepeat new passwd: ";
  3219.  
  3220. $rnewpwd = <STDIN>;
  3221.  
  3222. chop $newpwd;
  3223. chop $rnewpwd;
  3224.  
  3225. if ($newpwd ne $rnewpwd)
  3226.   {
  3227.   print "\n Incorrectly typed. Password unchanged.\n";
  3228.   exit (1);
  3229.   }
  3230.  
  3231. $salt = rand();
  3232. $new_coded_pwd = crypt($newpwd,$salt);
  3233.  
  3234. print "\n\n$name:$new_coded_pwd:$uid:$gid:$gcos:$home:$shell\n";
  3235.  
  3236. Example with `fork()'
  3237.  
  3238. The following example uses the `fork' function to start a daemon which goes into the background and watches the system to which process is using the greatest amount of CPU time each minute. A pipe is opened from the BSD `ps' command.
  3239.  
  3240. #!/local/bin/perl
  3241. #
  3242. # A fork() demo. This program will sit in the background and
  3243. # make a list of the process which uses the maximum CPU average
  3244. # at 1 minute intervals. On a quiet BSD like system this will
  3245. # normally be the swapper (long term scheduler).
  3246. #
  3247.  
  3248. $true = 1;
  3249. $logfile="perl.cpu.logfile";
  3250.  
  3251. print "Max CPU logfile, forking daemon...\n";
  3252.  
  3253. if (fork())
  3254.   {
  3255.   exit(0);
  3256.   }
  3257.  
  3258. while ($true)
  3259.   {
  3260.   open (logfile,">> $logfile") || die "Can't open $logfile\n";
  3261.   open (ps,"/bin/ps aux |") || die "Couldn't open a pipe from ps !!\n";
  3262.  
  3263.   $skip_first_line = <ps>;
  3264.   $max_process = <ps>;
  3265.   close(ps);
  3266.  
  3267.   print logfile $max_process;
  3268.   close(logfile);
  3269.   sleep 60;
  3270.  
  3271.   ($a,$b,$c,$d,$e,$f,$g,$size) = stat($logfile);
  3272.  
  3273.   if ($size > 500)
  3274.      {
  3275.      print STDERR "Log file getting big, better quit!\n";
  3276.      exit(0);
  3277.      }
  3278.   }
  3279.  
  3280. Example reading databases
  3281.  
  3282. Here is an example program with several of the above features demonstrated simultaneously. This following program lists all users who have home directories on the current host. If the home area has sub-directories, corresponding to groups, then this is specified on the command line. The word `home' causes the program to print out the home directories of the users.
  3283.  
  3284. #!/local/bin/perl
  3285. ##################################################################
  3286. #
  3287. # allusers - list all users on named host, i.e. all
  3288. #            users who can log into this machine.
  3289. #
  3290. # Syntax: allusers group
  3291. #         allusers mygroup home
  3292. #         allusers myhost group home
  3293. #
  3294. # NOTE : This command returns only users who are registered on
  3295. #        the current host. It will not find users which cannot
  3296. #        be validated in the passwd file, or in the named groups
  3297. #        in NIS. It assumes that the users belonging to
  3298. #        different groups are saved in subdirectories of
  3299. #        /home/hostname.
  3300. #
  3301. ##################################################################
  3302.  
  3303. &arguments();
  3304.  
  3305. die "\n" if ( ! -d "/home/$server" );
  3306.  
  3307. $disks = `/bin/ls -d /home/$server/$group`;
  3308.  
  3309. foreach $home (split(/\s/,$disks))
  3310.   {
  3311.   open (LS,"cd $home; /bin/ls $home |") || die "allusers: Pipe didn't open";
  3312.  
  3313.   while (<LS>)
  3314.      {
  3315.      $exists = "";
  3316.      ($user) = split;
  3317.      ($exists,$pw,$uid,$gid,$qu,$cm,$gcos,$dir)=getpwnam($user);
  3318.  
  3319.      if ($exists)
  3320.         {
  3321.         if ($printhomes)
  3322.            {
  3323.            print "$dir\n";
  3324.            }
  3325.         else
  3326.            {
  3327.            print "$user\n";
  3328.            }
  3329.         }
  3330.      }
  3331.   close(LS);
  3332.   }
  3333.  
  3334. ########################################################
  3335.  
  3336. sub arguments
  3337.   {
  3338.   $printhomes = 0;
  3339.   $group = "*";
  3340.   $server = `/bin/hostname`;
  3341.   chop $server;
  3342.  
  3343.   foreach $arg (@ARGV)
  3344.      {
  3345.      if (substr($arg,0,1) eq "u")
  3346.         {
  3347.         $group = $arg;
  3348.         next;
  3349.         }
  3350.  
  3351.      if ($arg eq "home")
  3352.         {
  3353.         $printhomes = 1;
  3354.         next;
  3355.         }
  3356.  
  3357.      $server= $arg;     #default is to interpret as a server.
  3358.      }
  3359.   }
  3360.  
  3361. Pattern matching and extraction
  3362.  
  3363. Perl has regular expression operators for identifying patterns. The operator
  3364.  
  3365.  
  3366.     /regular expression/
  3367.  
  3368.  
  3369. returns true of false depending on whether the regular expression matches the contents of $_. For example
  3370.  
  3371.  
  3372.  if (/perl/)
  3373.     {
  3374.     print "String contains perl as a substring";
  3375.     }
  3376.  
  3377.  if (/(Sat|Sun)day/)
  3378.     {
  3379.     print "Weekend day....";
  3380.     }
  3381.  
  3382. The effect is rather like the grep command. To use this operator on other variables you would write:
  3383.  
  3384.  
  3385.  $variable =~ /regexp/
  3386.  
  3387. Regular expression can contain parenthetic sub-expressions, e.g.
  3388.  
  3389.  
  3390.  if (/(Sat|Sun)day (..)th (.*)/)
  3391.     {
  3392.     $first = $1;
  3393.     $second = $2;
  3394.     $third = $3;
  3395.     }
  3396.  
  3397. in which case perl places the objects matched by such sub-expressions in the variables $1, $2 etc.
  3398.  
  3399. Searching and replacing text
  3400.  
  3401. The `sed'-like function for replacing all occurances of a string is easily implemented in Perl using
  3402.  
  3403. while (<input>)
  3404.    {
  3405.    s/$search/$replace/g;
  3406.    print output;
  3407.    }
  3408.  
  3409. This example replaces the string inside the default variable. To replace in a general variable we use the operator `=~', with syntax:
  3410.  
  3411. $variable =~ s/search/replace/
  3412.  
  3413. Here is an example of some of this operator in use. The following is a program which searches and replaces a string in several files. This is useful program indeed for making a change globally in a group of files! The program is called `file-replace'.
  3414.  
  3415. #!/local/bin/perl
  3416. ##############################################################
  3417. #
  3418. # Look through files for findstring and change to newstring
  3419. # in all files.
  3420. #
  3421. ##############################################################
  3422.  
  3423. #
  3424. # Define a temporary file and check it doesn't exist
  3425. #
  3426.  
  3427. $outputfile = "tmpmarkfind";
  3428. unlink $outputfile;
  3429.  
  3430. #
  3431. # Check command line for list of files
  3432. #
  3433.  
  3434. if ($#ARGV < 0)
  3435.    {
  3436.    die "Syntax: file-replace [file list]\n";
  3437.    }
  3438.  
  3439. print "Enter the string you want to find (Don't use quotes):\n\n:";
  3440. $findstring=<STDIN>;
  3441. chop $findstring;
  3442.  
  3443. print "Enter the string you want to replace with (Don't use quotes):\n\n:";
  3444. $replacestring=<STDIN>;
  3445. chop $replacestring;
  3446.  
  3447. #
  3448.  
  3449. print "\nFind: $findstring\n";
  3450. print "Replace: $replacestring\n";
  3451. print "\nConfirm (y/n)  ";
  3452. $y = <STDIN>;
  3453. chop $y;
  3454.  
  3455. if ( $y ne "y")
  3456.    {
  3457.    die "Aborted -- nothing done.\n";
  3458.    }
  3459. else
  3460.    {
  3461.    print "Use CTRL-C to interrupt...\n";
  3462.    }
  3463.  
  3464. #
  3465. # Now shift default array @ARGV to get arguments 1 by 1
  3466. #
  3467.  
  3468. while ($file = shift)      
  3469.    {
  3470.    if ($file eq "file-replace")
  3471.       {
  3472.       print "Findmark will not operate on itself!";
  3473.       next;
  3474.       }
  3475.  
  3476.    #
  3477.    # Save existing mode of file for later
  3478.    #
  3479.  
  3480.    ($dev,$ino,$mode)=stat($file);
  3481.  
  3482.    open (INPUT,$file) || warn "Couldn't open $file\n";
  3483.    open (OUTPUT,"> $outputfile") || warn "Can't open tmp";
  3484.  
  3485.    $notify = 1;
  3486.  
  3487.    while (<INPUT>)
  3488.       {
  3489.       if (/$findstring/ && $notify)
  3490.          {
  3491.          print "Fixing $file...\n";
  3492.          $notify = 0;
  3493.          }
  3494.       s/$findstring/$replacestring/g;
  3495.       print OUTPUT;
  3496.       }
  3497.  
  3498.    close (OUTPUT);
  3499.  
  3500.    #
  3501.    # If nothing went wrong (if outfile not empty)
  3502.    # move temp file to original and reset the
  3503.    # file mode saved above
  3504.    #
  3505.  
  3506.    if (! -z $outputfile)
  3507.       {
  3508.       rename ($outputfile,$file);
  3509.       chmod ($mode,$file);
  3510.       }
  3511.    else
  3512.       {
  3513.       print "Warning: file empty!\n.";
  3514.       }
  3515.    }
  3516.  
  3517. Similarly we can search for lines containing a string. Here is the grep program written in perl
  3518.  
  3519. #!/local/bin/perl
  3520. #
  3521. # grep as a perl program
  3522. #
  3523.  
  3524. # Check arguments etc
  3525.  
  3526. while (<>)
  3527.    {
  3528.    print if (/$ARGV[1]/);
  3529.    }
  3530.  
  3531. The operator `/search-string/' returns true if the search string is a substring of the default variable $_. To search an arbitrary string, we write
  3532.  
  3533. .... if (teststring =~ /search-string/);
  3534.  
  3535. Here teststring is searched for occurrances of search-string and the result is true if one is found.
  3536.  
  3537. In perl you can use regular expressions to search for text patterns. Note however that, like all regular expression dialects, perl has its own conventions. For example the dollar sign does not mean "match the end of line" in perl, instead one uses the `\n' symbol. Here is an example program which illustrates the use of regular expressions in perl:
  3538.  
  3539. #!/local/bin/perl
  3540. #
  3541. # Test regular expressions in perl
  3542. #
  3543. # NB - careful with \ $ * symbols etc. Use " quotes since
  3544. #      the shell interprets these!
  3545. #
  3546.  
  3547. open (FILE,"regex_test");
  3548.  
  3549. $regex = $ARGV[$#ARGV];
  3550.  
  3551. print "Looking for $ARGV[$#ARGV] in file...\n";
  3552.  
  3553. while (<FILE>)
  3554.    {
  3555.    if (/$regex/)
  3556.       {
  3557.       print;
  3558.       }
  3559.    }
  3560.  
  3561. #
  3562. # Test like this:
  3563. #
  3564. #  regex '.*'       - prints every line (matches everything)
  3565. #  regex '.'        - all lines except those containing only blanks
  3566. #                     (. doesn't match ws/white-space)
  3567. #  regex '[a-z]'    - matches any line containing lowercase
  3568. #  regex '[^a-z]'   - matches any line containg something which is
  3569. #                     not lowercase a-z
  3570. #  regex '[A-Za-z]' - matches any line containing letters of any kind
  3571. #  regex '[0-9]'    - match any line containing numbers
  3572. #  regex '#.*'      - line containing a hash symbol followed by anything
  3573. #  regex '^#.*'     - line starting with hash symbol (first char)
  3574. #  regex ';\n'      - match line ending in a semi-colon
  3575. #
  3576.  
  3577. Try running this program with the test data on the following file which is called `regex_test' in the example program.
  3578.  
  3579.  
  3580. # A line beginning with a hash symbol
  3581.  
  3582. JUST UPPERCASE LETTERS
  3583.  
  3584. just lowercase letters
  3585.  
  3586. Letters and numbers 123456
  3587.  
  3588. 123456
  3589.  
  3590. A line ending with a semi-colon;
  3591.  
  3592. Line with a comment # COMMENT...
  3593.  
  3594. Example: convert mail to WWW pages
  3595.  
  3596. Here is an example program which you could use to automatically turn a mail message of the form
  3597.  
  3598. From: Newswire
  3599. To: Mail2html
  3600. Subject: Nothing happened
  3601.  
  3602. On the 13th February at kl. 09:30 nothing happened. No footprints
  3603. were found leading to the scene of a terrible murder, no evidence
  3604. of a struggle .... etc etc
  3605.  
  3606. into an html-file for the world wide web. The program works by extracting the message body and subject from the mail and writing html-commands around these to make a web page. The subject field of the mail becomes the title. The other headers get skipped, since the script searches for lines containing the sequence "colon-space" or `: '. A regular expression is used for this.
  3607.  
  3608. #!/local/bin/perl
  3609. #
  3610. # Make HTML from mail
  3611. #
  3612.  
  3613. &BeginWebPage();
  3614. &ReadNewMail();
  3615. &EndWebPage();
  3616.  
  3617. ##########################################################
  3618.  
  3619. sub BeginWebPage
  3620.  
  3621. {
  3622.     print "<HTML>\n";
  3623.     print "<BODY>\n";
  3624. }
  3625.  
  3626. ##########################################################
  3627.  
  3628. sub EndWebPage
  3629.  
  3630. {
  3631.     print "</BODY>\n";
  3632.     print "</HTML>\n";
  3633. }
  3634.  
  3635. ##########################################################
  3636.  
  3637. sub ReadNewMail
  3638.  
  3639. {
  3640. while (<>)
  3641.    {
  3642.    if (/Subject:/)   # Search for subject line
  3643.       {
  3644.       # Extract subject text...
  3645.  
  3646.       chop;
  3647.       ($left,$right) = split(":",$_);
  3648.       print "<H1> $right </H1>\n";
  3649.       next;
  3650.       }
  3651.    elsif (/.*: .*/)   # Search for - anything: anything
  3652.       {
  3653.       next;           # skip other headers
  3654.       }
  3655.  
  3656.    print;
  3657.    }
  3658. }
  3659.  
  3660. Generate WWW pages automagically
  3661.  
  3662. The following program scans through the password database and build a standardized html-page for each user it finds there. It fills in the name of the user in each case. Note the use of the `<<' operator for extended input, already used in the context of the shell, See section Pipes and redirection in csh. This allows us to format a whole passage of text, inserting variables at strategic places, and avoid having to the print over many lines.
  3663.  
  3664.  
  3665. #!/local/bin/perl
  3666. #
  3667. # Build a default home page for each user in /etc/passwd
  3668. #
  3669. #
  3670.  
  3671. ####################################################################
  3672. # Level 0 (main)
  3673. ####################################################################
  3674.  
  3675. $true = 1;
  3676. $false = 0;
  3677.  
  3678. # First build an associated array of users and full names
  3679.  
  3680. setpwent();
  3681.  
  3682. while ($true)
  3683.   {
  3684.   ($name,$passwd,$uid,$gid,$quota,$comment,$fullname) = getpwent;
  3685.   $FullName{$name} = $fullname;
  3686.   print "$name - $FullName{$name}\n";
  3687.   last if ($name eq "");
  3688.   }
  3689.  
  3690. print "\n";
  3691.  
  3692. # Now make a unique filename for each page and open a file
  3693.  
  3694. foreach $user (sort keys(%FullName))
  3695.   {
  3696.   next if ($user eq "");
  3697.  
  3698.   print "Making page for $user\n";
  3699.   $outputfile = "$user.html";
  3700.  
  3701.   open (OUT,"> $outputfile") || die "Can't open $outputfile\n";
  3702.  
  3703.   &MakePage;
  3704.  
  3705.   close (OUT);
  3706.   }
  3707.  
  3708. ####################################################################
  3709. # Level 1
  3710. ####################################################################
  3711.  
  3712. sub MakePage
  3713.  
  3714. {
  3715. print OUT <<ENDMARKER;
  3716.  
  3717. <HTML>
  3718. <BODY>
  3719. <HEAD><TITLE>$FullName{$user}'s Home Page</TITLE></HEAD>
  3720. <H1>$FullName{$user}'s Home Page</H1>
  3721.  
  3722. Hi welcome to my home page. In case you hadn't
  3723. got it yet my name is: $FullName{$user}...
  3724.  
  3725. I study at <a href=http://www.iu.hioslo.no>H&oslash;gskolen i Oslo</a>.
  3726.  
  3727. </BODY>
  3728. </HTML>
  3729.  
  3730. ENDMARKER
  3731. }
  3732.  
  3733.  
  3734. Other supported functions
  3735.  
  3736. Perl has very many functions which come directly from the C library. To give a taster, a few are listed here. The Perl book contains a comprehensive description of these.
  3737.  
  3738. Fork
  3739.    The standard UNIX fork command for spawning new processes.
  3740. Sockets
  3741.    Support for network socket communication.
  3742. Directories
  3743.    Directory opening and handling routines.
  3744. Databases
  3745.    Reading from the password files and the host databases is supported through the standard C functions `getpasswdbyname' etc. dressed up to look like Perl.
  3746. Crypt
  3747.    The password encryption function.
  3748. Regexp
  3749.    Regular expressions and pattern matching, search and replace functions as in `sed'.
  3750. Operators
  3751.    Perl has the full set of C's logical operators.
  3752. File testing
  3753.    Tests from the shell like `if (-f file)'.
  3754.  
  3755. Here are some of the most frequently used functions
  3756.  
  3757. chmod
  3758.    Change the file mode of a file. e.g. chmod 755,filename
  3759. chdir
  3760.    Change the current working directory. e.g. chdir /etc
  3761. stat
  3762.    Get info about permissions, ownership and type of a file.
  3763. open
  3764.    Open a file for reading, `>' writing, `|' as a pipe.
  3765. close
  3766.    Close an open file handle.
  3767. system
  3768.    Execute a shell command as a child process. e.g. system "ls";
  3769. split
  3770.    Split a string variable into an array of elements, by searching for a special character (space or `:' etc.) e.g. @array=split(":",$string).
  3771. rename
  3772.    Rename a file. e.g. rename old name new-name
  3773. mkdir
  3774.    Make a new directory. mkdir newdir
  3775. shift
  3776.    Read the first element of an array and delete it, shifting all the array elements down by one. (e.g. $first=shift(@array);).
  3777. chop
  3778.    Chops off the last character of a string. Often used for deleting the end-of-line character when reading from a file.
  3779. oct
  3780.    Interprets a number as octal (converts to decimal). e.g. $decimal = oct(755);
  3781. kill
  3782.    Send a kill signal to a list of processes. e.g. kill -9, pid1,pid2...
  3783.  
  3784. You should explore Perl's possibilities yourself. Perl is a good alternative to the shell which has much of the power of C and is therefore ideal for simple and more complex system programming tasks. If you intend to be a system administrator for UNIX systems, you could do much worse than to read the Perl book and learn Perl inside out.
  3785.  
  3786. Summary
  3787.  
  3788. The Practical Extraction and Report Language is a powerful tool which goes beyond shell programming, but which retains much of the immediateness of shell programming in a more formal programming environment.
  3789.  
  3790. The success of Perl has led many programmers to use it exclusively. In the next section, I would like to argue that programming directly in C is not much harder. In fact it has advantages in the long run. The power of Perl is that it is as immediate as shell programming. If you are inexperienced, Perl is a little easier than C because many features are ready programmed into the language, but with time one also builds up a repertoire of C functions which can do the same tricks.
  3791.  
  3792. Exercises
  3793.  
  3794.    Write a progam which prints out all of its arguments alphabetically together with the first and the last, and the number of arguments.
  3795.    Write a program which prints out the pathname of the home directory for a given user. The user's login name should be given as an argument.
  3796.    Write a program called `search-replace' which looks for a given string in a list of files and replaces it with a new string. You should be able to specify a list of files using ordinary unix wildcards. e.g. `search-replace search-string replace-string *.text'. This is a dangerous operation! What if the user types the strings incorrectly? How can you may the program safer?
  3797.    Write a program which opens a pipe from `ps' and computes the total cpu-time used by each user. Print the results in order of maximum to minimum. Hint: use an associated array to store the information.
  3798.    Write a program which forks and goes into the background. Make the program send you mail when some other user of your choice logs in. Use sleep to check only every few minutes.
  3799.    Open a pipe from `find' and collect statistics over how many files there are in all of your sub-directories.
  3800.  
  3801. Project
  3802.  
  3803. Write a program which checks the `sanity' of your UNIX system.
  3804.  
  3805.    Check that the password file /etc/passwd is not writable by general users.
  3806.    Check that the processes `cron' and `sendmail' are running.
  3807.    Check that, if the file `/etc/exports' or `/etc/dfs/dfstab' exists, the nfsd daemon is running.
  3808.    Check that if the filesystem table `/etc/fstab' (or its equivalent on non-BSD systems) contains NFS mounted filesystems, the `biod' or `nfsiod' daemon is running.
  3809.    Check that the file `/etc/resolv.conf' contains the correct domain name. It may or may not be the same as that returned by the shell command `domainname'. If it is not the same, you should print the message `NIS domain has different name to DNS domain'.
  3810.  
  3811. WWW and CGI programming
  3812.  
  3813. CGI stands for the Common Gateway Interface. It is the name given to scripts which can be executed from within pages of the world wide web. Although it is possible to use any language in CGI programs (hence the word `common'), the usual choice is Perl, because of the ease with which Perl can handle text.
  3814.  
  3815. The CGI interface is pretty unintelligent, in order to be as general as possible, so we need to do a bit of work in order to make scripts work.
  3816.  
  3817. Permissions
  3818.  
  3819. The key thing about the WWW which often causes a lot of confusion is that the W3 service runs with a user ID of `nobody'. The purpose of this is to ensure that nobody has the right to read or write files unless they are opened very explicitly by the user who owns them.
  3820.  
  3821. In order for files to be readable on the WWW, they must have file mode `644' and they must lie in a directory which has mode `755'. In order for a CGI program to be executable, it must have permission `755' and in order for such a program to write to a file in a user's directory, it must be possible for the file to be created (if necessary) and everyone must be able to write to it. That means that files which are written to by the WWW must have mode `666' and must either exist already or lie in a directory with permission `777'(6).
  3822.  
  3823. Protocols
  3824.  
  3825. CGI script programs communicate with W3 browsers using a very simple protocol. It goes like this:
  3826.  
  3827.    A web page sends data to a script using the `forms' interface. Those data are concatenated into a single line. The data in separate fields of a form are separated by `&' signs. New lines are replaced by the text `%0D%0A', which is the DOS ASCII representation of a newline, and spaces are replaced by `+' symbols.
  3828.    A CGI script reads this single line of text on the standard input.
  3829.    The CGI script replies to the web browser. The first line of the reply must be a line which is tells the browser what mime-type the data are sent in. Usually, a CGI script replies in HTML code, in which case the first line in the reply must be:
  3830.  
  3831.      Content-type: text/html
  3832.  
  3833.    This must be followed by a blank line.
  3834.  
  3835. HTML coding of forms
  3836.  
  3837. To start a CGI program from a web page we use a form which is a part of the HTML code enclosed with the parentheses
  3838.  
  3839.  
  3840. <FORM method="POST" ACTION="/cgi-script-alias/program.pl">
  3841.     ...
  3842. </FORM>
  3843.  
  3844. The method `post' means that the data which get typed into this form will be piped into the CGI program via its standard input. The `action' specifies which program you want to start. Note that you cannot simply use the absolute path of the file, for security reasons. You must use something called a `script alias' to tell the web browser where to find the program. If you do not have a script alias defined for you personally, then you need to get one from your system administrator. By using a script alias, no one from outside your site can see where your files are located, only that you have a `cgi-bin' area somewhere on your system.
  3845.  
  3846. Within these parentheses, you can arrange to collect different kinds of input. The simplest kind of input is just a button which starts the CGI program. This has the form
  3847.  
  3848.  
  3849. <INPUT TYPE="submit" VALUE="Start my program">
  3850.  
  3851. This code creates a button. When you click on it the program in your `action' string gets started. More generally, you will want to create input boxes where you can type in data. To create a single line input field, you use the following syntax:
  3852.  
  3853.  
  3854. <INPUT NAME="variable-name" SIZE=40>
  3855.  
  3856. This creates a single line text field of width 40 characters. This is not the limit on the length of the string which can be typed into the field, only a limit on the amount which is visible at any time. It is for visual formatting only. The NAME field is used to identify the data in the CGI script. The string you enter here will be sent to the CGI script in the form `variable-name=value of input...'. Another type of input is a text area. This is a larger box where one can type in text on several lines. The syntax is:
  3857.  
  3858.  
  3859. <TEXTAREA NAME="variable-name" ROW=50 COLS=50>
  3860.  
  3861. which means: create a text area of fifty rows by fifty columns with a prompt to the left of the box. Again, the size has only to do with the visual formatting, not to do with limits on the amount of text which can be entered.
  3862.  
  3863. As an example, let's create a WWW page with a complete form which can be used to make a guest book, or order form.
  3864.  
  3865. <HTML>
  3866. <HEAD>
  3867. <TITLE>Example form</TITLE>
  3868. <!-- Comment: Mark Burgess, 27-Jan-1997 -->
  3869. <LINK REV="made" HREF="mailto:mark@iu.hioslo.no">
  3870. </HEAD>
  3871. <BODY>
  3872. <CENTER><H1>Write in my guest book...</H1></CENTER>
  3873. <HR>
  3874.  
  3875. <CENTER><H2>Please leave a comment using the form below.</H2><P>
  3876. <FORM method="POST" ACTION="/cgi-bin-mark/comment.pl">
  3877.  
  3878. Your Name/e-mail: <INPUT NAME="variable1" SIZE=40> <BR><BR>
  3879.  
  3880. <P>
  3881. <TEXTAREA NAME="variable2" cols=50 rows=8></TEXTAREA>
  3882. <P>
  3883.  
  3884. <INPUT TYPE=submit VALUE="Add message to book">
  3885. <INPUT TYPE=reset VALUE="Clear message">
  3886. </FORM>
  3887.  
  3888. <P>
  3889.  
  3890. </BODY>
  3891. </HTML>
  3892.  
  3893. The reset button clears the form. When the submit button is pressed, the CGI program is activated.
  3894.  
  3895. Perl and the web
  3896.  
  3897. Interpreting data from forms
  3898.  
  3899. To interpret and respond to the data in a form, we must write a program which satsifies the protocol above, See section Protocols. We use perl as a script langauge. The simplest valid CGI script is the following:
  3900.  
  3901.  
  3902. #!/local/bin/perl
  3903.  
  3904. #
  3905. # Reply with proper protocol
  3906. #
  3907.  
  3908. print "Content-type: text/html\n\n";
  3909.  
  3910. #
  3911. # Get the data from the form ...
  3912. #
  3913.  
  3914. $input = <STDIN>;
  3915.  
  3916. #
  3917. # ... and echo them back
  3918. #
  3919.  
  3920. print $input, "\n Done! \n";
  3921.  
  3922. Although rather banal, this script is a useful starting point for CGI programming, because it shows you just how the input arrives at the script from the HTML form. The data arrive all in a single, enormously long line, full of funny characters. The first job of any script is to decode this line.
  3923.  
  3924. Before looking at how to decode the data, we should make an important point about the protocol line. If a web browser does not get this `Content-type' line from the CGI script it returns with an error:
  3925.  
  3926.  
  3927. 500 Server Error
  3928.  
  3929. The server encountered an internal error or misconfiguration and was
  3930. unable to complete your request.
  3931.  
  3932. Please contact the server administrator, and inform them of the time
  3933. the error occurred, and anything you might have done that may have
  3934. caused the error.
  3935.  
  3936. Error: HTTPd: malformed header from script www/cgi-bin/comment.pl
  3937.  
  3938. Before finishing your CGI script, you will probably ecounter this error several times. A common reason for getting the error is a syntax error in your script. If your program contains an error, the first thing a browser gets in return is not the `Content-type' line, but an error message. The browser does not pass on this error message, it just prints the uninformative message above.
  3939.  
  3940. If you can get the above script to work, then you are ready to decode the data which are sent to the script. The first thing is to use perl to split the long line into an array of lines, by splitting on `&'. We can also convert all of the `+' symbols back into spaces. The script now looks like this:
  3941.  
  3942. #!/local/bin/perl
  3943.  
  3944. #
  3945. # Reply with proper protocol
  3946. #
  3947.  
  3948. print "Content-type: text/html\n\n";
  3949.  
  3950. #
  3951. # Get the data from the form ...
  3952. #
  3953.  
  3954. $input = <STDIN>;
  3955.  
  3956. #
  3957. # ... and echo them back
  3958. #
  3959.  
  3960. print "$input\n\n\n";
  3961.  
  3962. $input =~ s/\+/ /g;
  3963.  
  3964. #
  3965. # Now split the lines and convert
  3966. #
  3967.  
  3968. @array = split('&',$input);
  3969.  
  3970. foreach $var ( @array )
  3971.   {
  3972.   print "$var\n";
  3973.   }
  3974.  
  3975. print "Done! \n";
  3976.  
  3977. We now have a series of elements in our array. The output from this script is something like this:
  3978.  
  3979.  
  3980. variable1=Mark+Burgess&variable2=%0D%0AI+just+called+to+say+ (wrap)
  3981. ....%0D%0A...hey+pig%2C+nothing%27s+working+out+the+way+I+planned
  3982. variable1=Mark Burgess variable2=%0D%0AI just called to say  (wrap)
  3983. ....%0D%0A...hey pig%2Cnothing%27s working out the way I planned Done!
  3984.  
  3985. As you can see, all control characters are converted into the form `%XX'. We should now try to do something with these. Since we are usually not interested in keeping new lines, or any other control codes, we can simply null-out these with a line of the form
  3986.  
  3987.  
  3988. $input =~ s/%..//g;
  3989.  
  3990. The regular expression `%..' matches anything beginning with a percent symbol followed by two characters. The resulting output is then free of these symbols. We can then separate the variable contents from their names by splitting the input. Here is the complete code:
  3991.  
  3992. #!/local/bin/perl
  3993.  
  3994. #
  3995. # Reply with proper protocol
  3996. #
  3997.  
  3998. print "Content-type: text/html\n\n";
  3999.  
  4000. #
  4001. # Get the data from the form ...
  4002. #
  4003.  
  4004. $input = <STDIN>;
  4005.  
  4006. #
  4007. # ... and echo them back
  4008. #
  4009.  
  4010. print "$input\n\n\n";
  4011.  
  4012. $input =~ s/%..//g;
  4013.  
  4014. $input =~ s/\+/ /g;
  4015.  
  4016. @array = split('&',$input);
  4017.  
  4018. foreach $var ( @array )
  4019.   {
  4020.   print "$var<br>";
  4021.   }
  4022.  
  4023. print "<hr>\n";
  4024.  
  4025. ($name,$variable1) = split("variable1=",$array[0]);
  4026. ($name,$variable2) = split("variable2=",$array[1]);
  4027.  
  4028. print "<br>var1 = $variable1<br>";
  4029. print "<br>var2 = $variable2<br>";
  4030.  
  4031. print "<br>Done! \n";
  4032.  
  4033. and the output
  4034.  
  4035. variable1=Mark+Burgess&variable2=%0D%0AI+just+called+to+say (wrap)
  4036. +....%0D%0A...hey+pig%2C+nothing%27s+working+out+the+way+I+planned
  4037. variable1=Mark Burgess
  4038. variable2=I just called to say .......hey pig nothings working (wrap)
  4039. out the way I planned
  4040.  
  4041. var1 = Mark Burgess
  4042.  
  4043. var2 = I just called to say .......hey pig nothings working out  (wrap)
  4044. the way I planned
  4045.  
  4046. Done!
  4047.  
  4048. A complete guestbook example in perl
  4049.  
  4050. Let us now use this technique to develop a guest book aplication. Based on the code above, analyze the following code.
  4051.  
  4052. #!/local/bin/perl
  4053. ####################################################################
  4054. #
  4055. # Guest book
  4056. #
  4057. ####################################################################
  4058.  
  4059. $guestbook_page = "/iu/nexus/ud/mark/www/tmp/cfguest.html";
  4060.  
  4061. $tmp_page = "/iu/nexus/ud/mark/www/tmp/guests.tmp";
  4062.  
  4063. $remote_host = $ENV{"REMOTE_HOST"};
  4064.  
  4065. print "Content-type: text/html\n\n";
  4066. print "<br><hr><br>\n";
  4067. print "Thank you for submitting your comment!<br><br>\n";
  4068. print "best wishes,<br><br>";
  4069. print "-Mark<br><br><br>";
  4070. print "Return to <a href=http://www.iu.hioslo.no/~mark/menu.html>menu</a>\n";
  4071.  
  4072. $input = <STDIN>;
  4073.  
  4074. $input =~ s/%..//g;
  4075.  
  4076. $input =~ s/\+/ /g;
  4077.  
  4078. @array = split('&',$input);
  4079.  
  4080. ($skip,$name) = split("var1=",$array[0]);
  4081. ($skip,$message) = split("var2=",$array[1]);
  4082.  
  4083. if (! open (PAGE, $guestbook_page))
  4084.   {
  4085.   print "Content-type: text/html\n\n";
  4086.   print "couldn't open guestbook page file!";
  4087.   }
  4088.  
  4089. if (! open (TMP, "+>$tmp_page"))
  4090.   {
  4091.   print "Content-type: text/html\n\n";
  4092.   print "couldn't open temporary output file!";
  4093.   }
  4094.  
  4095. while ($line = <PAGE>)
  4096.   {
  4097.   if ($line =~ /<h3>Number of entries: (..)/)
  4098.      {
  4099.      $entry_no = $1;
  4100.      $entry_no++;
  4101.      $line = "<h3>Number of entries: $entry_no </h3>\n";
  4102.      }
  4103.  
  4104.    if ($line =~ /<!-- LAST ENTRY -->/)
  4105.       {
  4106.        $date = `date +"%A, %b %d %Y"`
  4107.       print TMP "<b>Entry $date from host: $remote_host</b>\n<p>\n";
  4108.       print TMP "From: $name\n<p>\n";
  4109.       print TMP $message;
  4110.       print TMP "\n<hr>\n";
  4111.       }
  4112.  
  4113.    print TMP "$line";
  4114.    }
  4115.    
  4116. close PAGE;
  4117. close TMP;
  4118.  
  4119. if (! rename ($tmp_page, $guestbook_page))
  4120.   {
  4121.   print "Oops! Rename operation failed!\n";
  4122.   }
  4123.  
  4124. chmod (0600, $guestbook_page);
  4125.  
  4126. This script works by reading through the old guest book file, opening a new copy of the guest book file and appending a new messages at the end. The end of the message section (not counting the `</HTML>' tags) is marked by a comment line.
  4127.  
  4128. <!-- LAST ENTRY -->
  4129.  
  4130. Note that a provisional guest book file has to exist in the first place. The script writes to a new file and then swaps the new file for the old one. The guest book file looks something like this:
  4131.  
  4132. <html><head>
  4133. <title>Comments</title>
  4134. </head>
  4135. <body>
  4136. <h1>My guest book</h1>
  4137.  
  4138. <b>Entry no.  Wednesday, Feb 28 1996
  4139. from host: dax</b>
  4140. <p>
  4141. From: Mark.Burgess@iu.hioslo.no
  4142. <p>
  4143. Just to start the ball rolling....
  4144.  
  4145. <hr>
  4146.  
  4147. <b>Entry no.  Tuesday, Mar 26 1996
  4148. from host: enterprise.subspace.net</b>
  4149. <p>
  4150. From: spock@enterprise
  4151. <p>
  4152. Registering a form of energy never before encountered.
  4153.  
  4154. <!-- LAST ENTRY -->
  4155.  
  4156. </body> <address><a href="http://www.iu.hioslo.no/~mark">Mark
  4157. Burgess</a> - Mark.Burgess@iu.hioslo.no</addre ss> </html>
  4158.  
  4159. The directory in which this file lies needs to be writable to the user nobody (the WWW user) and the files within need to be deletable by nobody but no one else. Some users try to make guest book scripts setuid-themselves in order to overcome the problem that httpd runs with uid nobody, but this opens many security issues. In short it is asking for trouble. Unfortunately an ordinary user cannot use chown in order to give access only to the WWW user nobody, so this approach needs the cooperation of the system administrator. Nevertheless this is the most secure approach. Try to work through this example step for step.
  4160.  
  4161. PHP and the web
  4162.  
  4163. The PHP 3 language makes the whole business of web programming rather simpler than perl. It hides the business of translating variables from forms into new variables in a CGI program and it even allows you to embed active code into you HTML pages. PHP has special support for querying data in an SQL database like MySQL or Oracle. PHP documentation lives at @uref{http://www.php.net}.
  4164. Embedded PHP
  4165.  
  4166. PHP code can be embedded inside HTML pages provided your WWW server is configurered with PHP support. PHP code lives inside a tag with the general form
  4167.  
  4168.  
  4169.  <?php code...  ?>
  4170.  
  4171. For example, we could use this to import one file into another and print out a table of numbers:
  4172.  
  4173. <html>
  4174. <body>
  4175.  
  4176. <?php
  4177.  
  4178. include "file.html"
  4179.  
  4180. for ($i = 0; $i < 10; $i++)
  4181.   {
  4182.   print "Counting $i<br>";
  4183.   }
  4184.  
  4185. ?>
  4186.  
  4187. </body>
  4188. </html>
  4189.  
  4190. This makes it easy to generate WWW pages with a fixed visual layout:
  4191.  
  4192. <?php
  4193. #
  4194. # Standard layout
  4195. #
  4196.  
  4197. # Set $title, $comment and $contents
  4198.  
  4199. ##########################################################################
  4200.  
  4201. print "<body>\n";
  4202. print "<img src=img/header.gif>";
  4203.  
  4204. print "<h1>"$title</h1>";
  4205. print "<em>$comment</em>";
  4206. print "<blockquote>\n";
  4207.  
  4208. include $contents;
  4209.  
  4210. print ("</blockquote>\n");
  4211. print ("</body>\n");
  4212. print ("</html>\n");
  4213.  
  4214. Variables are easily set by calling PHP code in the form of a CGI program from a form.
  4215. PHP and forms
  4216.  
  4217. PHP is particularly good at dealing with forms, as a CGI scripting langauge. Consider the following form:
  4218.  
  4219. <html>
  4220. <body>
  4221. <form action="/cgi-bin-scriptalias/spititout.php" method="post">
  4222.  
  4223.    Name:  <input type="text" name="personal[name]"><br>
  4224.    Email: <input type="text" name="personal[email]"><br>
  4225.    Preferred language:
  4226.    <select multiple name="language[]">
  4227.        <option value="English">English
  4228.        <option value="Norwegian">Norwegian
  4229.        <option value="Gobbledigook">Gobbledigook
  4230.    </select>
  4231.    
  4232.    <input type=image src="image.gif" name="sub">
  4233.  
  4234. </form>
  4235. </body>
  4236. </html>
  4237.  
  4238. This produces a page into which one types a name and email address and chooses a language from a list of three possible choices. When the user clicks on a button marked by the file `image.gif' the form is posted. Here is a program which unravels the data sent to the CGI program:
  4239.  
  4240.  
  4241. #!/local/bin/php
  4242.  
  4243. <?php
  4244. #
  4245. # A CGI program which handles a form
  4246. # Variables a translated automatically
  4247. #
  4248.  
  4249. $title = "This page title";
  4250. $comment = "This pages talks about the following.....";
  4251.  
  4252. ##########################################################################
  4253.  
  4254. echo "<body>";
  4255. echo "<h1>$title</h1>";
  4256. echo "<em>$comment</em>";
  4257. echo "<blockquote>\n";
  4258.  
  4259. ###
  4260.  
  4261. echo "Your name is $personal[name]<br><br>";
  4262. echo "Your email is $personal[email]<br><br>";
  4263.  
  4264. echo "Language options: ";
  4265. echo "<table> ";
  4266.  
  4267. for ($i = 0; strlen($language[$i]) > 0; $i++)
  4268.    {
  4269.    echo "<tr><td bgcolor=#ff0000>Variable language[$i] = $language[$i]</td></tr>";
  4270.     }
  4271.  
  4272.  if ($language[0] == "Norwegian")
  4273.     {
  4274.     echo "Hei alle sammen<p>";
  4275.     }
  4276.  else
  4277.     {
  4278.     echo "Greetings everyone, this page will be in English<p>";
  4279.     }
  4280.  
  4281.  echo "</table> ";
  4282.  
  4283. ###
  4284.  
  4285. echo ("</blockquote>\n");
  4286. echo ("</body>\n");
  4287. echo ("</html>\n");
  4288. ?>
  4289.  
  4290. C programming
  4291.  
  4292. This section is not meant to teach you C. It is a guide to using C in UNIX and it is assumed that you have a working knowledge of the language. See the GNU C-Tutorial for an introduction to basics.
  4293.  
  4294. Shell or C?
  4295.  
  4296. In the preceding chapters we have been looking at ways to get simple programming tasks done. The immediateness of the script languages is a great advantage when we just want to get a job done as quickly as possible. Scripts lend themselves to simple system administration tasks like file processing, but they do not easily lend themselves to more serious programs.
  4297.  
  4298. Although some system administrators have grown to the idea that shell programming is easier, I would argue that this is not really true. First of all, most of the UNIX shell commands are just wrapper programs for C function calls. Why use the wrapper when you can use the real thing? Secondly, the C function calls return data in pointers and structures which are very easy to manipulate, whereas piping the output of shell programs into others can be a very messy and awkward way of working. Here are some of the reasons why we also need a more traditional programming language like C.
  4299.  
  4300.     The shell languages do not allow us to create an acceptable user-interface, like X-windows, or the curses (cursor manipulation) library. They are mainly intended for file-processing. (Though recently the Tk library has provided a way of creating user interfaces in Tcl and Perl.)
  4301.     Shell commands read their input line-by-line. Not all input is generated in this simple way -- we also need to be able to read through lines i.e. the concept of a data stream.
  4302.     More advanced data structures are needed for most applications, such as linked lists and binary trees, acyclic graphs etc.
  4303.     Compilers help to sort out simple typographical and logical errors by compile-time checking source code.
  4304.     Compiled code is faster than interpreted code.
  4305.     Many tools have been written to help in the programming of C code (dbx, lex, yacc etc.).
  4306.  
  4307. C program structure
  4308.  
  4309. The form of a C program
  4310.  
  4311. A C program consists of a set of function, beginning with the main program:
  4312.  
  4313.  
  4314. main ()   /* This is a comment */
  4315.  
  4316. {
  4317.  
  4318. Commands ...
  4319.  
  4320. }
  4321.  
  4322. The source code of a C program can be divided into several text files. C compiles all functions separately; the linker ld joins them all up at the end. This means that we can plan out a strategy for writing large programs in a clear and efficient manner.
  4323.  
  4324. NOTE: C++ style comments `//...' are not allowed by most C compilers.
  4325.  
  4326. Macros and declarations
  4327.  
  4328. Most Unix systems now have ANSI C compatible compilers, but this has not always been the case. Most UNIX programs written in a version of C which is older than the ANSI standard, so you will need an appreciation of old Kernighan and Ritchie C conventions for C programming. See for example my C book.
  4329.  
  4330. An obvious difference between ANSI C and K&R C is that the C++ additions to the language are not included. Here are some useful points to remember.
  4331.  
  4332.    K&R C does not allow `const' data, it uses the C preprocessor with `#define' instead. i.e. intead of
  4333.  
  4334.  
  4335.     const int blah = 1;
  4336.  
  4337.     use
  4338.  
  4339.     #define blah 1
  4340.  
  4341.     Remember that the hash symbol `#' must be the first character on a line under UNIX.
  4342.     K&R C doesn't use function prototypes or declarations of the form:
  4343.  
  4344.  
  4345.    void function (char *string, int a, int b)
  4346.  
  4347.    {
  4348.    }
  4349.  
  4350.    Instead one writes:
  4351.  
  4352.  
  4353.    void function (string, a, b)
  4354.  
  4355.    char *string;
  4356.    int a,b;
  4357.  
  4358.    {
  4359.    }
  4360.  
  4361. Several files
  4362.  
  4363. Most unix programs are very large and are split up into many files. Remember, when you split up programs into several files, you must declare variables as `extern' in file A if they are really declared in file B. in which you want to use them. This tells the compiler that it should not try to create local storage for the variable, because this was already done in another file.
  4364.  
  4365. A note about UNIX system calls and standards
  4366.  
  4367. Most of the system calls in UNIX return data in the form of `struct' variables. Sometimes these are structures used by the operating system itself -- in other cases they are just put together so that programmers can handle a packet of data in a convenient way.
  4368.  
  4369. If in doubt, you can find the definitions of these structures in the relevant include files under `/usr/include'.
  4370.  
  4371. Since UNIX comes in many flavours the system calls are not always compatible and may have different options and arguments. Because of this there is a number of standardizing organizations for UNIX. One of them is POSIX which is an organization run by the major UNIX vendors. Programs written for UNIX are now expected to be POSIX compliant. This is not something you need to think about at the level of this course, but you should certainly remember that there exist programming standards and that these should be adhered to. The aim is to work towards a single standard UNIX.
  4372.  
  4373. Compiling: `cc', `ld' and `a.out'
  4374.  
  4375. The C compiler on the unix system is traditionally called `cc' and has always been a traditional part of every Unix environment. Recently several Unix vendors have stopped including the C compiler as a part of their operating systems and instead sell a compiler separately. Fortunately there is a public domain Free Software version of the compiler called `gcc' (the GNU C compiler). We shall use this in all the examples.
  4376.  
  4377. To compile a program consisting of several files of code, we first compile all of the separate pieces without trying to link them. There are therefore two stages: first we turn `.c' files into `.o' files. This compiles code but does not fix any address references. Then we link all `.o' files into the final executable, including any libraries which are used.
  4378.  
  4379. Let's suppose we have files `a.c', `b.c' and `c.c'. We write:
  4380.  
  4381.  
  4382.   gcc -c a.c b.c c.c
  4383.  
  4384. This creates files `a.o', `b.o' and `c.o'. Next we link them into one file called `myprog'.
  4385.  
  4386.  
  4387.   gcc -o myprog  a.o b.o c.o
  4388.  
  4389. If the naming option `-o myprog' is not used, the link `ld' uses the default name a.out for the executable file.
  4390.  
  4391. Libraries and `LD_LIBRARY_PATH'
  4392.  
  4393. The resulting file is called `myprog' and includes references only to the standard library `libc'. If we wish to link in the math library `libm' or the cursor movement library `libcurses' -- or in general, a library called `libBLAH' , we need to use the `-l' directive.
  4394.  
  4395.  
  4396. gcc -o myprog files.o  -lm -lcurses -lBLAH
  4397.  
  4398. The compiler looks for a suitable library in all of the directories listed in the environment variable `LD_LIBRARY_PATH'. Alternatively we can add a directory to the search path by using the `-L'. option:
  4399.  
  4400.  
  4401. gcc -o myprog files.o -L/usr/local/lib -lm -lcurses -lBLAH
  4402.  
  4403. Include files
  4404.  
  4405. Normally the compiler looks for include files only in the directory `/usr/include'. We can add further paths to search using the `-I' option.
  4406.  
  4407.  
  4408. gcc -o myprog file.c  -I/usr/local/include -I/usr/local/X11/include
  4409.  
  4410. Previously, Unix libraries have been in `a.out' code format, but recent releases of unix have gone over to a more efficient and flexible format called ELF (executable and linking format).
  4411.  
  4412. Shared and static libraries
  4413.  
  4414. Libraries are collections of C functions which the operating system creators have written for our convenience. The source code for such a library is just the source for a collection of functions -- there is no main program.
  4415.  
  4416. There are two kinds of library used by modern operating systems: archive libraries or static libraries and shared libraries or dynamical libraries. An archive library has a name of the form
  4417.  
  4418.  
  4419.  libname.a
  4420.  
  4421. When an archive library is linked to a program, it is appended lock, stock and barrel to the program code. This uses a lot of disk space and makes the size of the compiled program very large. Shared libraries (shared objects `so' or shared archives `sa' generally have names of the form)
  4422.  
  4423.  
  4424.   libname.so
  4425.   libname.sa
  4426.  
  4427. often with version numbers appended. When a program is linked with a shared library the code is not appended to the program. Instead pointers to the shared objects are created and the library is loaded at runtime, thus avoiding the problem of having to store the library effectively multiple times on the disk.
  4428.  
  4429. To make an archive library we compile all of the functions we wish to include in the library
  4430.  
  4431.  
  4432. gcc -c function1.c function2.c ...
  4433.  
  4434. and then join the files using the `ar' command.
  4435.  
  4436.  
  4437. ar rcv libMYLIB.a function1.o
  4438. ar rcv libMYLIB.a function2.o
  4439.  
  4440. To make a shared library one provides an option to the linker program. The exact method is different in different operating systems, so you should look at the manual page for ld on your system. Under SunOS 4 we take the object files `*.o' and run
  4441.  
  4442.  
  4443.  ld -o libMYLIB.so.1.1 -assert pure-text *.o
  4444.  
  4445. Under HPUX, we write
  4446.  
  4447.  
  4448.  ld -b -o libMYLIB.so.1.1 *.o
  4449.  
  4450. With the GNU linker, you write
  4451.  
  4452.  
  4453.  ld -shared -o libMYLIB.so.1.1 *.o
  4454.  
  4455. NOTE: when you add a shared library to the system under SunOS or GNU/Linux you must run the command `ldconfig', making sure that the path to the library is included in `LD_LIBRARY_PATH'. SunOS and GNU/Linux use a cache file `/etc/ld.so.cache' to keep current versions of libraries. GNU/Linux also uses a configuration file called `/etc/ld.so.conf'.
  4456.  
  4457. Knowing about important paths: directory structure
  4458.  
  4459. It is important to understand how the C compiler finds the files it needs. We have already mentioned the `-I' and `-L' options to the compilation command line. In general, all system include files can be found in the directory `/usr/include' and subdirectories of this directory. All system libraries can be found in `/usr/lib'.
  4460.  
  4461. Many packages build their own libraries and keep the relevant files in separate directories so that if the system gets reinstalled, they do not get deleted. This is true for example of the X-windows system. The include and library files for this are typically kept in directories which look something like `/usr/local/X11R5/include' and `/usr/X11R6/lib'. That means that we need to give all of this information to the compiler. Compiling a program becomes a complicated task in many cases so we need some kind of script to help us perform the task. The Unix tool make was designed for this purpose.
  4462.  
  4463. Make
  4464.  
  4465. Nowadays compilers are often sold with fancy user environments driven by menus which make it easier to compile programs. Unix has similar environments but all of them use shell-based command line compilation beneath the surface. That is because UNIX programmers are used to writing large and complex programs which occupy many directories and subdirectories. Each directory has to be adapted or configured to fit the particular flavour of Unix system it is being compiled upon. Interactive user environments are very poor at performing this kind of service. UNIX solves the problem of compiling enormous trees of software (such as the unix system itself!) by using a compilation language called `make'. Such language files can be generated automatically by scripts, allowing very complex programs to configure and compile themselves from a single control script.
  4466.  
  4467. Compiling large projects
  4468.  
  4469. Typing lines like
  4470.  
  4471.  
  4472.  cc -c file1.c file2.c ...
  4473.  cc -o target file1.o ....
  4474.  
  4475. repeatedly to compile a complicated program can be a real nuisance. One possibility would therefore be to keep all the commands in a script. This could waste a lot of time though. Suppose you are working on a big project which consists of many lines of source code -- but are editing only one file. You really only want to recompile the file you are working on and then relink the resulting object file with all of the other object files. Recompiling the other files which hadn't changed would be a waste of time. But that would mean that you would have to change the script each time you change what you need to compile.
  4476.  
  4477. A better solution is to use the `make' command. `make' was designed for precisely this purpose. To use `make', we create a file called `Makefile' in the same directory as our program. `make' is a quite general program for building software. It is not specifically tied to the C programming language--- it can be used in any programming language.
  4478.  
  4479. A `make' configuration file, called a `Makefile', contains rules which describe how to compile or build all of the pieces of a program. For example, even without telling it specifically, make knows that in order to go from `prog.c' to `prog.o' the command `cc -c prog.c' must be executed. A Makefile works by making such associations. The Makefile contains a list of all of the files which compose the program and rules as to how to get to the finished product from the source.
  4480.  
  4481. The idea is that, to compile a program, we just have to type make. `make' then reads the Makefile and compiles all of the parts which need compiling. It does not recompile files which have not changed since the last compilation! How does it do this? `make' works by comparing the time-stamp on the file it needs to create with the time-stamp on the file which is to be compiled. If the compiled version exists and is newer than its source then the source does not need to be recompiled.
  4482.  
  4483. To make this idea work in practice, `make' has to know how to go through the steps of compiling a program. Some default rules are defined in a global configuration file, e.g.
  4484.  
  4485.  
  4486. /usr/include/make/default.mk
  4487.  
  4488. Let's consider an example of what happens for the the three files `a.c', `b.c' and `c.c' in the example above -- and let's not worry about what the Makefile looks like yet.
  4489.  
  4490. The first time we compile, only the `.c' files exist. When we type `make', the program looks at its rules and finds that it has to make a file called `myprog'. To make this it needs to execute the command
  4491.  
  4492.  
  4493.  gcc -o myprog  a.o b.o c.o
  4494.  
  4495. So it looks for `a.o' etc and doesn't find them. It now goes to a kind of subroutine and looks to see if it has any rules for making files called `.o' and it discovers that these are made by compiling with the `gcc -c' option. Since the files do not exist, it does this. Now the files `a.o b.o c.o' exist and it jumps back to the original problem of trying to make `myprog'. All the files it needs now exist and so it executes the command and builds `myprog'.
  4496.  
  4497. If we now edit `a.c', and type `make' once again -- it goes through the same procedure as before but now it finds all of the files. So it compares the dates on the files -- if the source is newer than the result, it recompiles.
  4498.  
  4499. By using this recursive method, `make' only compiles those parts of a program which need compiling.
  4500.  
  4501. Makefiles
  4502.  
  4503. To write a Makefile, we have to tell `make' about dependencies. The dependencies of a file are all of those files which are required to build it. Thus, the dependencies of `myprog' are `a.o', `b.o' and `c.o'. The dependencies of `a.o' are simply `a.c', the dependencies of `b.o' are `b.c' and so on.
  4504.  
  4505. A Makefile consists of rules of the form:
  4506.  
  4507. target : dependencies
  4508. TAB                   rule;
  4509.  
  4510. The target is the thing we want to build, the dependenices are like subroutines to be executed first if they do not exist. Finally the rule is to be executed if all if the dependenices exist; it takes the dependencies and turns them into the target. There are two important things to remember:
  4511.  
  4512.     The file names must start on the first character of a line.
  4513.     There must be a TAB character at the beginning of every rule or action. If there are spaces instead of tabs, or no tab at all, `make' will signal an error. This bizarre feature can cause a lot of confusion.
  4514.  
  4515. Let's look at an example Makefile for a program which consists of two course files `main.c' and `other.c' and which makes use of a library called `libdb' which lies in the directory `/usr/local/lib'. Our aim is to build a program called database:
  4516.  
  4517.  
  4518. #
  4519. # Simple Makefile for `database'
  4520. #
  4521.  
  4522. # First define a macro
  4523.  
  4524. OBJ = main.o other.o
  4525.  
  4526. CC = gcc
  4527. CFLAGS = -I/usr/local/include
  4528. LDFLAGS = -L/usr/local/lib -ldb
  4529. INSTALLDIR = /usr/local/bin
  4530.  
  4531. #
  4532. # Rules start here. Note that the $@ variable becomes the name of the
  4533. # executable file. In this case it is taken from the ${OBJ} variable
  4534. #
  4535.  
  4536. database: ${OBJ}
  4537.           ${CC} -o $@ ${OBJ} ${LDFLAGS}
  4538.  
  4539. #
  4540. # If a header file changes, normally we need to recompile everything.
  4541. # There is no way that make can know this unless we write a rule which
  4542. # forces it to rebuild all .o files if the header file changes...
  4543. #
  4544.  
  4545. ${OBJ}: ${HEADERS}
  4546.  
  4547. #
  4548. # As well as special rules for special files we can also define a
  4549. # "suffix rule". This is a rule which tells us how to build all files
  4550. # of a certain type. Here is a rule to get .o files from .c files.
  4551. # The $< variable is like $? but is only used in suffix rules.
  4552. #
  4553.  
  4554. .c.o:
  4555.           ${CC} -c ${CFLAGS} $<
  4556.  
  4557. #######################################################################
  4558. # Clean up
  4559. #######################################################################
  4560.  
  4561.  #
  4562.  # Make can also perform ordinary shell command jobs
  4563.  # "make tidy" here performs a cleanup operation
  4564.  #
  4565.  
  4566. clean:
  4567.          rm -f ${OBJ}
  4568.          rm -f y.tab.c lex.yy.c y.tab.h
  4569.          rm -f y.tab lex.yy
  4570.          rm -f *% *~ *.o
  4571.          rm -f mconfig.tab.c mconfig.tab.h a.out
  4572.          rm -f man.dvi man.aux man.log man.toc
  4573.          rm -f cfengine.tar.gz cfengine.tar cfengine.tar.Z
  4574.          make tidy
  4575.          rm -f cfengine
  4576.  
  4577. install: ${INSTALLDIR}/database
  4578.         cp database ${INSTALLDIR}/database
  4579.  
  4580. The Makefile above can be invoked in several ways.
  4581.  
  4582.  
  4583. make
  4584. make database
  4585. make clean
  4586. make install
  4587.  
  4588. If we simple type `make' i.e. the first of these choices, `make' takes the first of the rules it finds as the object to build. In this case the rule is `database', so the first two forms above are equivalent.
  4589.  
  4590. On the other hand, if we type
  4591.  
  4592. make clean
  4593.  
  4594. then execution starts at the rule for `clean', which is normally used to remove all files except the original source code. Make `install' causes the compiled program to be installed at its intended destination.
  4595.  
  4596. `make' uses some special variables (which resemble the special variables used in Perl -- but don't confuse them). The most useful one is `$@' which represents the current target -- or the object which `make' would like to compile. i.e. as `make' checks each file it would like to compile, `$@' is set to the current filename.
  4597.  
  4598. $@
  4599.    This evaluates to the current target i.e. the name of the object you are currently trying to build. It is normal to use this as the final name of the program when compiling
  4600. $?
  4601.    This is used only outside of suffix rules and means the name of all the files which must be compiled in order to build the current target.
  4602.  
  4603.    target: file1.o file2.o
  4604.    TAB cc -o $@ $?
  4605.  
  4606. $<
  4607.    This is only used in suffix rules. It has the same meaning as `$?' but only in suffix rules. It stands for the pre-requisite, or the file which must be compiled in order to make a given object.
  4608.  
  4609. Note that, because `make' has some default rules defined in its configuration file, a single-file C program can be compiled very easily by typing
  4610.  
  4611. make filename.c
  4612.  
  4613. This is equivalent to
  4614.  
  4615. cc -c filename.c
  4616. cc -o filename filename.o
  4617.  
  4618. New suffix rules for C++
  4619.  
  4620. Standard rules for C++ are not often built into UNIX systems at the time of writing, but we can create them in our own Makefiles very easily. Here we shall use the GNU compiler g++'s conventions for C++ files. Here is a sample Makefile for using C++. Note that the `.SUFFIXES' command must be used to declare new endings or file extensions.
  4621.  
  4622. ##################################################################
  4623. #
  4624. # This is the Makefile for g++
  4625. #
  4626. ##################################################################
  4627.  
  4628. OBJ = cpp-prog.o X.o Y.o Z.o
  4629.  
  4630. CCPLUS = g++
  4631.  
  4632. .SUFFIXES: .C .o .h
  4633.  
  4634. #
  4635. # Program Rules
  4636. #
  4637.  
  4638. filesys: ${OBJ}
  4639.         $(CCPLUS) -o filesys $(OBJ)
  4640.  
  4641. #
  4642. #  Extra dependencies on the header file
  4643. # (if the header file changes, we need to rebuild *.o)
  4644. #
  4645.  
  4646. cpp-prog.o: filesys.h
  4647. X.o: filesys.h
  4648. Y.o: filesys.h
  4649. Z.o: filesys.h
  4650.  
  4651. #
  4652. # Suffix rules
  4653. #
  4654.  
  4655. .C.o:
  4656.         $(CCPLUS) -c $<
  4657.  
  4658. The general rule here tells make that a `.o' file can be created from a `.C' file by executing the command `$(CCPLUS) -c'. (This is identical to the C case, exept for the name of the compiler). The extra dependencies tell make that, if we change the header file `filesys.h', then we must recompile all the files which read in `filesys.h', since this could affect all of these. Finally, the highest level rule says that to make `filesys' from the `.o' files, we have to run `$(CCPLUS) -o filesys *.o'.
  4659.  
  4660. The argv, argc and envp paramters
  4661.  
  4662. When we write C programs which reads command line arguments, they are fed to us as an array of strings called the argument vector. The mechanisms for the C-shell and Perl are derived from the C argument vector. To read in the command line, we write
  4663.  
  4664.  
  4665. main (argc,argv,envp)
  4666.  
  4667. int argc;
  4668. char *argv[], *envp[];
  4669.  
  4670. {
  4671. printf ("The first argument was %s\n",argv[1]);
  4672. }
  4673.  
  4674. Argument zero is the name of the program itself and `argv[argc-1]' is the last argument. The above definitions are in Kernighan and Ritchie C style. In ANSI C, the arguments can be declared using prototype:
  4675.  
  4676. main (int argc, char **argv)
  4677.  
  4678. {
  4679.  
  4680. }
  4681.  
  4682. The array of strings `envp[]' is a list of values of the environment variables of the system, formatted by
  4683.  
  4684. NAME=value
  4685.  
  4686. This gives C programmers access to the shell's global environment.
  4687.  
  4688. Environment variables in C
  4689.  
  4690. In addition to the `envp' vector, it is possible to access the environment variables through the call `getenv()'. This is used as follows; suppose we want to access the shell environment variable `$HOME'.
  4691.  
  4692. char *string;
  4693.  
  4694. string = getenv("HOME");
  4695.  
  4696. `string' is now a pointer to static but public data. You should not use `string' as if it were you're own property because it will be used again by the system. Copy it's contents to another string before using the data.
  4697.  
  4698. char buffer[500];
  4699.  
  4700. strcpy (buffer,string);
  4701.  
  4702. Files and directories
  4703.  
  4704. All of the regular C functions from the standard library are available to Unix programmers. The standard functions only address the issue of reading and writing to files however, they do not deal with operating system specific attributes such as file permissions and file types. Nor is there a mechanisms for obtaining lists of files within a directory. The reason for these omissions is that they are operating system dependent. To find out about these other attributes POSIX describes some standard Unix system calls.
  4705.  
  4706. opendir, readdir
  4707.  
  4708. Files and directories are handled by functions defined in the header file `dirent.h'. In earlier UNIX systems the file `dir.h' was used -- and the definitions were slightly different, but not much. To get a list of files in a directory we must open the directory and read from it -- just like a file. (A directory is just a file which contains data on its entries). The commands are
  4709.  
  4710. opendir
  4711. closedir
  4712. readdir
  4713.  
  4714. See the manual pages for dirent. These functions return pointers to a dirent structure which is defined in the file `/usr/include/dirent.h'. Here is an example ls command which lists the contents of the directory `/etc'. This header defines a structure
  4715.  
  4716.  
  4717. struct dirent
  4718.   {
  4719.    off_t                d_off;      /* offset of next disk dir entry */
  4720.    unsigned long        d_fileno;   /* file number of entry */
  4721.    unsigned short       d_reclen;   /* length of this record */
  4722.    unsigned short       d_namlen;   /* length of string in d_name */
  4723.    char                 d_name[255+1];  /* name (up to MAXNAMLEN + 1) */
  4724.   };
  4725.  
  4726. which can be used to obtain information from the directory nodes.
  4727.  
  4728.  
  4729. #include <stdio.h>
  4730. #include <dirent.h>
  4731.  
  4732. main ()
  4733.  
  4734. { DIR *dirh;
  4735.  struct dirent *dirp;
  4736.  static char mydir[20] = "/etc";
  4737.  
  4738. if ((dirh = opendir(mydir)) == NULL)
  4739.   {
  4740.   perror("opendir");
  4741.   return;
  4742.   }
  4743.  
  4744. for (dirp = readdir(dirh); dirp != NULL; dirp = readdir(dirh))
  4745.   {
  4746.   printf("Got dir entry: %s\n",dirp->d_name);
  4747.   }
  4748.  
  4749. closedir(dirh);
  4750. }
  4751.  
  4752. Notice that reading from a directory is like reading from a file with fgets(), but the entries are filenames rather than lines of text.
  4753.  
  4754. stat()
  4755.  
  4756. To determine the file properties or statistics we use the function call `stat()' or its corollory `lstat()'. Both these functions find out information about files (permissions, owner, filetype etc). The only difference between them is the way in which they treat symbolic links. If `stat' is used on a symbolic link, it stats the file the link points to rather than the link itself. If `lstat' is used, the data refer to the link. Thus, to detect a link, we must use `lstat', See section lstat and readlink.
  4757.  
  4758. The data in the `stat' structure are defined in the file `/usr/include/sys/stat.h'. Here are the most important structures.
  4759.  
  4760.  
  4761. struct  stat
  4762.    {
  4763.    dev_t        st_dev;             /* device number*/
  4764.    ino_t        st_ino;             /* file inode */
  4765.    mode_t       st_mode;            /* permission */
  4766.    short        st_nlink;           /* Number of hardlinks to file */
  4767.    uid_t        st_uid;             /* user id */
  4768.    gid_t        st_gid;             /* group id */
  4769.    dev_t        st_rdev;
  4770.    off_t        st_size;            /* size in bytes */
  4771.    time_t       st_atime;           /* time file last accessed */
  4772.    time_t       st_mtime;           /* time file contents last modified */
  4773.    time_t       st_ctime;           /* time last attribute change */
  4774.    long         st_blksize;
  4775.    long         st_blocks;
  4776.    };
  4777.  
  4778. lstat and readlink
  4779.  
  4780. The function `stat()' treats symbolic links as though they were the files they point to. In other words, if we use `stat()' to read a symbolic link, we end up reading the file the link points to and not the link itself--- we never see symbolic links. To avoid this problem, there is a different version of the stat function called `lstat()' which is identical to `stat()' except that it treats links as links and not as the files they point to. This means that we can test whether a file is a symbolic link, only if we use `lstat()'. (See the next paragraph.)
  4781.  
  4782. Once we have identified a file to be a symbolic link, we use the `readlink()' function to obtain the name of the file the link points to.
  4783.  
  4784.  
  4785. #define bufsize 512
  4786. char buffer[bufsize];
  4787.  
  4788. readlink("/path/to/file",buffer,bufsize);
  4789.  
  4790. The result is returned in the string buffer.
  4791.  
  4792. stat() test macros
  4793.  
  4794. As we have already mentioned, the Unix mode bits contain not only information about what permissions a file has, but also bits describing the type of file -- whether it is a directory or a link etc. There are macros defined in UNIX to extract this information from the `st_mode' member of the `stat' structure. They are defined in the `stat.h' headerfile. Here are some examples.
  4795.  
  4796. #define S_ISBLK(m)    /* is block device */
  4797. #define S_ISCHR(m)    /* is character device */
  4798. #define S_ISDIR(m)    /* is directory */
  4799. #define S_ISFIFO(m)   /* is fifo pipe/socket */
  4800. #define S_ISREG(m)    /* is regular (normal) file */
  4801.  
  4802. #define S_ISLNK(m)    /* is symbolic link */  /* Not POSIX */
  4803. #define S_ISSOCK(m)   /* is a lock */
  4804.  
  4805. #define S_IRWXU     /* rwx, owner */
  4806. #define     S_IRUSR /* read permission, owner */
  4807. #define     S_IWUSR /* write permission, owner */
  4808. #define     S_IXUSR /* execute/search permission, owner */
  4809. #define S_IRWXG     /* rwx, group */
  4810. #define     S_IRGRP /* read permission, group */
  4811. #define     S_IWGRP /* write permission, grougroup */
  4812. #define     S_IXGRP /* execute/search permission, group */
  4813. #define S_IRWXO     /* rwx, other */
  4814. #define     S_IROTH /* read permission, other */
  4815. #define     S_IWOTH /* write permission, other */
  4816. #define     S_IXOTH /* execute/search permission, other */
  4817.  
  4818. These return true or false when acting on the mode member. Here is an example See section Example filing program.
  4819.  
  4820.  
  4821. struct stat statvar;
  4822.  
  4823. stat("file",&statvar);
  4824.  
  4825. /* test return values */
  4826.  
  4827. if (S_ISDIR(statvar.st_mode))
  4828.   {
  4829.   printf("Is a directory!");
  4830.   }
  4831.  
  4832. Example filing program
  4833.  
  4834. The following example program demonstrates the use of the directory functions in dirent and the stat function call.
  4835.  
  4836. /********************************************************************/
  4837. /*                                                                  */
  4838. /* Reading directories and `statting' files                         */
  4839. /*                                                                  */
  4840. /********************************************************************/
  4841.  
  4842. #include <stdio.h>
  4843. #include <dirent.h>
  4844. #include <sys/types.h>
  4845. #include <sys/stat.h>
  4846.  
  4847. #define DIRNAME "/."
  4848. #define bufsize 255
  4849.  
  4850. /********************************************************************/
  4851.  
  4852. main ()
  4853.  
  4854. { DIR *dirh;
  4855.   struct dirent *dirp;
  4856.   struct stat statbuf;
  4857.   char *pathname[bufsize];
  4858.   char *linkname[bufsize];
  4859.  
  4860. if ((dirh = opendir(DIRNAME)) == NULL)
  4861.    {
  4862.    perror("opendir");
  4863.    exit(1);
  4864.    }
  4865.  
  4866. for (dirp = readdir(dirh); dirp != NULL; dirp = readdir(dirh))
  4867.    {
  4868.    if (strcmp(".",dirp->d_name) == 0 || strcmp("..",dirp->d_name) == 0)
  4869.       {
  4870.       continue;
  4871.       }
  4872.  
  4873.    if (strcmp("lost+found",dirp->d_name) == 0)
  4874.       {
  4875.       continue;
  4876.       }
  4877.  
  4878.    sprintf(pathname,"%s/%s",DIRNAME,dirp->d_name);
  4879.  
  4880.    if (lstat(pathname,&statbuf) == -1)                /* see man stat */
  4881.      {
  4882.      perror("stat");
  4883.      continue;
  4884.      }
  4885.  
  4886.    if (S_ISREG(statbuf.st_mode))
  4887.       {
  4888.       printf("%s is a regular file\n",pathname);
  4889.       };
  4890.  
  4891.    if (S_ISDIR(statbuf.st_mode))
  4892.       {
  4893.       printf("%s is a directory\n",pathname);
  4894.       }
  4895.  
  4896.    if (S_ISLNK(statbuf.st_mode))
  4897.       {
  4898.       bzero(linkname,bufsize);                         /* clear string */
  4899.       readlink(pathname,linkname,bufsize);
  4900.       printf("%s is a link to %s\n",pathname,linkname);
  4901.       }
  4902.  
  4903.    printf("The mode of %s is %o\n\n",pathname,statbuf.st_mode & 07777);
  4904.    }
  4905.  
  4906. closedir(dirh);
  4907. }
  4908.  
  4909. Process control, fork(), exec(), popen() and system
  4910.  
  4911. There is a number of ways in which processes can interact with one another and in which we can control their behaviour. We shall not go into great detail in this course, only provide examples for reference.
  4912.  
  4913. The UNIX `fork()' function is used to create child processes. This is the basis of all `heavyweight' multitasking under unix. Here is a simple example of fork in which we start a child process from within a program and wait for it to finish. Note that the code for the parent and the child is is the same file. The only thing that distinguishes parent from child is the value returned by the fork function.
  4914.  
  4915. When `fork()' is called, it duplicates the entire current process so that two parallel processes are then running. The only difference between these is that the child process (the copy) gets a return value of zero from `fork()', whereas the parent gets a return value equal to the process identifier (pid) of the child. This value can be used by the parent to send messages or to wait for the child. Here we show a simple example in which the `wait(NULL)' command is used to wait for the last child spawned by the parent.
  4916.  
  4917. /**************************************************************/
  4918. /*                                                            */
  4919. /*  A brief demo of the UNIX process duplicator fork().       */
  4920. /*                                                            */
  4921. /**************************************************************/
  4922.  
  4923. #include <stdio.h>
  4924.  
  4925. /***************************************************************/
  4926.  
  4927. main ()
  4928.  
  4929. { int pid, cid;
  4930.  
  4931. pid = getpid();
  4932.  
  4933. printf ("Fork demo! I am the parent (pid = %d)\n",pid);
  4934.  
  4935. if (! fork())
  4936.   {
  4937.   cid = getpid();
  4938.   printf ("I am the child (cid = %d) of (pid=%d)\n",cid,pid);
  4939.   ChildProcess();
  4940.   exit(0);
  4941.   }
  4942.  
  4943. printf("Parent waiting here for the child...\n");
  4944.  
  4945. wait(NULL);
  4946.  
  4947. printf("Child finished, parent quitting too!\n");
  4948. }
  4949.  
  4950. /**************************************************************/
  4951.  
  4952. ChildProcess()
  4953.  
  4954. { int i;
  4955.  
  4956. for (i = 0; i < 10; i++)
  4957.   {
  4958.   printf ("%d...\n",i);
  4959.   sleep(1);
  4960.   }
  4961. }
  4962.  
  4963. Another possibility is that we might want to execute a program and wait to find out what the result of the program is before continuing. There are two ways to do this. The first is a variation on the theme above and uses fork().
  4964.  
  4965. Let's create a function which runs a shell command from within a C program, and determines its return value. We make the result a boolean (integer) value, so that the function returns `true' if the shell command exits normally, See section Return codes.
  4966.  
  4967. if (ShellCommandReturnsZero(shell-command))
  4968.   {
  4969.   printf ("Command %s went ok\n",shell-command);
  4970.   }
  4971.  
  4972. To do this we first have to fork a new process and then use one of the exec commands to load a new code image on top of the new process. shell commands from C This sounds complicated, but it is necessary because of the way unix handles processes. If we had no use for the return value, we could simply execute a shell command using the system("shell command") function, (which does all this for us) but when system() exits, we can only tell if the command was executed successfully or unsuccessfully--we learn nothing about what actually failed (the shell or command which was executed under the shell?) If we require detailed information about what happened to the child process then we need to do the following.
  4973.  
  4974. #include <sys/types.h>
  4975. #include <sys/wait.h>
  4976.  
  4977.  /* Send complete command as a string */
  4978.  /* including all arguments           */
  4979.  
  4980. ShellCommandReturnsZero(comm)
  4981.  
  4982. char *comm;
  4983.  
  4984. { int status, i, argc;
  4985.  pid_t pid;
  4986.  char arg[maxshellargs][bufsize];
  4987.  char **argv;
  4988.  
  4989. /* Build argument array for execv call*/
  4990.  
  4991. for (i = 0; i < maxshellargs; i++)
  4992.   {
  4993.   bzero (arg[i],bufsize);
  4994.   }
  4995.  
  4996. argc = SplitCommand(comm,arg);
  4997.  
  4998. if ((pid = fork()) < 0)
  4999.   {
  5000.   FatalError("Failed to fork new process");
  5001.   }
  5002. else if (pid == 0)                     /* child */
  5003.   {
  5004.   argv = malloc((argc+1)*sizeof(char *));
  5005.  
  5006.   for (i = 0; i < argc; i++)
  5007.      {
  5008.      argv[i] = arg[i];
  5009.      }
  5010.  
  5011.   argv[i] = (char *) NULL;
  5012.  
  5013.   if (execv(arg[0],argv) == -1)
  5014.      {
  5015.      yyerror("script failed");
  5016.      perror("execvp");
  5017.      exit(1);
  5018.      }
  5019.   }
  5020. else                                    /* parent */
  5021.   {
  5022.   if (wait(&status) != pid)
  5023.      {
  5024.      printf("Wait for child failed\n");
  5025.      perror("wait");
  5026.      return false;
  5027.      }
  5028.   else
  5029.      {
  5030.      if (WIFSIGNALED(status))
  5031.         {
  5032.         printf("Script %s returned: %s\n",comm,WTERMSIG(status));
  5033.         return false;
  5034.         }
  5035.  
  5036.      if (! WIFEXITED(status))
  5037.         {
  5038.         return false;
  5039.         }
  5040.  
  5041.      if (WEXITSTATUS(status) == 0)
  5042.         {
  5043.         return true;
  5044.         }
  5045.      else
  5046.         {
  5047.         return false;
  5048.         }
  5049.      }
  5050.   }
  5051.  
  5052. }
  5053.  
  5054. /*******************************************************************/
  5055.  
  5056. SplitCommand(comm,arg)
  5057.  
  5058. char *comm, arg[maxshellargs][bufsize];
  5059.  
  5060. { char *sp;
  5061.  int i = 0, j;
  5062.  char buff[bufsize];
  5063.  
  5064. for (sp = comm; *sp != NULL; sp++)
  5065.   {
  5066.   bzero(buff,bufsize);
  5067.  
  5068.   if (i >= maxshellargs-1)
  5069.      {
  5070.      yyerror("Too many arguments in embedded script");
  5071.      FatalError("Use a wrapper");
  5072.      }
  5073.  
  5074.   while (*sp == ' ' || *sp == '\t')
  5075.      {
  5076.      sp++;
  5077.      }
  5078.  
  5079.   switch (*sp)
  5080.      {
  5081.      case '\"': sscanf (++sp,"%[^\"]",buff);
  5082.                 break;
  5083.      case '\": sscanf (++sp,"%[^\']",buff);
  5084.                  break;
  5085.       default:   sscanf (sp,"%s",buff);
  5086.                  break;
  5087.       }
  5088.  
  5089.    for (j = 0; j < bufsize; j++)
  5090.       {
  5091.       arg[i][j] = buff[j];
  5092.       }
  5093.  
  5094.    sp += strlen(arg[i]);
  5095.    i++;
  5096.    }
  5097. return (i);
  5098. }
  5099.  
  5100. In this example, the script waits for the exit signal from the child process before continuing. The return value from the child is available from the wait function with the help of a set of macros defined in `/usr/include/sys/wait.h'. The value is given by WTERMSIG(status).
  5101.  
  5102. In the final example, we can open a pipe to a process directly in a C program as though it were a file, by using the function popen(). Pipes may be opened for reading or for writing, in exactly the same way as a file is opened. The child process is automatically synchronized with the parent using this method. Here is a program which opens a unix command for reading (both stdout and stderr) from the child process are piped into the program. Notice that the syntax used in this call is that used by the Bourne shell, since this is build deeply into the unix execution design.
  5103.  
  5104.  
  5105. #define bufsize 1024
  5106.  
  5107. FILE *pp;
  5108. char VBUFF[bufsize];
  5109.  
  5110. ...
  5111.  
  5112. if ((pp = popen( "/sbin/mount -va 2<&1","r")) == NULL)
  5113.   {
  5114.   printf("Failed to open pipe\n");
  5115.   return errorcode;
  5116.   }
  5117.  
  5118. while (!feof(pp))
  5119.   {
  5120.   fgets(VBUFF,bufsize,pp);
  5121.  
  5122.   /* Just write the output to stdout */
  5123.  
  5124.   printf ("Pipe read: %s\n",VBUFF);
  5125.   }
  5126.  
  5127. pclose(pp);
  5128.  
  5129. A more secure popen()
  5130.  
  5131. One problem with the popen() system call is that it uses a shell to execute the command it obtains a pipe to. In the past this has been used to allow Unix security breaches, using a so-called IFS attack which can trick the shell into executing a program with the name of the first node in the directory of the executable. For instance,if the pipe was to open the program `/bin/ps', this coudl be tricked into executing a program in the current working directory of the process called `bin' with argument `ps'.
  5132.  
  5133. The solution is not to use a shell at all, but to replace popen() with a version which calls exec() directly. Here is a safe version from the source code of cfengine:
  5134.  
  5135.  
  5136. #define bufsize      4096
  5137. #define maxshellargs 20
  5138.  
  5139. pid_t *CHILD;
  5140. int    MAXFD = 20; /* Max number of simultaneous pipes */
  5141.  
  5142. /***************************************************************/
  5143.  
  5144. FILE *cfpopen(command, type)
  5145.    
  5146. char *command, *type;
  5147.  
  5148.  { char arg[maxshellargs][bufsize];
  5149.    int i, argc, pd[2];
  5150.    char **argv;
  5151.    pid_t pid;
  5152.    FILE *pp = NULL;
  5153.  
  5154.  if ((*type != 'r' && *type != 'w') || (type[1] != '\0'))
  5155.     {
  5156.     errno = EINVAL;
  5157.     return NULL;
  5158.     }
  5159.  
  5160.  if (CHILD == NULL)   /* first time */
  5161.     {
  5162.     if ((CHILD = calloc(MAXFD,sizeof(pid_t))) == NULL)
  5163.        {
  5164.        return NULL;
  5165.        }
  5166.     }
  5167.  
  5168.  if (pipe(pd) < 0)  /* Create a pair of descriptors to this process */
  5169.     {
  5170.     return NULL;
  5171.     }
  5172.  
  5173.  if ((pid = fork()) == -1)
  5174.     {
  5175.     return NULL;
  5176.     }
  5177.  
  5178.  if (pid == 0)
  5179.     {
  5180.     switch (*type)
  5181.        {
  5182.        case 'r':
  5183.  
  5184.                  close(pd[0]);        /* Don't need output from parent */
  5185.  
  5186.                 if (pd[1] != 1)
  5187.                    {
  5188.                    dup2(pd[1],1);    /* Attach pp=pd[1] to our stdout */
  5189.                    dup2(pd[1],2);    /* Merge stdout/stderr */
  5190.                    close(pd[1]);
  5191.                    }
  5192.  
  5193.                 break;
  5194.  
  5195.       case 'w':
  5196.  
  5197.                 close(pd[1]);
  5198.  
  5199.                 if (pd[0] != 0)
  5200.                    {
  5201.                    dup2(pd[0],0);
  5202.                    close(pd[0]);
  5203.                   }
  5204.       }
  5205.  
  5206.    for (i = 0; i < MAXFD; i++)
  5207.       {
  5208.       if (CHILD[i] > 0)
  5209.          {
  5210.          close(CHILD[i]);
  5211.          }
  5212.  
  5213.       argc = SplitCommand(command,arg);
  5214.       argv = (char **) malloc((argc+1)*sizeof(char *));
  5215.  
  5216.       if (argv == NULL)
  5217.          {
  5218.          FatalError("Out of memory");
  5219.          }
  5220.      
  5221.       for (i = 0; i < argc; i++)
  5222.          {
  5223.          argv[i] = arg[i];
  5224.          }
  5225.  
  5226.       argv[i] = (char *) NULL;
  5227.  
  5228.       if (execv(arg[0],argv) == -1)
  5229.          {
  5230.          sprintf(OUTPUT,"Couldn't run %s",arg[0]);
  5231.          CfLog(cferror,OUTPUT,"execv");
  5232.          }
  5233.  
  5234.       _exit(1);
  5235.       }
  5236.    }
  5237. else
  5238.    {
  5239.    switch (*type)
  5240.       {
  5241.       case 'r':
  5242.  
  5243.                 close(pd[1]);
  5244.      
  5245.                 if ((pp = fdopen(pd[0],type)) == NULL)
  5246.                    {
  5247.                    return NULL;
  5248.                    }
  5249.                 break;
  5250.      
  5251.       case 'w':
  5252.  
  5253.                 close(pd[0]);
  5254.      
  5255.                 if ((pp = fdopen(pd[1],type)) == NULL)
  5256.                    {
  5257.                    return NULL;
  5258.                    }
  5259.       }
  5260.  
  5261.    CHILD[fileno(pp)] = pid;
  5262.    return pp;
  5263.    }
  5264. }
  5265.  
  5266. /***************************************************************/
  5267.  
  5268. cfpclose(pp)
  5269.  
  5270. FILE *pp;
  5271.  
  5272. { int fd, status;
  5273.  pid_t pid;
  5274.  
  5275. Debug("cfpclose(pp)\n");
  5276.  
  5277. if (CHILD == NULL)  /* popen hasn't been called */
  5278.   {
  5279.   return -1;
  5280.   }
  5281.  
  5282. fd = fileno(pp);
  5283.  
  5284. if ((pid = CHILD[fd]) == 0)
  5285.   {
  5286.   return -1;
  5287.   }
  5288.  
  5289. CHILD[fd] = 0;
  5290.  
  5291. if (fclose(pp) == EOF)
  5292.   {
  5293.   return -1;
  5294.   }
  5295.  
  5296. Debug("cfpopen - Waiting for process %d\n",pid);
  5297.  
  5298. #ifdef HAVE_WAITPID
  5299.  
  5300. while(waitpid(pid,&status,0) < 0)
  5301.   {
  5302.   if (errno != EINTR)
  5303.      {
  5304.      return -1;
  5305.      }
  5306.   }
  5307.  
  5308. return status;
  5309.  
  5310. #else
  5311.      
  5312. if (wait(&status) != pid)
  5313.    {
  5314.    return -1;
  5315.    }
  5316. else
  5317.    {
  5318.    if (WIFSIGNALED(status))
  5319.       {
  5320.       return -1;
  5321.       }
  5322.    
  5323.    if (! WIFEXITED(status))
  5324.       {
  5325.       return -1;
  5326.       }
  5327.    
  5328.    return (WEXITSTATUS(status));
  5329.    }
  5330. #endif
  5331. }
  5332.  
  5333. /*******************************************************************/
  5334. /* Command exec aids                                               */
  5335. /*******************************************************************/
  5336.  
  5337. SplitCommand(comm,arg)
  5338.  
  5339. char *comm, arg[maxshellargs][bufsize];
  5340.  
  5341. { char *sp;
  5342.  int i = 0, j;
  5343.  char buff[bufsize];
  5344.  
  5345. for (sp = comm; sp < comm+strlen(comm); sp++)
  5346.   {
  5347.   bzero(buff,bufsize);
  5348.  
  5349.   if (i >= maxshellargs-1)
  5350.      {
  5351.      CfLog(cferror,"Too many arguments in embedded script","");
  5352.      FatalError("Use a wrapper");
  5353.      }
  5354.  
  5355.   while (*sp == ' ' || *sp == '\t')
  5356.      {
  5357.      sp++;
  5358.      }
  5359.  
  5360.   switch (*sp)
  5361.      {
  5362.      case '\0': return(i-1);
  5363.  
  5364.      case '\"': sscanf (++sp,"%[^\"]",arg[i]);
  5365.                 break;
  5366.      case '\": sscanf (++sp,"%[^\']",arg[i]);
  5367.                 break;
  5368.      case '`':  sscanf (++sp,"%[^`]",arg[i]);
  5369.                  break;
  5370.       default:   sscanf (sp,"%s",arg[i]);
  5371.                  break;
  5372.       }
  5373.  
  5374.    sp += strlen(arg[i]);
  5375.    i++;
  5376.    }
  5377.  
  5378. return (i);
  5379. }
  5380.  
  5381. Traps and signals
  5382.  
  5383. Processes can receive signals from the UNIX kernel at any time. Some of these signals terminate the execution of the program. This can cause problems if the program is in the middle of critical activity such as writing to a file. For that reason we can trap signals and provide our own routine for handling them in a special way.
  5384.  
  5385. A signal handler is made by calling the function `signal()' for each signal and by specifying a pointer to a function which will be called in the event of a signal. For example:
  5386.  
  5387.  
  5388. main ()
  5389.  
  5390. { int HandleSignal();
  5391.  
  5392. signal(SIGTERM,HandleSignal);
  5393. }
  5394.  
  5395. HandleSignal()
  5396.  
  5397. {
  5398. /* Tidy up and exit cleanly */
  5399.  
  5400. exit(0);
  5401. }
  5402.  
  5403. `SIGTERM' is the usual signal sent by the command `kill'. There are many other signals which can be sent to programs. Here is list. You have to decide for yourself whether or not you want to provide your own signal handling function. To ignore a signal, you write
  5404.  
  5405.  
  5406. signal(SIGtype,SIG_IGN);
  5407.  
  5408. To remove a signal handler and re-activate a signal, you write
  5409.  
  5410.  
  5411. signal(SIGtype,SIG_DFL);
  5412.  
  5413. Regular expressions
  5414.  
  5415. A regular expression is a pattern for matching strings of text. We have met regular expressions earlier in connection with the shell and Perl. Naturally these earlier encounters have their roots in C functions for handling expressions. A regular expression is used by first `compiling' it into a convenient data structure. Then a matching function is used to compare the expression with a test string. In this example program we show how a regular expression typed in as an argument to the program is found within strings of input entered on the keyboard.
  5416.  
  5417.  
  5418. #include <stdio.h>
  5419. #include <regex.h>
  5420.  
  5421. main (argc,argv)
  5422.  
  5423. int argc;
  5424. char **argv;
  5425.  
  5426. {
  5427.  char buffer[1024];
  5428.  regex_t rx;
  5429.  regmatch_t match;
  5430.  size_t nmatch = 1;
  5431.  
  5432.  if (regcomp(&rx, argv[1], REG_EXTENDED) != 0)
  5433.     {
  5434.     perror("regcomp");
  5435.     return;
  5436.     }
  5437.  
  5438.  while (!feof(stdin))
  5439.     {
  5440.     fgets(buffer,1024,stdin);
  5441.    
  5442.     if (regexec(&rx,buffer,1,&match,0) == 0)
  5443.        {
  5444.        printf("Matched:(%s) at %d to %d",buffer,match.rm_so,match.rm_eo);
  5445.        }
  5446.  
  5447.     }
  5448.  
  5449. regfree(&rx);
  5450. }
  5451.  
  5452. Here is an example of its use. The output of the program is in italics
  5453.  
  5454.  
  5455. % a.out xyz
  5456. this is a string
  5457. another string
  5458. an xyz string
  5459. Matched: (an xyz string
  5460. ) at 3 to 6
  5461. another xyz zyxxyz string
  5462. Matched: (another xyz xyz string
  5463. ) at 8 to 11
  5464.  
  5465. % a.out 'xyz|abc'
  5466. This is a string
  5467. An abc string
  5468. Matched: (An abc string
  5469. ) at 3 to 6
  5470. Or an xyz string
  5471. Matched: (Or an xyz string
  5472. ) at 6 to 9
  5473.  
  5474. If you don't want the match data set &pm to NULL. To get an exact match rather than a substring check that the bounds are 0 and strlen(argv[1])-1.
  5475.  
  5476. DES encryption
  5477.  
  5478. Encryption with the SSLeay library, compile with command
  5479.  
  5480.  
  5481.  gcc crypto.c -I/usr/local/ssl/include -L/usr/local/ssl/lib -lcrypto
  5482.  
  5483. Example of normal triple DES encryption which works only on an 8-byte buffer:
  5484.  
  5485.  
  5486. /*****************************************************************************/
  5487. /*                                                                           */
  5488. /* File: crypto.c                                                            */
  5489. /*                                                                           */
  5490. /* Compile with:  gcc program.c  -lcrypto   (SSLeay)                         */
  5491. /*                                                                           */
  5492. /*****************************************************************************/
  5493.  
  5494. #include <stdio.h>
  5495. #include <des.h>
  5496.  
  5497. #define bufsize 1024
  5498.  
  5499.    /* Note how this truncates to 8 characters */
  5500.  
  5501. main ()
  5502.  
  5503. { char in[bufsize],out[bufsize],back[bufsize];
  5504.  des_cblock key1,key2,key3,seed = {0xFE,0xDC,0xBA,0x98,0x76,0x54,0x32,0x10};
  5505.  des_key_schedule ks1,ks2,ks3;
  5506.  
  5507. strcpy(in,"1 2 3 4 5 6 7 8 9 a b c d e f g h i j k");
  5508.  
  5509. des_random_seed(seed);
  5510.  
  5511. des_random_key(key1);
  5512. des_random_key(key2);
  5513. des_random_key(key3);
  5514.  
  5515. des_set_key((C_Block *)key1,ks1);
  5516. des_set_key((C_Block *)key2,ks2);
  5517. des_set_key((C_Block *)key3,ks3);
  5518. des_ecb3_encrypt((C_Block *)in,(C_Block *)out,ks1,ks2,ks3,DES_ENCRYPT);
  5519.  
  5520. printf("Encrypted [%s] into [%s]\n",in,out);
  5521.  
  5522. des_ecb3_encrypt((C_Block *)out,(C_Block *)back,ks1,ks2,ks3,DES_DECRYPT);
  5523.  
  5524. printf("and back to.. [%s]\n",back);
  5525. }
  5526.  
  5527. Triple DES, chaining mode, for longer strings (which must be a multiple of 8 bytes):
  5528.  
  5529.  
  5530. /*****************************************************************************/
  5531. /*                                                                           */
  5532. /* File: crypto.c                                                            */
  5533. /*                                                                           */
  5534. /* Compile with:  gcc program.c  -lcrypto   (SSLeay)                         */
  5535. /*                                                                           */
  5536. /*****************************************************************************/
  5537.  
  5538. #include <stdio.h>
  5539. #include <des.h>
  5540.  
  5541. #define bufsize 1024
  5542.  
  5543.    /* This can be used on arbitrary length buffers */
  5544.  
  5545. main ()
  5546.  
  5547. { char in[bufsize],out[bufsize],back[bufsize],workvec[bufsize];
  5548.  des_cblock key1,key2,key3,seed = {0xFE,0xDC,0xBA,0x98,0x76,0x54,0x32,0x10};
  5549.  des_key_schedule ks1,ks2,ks3;
  5550.  
  5551. strcpy(in,"1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t u v w x y z");
  5552.  
  5553. des_random_seed(seed);
  5554.  
  5555. des_random_key(key1);
  5556. des_random_key(key2);
  5557. des_random_key(key3);
  5558.  
  5559. des_set_key((C_Block *)key1,ks1);
  5560. des_set_key((C_Block *)key2,ks2);
  5561. des_set_key((C_Block *)key3,ks3);
  5562.  
  5563. /* This work vector can be intialized t anything ...*/
  5564.  
  5565. memset(workvec,0,bufsize);
  5566.  
  5567. des_ede3_cbc_encrypt((C_Block *)in,(C_Block *)out,(long)strlen(in),
  5568.           ks1,ks2,ks3,(C_Block *)workvec,DES_ENCRYPT);
  5569.  
  5570. printf("Encypted [%s] into [something]\n",in);
  5571.  
  5572. /* .. but this must be initialized the same as above */
  5573.  
  5574. memset(workvec,0,bufsize);
  5575.  
  5576. /* Note that the length is the original length, not strlen(out) */
  5577.  
  5578. des_ede3_cbc_encrypt((C_Block *)out,(C_Block *)back,(long)strlen(in),
  5579.           ks1,ks2,ks3,(C_Block *)workvec,DES_DECRYPT);
  5580.  
  5581. printf("and back to.. [%s]\n",back);
  5582. }
  5583.  
  5584. Device control: ioctl
  5585.  
  5586. The C function `ioctl' (I/O control) is used to send special control commands to devices like the disk and the network interface. The syntax of the function is
  5587.  
  5588. int ioctl(fd, request, arg)
  5589. int fd, request;
  5590. long arg;
  5591.  
  5592. The first parameter is normally as device handle or socket descriptor. The second is a control parameter. Lists of valid control parameters are normally defined in the system `include' files for a particular device. They are device and system dependent so you need a local manual and som detective work to find out what they are. The final parameter is a pointer to a variable which receives return data from the device.
  5593.  
  5594. `ioctl' commands are device specific, by their nature. The commands for the ethernet interface device are only partially standardized, for example. We could read the ethernet device (which is called `le0' on a Sun workstation), using the following command:
  5595.  
  5596.  
  5597. # include <sys/socket.h>      /* Typical includes for internet */
  5598. # include <sys/ioctl.h>
  5599. # include <net/if.h>
  5600. # include <netinet/in.h>
  5601. # include <arpa/inet.h>
  5602. # include <netdb.h>
  5603. # include <sys/protosw.h>
  5604. # include <net/route.h>
  5605.  
  5606. struct ifreq IFR;
  5607. int sk;
  5608. struct sockaddr_in sin;
  5609.  
  5610. strcpy(IFR.ifr_name,"le0");
  5611. IFR.ifr_addr.sa_family = AF_INET;
  5612.  
  5613. if ((sk = socket(AF_INET,SOCK_DGRAM,IPPROTO_IP)) == -1)
  5614.   {
  5615.   perror("socket");
  5616.   exit(1);
  5617.   }
  5618.  
  5619. if (ioctl(sk,SIOCGIFFLAGS, (caddr_t) &IFR) == -1)
  5620.   {
  5621.   perror ("ioctl");
  5622.   exit(1);
  5623.   }
  5624.  
  5625. We shall not go into the further details of `ioctl', but simply note its role in system programming.
  5626.  
  5627. Database example (Berkeley db)
  5628.  
  5629.  
  5630. DBT key,value;
  5631. DB *dbp;
  5632. DBC *dbcp;
  5633. db_recno_t recno;
  5634.  
  5635. if ((errno = db_open(CHECKSUMDB,DB_BTREE, DB_CREATE, 0664, NULL, NULL, &dbp)) != 0)
  5636.    {
  5637.    sprintf(OUTPUT,"cfd: couldn't open checksum database %s\n",CHECKSUMDB);
  5638.    CfLog(cferror,OUTPUT,"db_open");
  5639.    return false;
  5640.    }
  5641.  
  5642. bzero(&value,sizeof(value));
  5643. bzero(&key,sizeof(key));      
  5644.      
  5645. key.data = filename;
  5646. key.size = strlen(filename)+1;
  5647. value.data = dbvalue;
  5648. value.size = sizeof(dbvalue);
  5649.    
  5650. if ((errno = dbp->del(dbp,NULL,&key,0)) != 0)
  5651.    {
  5652.    CfLog(cferror,"","db_store");
  5653.    }
  5654.  
  5655. key.data = filename;
  5656. key.size = strlen(filename)+1;
  5657.    
  5658. if ((errno = dbp->put(dbp,NULL,&key,&value,0)) != 0)
  5659.    {
  5660.    CfLog(cferror,"put failed","db->put");
  5661.    }      
  5662.  
  5663. if ((errno = dbp->get(dbp,NULL,&key,&value,0)) == 0)
  5664.    {
  5665.    /* Not found ... */
  5666.    return;
  5667.    }
  5668.  
  5669. dbp->close(dbp,0);
  5670.  
  5671. Text parsing tools: `lex' and `yacc'
  5672.  
  5673. This section is a taster only. You only need to know what lex and yacc are, not how they work.
  5674.  
  5675. `lex' and `yacc' are two tools for the C programmer who wishes to make a text parser. A text parser is a program which reads a text file and interprets the symbols in it. Every programming language must include a text parser, for instance.
  5676.  
  5677. The `yacc' (yet another compiler compiler) program generates C code which parses a textfile, given a description of the syntax rules for the file. In other words, we define the logical structure of the text file, according to the way we wish to interpret it and give the rules to `yacc'. `yacc' produces C code from this which does the job.
  5678.  
  5679. `lex' is a `lexer'. It is normally used together with `yacc'. `lex' tokenizes or identifies symbols in a file. What that means is that it reads in a file and matches types of string in the file which are defined in terms of regular expressions by the programmer, and returns symbolic values for those strings.
  5680.  
  5681. Although `lex' can be used by independently of `yacc', it is normally used to identify the different types of string which define the syntax of a file. For example, suppose `yacc' was parsing a C program. On the beginning of a line, it might expect to find either a variable name or a preprocessor symbol. A variable name is just a string consisting of characters from the set `0-9a-Z_', whereas a preprocessor command always starts with the character `#'. `yacc' passes control to `lex' which reads the file and matches the first object on the line. If it finds a variable, it returns to `yacc' a token which is a number or value corresponding to `variable'. Similarly, if it finds a preprocessor command, it returns a token for that. If it doesn't match either type it returns something else and `yacc' signals a syntax error.
  5682.  
  5683. Here is a `yacc' file which parses a file consisting of lines of the form a+b, where $a$ and $b$ are numbers -- any other syntax is incorrect. We could have used this later in the example program for the client-server example, See section Socket streams.
  5684.  
  5685. You can learn more about lex and yacc in "Lex and Yacc", J. Levine, T. Mason and D. Brown, O'Reilly and Assoc.
  5686.  
  5687. %{
  5688. /*******************************************************************/
  5689. /*                                                                 */
  5690. /*  PARSER for a + b protocol                                      */
  5691. /*                                                                 */
  5692. /* The section between the single %'s gets copied verbatim into    */
  5693. /* the resulting C code yacc generates -- including this comment!  */
  5694. /*                                                                 */
  5695. /*******************************************************************/
  5696.  
  5697. #include <stdio.h>
  5698.  
  5699. extern char *yytext;
  5700.  
  5701. %}
  5702.  
  5703. %token NUMBER PLUS
  5704.  
  5705. %%
  5706.  
  5707. specification:       { yyerror("Warning: invalid statement");}
  5708.                     | statement;
  5709.  
  5710. statement:            NUMBER PLUS NUMBER;
  5711.  
  5712. The lexer to go with this parser generates the tokens NUMBER and PLUS used by `yacc':
  5713.  
  5714. %{
  5715. /*******************************************************************/
  5716. /*                                                                 */
  5717. /*  LEXER for a + b protocol                                       */
  5718. /*                                                                 */
  5719. /* Returns token types NUMBER and PLUS to yacc, one at a time      */
  5720. /*                                                                 */
  5721. /*******************************************************************/
  5722.  
  5723. #include "y.tab.h"       /* yacc produces this -- need this line! */
  5724.  
  5725. %}
  5726.  
  5727. number    [0-9]+
  5728. plus      [+]
  5729.  
  5730. %%
  5731.  
  5732. number                 {
  5733.                        return NUMBER;
  5734.                        }
  5735.  
  5736. plus                   {
  5737.                        return PLUS;
  5738.                        }
  5739.  
  5740. .                      {
  5741.                        return yytext[0];
  5742.                        }
  5743.  
  5744. %%
  5745.  
  5746. /* EOF */
  5747.  
  5748. The main program which uses `yacc' and `lex' looks like this:
  5749.  
  5750.  
  5751. extern FILE *yyin;
  5752.  
  5753. main ()
  5754.  
  5755. {
  5756.  
  5757. if ((yyin = fopen("My_Input_File","r")) == NULL)      /* Open file */
  5758.    {
  5759.    printf("Can't open file\n");
  5760.    exit (1);
  5761.    }
  5762.  
  5763. while (!feof(yyin))
  5764.    {
  5765.    yyparse();
  5766.    }
  5767.  
  5768. fclose (yyin);
  5769. }
  5770.  
  5771. Exercises
  5772.  
  5773.     Write a daemon program with a signal handler which makes a log of the heaviest (maximum cpu) process running, every five minutes. The program should exit if the log file becomes greater than 5-kbytes.
  5774.     Rewrite in C the perl program which lists all the files in the current directory containing a certain string.
  5775.     Write a version of `more' which prints control characters safely. See the `cat -e' command.
  5776.     Write a Makefile to create a shared library from a number of object files.
  5777.  
  5778. Network Programming
  5779.  
  5780. Client-server communication is the basis of modern operating system technology. The Unix socket mechanism makes stream-based communication virtually transparent.
  5781.  
  5782. Socket streams
  5783.  
  5784. Analogous to filestreams are sockets or TCP/IP network connections. A socket is a two-way (read/write) pseudo-file node. An open socket stream is like an open file-descriptor. Berkeley sockets are part of the standard C library.
  5785.  
  5786. There are two main kinds of socket: TCP/IP sockets and Unix domain sockets. Unix sockets can be used to provide local interprocess communication using a filestream communication protocol. TCP/IP sockets open file descriptors across the network. A TCP/IP socket is a file stream associated with an IP address and a port number. We write to a socket descriptor just as with a file descriptor, either with write() or using send().
  5787.  
  5788. When sending binary data over a network we have to be careful about machine level representations of data. Operating systems (actually the hardware they run on) fall into two categories known as big endian and little endian. The names refer to the byte-order of numerical representations. The names indicate how large integers (which require say 32 bits or more) are stored in memory. Little endian systems store the least significant byte first, while big endian systems store the most significant byte first. For example, the representation of the number 34,677,374 has either of these forms.
  5789.  
  5790.  
  5791.           -----------------------------------
  5792.   Big    |   2    |   17   |  34    |   126  |
  5793.           -----------------------------------
  5794.  
  5795.           -----------------------------------
  5796.   Little |   126  |   34   |  17    |   2    |
  5797.           -----------------------------------
  5798.  
  5799. Obviously if we are transferring data from one host to another, both hosts have to agree on the data representation otherwise there would be disastrous consequences. This means that there has to be a common standard of network byte ordering. For example, Solaris (SPARC hardware) uses network byte ordering (big endian), while GNU/Linux (Intel hardware) uses the opposite (little endian). This means that Intel systems have to convert the format every time something is transmitted over the network. Unix systems provide generic functions for converting between host-byteorder and network-byteorder for small and long integer data:
  5800.  
  5801.  
  5802.    htonl,  htons,  ntohl, ntohs
  5803.  
  5804. Here we list two example programs which show how to make a client-server pair. The server enters a loop, and listens for connections from any clients (the generic address `INADDR_ANY' is a wildcard for any address on the current local network segment). The client program sends requests to the server as a protocol in the form of a string of the type `a + b'. Normally `a' and `b' are numbers, in which case the server returns their sum to the client. If the message has the special form `halt + *', where the star is arbitrary, then the server shuts down. Any other form of message results in an error, which the server signals to the client.
  5805.  
  5806. The basic structure of the client-server components in terms of system calls is this:
  5807.  
  5808.  
  5809. Client:
  5810.  
  5811.  socket()             Create a socket
  5812.  connect()            Contact a server socket (IP + port)
  5813.  
  5814.    while (?)
  5815.       {
  5816.       send()          Send to server
  5817.       recv()          Receive from server
  5818.       }
  5819.  
  5820. Server:
  5821.  
  5822.   socket()            Create a socket
  5823.   bind()              Associates the socket with a fixed address
  5824.   listen()            Create a listen queue
  5825.  
  5826.   while()
  5827.      {
  5828.      reply=accept()   Accept a connection request
  5829.      recv()           Receive from client
  5830.      send()           Send to client
  5831.      }
  5832.  
  5833. /**********************************************************************/
  5834. /*                                                                    */
  5835. /* The client part of a client-server pair. This simply takes two     */
  5836. /* numbers and adds them together, returning the result to the client */
  5837. /*                                                                    */
  5838. /* Compiled with:                                                     */
  5839. /*                   cc server.c                                      */
  5840. /*                                                                    */
  5841. /* User types:                                                        */
  5842. /*                   3 + 5                                            */
  5843. /*                   a + b                                            */
  5844. /*                   halt + server                                    */
  5845. /**********************************************************************/
  5846.  
  5847. #include <stdio.h>
  5848. #include <sys/types.h>
  5849. #include <sys/socket.h>
  5850. #include <netinet/in.h>
  5851. #include <netdb.h>
  5852.  
  5853. #define PORT 9000                   /* Arbitrary non-reserved port */
  5854. #define HOST "nexus.iu.hioslo.no"
  5855. #define bufsize 20
  5856.  
  5857. /**********************************************************************/
  5858. /* Main                                                               */
  5859. /**********************************************************************/
  5860.  
  5861. main (argc,argv)
  5862.  
  5863. int argc;
  5864. char *argv[];
  5865.  
  5866. { struct sockaddr_in cin;
  5867.  struct hostent *hp;
  5868.  char buffer[bufsize];
  5869.  int sd;
  5870.  
  5871. if (argc != 4)
  5872.   {
  5873.   printf("syntax: client a + b\n");
  5874.   exit(1);
  5875.   }
  5876.  
  5877. if ((hp = gethostbyname(HOST)) == NULL)
  5878.   {
  5879.   perror("gethostbyname: ");
  5880.   exit(1);
  5881.   }
  5882.  
  5883. memset(&cin,0,sizeof(cin));             /* Another way to zero memory */
  5884.  
  5885. cin.sin_family = AF_INET;
  5886. cin.sin_addr.s_addr = ((struct in_addr *)(hp->h_addr))->s_addr;
  5887. cin.sin_port = htons(PORT);
  5888.  
  5889. printf("Trying to connect to %s = %s\n",HOST,inet_ntoa(cin.sin_addr));
  5890.  
  5891. if ((sd = socket(AF_INET,SOCK_STREAM,0)) == -1)
  5892.   {
  5893.   perror("socket");
  5894.   exit(1);
  5895.   }
  5896.  
  5897. if (connect(sd,&cin,sizeof(cin)) == -1)
  5898.   {
  5899.   perror("connect");
  5900.   exit(1);
  5901.   }
  5902.  
  5903. sprintf(buffer,"%s + %s",argv[1],argv[3]);
  5904.  
  5905. if (send(sd,buffer,strlen(buffer),0) == -1)
  5906.   {
  5907.   perror ("send");
  5908.   exit(1);
  5909.   }
  5910.  
  5911. if (recv(sd,buffer,bufsize,0) == -1)
  5912.   {
  5913.   perror("recv");
  5914.   exit (1);
  5915.   }
  5916.  
  5917. printf ("Server responded with %s\n",buffer);
  5918.  
  5919. close (sd);
  5920. unlink("./socket");
  5921. }
  5922.  
  5923. /**********************************************************************/
  5924. /*                                                                    */
  5925. /* The server part of a client-server pair. This simply takes two     */
  5926. /* numbers and adds them together, returning the result to the client */
  5927. /*                                                                    */
  5928. /* Compiled with:                                                     */
  5929. /*                   cc server.c                                      */
  5930. /*                                                                    */
  5931. /**********************************************************************/
  5932.  
  5933. #include <stdio.h>
  5934. #include <sys/types.h>
  5935. #include <sys/socket.h>
  5936. #include <netinet/in.h>
  5937. #include <netdb.h>
  5938.  
  5939. #define PORT 9000
  5940. #define bufsize 20
  5941. #define queuesize 5
  5942. #define true 1
  5943. #define false 0
  5944.  
  5945. /**********************************************************************/
  5946. /* Main                                                               */
  5947. /**********************************************************************/
  5948.  
  5949. main ()
  5950.  
  5951. { struct sockaddr_in cin;
  5952.  struct sockaddr_in sin;
  5953.  struct hostent *hp;
  5954.  char buffer[bufsize];
  5955.  int sd, sd_client, addrlen;
  5956.  
  5957. memset(&sin,0,sizeof(sin));       /* Another way to zero memory */
  5958. sin.sin_family = AF_INET;
  5959. sin.sin_addr.s_addr = INADDR_ANY;          /* Broadcast address */
  5960. sin.sin_port = htons(PORT);
  5961.  
  5962. if ((sd = socket(AF_INET,SOCK_STREAM,0)) == -1)
  5963.   {
  5964.   perror("socket");
  5965.   exit(1);
  5966.   }
  5967.  
  5968. if (bind(sd,&sin,sizeof(sin)) == -1)  /* Must have this on server */
  5969.   {
  5970.   perror("bind");
  5971.   exit(1);
  5972.   }
  5973.  
  5974. if (listen(sd,queuesize) == -1)
  5975.   {
  5976.   perror("listen");
  5977.   exit(1);
  5978.   }
  5979.  
  5980. while (true)
  5981.  {
  5982.  if ((sd_client = accept(sd,&cin,&addrlen)) == -1)
  5983.      {
  5984.      perror("accept");
  5985.      exit(1);
  5986.      }
  5987.  
  5988.   if (recv(sd_client,buffer,sizeof(buffer),0) == -1)
  5989.      {
  5990.      perror("recv");
  5991.      exit(1);
  5992.      }
  5993.  
  5994.   if (!DoService(buffer))
  5995.      {
  5996.      break;
  5997.      }
  5998.  
  5999.   if (send(sd_client,buffer,strlen(buffer),0) == -1)
  6000.      {
  6001.      perror("send");
  6002.      exit(1);
  6003.      }
  6004.  
  6005.   close (sd_client);
  6006.   }
  6007.  
  6008. close (sd);
  6009. printf("Server closing down...\n");
  6010. }
  6011.  
  6012. /**************************************************************/
  6013.  
  6014. DoService(buffer)
  6015.  
  6016. char *buffer;
  6017.  
  6018.  /* This is the protocol section. Here we must */
  6019.  /* check that the incoming data are sensible  */
  6020.  
  6021. { int a=0,b=0;
  6022.  
  6023. printf("Received: %s\n",buffer);
  6024. sscanf(buffer,"%d + %d\n",&a,&b);
  6025.  
  6026. if (a > 0 && b> 0)
  6027.   {
  6028.   sprintf(buffer,"%d + %d = %d",a,b,a+b);
  6029.   return true;
  6030.   }
  6031. else
  6032.   {
  6033.   if (strncmp("halt",buffer,4) == 0)
  6034.     {
  6035.     sprintf(buffer,"Server closing down!");
  6036.     return false;
  6037.     }
  6038.   else
  6039.     {
  6040.     sprintf(buffer,"Invalid protocol");
  6041.     return true;
  6042.     }
  6043.   }
  6044. }
  6045.  
  6046. In the example we use `streams' to implement a typical input/output behaviour for C. A stream interface is a so-called reliable protocol. There are other kinds of sockets too, called unrealiable, or UDP sockets. Features to notice on the server are that we must bind to a specific address. The client is always implicitly bound to an address since a socket connection always originates from the machine on which the client is running. On the server however we want to know which addresses we shall be receiving requests from. In the above example we use the generic wildcard address `INADDR_ANY' which means that any host can connect to the server. Had we been more specific, we could have limited communication to two machines only.
  6047.  
  6048. By calling `listen()' we set up a queue for incoming connections. Rather than forking a separate process to handle each request we set up a queue of a certain depth. If we exceed this depth then new clients rtying to connect will be refused connection.
  6049.  
  6050. The `accept' call is the mechanism which extracts a `reply handle' from the socket. Using the handle obtained from this call we can reply to the client without having to open a special socket explicitly.
  6051.  
  6052. An improved server side connection can be setup, reading the service name from `/etc/services' and setting reusable socket options to avoid busy signals, like this:
  6053.  
  6054.  
  6055.  struct sockaddr_in cin, sin;
  6056.  struct servent *server;
  6057.  int sd, addrlen = sizeof(cin);
  6058.  int portnumber, yes=1;
  6059.  
  6060.  if ((server = getservbyname(service-name,"tcp")) == NULL)
  6061.      {
  6062.      CfLog(cferror,"Couldn't get cfengine service","getservbyname");
  6063.      exit (1);
  6064.      }
  6065.  
  6066.   bzero(&cin,sizeof(cin));
  6067.  
  6068. /*  Service returns network byte order */
  6069.  
  6070.   sin.sin_port = (unsigned short)(server->s_port);
  6071.   sin.sin_addr.s_addr = INADDR_ANY;
  6072.   sin.sin_family = AF_INET;
  6073.  
  6074.   if ((sd = socket(AF_INET,SOCK_STREAM,0)) == -1)
  6075.      {
  6076.      CfLog(cferror,"Couldn't open socket","socket");
  6077.      exit (1);
  6078.      }
  6079.  
  6080.   if (setsockopt (sd, SOL_SOCKET, SO_REUSEADDR,
  6081.                      (char *) &yes, sizeof (int)) == -1)
  6082.      {
  6083.      CfLog(cferror,"Couldn't set socket options","sockopt");
  6084.      exit (1);
  6085.      }
  6086.  
  6087.  
  6088.   if (bind(sd,(struct sockaddr *)&sin,sizeof(sin)) == -1)
  6089.      {
  6090.    
  6091.      }
  6092.  
  6093. /* etc */
  6094.  
  6095. Multithreading a server
  6096.  
  6097. All the arguments must be collected into a struct, since only one argument pointer can be sent to the pthread functions.
  6098.  
  6099.  
  6100. #include <pthread.h>
  6101.  
  6102. SpawnCfGetFile(args)
  6103.  
  6104. struct cfd_thread_arg *args;
  6105.  
  6106. { pthread_t tid;
  6107.  void *CfGetFile();
  6108.  
  6109. pthread_attr_init(&PTHREADDEFAULTS);
  6110. pthread_attr_setdetachstate(&PTHREADDEFAULTS,PTHREAD_CREATE_DETACHED);
  6111.  
  6112. if (pthread_create(&tid,&PTHREADDEFAULTS,CfGetFile,args) != 0)
  6113.   {
  6114.   CfLog(cferror,"pthread_create failed","create");
  6115.   CfGetFile(args);
  6116.   }
  6117.  
  6118. pthread_attr_destroy(&PTHREADDEFAULTS);
  6119. }
  6120.  
  6121. /***************************************************************/
  6122.  
  6123. void *CfGetFile(args)
  6124.  
  6125. struct cfd_thread_arg *args;
  6126.  
  6127. { pthread_mutex_t mutex;
  6128.  
  6129. if (pthread_mutex_lock(&mutex) != 0)
  6130.   {
  6131.   CfLog(cferror,"pthread_mutex_lock failed","pthread_mutex_lock");
  6132.   free(args->replyfile);  /* from strdup in each thread */
  6133.   DeleteConn(args->connect);
  6134.   free((char *)args);
  6135.   return NULL;
  6136.   }
  6137.  
  6138. ACTIVE_THREADS++;   /* Global variable */
  6139.  
  6140. if (pthread_mutex_unlock(&mutex) != 0)
  6141.   {
  6142.   CfLog(cferror,"pthread_mutex_unlock failed","unlock");
  6143.   }
  6144.  
  6145.  
  6146. /* send data */
  6147.  
  6148. if (pthread_mutex_lock(&mutex) != 0)
  6149.   {
  6150.   CfLog(cferror,"pthread_mutex_lock failed","pthread_mutex_lock");
  6151.   return;
  6152.   }
  6153.  
  6154. ACTIVE_THREADS--;
  6155.  
  6156. if (pthread_mutex_unlock(&mutex) != 0)
  6157.   {
  6158.   CfLog(cferror,"pthread_mutex_unlock failed","unlock");
  6159.   }
  6160.  
  6161. #endif
  6162.  
  6163. return NULL;
  6164. }
  6165.  
  6166. System databases
  6167.  
  6168. The C library calls which query the databases are, amongst others,
  6169.  
  6170. getpwnam            get password data by name
  6171. getpwuid            get password data by uid
  6172. getgrnam            get group data by name
  6173. gethostent          get entry in hosts database
  6174. getnetgrent         get entry in netgroups database
  6175. getservbyname       get servive by name
  6176. getservbyport       get service by port
  6177. get protobyname     get protocol by name
  6178.  
  6179. For a complete list and how to use these, see the UNIX manual.
  6180.  
  6181. The following example shows how to read the password file of the system. The functions used here can be used regardless of whether the network information service (NIS) is in use. The data are returned in a structure which is defined in `/usr/include/pwd.h'.
  6182.  
  6183. /******************************************************************/
  6184. /*                                                                */
  6185. /* Read the passwd file by name and sequentially                  */
  6186. /*                                                                */
  6187. /******************************************************************/
  6188.  
  6189. #include <unistd.h>
  6190. #include <pwd.h>
  6191.  
  6192. main ()
  6193.  
  6194. { uid_t uid;
  6195.   struct passwd *pw;
  6196.  
  6197. uid = getuid();
  6198.  
  6199. pw = getpwuid(uid);
  6200.  
  6201. printf ("Your login name is %s\n",pw->pw_name);
  6202.  
  6203. printf ("Now here comes the whole file!\n\n");
  6204.  
  6205. setpwent();
  6206.  
  6207. while (getpwent())
  6208.   {
  6209.   printf ("%s:%s:%s\n",pw->pw_name,pw->pw_gecos,pw->pw_dir);
  6210.   }
  6211.                  
  6212. endpwent();
  6213. }
  6214.  
  6215. DNS - The Domain Name Service
  6216.  
  6217. The second network database service is that which converts host and domain names into IP numbers and vice versa. This is the domain name service, usually implemented by the BIND (Berkeley Internet Name Domain) software. The information here concerns version 4.9 of this software.
  6218.  
  6219. gethostbyname()
  6220.  
  6221. This is perhaps the most important function form hostname lookup. `gethostbyname()' gets its information either from files, NIS or DNS. Its behaviour is configured by the files mentioned above, See section DNS - The Domain Name Service. It is used to look up the IP address of a named host (including domain name if DNS is used). On the configurable systems described above, the full list of servers is queried until a reply is obtained. The order in which the different services are queried is important here since DNS returns a fully qualified name (host name plus domain name) whereas NIS and the `/etc/hosts' file database return only a hostname.
  6222.  
  6223. gethostbyname returns data in the form of a pointer to a static data structure. The syntax is
  6224.  
  6225. #include <netdb.h>
  6226.  
  6227. struct hostent *hp;
  6228.  
  6229. hp = gethostbyname("myhost.domain.country")
  6230.  
  6231. The resulting structure varies on different implementations of UNIX, but the `old BSD standard' is of the form:
  6232.  
  6233.  
  6234. struct  hostent
  6235.   {
  6236.   char    *h_name;        /* official name of host */
  6237.   char    **h_aliases;    /* alias list */
  6238.   int     h_addrtype;     /* host address type */
  6239.   int     h_length;       /* length of address */
  6240.   char    **h_addr_list;  /* list of addresses from name server */
  6241.   };
  6242.  
  6243. #define h_addr  h_addr_list[0]  /* address, for backward compatiblity */
  6244.  
  6245. The structure contains a list of addresses and or aliases from the nameserver. The interesting quantity is usually extracted by means of the macro `h_addr' whcih gives the first value in the address list, though officially one should examine the whole list now.
  6246.  
  6247. This value is a pointer which can be converted into a text form by the following hideous type transformation:
  6248.  
  6249.  
  6250. #include <sys/types.h>
  6251. #include <sys/socket.h>
  6252. #include <netinet/in.h>
  6253.  
  6254. struct sockaddr_in sin;
  6255.  
  6256. cin.sin_addr.s_addr = ((struct in_addr *)(hp->h_addr))->s_addr;
  6257.  
  6258. printf("IP address = %s\n",inet_ntoa(cin.sin_addr));
  6259.  
  6260. See the client program in the first section of this chapter for an example of its use.
  6261.  
  6262. C support for NFS
  6263.  
  6264. The support for NFS mounting in the standard C library is through two sources. NFS is based on the Sun's RPC system, so the basic calls are only instances of standard RPC protocols.
  6265.  
  6266. The C functions in the standard input/output library can be used to access NFS filesystems. Since NFS imitates the UNIX filesystem as closely as possible, NFS filesystems can be mounted in exactly the same way as ordinary filesystems. Unfortunately, the C functions which perform the mount operation in UNIX and depressingly non-standard. They differ on almost every implementation of UNIX.
  6267.  
  6268. The basic function which mounts a filesystem, in `mount' (see man (2) mount). The mount table is stored in a file /etc/mtab on BSD systems (again the name varies wildly from UNIX to UNIX, mnttab on HPUX for instance). The file /etc/rmtab on an NFS server contains a list of remote-mounted filesystems which are mounted by remote clients. C functions exist which can read the filesystem tables and place the resulting data in C struct types. Alas, these struct defintions are also quite different on different systems. See `/usr/include/sys/mount.h', so the user wishing to write system-independent code is confounded at the lowest level.
  6269.  
  6270. Exercises
  6271.  
  6272.    Use `gethostbyname()' to make a simple program like `nslookup' which gives the internet address of a named host.
  6273.    Modify the client server example above to make a `remote ls' command called `rls'. You should be able to use the syntax
  6274.  
  6275.    rls (options) hostname:/path/to/file
  6276.  
  6277. Summary of programming idioms.
  6278.  
  6279. True and false
  6280.  
  6281.  
  6282.    # C shell
  6283.  
  6284.      True   - non-zero/non-empty value
  6285.      False  - zero or null string
  6286.  
  6287.    # Bourne shell
  6288.  
  6289.      True   - 0 returned by shell command
  6290.      False  - non-zero returned by shell command
  6291.  
  6292.      ( Note that "test" converts from C shell style to Bourne shell)
  6293.  
  6294.    # Perl
  6295.  
  6296.      True   - non-zero/non-empty value
  6297.      False  - zero or null string
  6298.  
  6299.    /* C */
  6300.  
  6301.      True   - non zero integer
  6302.      False  - zero integer
  6303.  
  6304. Input from tty
  6305.  
  6306.    # C shell
  6307.  
  6308.       $<
  6309.  
  6310.    # Bourne shell
  6311.  
  6312.       line
  6313.       read
  6314.  
  6315.    # Perl
  6316.  
  6317.       <STDIN>
  6318.  
  6319.    /* C */
  6320.  
  6321.       scanf
  6322.  
  6323. Redirection of I/O
  6324.  
  6325.  
  6326.    # C  shell
  6327.  
  6328.      command  >  file
  6329.      command  >& file
  6330.      command  >> file
  6331.      command1 |  command2
  6332.  
  6333.    # Bourne shell
  6334.  
  6335.      command  > file
  6336.      command  > file 2>&1
  6337.      command  >> file
  6338.      command1 | command2
  6339.  
  6340.    # Perl
  6341.  
  6342.      open (HANDLE,">file")
  6343.      open (HANDLE,">file 2>&1")
  6344.      open (HANDLE,">>file")
  6345.      open (HANDLE,"command1 |")
  6346.      open (HANDLE,"| command2")
  6347.  
  6348.    /* C */
  6349.  
  6350.      fopen ("file","w"); printf(..)
  6351.      fopen ("file","w"); printf(..); fprintf(stderr,..)
  6352.      fopen ("file","a"); printf(..)
  6353.      popen ("command1","r")
  6354.      popen ("command2","w")
  6355.  
  6356. Loops and tests
  6357.  
  6358.  
  6359.    /* C */ Shell
  6360.  
  6361.      foreach end         if then else endif
  6362.      while end           switch case breaksw endsw
  6363.      repeat
  6364.  
  6365.    # Bourne shell
  6366.  
  6367.      while do done       if then else fi
  6368.      until do done       case in esac
  6369.      for in do done
  6370.  
  6371.    # Perl
  6372.  
  6373.      while               if then else
  6374.      for                 unless else
  6375.      foreach
  6376.      until
  6377.      do while
  6378.      do until
  6379.  
  6380.    /* C */
  6381.  
  6382.      while               if then else
  6383.      do while            switch case
  6384.      for
  6385.  
  6386. Arguments from command line
  6387.  
  6388.    # C shell
  6389.  
  6390.      $argv[]
  6391.      $#argv
  6392.  
  6393.    # Bourne Shell
  6394.  
  6395.      $1, $2, $3...  $*
  6396.      $#
  6397.  
  6398.    # Perl
  6399.  
  6400.      $ARGV[]
  6401.      $#ARGV
  6402.  
  6403.    /* C */
  6404.  
  6405.      char argv[][]
  6406.      int  argc
  6407.  
  6408. Arithmetic
  6409.  
  6410.    # C shell
  6411.  
  6412.        a = $b + $c
  6413.  
  6414.    # Bourne shell
  6415.  
  6416.       a = `expr $b + $c`
  6417.  
  6418.    # Perl
  6419.  
  6420.       $a = $b + $c;
  6421.  
  6422.    /* C */
  6423.  
  6424.       a = b + c;
  6425.  
  6426. Numerical comparison
  6427.  
  6428.    # C shell
  6429.  
  6430.       if ( $x == $y ) then
  6431.       endif
  6432.  
  6433.    # Bourne shell
  6434.  
  6435.       if [ $x -eq $y ]; then
  6436.       fi
  6437.  
  6438.    # Perl
  6439.  
  6440.       if ( $x == $y )
  6441.          {
  6442.          }
  6443.  
  6444.    /* C */
  6445.  
  6446.       if ( x == y )
  6447.          {
  6448.          }
  6449.  
  6450. String comparison
  6451.  
  6452.  
  6453.    # C shell
  6454.  
  6455.       if ( $x == $y ) then
  6456.       endif
  6457.  
  6458.    # Bourne shell
  6459.  
  6460.       if [ $x = $y ]; then
  6461.       fi
  6462.  
  6463.    # Perl
  6464.  
  6465.       if ( $x eq $y ) then
  6466.          {
  6467.          }
  6468.  
  6469.    /* C */
  6470.  
  6471.       if (strcmp(x,y) == 0)
  6472.          {
  6473.          }
  6474.  
  6475. Opening a file
  6476.  
  6477.  
  6478.    # C shell, Bourne shell - cannot be done (pipes only)
  6479.  
  6480.    # Perl
  6481.  
  6482.        open (READ_HANDLE,"filename");
  6483.        open (WRITE_HANDLE,"> filename");
  6484.        open (APPEND_HANDLE,">> filename");
  6485.  
  6486.    /* C */
  6487.  
  6488.        FILE *fp;
  6489.  
  6490.        fp = fopen ("file","r");
  6491.        fp = fopen ("file","w");
  6492.        fp = fopen ("file","a");
  6493.  
  6494. Opening a directory
  6495.  
  6496.  
  6497.    # C shell
  6498.  
  6499.       foreach dir ( directory/* )
  6500.          ...
  6501.       end
  6502.  
  6503.    # Bourne shell
  6504.  
  6505.       for dir in directory/* ;
  6506.       do
  6507.         ...
  6508.       done
  6509.  
  6510.    # Perl
  6511.  
  6512.       opendir (HANDLE,"directory") || die;
  6513.  
  6514.       while ($entry = readdir(HANDLE))
  6515.          {
  6516.          }
  6517.  
  6518.       closedir(HANDLE);
  6519.  
  6520.    # C
  6521.  
  6522.       #include <dirent.h>
  6523.       DIR *dirh;
  6524.       struct dirent *dirp;
  6525.  
  6526.       if ((dirh = opendir(name)) == NULL)
  6527.          {
  6528.          perror("opendir")
  6529.          exit(1);
  6530.          }
  6531.  
  6532.       for (dirp = readdir(dirh); dirp != NULL; dirp = readdir(dirh))
  6533.          {
  6534.          ...  /* dirp->d_name points to child */
  6535.          }
  6536.  
  6537.       closedir(dirh);
  6538.      
  6539.  
  6540. Testing file types
  6541.  
  6542.  
  6543.    # C shell
  6544.  
  6545.        if ( -f file )  # plain file
  6546.        if ( -d file )  # directory
  6547.  
  6548.    # Bourne shell
  6549.  
  6550.        if [ -f file ]  # plain file
  6551.        if [ -d file ]  # directory
  6552.  
  6553.    # Perl
  6554.  
  6555.        if ( -f file )  # plain file
  6556.        if ( -d file )  # directory
  6557.        if ( -l file )  # symbolic link
  6558.  
  6559.    /* C */
  6560.  
  6561.         #include <sys/stat.h>
  6562.  
  6563.         struct stat statvar;
  6564.  
  6565.         stat("file", &statvar);
  6566.  
  6567.         if (S_ISREG(statvar.mode))  /* plain file */
  6568.         if (S_ISDIR(statvar.mode))  /* directory  */
  6569.  
  6570.         lstat("file", &statvar);
  6571.  
  6572.         if (S_ISLNK(statvar.mode))  /* symbolic link */
  6573.  
  6574. Command and Variable Index
  6575.  
  6576. !
  6577. `!' in sh
  6578. `!' not
  6579. `!='
  6580. `!=' in sh
  6581. `!~'
  6582. "
  6583. "
  6584. " shell construction
  6585. $
  6586. $ in regular expressions
  6587. $< in make
  6588. $? in make
  6589. $@ in make
  6590. &
  6591. &
  6592. `&' AND
  6593. '
  6594. '
  6595. (
  6596. () in csh
  6597. *
  6598. *
  6599. * in regular expressions
  6600. +
  6601. `+'
  6602. + in regular expressions
  6603. `++'
  6604. `+='
  6605. -
  6606. `-'
  6607. `--'
  6608. --help
  6609. `-='
  6610. `-a' in sh
  6611. -d file
  6612. `-d' in sh
  6613. -e file
  6614. `-eq' in sh
  6615. -f file
  6616. `-f' in sh
  6617. `-g' in sh
  6618. `-ge' in sh
  6619. `-gt' in sh
  6620. -h
  6621. `-le' in sh
  6622. `-lt' in sh
  6623. `-ne' in sh
  6624. `-o' in sh
  6625. -r file
  6626. `-r' in sh
  6627. `-s' in sh
  6628. `-u' in sh
  6629. -w file
  6630. `-w' in sh
  6631. `-x' in sh
  6632. -z file
  6633. -z in perl
  6634. .
  6635. .
  6636. . in regular expressions
  6637. ..
  6638. .cshrc
  6639. .profile
  6640. .xsession
  6641. /
  6642. `/bin'
  6643. `/bin/csh'
  6644. `/bin/sh'
  6645. `/dev'
  6646. `/devices'
  6647. `/etc'
  6648. `/export'
  6649. `/home'
  6650. `/sbin'
  6651. `/sys'
  6652. `/users'
  6653. `/usr'
  6654. `/usr/bin'
  6655. `/usr/local'
  6656. `/var'
  6657. `/var/adm'
  6658. `/vr/spool'
  6659. :
  6660. :e
  6661. :h
  6662. :r
  6663. :t
  6664. <
  6665. <
  6666. `<'
  6667. `<' less than
  6668. <<, <<
  6669. `<<' shift
  6670. `<='
  6671. =
  6672. `=' assignment
  6673. `=' in sh
  6674. `==', `=='
  6675. `==' equal to (compare)
  6676. `=~'
  6677. >
  6678. >
  6679. `>'
  6680. `>' greater than
  6681. `>='
  6682. >>
  6683. `>>' shift
  6684. ?
  6685. ?
  6686. ? in regular expressions
  6687. [
  6688. []
  6689. [] in regular expressions
  6690. ^
  6691. ^ in regular expressions
  6692. `^' XOR
  6693. `
  6694. `
  6695. `...`
  6696. a
  6697. apropos
  6698. `ar'
  6699. `archie'
  6700. argc in C
  6701. argv in C
  6702. `awk'
  6703. b
  6704. breaksw
  6705. c
  6706. `cat'
  6707. `cc'
  6708. `CC'
  6709. `chgrp'
  6710. `chmod'
  6711. `chown'
  6712. `cmdtool'
  6713. continue
  6714. `cp'
  6715. crypt()
  6716. CTRL-A
  6717. CTRL-C
  6718. CTRL-D
  6719. CTRL-E
  6720. CTRL-L
  6721. CTRL-Z
  6722. `cut'
  6723. d
  6724. `date'
  6725. `dbx'
  6726. `dc'
  6727. ddd
  6728. `df'
  6729. DISPLAY
  6730. `domainname'
  6731. `du'
  6732. `dvips'
  6733. e
  6734. `ed'
  6735. `elm'
  6736. `emacs'
  6737. env
  6738. `eq'
  6739. f
  6740. `find'
  6741. `finger'
  6742. `fmgr'
  6743. `fnews'
  6744. foreach
  6745. fork()
  6746. `ftp'
  6747. g
  6748. `g++'
  6749. `gcc'
  6750. `gdb'
  6751. getenv()
  6752. `ghostscript'
  6753. `ghostview'
  6754. h
  6755. HOME
  6756. HOST
  6757. `hostname'
  6758. i
  6759. ioctl()
  6760. `irc'
  6761. `ispell'
  6762. k
  6763. keys
  6764. l
  6765. `latex'
  6766. `ld'
  6767. `LD_LIBRARY_PATH'
  6768. LD_LIBRARY_PATH
  6769. `less'
  6770. ln
  6771. ln -s
  6772. `locate'
  6773. `lp', `lp'
  6774. `lpq'
  6775. `lpstat'
  6776. `ls'
  6777. m
  6778. man -k
  6779. `mesg'
  6780. `mkdir'
  6781. mkdir
  6782. `more'
  6783. `mv'
  6784. n
  6785. `ncftp'
  6786. `netstat'
  6787. `nslookup'
  6788. p
  6789. `paste'
  6790. PATH
  6791. `pico'
  6792. `pine'
  6793. `ping'
  6794. `PRINTER'
  6795. PRINTER
  6796. `ps'
  6797. r
  6798. rand()
  6799. `rcpinfo'
  6800. `rename' in perl
  6801. repeat
  6802. `rlogin'
  6803. `rmail'
  6804. `rmdir'
  6805. `rsh'
  6806. s
  6807. `screen'
  6808. `sed'
  6809. set
  6810. `setroot'
  6811. `shelltool'
  6812. `showmount'
  6813. stderr
  6814. stdin
  6815. stdout
  6816. t
  6817. `talk'
  6818. `tcl'
  6819. `telnet'
  6820. TERM
  6821. `tex'
  6822. `texinfo'
  6823. `textedit'
  6824. `touch'
  6825. u
  6826. `uname'
  6827. `unlink', `unlink'
  6828. unset
  6829. `users'
  6830. v
  6831. `vi'
  6832. `vmstat'
  6833. `vmunix'
  6834. w
  6835. `w'
  6836. `whereis'
  6837. which
  6838. while
  6839. `who'
  6840. `write'
  6841. x
  6842. `xarchie'
  6843. `xcalc'
  6844. `xdvi'
  6845. `xedit'
  6846. `xemacs'
  6847. `xfig'
  6848. `xmosaic'
  6849. `xpaint'
  6850. `xrn'
  6851. `xterm'
  6852. `xv'
  6853. `xxgdb'
  6854. z
  6855. `zmail'
  6856. |
  6857. |
  6858. `|' OR
  6859. `||' logical OR
  6860.  
  6861. Concept Index
  6862.  
  6863. #
  6864. `#!program' sequence, `#!program' sequence
  6865. $
  6866. $ in regular expressions
  6867. `$<' operator
  6868. '
  6869. ' and "
  6870. (
  6871. () and subshells
  6872. () operators to make array in csh
  6873. *
  6874. * in regular expressions
  6875. +
  6876. + in regular expressions
  6877. -
  6878. `-I' option to cc
  6879. `-L' option to cc
  6880. .
  6881. . directory, . directory
  6882. . in regular expressions
  6883. .. directory, .. directory
  6884. `.cshrc' file
  6885. `.login' file
  6886. `.profile' set up in sh
  6887. .xsession file
  6888. /
  6889. `/etc/group'
  6890. 1
  6891. `1>' in sh
  6892. 2
  6893. `2>' in sh
  6894. `2>&1' in sh
  6895. <
  6896. `<>' filehandle in perl
  6897. ?
  6898. ? in regular expressions
  6899. [
  6900. `[]' for test in sh
  6901. [] in regular expressions
  6902. ^
  6903. ^ in regular expressions
  6904. `
  6905. ` symbol and embedded shells
  6906. ``..`' in perl
  6907. a
  6908. `a.out'
  6909. accept()
  6910. Access bits
  6911. Access bits, octal form
  6912. Access bits, text form
  6913. Access control
  6914. Access control lists
  6915. Access rights
  6916. Access to files
  6917. ACLs
  6918. ANSI C
  6919. Appending to a file with `>>'
  6920. apropos
  6921. `ar' archiver
  6922. `archie' program
  6923. Argument vector in csh
  6924. Argument vector in perl, Argument vector in perl
  6925. Arguments, command line
  6926. `argv'
  6927. Arithemtic in sh
  6928. Arithemtic operations in csh
  6929. Arrays (associated) in perl
  6930. Arrays (normal) in perl
  6931. Arrays and `split'
  6932. Arrays in csh
  6933. Arrays in perl
  6934. Associated arrays, iteration
  6935. `at' command
  6936. AT&T
  6937. `awk'
  6938. `awk' pattern extractor
  6939. b
  6940. `Background picture'
  6941. Background process, Background process
  6942. Backwards quotes
  6943. bash, bash
  6944. `batch' command
  6945. Berkeley Internet Name Domain (BIND)
  6946. `bg' command
  6947. Big endian
  6948. BIND
  6949. bind()
  6950. Bourne shell, Bourne shell
  6951. Break key
  6952. `breaksw'
  6953. Browsing through a file
  6954. BSD
  6955. Build software script
  6956. Built in commands
  6957. Byte order
  6958. c
  6959. C
  6960. C library calls and shell commands
  6961. C programming
  6962. C shell
  6963. C shell setup files
  6964. C++ suffix rules
  6965. C, role in unix
  6966. Calculator, shell
  6967. Calculator, X windows
  6968. `cat' command
  6969. `CC'
  6970. `cc'
  6971. CGI protocol
  6972. Changing file mode
  6973. chgrp command
  6974. `chgrp' command
  6975. `chmod' command
  6976. chmod command
  6977. `chop' command in perl
  6978. chown command
  6979. `chown' command
  6980. `close' command in perl
  6981. closedir command
  6982. `cmdtool'
  6983. Command completion
  6984. Command history
  6985. Command interpreter
  6986. Command line arguments
  6987. Command line arguments in C
  6988. Command line arguments in perl, Command line arguments in perl
  6989. Command line arguments in sh
  6990. Command path
  6991. Command window
  6992. Commands as files
  6993. Commands path
  6994. Comparison operators in csh
  6995. Compiler script
  6996. Compilers
  6997. Compiling huge programs
  6998. Compiling programs
  6999. connect()
  7000. `continue' in csh
  7001. Continuing long lines
  7002. Copy of output to file
  7003. core
  7004. `cp' command
  7005. Creating directories
  7006. Creating files
  7007. csh
  7008. `csh'
  7009. CTRL-A
  7010. CTRL-C
  7011. CTRL-D
  7012. CTRL-D and EOF
  7013. CTRL-E
  7014. CTRL-L
  7015. CTRL-Z
  7016. Curses
  7017. `cut'
  7018. Cut as a perl script
  7019. `cut' command
  7020. d
  7021. Database maps
  7022. Database support
  7023. `date' command
  7024. Date stamp, updating
  7025. `dbx' debugger
  7026. Debugger
  7027. Debugger for C
  7028. Debugger GUI
  7029. Decisions and return codes in sh
  7030. delete
  7031. Dependencies in Makefiles
  7032. `df' command
  7033. `die'
  7034. Directories, creating
  7035. Directories, deleting
  7036. dirent directory interface
  7037. Disk usage.
  7038. DISPLAY variable
  7039. Display, X
  7040. DNS
  7041. `do..while' in perl
  7042. Domainname
  7043. `domainname' command
  7044. DOS
  7045. Drawing program
  7046. `du' command
  7047. dvi to postscript
  7048. e
  7049. `ed'
  7050. egrep command
  7051. `elm' mailer
  7052. `emacs'
  7053. Embedded shell
  7054. Encryption
  7055. End of file CTRL-D
  7056. env command
  7057. Environment variables, Environment variables
  7058. Environment variables in C, Environment variables in C
  7059. Environment variables in perl, Environment variables in perl
  7060. Environment, unix user
  7061. envp in C
  7062. `eq' and `==' in perl
  7063. Error messages
  7064. Errors in perl
  7065. Executable, making programs
  7066. Exiting on errors in perl
  7067. `EXPORT' command in sh
  7068. Expressions, regular
  7069. extern variables
  7070. Extracting filename components
  7071. f
  7072. `fg' command
  7073. File access permission
  7074. File handles in perl
  7075. File hierarchy
  7076. File mode, changing
  7077. File protection bits
  7078. File transfer
  7079. File type, determining in C
  7080. Filename completion
  7081. Files in perl
  7082. Files, iterating over lines
  7083. `find' command, `find' command
  7084. Finding commands
  7085. Finding FTP files
  7086. `finger' service
  7087. `fmgr' file manager
  7088. `fnews' news reader
  7089. For loop
  7090. for loop in perl
  7091. for loop in sh
  7092. For loops in perl
  7093. foreach
  7094. foreach example
  7095. Foreach loop
  7096. foreach loop in perl
  7097. Foreground process
  7098. Forking new processes
  7099. Formatting text in a file
  7100. Forms in HTML
  7101. `ftp' program
  7102. FTP resources, finding
  7103. Fully qualified name
  7104. g
  7105. `g++'
  7106. `gcc'
  7107. `gdb' debugger
  7108. getenv() function
  7109. getgrnam()
  7110. gethostbyname(), gethostbyname()
  7111. gethostent()
  7112. getnetgrent()
  7113. getpwnam()
  7114. getpwuid()
  7115. getservbyname()
  7116. getservbyport()
  7117. Getting command output into a string
  7118. `ghostscript' "GNU postscript" interpreter
  7119. `ghostview' postscript previewer
  7120. gif
  7121. Global variables
  7122. Global variables in csh
  7123. Global variables in sh
  7124. Granting permission
  7125. groups
  7126. h
  7127. Hard links, Hard links
  7128. Help function for commands
  7129. Hierarchy, file
  7130. `hostname' command
  7131. Hypertext
  7132. i
  7133. I/O streams
  7134. `if' in perl
  7135. if..then..else in csh
  7136. if..then..else..fi in sh
  7137. `IFS' variable in sh
  7138. INADDR_ANY
  7139. Include file search path
  7140. Include files
  7141. Index nodes
  7142. Information about file properties
  7143. init
  7144. inodes
  7145. Input in csh
  7146. Input in sh
  7147. Input over many lines
  7148. Inserting a command into a string
  7149. Internet relay chat
  7150. Internet resources
  7151. Interpretation of values in perl
  7152. Interrupt handler in sh
  7153. ioctl()
  7154. `IRC'
  7155. Iterating over files
  7156. Iteration over arrays
  7157. j
  7158. Job control
  7159. Job numbers in csh
  7160. Job, moving to background
  7161. Joker notation
  7162. jpg
  7163. jsh
  7164. k
  7165. Kernel
  7166. kernel
  7167. Kernighan and Ritchie C
  7168. `kill' command
  7169. ksh, ksh
  7170. l
  7171. `latex'
  7172. `ld' loader/linker
  7173. `ld.so.cache'
  7174. `ldconfig'
  7175. `less' command
  7176. `lex'
  7177. lex
  7178. Lexer
  7179. libc
  7180. libcurses
  7181. libm
  7182. Library path for C loader
  7183. Limitations of shell programs
  7184. Links in C
  7185. Links, where do they point?
  7186. listen()
  7187. Little endian
  7188. ln -s
  7189. Local variables
  7190. Local variables in csh
  7191. Local variables in perl
  7192. Local variables in sh
  7193. `locate' command
  7194. Logging on
  7195. Login environment
  7196. Login evironment
  7197. Long file listing
  7198. Long lines, continuing
  7199. Loops and list separators
  7200. Loops in csh
  7201. Loops in sh
  7202. `lp' command
  7203. `lpq'
  7204. `lpr' command
  7205. `lpstat'
  7206. ls -l
  7207. `ls command'
  7208. lstat()
  7209. m
  7210. MacIntosh
  7211. Macros for stat
  7212. Mail clients
  7213. make
  7214. Make rules for C++
  7215. Make software script
  7216. Making a script
  7217. Making directories
  7218. Making scripts in sh
  7219. Masking programs executable
  7220. Matching filenames, Matching filenames
  7221. Matching strings
  7222. mc
  7223. Mercury
  7224. `mesg'
  7225. Messages
  7226. Mime types in W3
  7227. mkdir
  7228. `mkdir' command
  7229. `more' command
  7230. `mosaic'
  7231. Mounted file systems
  7232. Moving a job to the background
  7233. Moving files
  7234. Multiple C files, compiling
  7235. Multiple screens
  7236. `mv' command
  7237. n
  7238. nc
  7239. `ncftp' program
  7240. `netstat' network statistics
  7241. Network byte order
  7242. Network databases
  7243. Network information service
  7244. Never do in unix
  7245. NFS and C support
  7246. NIS
  7247. nobody
  7248. noclobber overwrite protection
  7249. noclobber variable
  7250. `nslookup' command
  7251. o
  7252. `open' command in perl
  7253. opendir command
  7254. Opening a pipe in C
  7255. Operating system name
  7256. Operators in csh
  7257. Output to file
  7258. Output, sending to a file
  7259. p
  7260. Painting program
  7261. Panic button
  7262. Parameters in perl functions
  7263. Parser
  7264. Parts of a filename
  7265. `passwd' file
  7266. `paste'
  7267. Paste as a perl script
  7268. `paste' command
  7269. path
  7270. PATH
  7271. path
  7272. Pattern matching in perl, Pattern matching in perl
  7273. Pattern replacement in perl
  7274. PC windows
  7275. Perl
  7276. perl
  7277. Perl variables and types
  7278. Perl, strings and scalar
  7279. Perl, truncating strings
  7280. Permissions on files
  7281. Permissions, determining in C
  7282. `pico'
  7283. Picture processing
  7284. `pine' mailer
  7285. Pipe
  7286. Pipes
  7287. Pipes in C
  7288. Piping to more to prevent scrolling
  7289. popen()
  7290. POSIX standard
  7291. Postscript viewers
  7292. Printer queue
  7293. Printer status
  7294. `PRINTER' variable
  7295. Printing a file
  7296. Printing multiple lines
  7297. Procedures and subroutines in sh
  7298. Process. moving to background
  7299. Processes
  7300. Prompt, redefining
  7301. Protecting files from overwrite with `>'
  7302. Protection bits
  7303. `ps' command
  7304. r
  7305. readdir command
  7306. readlink()
  7307. recv()
  7308. Redefining list separator in sh
  7309. Redirecting stdio in sh
  7310. Redirection of stdio
  7311. Regular expressions
  7312. Reliable socket protocol
  7313. Renaming files
  7314. repeat
  7315. Result of a command into a string
  7316. Return codes
  7317. `rlogin'
  7318. rlogin program
  7319. `rm' command
  7320. `rmail' in emacs
  7321. `rmdir' command
  7322. Role of C in unix
  7323. Root privileges
  7324. root user
  7325. rpcgen
  7326. `rpcinfo'
  7327. `rsh'
  7328. s
  7329. s-bit, s-bit
  7330. Scalar variables in perl
  7331. scheme
  7332. `screen'
  7333. Screens
  7334. Script aliases in W3
  7335. Script, making
  7336. Scripts in sh, making
  7337. Searching and replacing in perl (example)
  7338. `sed' as a perl script
  7339. `sed' batch editor
  7340. `sed' editor
  7341. `sed', search and replace
  7342. send()
  7343. Sending messages
  7344. set command
  7345. setenv command
  7346. setgid bit
  7347. Setting the prompt
  7348. Setting up the C shell
  7349. Setting up the x environment
  7350. setuid bit
  7351. SetUID scripts
  7352. `sh'
  7353. sh
  7354. sh5
  7355. Shared libraries
  7356. shell, shell, shell
  7357. Shell commands and C library calls
  7358. Shells, various
  7359. `shelltool'
  7360. `shift' and arrays
  7361. `shift' and arrays in perl
  7362. shift operator on strings
  7363. `showmount'
  7364. Signal handler in sh
  7365. Single and double quotes
  7366. `sleep' command
  7367. socket()
  7368. Sockets
  7369. Soft links
  7370. `Sonar' ping
  7371. Spelling checker
  7372. `split' and arrays
  7373. `split' command
  7374. Splitting C into many files.
  7375. Splitting output to several files
  7376. Standard error
  7377. Standard I/O in perl
  7378. Standard I/O in sh
  7379. Standard I/O, redirection
  7380. Standard input
  7381. Standard output
  7382. Starting
  7383. Starting shell jobs
  7384. stat()
  7385. Static linking
  7386. Statistics about a file
  7387. Sticky bit
  7388. Strings in perl
  7389. `stty' and switching off term echo
  7390. Subroutines in perl
  7391. Subshells and ()
  7392. Suffix rules in Makefiles
  7393. superuser
  7394. Suspending a job
  7395. Swapping text strings
  7396. switch..case in csh
  7397. Symbolic links
  7398. System 5
  7399. `System details'
  7400. System identity and `uname'
  7401. System name
  7402. System V
  7403. t
  7404. t-bit
  7405. TAB completion key
  7406. `talk' service
  7407. `TCL'
  7408. TCP/IP
  7409. tcsh, tcsh
  7410. `tee' command
  7411. Teletype terminal
  7412. `telnet'
  7413. telnet
  7414. Terminal echo and `stty'
  7415. Terminals
  7416. `test' in sh
  7417. test programs
  7418. test, don't call your program this
  7419. Testing files
  7420. Testing reponse from other hosts.
  7421. Tests and conditions in csh
  7422. Tests in sh
  7423. `tex'
  7424. `texinfo' system
  7425. Text form of access bits
  7426. Text formatting
  7427. `textedit'
  7428. The arguement vector in C
  7429. The domain name service
  7430. tiff
  7431. Time and date
  7432. Time stamp, updating
  7433. Tk library
  7434. `touch' command
  7435. Traps in sh
  7436. Truncating strings in perl
  7437. tty
  7438. `type' in DOS
  7439. Types in perl
  7440. u
  7441. umask variable, umask variable
  7442. `uname' command
  7443. Undefining variables
  7444. undelete
  7445. UNIX
  7446. UNIX history
  7447. `unless' in perl
  7448. `unlink' command
  7449. unset command
  7450. `until'
  7451. Up arrow
  7452. Updating file time stamp
  7453. User database support
  7454. User environment
  7455. `users' command
  7456. v
  7457. Variables, global
  7458. Variables, local
  7459. `vi'
  7460. Viewing a file
  7461. `vmstat' virtual memory stats
  7462. w
  7463. `w' command
  7464. `wait.h'
  7465. Waiting for child processes
  7466. `whereis' command
  7467. which command
  7468. while
  7469. `while' in perl
  7470. `while' in sh
  7471. while loop in sh
  7472. `who' command
  7473. `whoami' command
  7474. Wildcards, Wildcards
  7475. Windows on PC
  7476. Wrapper functions
  7477. Wrappers
  7478. `write' command
  7479. write example
  7480. Writing a script
  7481. WTERMSIG(status)
  7482. x
  7483. X access control
  7484. X display, X display
  7485. X protocol
  7486. X window system
  7487. X windows
  7488. X windows access
  7489. X windows authentification
  7490. X-windows
  7491. `xarchie' client
  7492. Xauthority mechanism
  7493. `xedit'
  7494. `xemacs'
  7495. `xfig' drawing program
  7496. xhost mechanism
  7497. `xpaint' program
  7498. `xrn' news reader
  7499. `xterm'
  7500. xterm program
  7501. `xv' picture processor
  7502. `xxgdb'
  7503. y
  7504. `yacc', `yacc'
  7505. yacc
  7506. z
  7507. `zmail' client
  7508. zsh
  7509. |
  7510. `|' symbol, `|' symbol
  7511.  
  7512. This document was generated on 27 September 1999 using the texi2html translator version 1.51.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement