Obtaining Source Code Software
Installing Software from Source Code
No operating system comes with every application you need right out of the box. Being able to install, upgrade and remove software is a necessary skill for any server administrator. One of the advantages of Linux is that there is no difference between installing operating system updates and components versus third party applications. That is because, with the exception of the kernel, a Linux distribution is nothing more than a collection of third party programs. Software comes in one of two ways, as source code or as a package file.
Obtaining Source Code Software
In the Windows world, a programmer writes a program in a language such as C++ or Visual Basic then compiles it for specific hardware architecture (most commonly a 32-bit Intel platform.) Compiling a program is the process of transforming a human readable program into machine code that can be understood by a processor. The compiled program (also called a binary because the machine code consists of nothing but 0s and 1s) is then either sold or distributed as shareware or freeware. Because developers are able to protect their source code, programs cannot easily be reverse engineered or the technology in them copied by a competitor. However if an end user finds a bug or requests a new feature, they must wait for the developer to code and distribute the update.
The majority of software written for Linux is licensed under the General Public License (GPL). Among other things, a program licensed under the GPL must have its source code made public. Unlike commercial software programs, open source programs give businesses or individuals the ability to make custom programming changes to meet their specific needs. By having access to the source code, software can be compiled for whatever hardware you are using and can be optimized by compiling only the features you will use. For example, if your server is being run strictly from the command prompt, you could compile a program leaving out any support for X Windows. By not linking a number of shared libraries that will never be used, you can potentially improve the performance of the application.
When a developer wants to distribute the source code of an application they will usually create a tar archive of all the necessary files. The tar utility is used to collect several files and directories into a single file while retaining permission and ownership attributes. This tar file is then compressed to make it easier to download. In most cases the program used to do the compression is gzip . This zipped archive file has the extension .tar.gz (or . tgz for short) and is commonly referred to as a tarball.
Installing software from a tarball involves two main steps. First you must unpack the install files, and then you need to compile them. A gzipped file can be uncompressed either by using gunzip or by using gzip with the –d option. There are other types of file compression in use. The following table provides some of the key details:
|| Compress Command
|| Uncompress Command
| Standard Unix Compression
|| Mainly used on older Unix systems. Lower compression ratio than other types.
|| gunzip (or gzip –d)
|| Most common form of compression in Linux. Provides a good balance of file compression and processing time.
|| bunzip2 (or bzip2 –d)
|| Highest compression ratio. Also requires the most processor time. Bzip2 is a newer algorithm and is not as widespread, particularly in older distributions.
After unzipping a .tgz file you can use the tar command to extract files from the archive. Tar actually has three operating modes: create, extract, and verify. One of these modes must be specified when using tar. There are a number of options that can be specified, the most common of which are listed:
Used to create an archive
Used to extract files from an existing archive
Used to list files in an existing archive
Used to specify a file or device to use when creating/extracting/verifying
Used to display verbose output
Used to automatically compress or uncompress an archive with gzip
Used to automatically compress or uncompress an archive with bzip2
To create a file called config.tar that contains all the files in the /etc directory, you would type:
tar –cvf config.tar /etc
If the –f switch is not specified, tar will try to read or write to the default tape device. To extract files from a tar archive that you have downloaded, type:
tar –xvf download.tar
Notice that tar allows you to directly extract files from a .tgz file without having to uncompress it first. To do this, use the – z switch:
tar –xvzf download.tgz
Tar can be packed using relative or absolute pathnames. Relative pathnames mean that if the person who created the tar file did so from the /usr directory on their machine, your current directory will also need to be /usr when you extract the archive in order to restore files where they were intended to be. If the tar file was created using absolute pathnames, they will be installed in the same place no matter what your current directory is. Since there is no simple way to tell the difference, it is usually a good idea to verify the contents of a tar file before extracting it. This is done with the –t option:
tar –tvf download.tar
Installing Software from Source Code
Once you have the source code of a program on your hard disk you must compile it. In order to compile a program you must have a Makefile. This is a list of commands that tells the compiler what files to link and compile and where files are located. Most source code applications come with a generic Makefile and a configure script. The configure script probes your individual system for information needed to create a useable Makefile. The configure script typically determines system specific things like:
To run the configure script, change your current directory to the directory where you unpacked the source and type:
- The version and location of shared libraries (ex. libc)
- Whether other required software is installed (ex, PHP, MySQL)
- Processor type
Depending on the size of the application, this can take a minute or two. Be sure to check the output for any errors or warning messages. If this step completes successfully you can begin compiling the source code by typing:
This step generates binary files from the source code using the instructions provided in the Makefile and can be a lengthy process. Again, make sure there are no errors before progressing. Once the source code has been compiled, type:
This installs the compiled programs to their proper locations in the filesystem hierarchy. The final step is to type:
This cleans up any temporary files that were created during the compilation process.
If you prefer not to go through the trouble of compiling software you can install software from a package. The two most prevalent types of packages are RPM and Debian packages. The remainder of this TechNote focuses on the RedHat Package Manager (RPM.)
RPM files are self-installing packages of precompiled software. You need to make sure to obtain an RPM that has been compiled for your particular hardware architecture. Since RPM packages follow a standard naming convention, this information can easily be determined. Each RPM package is named in the following format:
<name>-<major release>.<minor release>-<patch level>.<architecture>.rpm
Optionally an RPM will list a specific distribution in the name indicating that is has been compiled for a specific flavor of Linux:
The command used to manipulate RPM packages is coincidentally named rpm . It can be used to install, upgrade and remove packages. It also can be used to query a package file for information or determine which packages are installed on a system. Rpm is a powerful and extensive utility. Examples of its most common uses are listed below:
| rpm –i foo-1.0.1.i686.rpm
|| Installs the package foo. Will produce an error if foo is already installed.
| rpm –U foo-1.0.1.i686.rpm
|| Installs the package foo or upgrades it if already installed.
| rpm –e foo
|| Uninstalls the package foo. Notice that the complete filename is not used.
| rpm –F foo-1.0.1.i686.rpm
|| Will reinstall or upgrade foo only if foo is already installed.
The –v and –h flags are often used in conjunction with one of the above commands to display a progress bar.
In reality, installing RPM packages is not as simple. Almost every package requires that other software (such as a shared library) be installed for it to run. Frequently you will try to install a package only to be told that another package it depends on is missing. This is commonly known as dependency hell and can sometimes be just as time consuming as compiling from source code.
Current related exam topics for the Linux+ exam:
DOMAIN 1.0 Installation
1.9 Manage packages after installing the operating systems (for example: install, uninstall, update) (for example: RPM, tar, gzip)
DOMAIN 2.0 Management
2.8 Perform and verify backups and restores (tar, cpio)
2.13 Repair packages and scripts (for example: resolving dependencies, repairing, installing, updating applications)
DOMAIN 3.0 Configuration
3.4 Configure the system and perform basic makefile changes to support compiling applications and drivers