Linux system administration part 1
What does it take to be a system administrator? Find out how to equip yourself with the skills you need to start a new career in Linux…
Being a system administrator is a sometimes strange and difficult role: nobody knows you exist unless there is a problem, and then it’s instantly considered your fault. Your number one priority is to enable your users or developers unimpeded access to systems you are managing. If there’s a problem it needs to be solved as swiftly and efficiently as possible. You need to see and understand the problems before they come up. Do you still want to be a system administrator? If so, what are the core tasks? How can you become a better system administrator? We’ll be answering all these questions and more over the next few pages, so get your notepads at the ready…
What is a system admin?
If you ask five system administrators (SAs) to define their jobs, you will almost certainly get five different answers. This happens because systems administrators do a wide variety of different tasks from day to day.
Strictly speaking, a system administrator is responsible for the smooth operation of one or more machines. Each machine usually runs a number of UNIX services, some of them being more critical than others, and may have a number of users or developers utilising it.
SAs are often also called network administrators, system operators, system engineers, UNIX gurus, and, lately, DevOps. One thing is for sure: system administration is by no means a boring job and rarely are two days alike.
Becoming an SA takes some time, but you don’t have to be a natural technology whizz. You would normally spend six months with on-the-job training to be able to work on a busy production server without being watched like a hawk by a supervisor, but after that things will get easier, provided that you follow the classic ‘measure twice and cut once’ rule.
Being able to productively administer a Linux machine on your own will take around two years of full-time administration experience. The single most critical task of an administrator is defining and implementing good backup practices and it’s something you need to learn quickly.
It is also usual and extremely critical for the system administrator to have the Intranet or web services of the company up and running at all times. Websites are usually considered a reflection of the reliability of the company it represents, so it’s a responsibility that’s naturally high priority.
Working with log files
Log files are an essential part of every Linux system. As an SA it’s considered very good practice to watch log files on a daily basis, especially when you are administering a new machine, in order to better understand its status and detect potential problems or bottlenecks. Which logs should you watch? Here are three of the most important…
Step 01 /var/log/syslog
This is the main system log file. It is the first log file that you should look at when searching for error messages or other important information. It usually contains three types of messages: errors, warnings and informative text. Apart from the active syslog file, Linux automatically keeps the last seven system.log files compressed. Just use your favourite editor to view it. Assuming you’re root (and you want to use nano), just type: nano /var/log/syslog
Step 02 /var/log/apache2 directory
The /var/log/apache2 directory contains two log files that are related
to the Apache web server. The access.log file records all requests processed by the apache web server. It is commonly used for generating web server statistics. The error.log file records diagnostic messages and errors that occurred while Apache was processing a HTTP request. If you have a problem with a webpage, this should be the first place to look for information. Virtual domains usually save their own access.log and error.log files separately as described in their definition.
Step 03 /var/log/debug
This file contains debugging and warning log messages. You should have in mind that the kind of data written in each system log file is defined by rsyslog inside /etc/rsyslog.conf. The rsyslog service replaces and improves the original UNIX syslog service. Its configuration file is /etc/rsyslog.conf and you should have root privileges to make changes to it.
The core tasks of a system administrator
An SA keeps one or more machines secure, helps developers when needed, checks the security of the whole network, automates key tasks with scripts and other tools. They’ll also be responsible for setting up new Linux machines, adding and deleting users and creating new domains (DNS) where needed.
They’re also expected to keep their users happy and try – usually in vain – to make their boss happy. That’s before wrangling log files to extract pertinent information, making and testing backup and restore procedures, updating systems, documenting their practices and administering databases. We haven’t included things like solving network problems, optimising hardware, installing new software and having the right answers to questions about virtualisation and the cloud. The list is never-ending, because new technologies appear every day and they’re expected to keep up with the hot new trends in the field. Don’t let it put you off though – an SA doesn’t deal with all of these tasks on a daily basis, but is responsible for all of them. Being able to prioritise tasks and manage your time effectively are skills you need to have in spades.
Essential DNS tools
The most important part of the internet is the Domain Name System (DNS) that connects one or more IP addresses with a (domain) name and has the structure of a big tree. DNS is so important because without it the internet cannot operate.
The main DNS-related command line tools are host, nslookup and dig – all three have the same capabilities. You should consult their main pages to learn about different command line parameters.
If you want to find more information about the linuxuser.co.uk domain, you can try one of the following three commands:
$ host linuxuser.co.uk $ dig linuxuser.co.uk $ nslookup linuxuser.co.uk
The root domain of the internet is specified by a dot, so if you want to find out the DNS servers of the root domain, you just have to execute the following command:
$ host -t ns .
Network data and tcpdump
Network data is formatted in network packets. Tcpdump is a command line utility based on libpcap that captures network data.
The header size of a packet may vary depending on the protocol and the header options, but the most common length of captured data (that used to be 68 bytes) is now (after the release of tcpdump version 4.0) 96 bytes. In order to capture full Ethernet frames, you should run tcpdump with the –s 1514 parameters; 1514 is the maximum length of the Ethernet network packet but most of the time you do not need the full packet.
The reason for tcpdump not capturing the entire packet is that usually the interest is in the header parts of the packet that are normally captured with the default length.
How to be a better system administrator
You can’t beat a bit of common sense, regardless of your line of work and these guidelines should help keep you on track
Be on top of your game
Although it may sound difficult, you should always try to learn new things because it will keep you fresh and give you better insight into system administration as a whole. Do not ignore new technologies, new Linux distributions or new operating systems.
Keep calm and administrate
Do not attempt to solve critical problems when you’re feeling exhausted, sleepless or don’t have a clear mind. If you’re not sure what to do, try sleeping on it, talking to a colleague or seeing what Google has to say on the matter.
Keep systems slim and simple
Do not run unnecessary services on your Linux machines. The more processes you have running, the more likely you’re going to run into trouble. Keep your systems at bare-bones level and functional as opposed to slow and dangerously difficult to manage.
Test before you deploy
Always have one machine to one side for testing. Check your updates before applying them to a production machine. Never update a production server without having a proper backup, even after you’ve tested it.
Two heads are better than one
You can’t be in the office 24/7/365. Ask a colleague to test your backup and restore scripts and automated tasks for you long before you decide to take a holiday in Timbuktu. It’s also a good time to get feedback on all of your documentation too.
The importance of Wireshark and tshark
The best way to analyse data is using WireShark or its command line version, tshark. The main advantage of the command line version is that it can be included in a script.
Nowadays there are many things going on inside a computer network. Before you start capturing, it is better to have a given issue in mind that you want to solve or examine.
Wireshark allows you to filter the network data during capture time by capturing specific types of traffic, avoiding the creation of huge capture files but there are also display filters. Display filters tell Wireshark to display the network packets that really matter so that you look at fewer packets and easily unveil what you are trying to find.
Email is a very popular and important internet service and, as you can understand, the SA is responsible for its smooth operation. The mail service is based on the SMTP protocol.
You should use an MX record of the DNS service in order to define the mail server for a domain or a subdomain but you also need a server process to receive incoming mail and send outgoing mail. Currently the most popular mail server is Postfix – but there are alternative options, including Sendmail, Exim and qmail.
10 essential command line tools
The power of UNIX comes from combining its tools. The more you 1know, the easier your job will be
Grep is a useful tool for searching text files, including log files, using regular expressions.
The netstat command shows network status. It is very useful for showing the active network connections on a machine
The traceroute command shows the route of a network packet to a given host. It is used for troubleshooting network problems.
The find command is useful for searching for files and directories in a directory hierarchy. You must learn this command.
Tripwire is a tool used for security to detects changes to files and is very useful for knowing if your website has been hacked.
An open source file synchronisation tool for text and binary files, this tool is great when you are working with more than two computers and want your files synchronised.
Drush is the CLI of Drupal, a popular CMS. If you are a Drupal user, you should learn drush as it is a great time-saver.
The Graphviz command line tools help you present graph structures and can be used
in networking, databases, web structures, embedded in scripts and much more.
The top command displays and updates sorted information about system processes. It’s a great window into your system.
vi or Emacs
Both vi and Emacs are very popular text editors among system administrators. Learn one of them well and never look back!