Official website for Linux User & Developer
FOLLOW US ON:
Mar
1

Monitor network traffic tutorial

by Mihalis Tsoukalos

Use tshark to examine network traffic, solve network difficulties and add network data to a MongoDB database

Gerald Combs created Ethereal, the ancestor of Wireshark, back in 2006. When he went to work in a new job, he could not use the name Ethereal any more so he renamed his tool Wireshark. The rest is history!

This tutorial will present you tshark, the command-line version of Wireshark, which is a very popular and capable network protocol analyser. The main advantage of tshark is that it can be used in scripts. Its main disadvantage is that it does not have a GUI.

You can get tshark either from its website – by compiling its source code – or directly from your Linux distribution. The second way is quicker and simpler.

If you try to run tshark as a normal user, you may not be able to use any network interfaces for capturing network traffic due to UNIX permissions. Your advisor finds it more convenient to run tshark as root (sudo tshark) when capturing data and as a normal user when analysing network data.

Before you start capturing, it is better to have a given issue that you want to solve or examine in mind. This is the first step for a successful network traffic analysis.

If you are already familiar with Wireshark, learning how to use tshark will be easy for you. Having a good knowledge of TCP/IP comes in handy too.

wireshark
Capture network data and display it on tshark

Resources

tshark
Wireshark
DHCP, RFC 2131
Display Filter Reference

Step-by-step

Step 01 Installing and running tshark

In order to install tshark on a Debian 7 system, you just have to run the following command as root:

# apt-get install tshark

To find out if tshark is properly installed, as well as its version, you can execute this command:

$ tshark -v

Step 02 Capturing network data using tshark

Tshark can be used as a replacement for tcpdump, which is the industry standard for network data capturing. Apart from the capturing part where both tools are equivalent, tshark is more powerful than tcpdump and therefore if you want to learn just one tool, tshark should be your choice.

The first command you should run is tshark D to list the available network interfaces.

The simplest way of capturing data is by running the tshark command without any parameters. You will get the output on screen – which, as you can easily understand, is not helpful at all!

Step 03 Two command-line parameters

The single most useful command-line parameter is -w, followed by a filename. This parameter allows you to save network data to a file for later processing.

The following tshark command captures 500 network packets (-c 500) and saves them into a file called test.pcap (-w test.pcap):

$ tshark -c 500 -w test.pcap

Another useful parameter is -r, followed by a filename, which allows you to read and analyse a previously captured file.

Step 04 Applying filters during capturing

Tshark allows you to filter network data by capturing specific types of traffic, avoiding the creation of huge capture files. This can be done using the -f command-line parameter followed by a filter in double quotes.

The most important TCP-related field names are tcp.port, for filtering the source or the destination TCP port; tcp.srcport, for checking the TCP source port; and tcp.dstport, for checking the destination port.

Generally speaking, applying a filter after data capturing is considered more practical and versatile than filtering during the capture stage because most of the time you do not know in advance what you want to inspect. Nevertheless, using filters during network capturing can save you time and disk space and that is the main reason for using them.

Remember that the filter strings should always be written in lower case.

Step 05 Applying filters after network capturing

Filters that are applied after data capturing are called Display Filters by tshark and Wireshark. You should use the -R command-line parameter followed by the Display Filter in double quotes.

The http.response.code != 404 display filter searches for HTTP traffic with a response code not equal to 404. The tcp.port == 80 && ip.src == 192.168.2.2 display filter searches for TCP traffic that both uses port number 80 and comes from the 192.168.2.2 IP address. If you have an error on your Display Filter, tshark will let you know by displaying an error message.

As you can easily understand, the possibilities are endless and only depend on your imagination and the problem you are trying to solve.

If you deeply understand Display Filters and have a good knowledge of TCP/IP and networks then network problems will not be a problem!

Step 06 Exporting captured data into a readable format

Imagine that you want to extract the frame number, the relative time of the frame, the source IP address, the destination IP address, the protocol of the packet and the length of the network packet from previously captured network traffic. The following tshark command will do the trick for you:

$ tshark -r login.tcpdump -T fields -e frame.number -e frame.time_relative -e ip.src -e ip.dst -e frame.protocols -e frame.len -E header=y -E quote=n -E occurrence=f

The –E header=y option tells tshark to first print a header line, the –E quote=n dictates tshark not to include the data in quotes and the –E occurrence=f tells tshark to only use the first occurrence for fields that have multiple occurrences.

Step 07 Solving a DHCP problem

The problem: some computers on a network could not connect to the network although other computers were okay. All computers were using the DHCP protocol to get their network settings. The IP of the official DHCP server was 10.0.10.10.

DHCP is short for Dynamic Host Configuration Protocol and is a protocol that provides configuration information to hosts on TCP/ IP networks. DHCP is based on BOOTP (the Bootstrap Protocol) and extends it by adding more capabilities. DHCP and BOOTP protocols are both using the UDP protocol with UDP ports 67 and 68.

Step 08 More about the DHCP protocol

The first packet of a usual DHCP transaction between a DHCP client and a DHCP server (IP 192.168.1.1) has a DHCPDISCOVER message from the machine searching for a DHCP server. Since the machine does not have an IP address yet, the source IP of the packet is 0.0.0.0 and the destination IP is the broadcast IP address (255.255.255.255).

What distinguishes the network card of a machine from another network device found in the same LAN is its MAC address, which is unique. Therefore the DHCPDISCOVER message should include the MAC address of the device requesting a DHCP server.

The next message is the DHCPOFFER from the DHCP server (IP 192.168.1.1) and is a broadcast message since the client machine still has no IP address.

Then the client machine requests from the DHCP server configuration parameters with the DHCPREQUEST message. Next, the DHCP server sends a DHCPACK message back to the client machine that includes all the configuration parameters. From now on, the DHCP client can use the offered configuration information and any parameter that is unique to that particular machine, like the IP address, is reserved by the DHCP server and is not offered to any other networked device.

Step 09 Solving the problem

Tshark output shows that there were two DHCPOFFER messages on the network from two different IP addresses (192.168.1.254 and 10.0.10.10) instead of only one DHCPOFFER message from the legitimate 10.0.10.10 DHCP server! This was the first truly useful hint for solving the actual problem.

As the DHCP server did not get any answer from the client, it re-sent the DHCPOFFER message (packet number 6), but as you can see, it was already too late (packet number 4)!

The IP address of the ‘extra’ DHCP server was 192.168.1.254. The 192.168.1.254 DHCP server offered the 192.168.1.60 IP address to the machine. As you can guess, all computers that could not properly connect to the network were getting IPs in the 192.168.1.1-253 range.

The client machine preferred the wrong DHCP server to get its information. The reason for choosing the 192.168.1.254 DHCP server was that it responded first! Pretty naive reason, yet it caused many problems.

After finding out that there was a second DHCP server that triggered the problem, it was easy to find out the computer that caused the problem. This particular computer was running a Linux virtual machine (VM). The OS on the VM had its DHCP server running and that was the cause of the problem! Pretty tricky, don’t you think?

Step 10 Creating a Perl script that uses tshark

The purpose of the checkIP.pl Perl script is to find invalid IP addresses. The checkIP.pl script assumes that the network data is already captured with tshark.
Several steps are needed in order to solve the problem. The first step is reading the file with the network data. Next, it is running the tshark binary using the appropriate command-line arguments using the following Perl commands:

my $command = "$TSHARK_BINARY -r $filename -T fields -e frame.number -e ip.src -e ip.dst -E header=y -E quote=n -E occurrence=f";
my @netDATA = `$command`;

Step 11 More about the Perl script

The next step is reading the output of the tshark command that was saved in the @netDATA variable line by line. After cleaning up input lines from unnecessary space characters and parsing it, the script uses the Data::Validate::IP Perl module for catching erroneous IP addresses and then prints the IP on screen:

if ( ! is_ipv4($sourceIP) )
{
   print "Packet number $frameNumber contains a bogus source IP!\n";
}

You can alter the script in order to catch the type of errors you want, such as traffic from unwanted hosts or traffic to specific TCP or UDP ports.

Step 12 Inserting network data into a MongoDB database

The Python script supplied (on the cover disc) inserts network data into a MongoDB database for further processing and querying. The name of the script is insertMongo.py and it assumes that the network data is already captured with tshark or tcpdump. The next shell command runs the Python script with input from tshark:

$ tshark -r ~/1000.pcap -T fields -e frame.number -e ip.src -e ip.dst -e frame.len -E header=n -E quote=n -E occurrence=f | python insertMongo.py
Total number of documents written: 1000

The 1000.pcap file contains 1,000 network packets and the script informs you that there were 1,000 documents written in MongoDB, so you know that everything is okay. You can now start querying the MongoDB database!

  • Tell a Friend
  • Follow our Twitter to find out about all the latest Linux news, reviews, previews, interviews, features and a whole more.