Next Previous Contents

3. Concepts of Operation

IDABench isn't an intrusion detection system. It's not an analysis tool. In all honesty, it doesn't really DO anything. Instead, IDABench provides a convenient workbench for human analysts to explore network events using a myriad of tools and techniques. If a certain analysis utility isn't available in IDABench, a plugin API simplifies its integration.

3.1 Raw data capture

IDABench sensors are installed at network ingress/egress points where malicious activity is likely to traverse. The DMZ (Demilitarized Zone) is the area physically between the route into your network and the filtering systems emplaced to defend it. This is the most beneficial location to deploy a sensor, as the majority of malicious activity (that network packet analysis is suited to) will traverse this segment. Any other untrusted/trusted network border is a candidate for sensor deployment.

  BIG BAD                     (DMZ)                 SOFT SQUISHY
 UNTRUSTED -------- ROUTER ---------- FIREWALL ---TRUSTED INTERNAL
  NETWORK                       v                     NETWORK
                                v
                                v (sniffing only)
                                v
                                v
                             IDABench ------------ IDABench
                              Sensor               Analyzer
The IDABench sensor is simply any Unix-like system that runs a libpcap-based sniffer program (tcpdump) to record all of the traffic that traverses the network segment it is monitoring. The appropriate network interface is placed in promiscuous mode by the sniffer so that all traffic, regardless of source or destination address, is available for capture. That captured data is compressed on the fly for short- term storage and named according to the date-time group of the current hour. Each hour, using crond(8), the sensor is re-initialized, so that the previous hour's file is closed out and a new file begun. In this way, the otherwise unwieldy volume of packet data is made available in somewhat more bite-sized chunks.

To lessen the risk of sensor compromise, a special account is created on the sensor and ownership of these dumpfiles is changed to that user. When the dumpfiles are retrieved by the analyzer, this account is used.

This capture process is controlled by the >sensor_driver.pl script. There are two required parameters:

[root@spleen sensor]# ./sensor_driver.pl 
        Usage: ./sensor_driver.pl <start|stop|restart> <ALL|site1 . . .>
The first speaks for itself. The second parameter instructs sensor_driver.pl which "site"'s sniffer should be started or stopped.

3.2 Partial captures

If a dump session is aborted and then resumed within the same hour, IDABench will rename the previous partial logfile by appending MMSS to the filename root. Here's an example:

The current dumpfile is tcp.2003032111.gz. The ppp interface being monitored goes down, killing the tcpdump session. When the interface resumes operation at 11:23:01, the interface control script includes a line to restart sensor_driver.pl. IDABench will rename the original file tcp.2003032111.2301.gz and a new tcp.2003032111.gz is initialized.

Although not required, if you have either tcpslice or mergecap installed on your sensor, IDABench will enable merging of those partial hourly dumpfiles, or "logbits". Of the two, mergecap is preferred, as it natively deals with compressed data, and is more fault tolerant. Without them, the logbits will remain on the sensor until removed manually, or by the analyzer's cleanup.pl.

There are two times that the logbits will be considered for merging:

  1. When sensor_driver.pl is executed with the "stop" parameter
  2. When sensor_driver.pl is executed with the "start" or "restart" parameter (they are synonymous) AND the current hour is DIFFERENT from the hour when the logbits were created. This condition exists at the top of every hour when cron restarts the sensor.

An added benefit of this behavior is the ability to add additional "sites" on the fly. By adding an additional "SITE_x" section to the sensor.conf and running sensor_driver.pl restart ALL, partial dumpfiles are retained, the new site is added to the logging directory and the sniffers are (re)started.

3.3 Raw data retrieval

The sensor's hourly dumpfiles don't do us much good unless we can open them up and start scrutinizing their contents. Instead of placing that load on the sensor itself (possibly leading to packet loss, if analysis loads are high), the IDABench analyzer reaches out to each sensor and retrieves the dumpfiles.

Secure Shell (SSH or OpenSSH) is used to authenticate the analyzer as well as to encrypt the packet data in transit. The analyzer asks the sensor for the date/time group of the last dumpfile, then uses scp(1) to retrieve it. No passwords are used in that exchange, as the analyzer is configured with a special user account who's public encryption key is placed on each sensor.

3.4 Analysis

The IDABench analyzer is a framework for libpcap-based analysis tools to be accessed via an easy-to-use web interface. The two main components are fetchem.pl and search.cgi. These two are run by crond(8) or httpd(8), respectively, and use plugins to interface with analysis tools such as tcpdump. The results are formatted by the plugins and presented to the analyst in html pages containing text, graphics, or links to resultant binary content.

fetchem.pl is responsible for retrieving the hourly dumpfiles from the sensor(s) and making pretty things happen on an hourly basis. It is run as the IDABENCH_USER according to that user's own crontab. Once the file has been secure copied to the analyzer, fetchem.pl runs the necessary plugin binaries. These are determined based on individual site configuration. The dumpfile is decompressed into RAM in fixed-size blocks and fed to the plugin-driven analysis programs, whose results are arranged in the hourly output html file, sorted by plugin name.

Hint: if you create an array called "pluglist" in a site.ph file, it will override the sorting behavior, giving you more control over the appearance of your webpages.

Hourly plugins are described later in this article.

search.cgi is the primary ad-hoc query interface for IDABench, and builds web forms based on search plugins present. The more plugins you have installed, the more tabs will appear across the top of the search webpage.

When the search form is submitted, the appropriate plugin is called to produce a commandline that will execute its associated utility. That commandline is passed as a parameter to the script pat_search.pl, which is responsible for accessing the archived packet logs and feeding them to the analysis program.

Output from the selected analysis utility is then prepared for either html display or further post processing (i.e. graphic generation), by an output subroutine in the plugin.

3.5 Files maintenance and cleanup

I'm as much of a pack rat as the next geek, but storage resources are finite, and there's a time to clean house. IDABench will take care of some of this housekeeping for you, but other tasks will need manual attention.

Sensor files

The sensors have no mechanism for deletion of files, regardless of age. The task of sensor cleanup is left to the analyzer, via ssh. This protects against accidental data loss in case of analyzer network failure. The analyzer script responsible for deleting old files on the sensor(s) is aptly named cleanup.pl. If your sensors are rather littered with old files for one reason or another, running cleanup.pl -h as the IDABENCH_USER will provide you with the syntax necessary to manually cleanup files individually or those created prior to a certain date.

Analyzer files

packet logs

Analyzer storage has always been a challenge with raw packet logger systems like IDABench. As storage resources are limited, a decision must be made as to when data can be deleted, reduced, or relocated to ensure the continued health and reliability of the system.

editcap, part of the ethereal distribution, is a utility that makes changes to existing libpcap dumpfiles. One of the edit options available is snaplen. By specifying a new snaplen of 38 and iterating through all files that have surpassed a certain age, all ip header information and 8 bytes of the next layer header are preserved, while reducing the size of archived files significantly. editcap can operate directly on compressed files, but must output to a new filename or stdout, thus a bit of tempfile tapdancing needs to take place in order to automate this process.

temporary files

IDABench generally cleans up after itself during hourly and search analyses, but certain things do remain for a period of time. Image files and binary results of ad-hoc searches are kept in the IDABENCH_WEB_SPOOL_LOCAL directory until they age past the CLEAN_TIME as set in site.ph. The search scripts are responsible for cleaning up after themselves, including the spool directory. If there are old files in the spool directory, it is most likely because no searches producing graphical or binary output have been run recently.


Next Previous Contents