Privateye Thoughts

I realized recently a few things regarding Privateye. First, I've been putting off work on it to focus on my other project, SpongeMonkee (spongemonkee.sourceforge.net). Second, it's a pretty complex project without much explanation to go with it. Sure, there's the documentation that comes with the file downloads (some of which is in our "Documentation" section here on Sourceforge, if you're interested in browsing it without downloading the file), but that mostly deals with the installation and use of the project. What hasn't been very well communicated is the 'why'. Why work on this project? What does it have to offer? What are the guiding principals of its development? And how will that shape its eventual progression? So, I've decided to use up some of my not-so-free time and jot down some things. They may not be in any sensical order (heck, they may not even agree with each other) but at the very least they should give a broad idea of what Privateye really is now and what it aims to be.

Why

What is Privateye? Where does it fit? What does it do well? Well, first thing's first. Why Privateye, in the first place?

Privateye was initially designed to solve a single, specific problem. On our network, users were forced to register their machines through a registration web page (Tech Jargon: SNMP traps for link up were sent to a central server, that checked the MAC against known entities and switched unknown assets to a registration VLAN, which used black-hole DNSing to bring users to a registration page). The user registration system also allowed for assets to be switched into quarantine and 'penalty box' VLANs where they were cut off from the rest of the network. Data on user registration was stored in a database, which basically allowed us to map a network asset to a specific user (Active Directory user, to be specific) with a single SQL query. This registration system, by the way, was "Campus Manager" by Bradford Networks. It will be referred to as "Campus Manager" from now on.

We also had in place a commercial intrusion detection system, and our normal work day started out the same every day:

Assets were placed in the Penalty Box if they were infected with viruses, attempting to communicate with bot networks, etc. And this worked pretty well, except for a few things. First, it was time consuming and not very fun (although there was a certain satisfaction of manually dealing with infected user machines). Second, there was a time lag. If a machine was infected while we weren't in or watching, it had until we saw its activity in the logs to merrily infect other users. And third, it offered relatively little feedback to the user. In the penalty box, they were directed to a curt web page that basically said "you've done something bad, so we've disconnected you". Many a frantic (and usually incoherent) call to the help desk resulted.

Our initial implementation was a case-specific glue between two of these systems: the IDS and Campus Manager. We wanted to remove our two most important problems, time lag (important for the network) and monotony (important for us). The IDS allowed Syslog output, so our first idea was to use Swatch to watch the log and run scripts for certain alerts. But the necessity for SQL queries in every such script made this annoying. And our alerting specifications could be complex. Rather than stretch Swatch up to and possibly past its limit, a simple script should suffice, we thought.

Privateye 1.0 was just that. It took in an alert, split it up into a set of fields (source and dest IP, ports, alert type, etc.), queried the database to find the user and MAC address, then was able to run regular expressions on the alert fields and, based on the output, run a shell script. Thresholding was built in to deal with SSH v1 scanning. A basic rule said "if this regular expression for this field matches this many times in this many seconds, do this".

And it worked. Very well. We saw virus outbreaks quashed quickly and efficiently, before more than a single machine was infected. Because we were calling outside scripts, we were able to shun outside IPs at the firewall and shut switch ports as well. And we were also able to create our own database, detailing what actions had been done to what users when.

With this last component, we came up with another idea. Specifically, we came up with a way to solve our third problem, user confusion. As I mentioned earlier, users in the penalty VLAN were directed by black-hole DNS to a single web server. This server also had seperate interfaces connected into normal VLANs for maintanence, etc. So the server was able to access our database of "who's been bad and why". With this database and the web server's ARP cache, we were able to connect every incoming connection to a misbehaving user and customize the web page they saw to the specific reason they were in penalty. Now, we didn't have confused users calling the helpdesk. We had empowered users. We had users who called the helpdesk and said "I'm in Penalty because I have the new IRC Bot, and I hear you can help me use antivirus to deal with it".

This database also proved useful on the helpdesk side. A simple web interface allowed helpdesk personell to view currently penaltied users, see why they were in there, and with the push of a button release them once remediation was complete. So what started as simple automation quickly grew to empower both the users and our first tier incident response.

We liked what we'd done, but we wanted to do more. And so came version 2. For a description of that, I present my next session:

What

What is Privateye now? Well, look at its use in version 1. The web interfaces, Campus Manager, a MySQL database server, and an IDS - they all already existed on the network, but they were disparate, unconnected entities. Privateye was the glue that allowed them to be connected, to work together and accomplish wonderful things. Privateye 2 was an expansion of this, an attempt to abstract as much as possible this idea of "glue" and create a framework for disparate information systems and remediation vectors to work together. Privateye 2 abstracted the problem into the following areas:

The Input

With just two initial information vectors (Campus Manager's database and the IDS logs), we were able to accomplish wonderful things. Privateye 2 took that idea up a level, allowing for an unlimited number of input vectors, both passive (tailing log files, reading data from a TCP connection, etc) and active (querying MySQL databases and LDAP directories). Most important to this information gathering was its inherant normalization. Since input is split into relavant fields, where a source IP exists within a log line ceases to matter. If a regular expression can grab that IP, it will.

The Alert

What is an alert? Privateye 2 defines it as simply a collection of data. In practice, it's an associative array. Any field can be added in, any type of data can be inserted. It's all up to the user. Each input vector could be split in arbitrary ways (through regular expressions) into any number of fields. An alert could be created by parsing a log file line, then added to with data from a MySQL query and user information pulled from Active Directory through LDAP.

The "User"

For each alert, we then became interested in the agent or agent-group responsible for that alert creation. It's actually more intuitive to think of this as a statistics set. It's not really a single user (although it can be). It could also be the subnet of the source IP, the destination port number - actually, it could be any combination of any number of alert fields. Each "user" had a permanent (in memory) collection of statistics associtated with it. This allowed for thresholding over multiple equivalent alerts (wait until a user has had 10 alerts in 2 minutes) and correlation of seperate alerts (don't act unless the user has triggered alerts A, B, and C, in that order). Multiple user objects can be investigated for each alert, allowing tracking of alert statistics by group (how many alerts from internal users with names that start with 'A'?).

The Triggers

We now have an alert and a statistics set (the 'user'). Basically, these together translate to a single set of information on which various tests can be run. We'll call these tests 'triggers'. A trigger could check a certain field with a regular expression (alert type contains the string 'IRC') or against a threshold (all destination ports above 8000). And triggers could be combined. AND, OR, and NOT triggers allowed for the creation of trigger trees.

The Rule

We defined a 'rule' as a trigger, a threshold, and an action. If an alert caused a rule's trigger to return true X times in Y seconds, we would run an action (more on that later). Rules were divided into rule lists, allowing a specific rule set to be checked based on which input vector generated it, which user it was associated with, values in its fields, etc.

The Action

We really only ever used two actions: run a shell script or execute another rule list. Running a shell script allowed Privateye, based on alert data, accomplish anything a shell script could do. We could (and do) telnet into a PIX firewall and shun an external IP, send an SNMP trap to Campus Manager to move a MAC into the penalty VLAN, walk the CDP neighbors of switches to find the MAC of an unmanaged asset and shut its network port. Basically, anything was possible. Each of these portions was actually a specific user-creatable and user-configurable object. Inputs were INPUT objects. These passed data to ALERTPARSER objects which split their data with regular expressions. The alert was passed to one or more USERHASH objects which chose the user(s) specific to the alert. USERHASH objecs then passed both the user and the alert to RULELIST objects, each containing one or more RULE objects. RULE objects contained, as said earlier, a TRIGGER and an ACTION object. TRIGGER objects (specifically the boolean combinations) could contain other TRIGGERs. The whole thing became a very extensible tree structure, where each set of input data passed through various subtrees in its analysis.

How

How well did it work? Remarkably well. With this new framework, we were able to first emulate, then expand upon, the original functionality of Privateye 1. And admittedly, we've only tapped the surface of what can be done. For more specifics on functionality and various possibilities, I'd recommend the documentation mentioned earlier.

What Now?

Privateye has evolved from a single-use utility to a broad framework for data analysis, event correlation, and automated response. Privateye 3, currently in production, hopes to expand upon this even further while at the same time simplifying the basic configuration and flattening the learning curve as much as possible.

One quick glance at the configuration documentation (Documentation.config) is enough to scare even the bravest of script gurus. Heck, it scares me, and I wrote it. The basic problem lies in the sequential order of 2's data handling. Alert data MUST be parsed first. Only then can a user be referenced. Only after that can rules be checked. And each step along the way requires objects to be created. And all of these objects must be correctly linked to create a processing path for the data.

Privateye 3 abstracts almost everything into input, output, and triggers. Take a look at 'actions'. They're really just void functions. And 'triggers' are really just boolean functions. And you can easily emulate a void function with a boolean function function whose output is ignored. So assimilating actions into triggers is a no-brainer. With a little tweaking of definition, other obects can be similarly assimilated.

Take alert parsing. What if we created all alerts with a single field, "DATA", that contained all input data. Then, we could extract other field data from that single field whenever we want. And we can do it based on the other processing. If the "DATA" field starts with the "#" character, maybe we stop there. Otherwise, we continue to parse out and test other alert fields. Poof, we've just added comments to our log files and drastically reduce the processing requirements of parsing them.

Take user hashing. Think of it as attaching a statistics set to an alert. Does that require that all field data is correctly parsed first? No. Does it need to be done at all? Only if we care about correlation over multiple alerts. If a single alert always necesitates an action, why not just do that action, regardless of any statistical history. Now, certian alerts won't require the retrieval of statistics. More processing reduction. And more simplicity.

With this, we're now approaching a real extensible framework. Take out stats retrieval, and we've really got a Swatch equivalent. But still with the ability to create and chain multiple data retrieval mechanisms and and alert tests. Or we could use the stats retrieval, and we've got at least the functionality of Privateye 2, while allowing for optimizations to drastically reduce the per-alert processing requirements with intelligent trigger placement to branch or stop processing on various alert types. And speaking of processing, how's this for speed increase: Privateye 1 and 2 were PHP programs. Privateye 3 is in C++. Eventually, this will also allow for threading to handle multiple alerts simultaneously and stop our current problem of serial processing and remediation.

Fin

So that's the current directon of Privateye. I've got high hopes. Privateye 2.2 is the latest PHP version, available in the downloads section, if you want to check out that code. It's been running at Middlebury for about half a year with very good results, so it can be considered pretty stable. And if you're interested in Privateye 3, it's available through SubVersion. Run: The code is documented with Doxygen-compliant comments, if you're a coder looking to contribute. If you've got this far, you're probably interested. I'd love to hear your comments and suggestions. Email me at gsconnell@gmail.com.
sourceforge.net