3.0

What's the matter with 2.2?

Speed

Privateye was originally written in PHP for one important reason. That's the only API I knew for accessing MySQL databases, and we needed to access MySQL databases. In other words, ADODB made me choose PHP. There were some benifits to working with PHP: 5.0 has some good object oriented features, sockets and streams were pretty easy to deal with, and the "everything is a string is a number is a whatever" idea (dynamic typing, I believe it's called) ended up helping out quite a bit. But there's some problems with PHP. It was designed with a purpose. And that purpose (I'm sure the PHP developers will agree with me on this) was not to create command-line applications for high-speed data parsing. It's an interpreted language, and that slows it down. It's not multithreaded, and that hurts it quite a bit when dealing with SQL queries and calling external scripts. As Privateye grew, it became apparent that a better foundation was required to push it forward.

Complexity

Privateye 1.0 was simple. Privateye 2.0 and above were not. Privateye 1.0 was designed for a specific problem on our specific network. Privateye 2.0 and up were desig ned to handle any problem we could throw at it and a few we couldn't think of, as well. Some of the added complexity, of course, was required to allow 2.0-2.2 the bread th that they have. But, when writing config files for version 2 variants, I always thought that there had to be an easier way. Take a look at 2.2's Documentation.config if you want a real show. The version 2 syntax was also needlessly constricting. INPUT passed to ALERTPARSER passed to USERHASH passed to RULELIST which had RULES where TRIGGERS passed to ALERTS based on thresholds. Wow. It all makes sense, if you read the config and realize that ALERTPARSER is really just an input normalization object, and that normalization most likely comes before correlation (USERHASH), etc, etc, etc. But this meant that in order to do anything, you had to know everything.

The New Deal: 3.0

Take a look at the new 3.0 codebase, and the first thing you should notice (unless you're really, really thick) is the change to C++. There were really two choices in the matter, C++ or Perl. Some things, Perl could have done better. Some things C++. The C++ decision was a tough one, but in the end, it was based on: C++ does suffer some shortfalls, most regrettably the conversion from strings to numbers and back, but through the version 3.0 release it's been clear that it was a good , solid choice that should serve Privateye well in the future. Also, for my own benifit, it lets me code in C++ (which I quite enjoy) and gives me a first chance to really play around with the GNU build system (configure, etc.).

So, how is version 3.0 different from the version 2s?

Codebase

I think we've beaten this one to death. It's now in C++. One thing I haven't mentioned, though, is the ability this gives to link with other open-source libraries. Anything C or C++ can now be easily be ported into a peTrigger, peInput, or peOutput object. This should greatly increase the capabilities of Privateye with limited coding and leveraging the full power of the open source community. And since Privateye now uses GNU's build system, those without the fancy libraries will still be able to compile with what they have and have a perfectly good working Privateye.

Further Abstraction

In an attempt to deal with the complexity issue, the plethora of object types in the v2s have been abstracted down to three: Input, Trigger, Output. Basically, everything that could be made a Trigger was. This had a number of interesting unforseen affects which actually help Privateye more than I could have hoped:

Configuration Standardization

In another effort to reduce complexity, all object creation and configuration is now done exactly the same way. It doesn't matter if you're creating an SQL trigger or a Regular Expression trigger. First you name the object. Then you give its type (Trigger) and subtype (Regular Expression), then you give it the arguments it requires, in any order. In Privateye 2.x, the Parser object took up almost a quarter of the entire code base, line for line. And this was mostly because every object had its own unique syntax, with its own required arguments that had to be given in a specific order. Now, arguments can be given in any order, or can even be set and reset again after the object is created. So long as an object has everything it needs by the time it goes into use, it's good to go. This as well has had some interesting and useful side affects:

In Conclusion...

This has been a showcasing of some of the major improvements that reengineering Privateye from the ground up has allowed. Currently, the 3.0_alpha release provides limited functionality, but hopefully a beta will be available in the next month or two with a full feature set ready to be debugged and tested by the ever-curious and ever-growing Privateye community. As always, if you have any questions or comments, I'd love to hear from you at gsconnell@gmail.com. Thanks again for the time you took reading this. If you got this far, hopefully you're interested enough in Privateye to download and test. I can always use other ideas, viewpoints, and criticism. And as always, if you're interested enough to read this far, maybe you're interested enough to grab the latest SVN: I can't guarantee it will work, but at the very least it should prove pretty amusing.
sourceforge.net