Add Post   Gallery
This is a community portal. Sign up on the left and start posting about analytics and visualization of security data.



Cloud-based Log Analysis and Visualization

I was giving a talk at RMLL 2010, a french free software conference. The title, Cloud-based Log Analysis and Visualization, already gives the content away. But in case, here is the abstract for the talk:

Cloud computing has changed the way businesses operate, the way businesses make money, and the way business have to protect their assets and information. More and more software applications are moving into the cloud. People are running their proxies in the cloud and soon you will be collecting your logs in the cloud. You shouldn't have to deal with log collection and log management. You should be able to focus your time on getting value out of the logs; to do log analysis and visualization.

In this presentation we will explore how we can leverage the cloud to build security visualization tools. We will discuss some common visualization libraries and have a look at how they can be deployed to solve security problems. We will see how easy it is to quickly stand up such an application. To close the presentation, we will look at a number of security visualization examples that show how security data benefits from visual representations. For example, how can network traffic, firewall data, or IDS data be visualized effectively?

"Trojan Pong" and other malware data visualization ideas

"Trojan Pong" and other malware data visualization ideas

This small experimental project was done for the Shadowserver Foundation. They are a volunteer, Not for Profit organization who deal in the capture, analysis and dissemination of data and intelligence relating to nefarious activity on the internet. Shadowserver provided us with one day worth of data (which was several gigabytes) for us to apply some known techniques, and experiment with some new ones.

The idea of this project was simply to provide some ideas as to ways to represent their massive datasets visually. There's lot of work to go, however here are few early ideas. My favourite is a light-hearted time series visualization in the theme of an old favourite arcade game originally released in 1972 "Pong".

See all of the samples at

SSHD brute force attempts - userids and IPs

SSHD brute force attempts - userids and IPs

One of many tests with Afterglow, visualizing SSHD brute force logins (yellow) vs source IP addresses (green).

This one shows quickly the IPs that have the most activity (one IP has the most: the yellow explosion in the middle), along with popularly attempted userids, and the IPs which have been attempting the same userids.

Monitoring / Visualisation Stations, & relevance of layer 4 traffic

Opinions sought from those working in the relevant areas - handed this document in as part of a degree project in security visualisation & monitoring, and the feedback was that the network and monitoring station/s are not realistic, and that I should have focused on port 80 and layer 7 traffic only, as layer 4 is not relevant any longer. The link provided below is only part of the document, I presume it's the part they had issues with. I wasn't actually intending to focus on web traffic, which was made clear in the document anyway (tho I did indicate to them that with the likes of Rumints packet contents visualiser, it is certainly viable to utilise that to match up with malware signature databases - but that aspect wasn't the focus of the project).
I don't expect it says anything that people working in those areas will be unaware of, and the general intention was to address what would be required for a monitoring station / network, which includes visualisation software, that would work in real-time as well as offline analysis and traffic capture.
The grouping into 'objectives' is just part of how the work has to be presented to comply with guidelines. Cheers for input, I know you're probably busy.

nb - the last part is probably wrong about ad-hoc IPs; I can't remember exactly right now how they are handed out; they probably aren't always dynamic esp. now it's more common to get fixed-IP SIMs.

Spam - A 2 day comparison with afterglow.

Spam - A 2 day comparison with afterglow.

I finally got my spam stats up and running. The results are amazing.
Lightyellow = Subject || Red = Sender || Black = Recipient

It is pretty easy to find the one user that appears to get a significant amount of Spam :). If I had to guess, I would say the single subject, large source and large destination likely originate from Botnets?

The results are from Wed and Thurs of last week.

Libemu sctest' output, created from PDF shellcodes

Libemu sctest' output, created from PDF shellcodes

I extracted this image using PDF malware that I got for analysis purpose. By using perl script I filter out the unneeded content and later put it in sctest(libemu tool). The graph created using dot command in Graphviz package

EDV - Event Data Visualization

Afterglow has been on my list of 'neat tools' for quite some time. Thankfully, last month I finally had a bit of spare time to really play with it.

The result was EDV:

See the page for more info. Keep in mind, this is BETA!

It currently supports Snort (Sguil DB format). However, even the untrained eye can easily modify it for straight Snort
or anything else you can MySQL query. Once you have your sources defined it will take care of the rest.

The tool is static (controlled by configs and cron) for now but I do plan on adding a query tab to the web page so that you can do on the fly queries. Low priority for now. I have been focusing on 2 parsers that log directly to MySQL. One parses Syslog output from a Barracuda spam firewall and the other URL info captured by URLSnarf. These will be my next additions.

Comments and suggestions welcome.


Zombie network activity representation by Dorothy

Zombie network activity representation by Dorothy

This graph is automatically generated by the Dorothy framework anytime a new malware is analyzed.
It aggregates three different kind of information : 1) the network activity 2) the dns host resolutions 3) the GET / POST resquest
In this way, we can be able to easily define certain activity related to botnet communications.
A quick legend :
Colors :
Green = Services / hostnames
Red = General target
Purple Red = Known C&C ( in this example there isn't any)
Purple = C&C Web target
Light blue = private network host

Circle = Target
Triangle = Source

The shape's dimension represent the network activity related to that node.

FDP visualization for Nepenthes using Afterglow and python-geoip

FDP visualization for Nepenthes using Afterglow and python-geoip

I created the image by using Nepenthes' log, later put the country info by using python-geoip. Finally, use Afterglow and graphviz to illustrate them.

Explorative Visualization of Log Data to support Signature Development

Explorative Visualization of Log Data to support Signature Development

click here for the full picture

The effectiveness of intrusion detection systems, which apply misuse detection, strongly depends on the conciseness and topicality of the applied signatures. Imprecise signatures heavily limit the detection capabilities of the intrusion detection systems and lead to false positives. The reasons for this detection inaccuracy can only to a lesser extent be imputed to qualitative restrictions of the audit functions. Instead, these restrictions must be identified primarily in the signature derivation process itself.

In particular, the derivation of signatures starting from given exploits appears to be a very complex task, which comprises identifying the traces in the audit data that are left behind by an attack and determining characteristic relations of the attack. This procedure requires also a manual audit data analysis. Admittedly, this basic activity is time-consuming, sophisticated, and cumbersome. The main reasons for these difficulties are the flood of very fine-granular information distributed to different sources as well as the non-ergonomic inspection of audit data.

Consequently, abstraction capabilities to extract relevant parts of this data richness are crucial, but common tools for audit data analysis do not tackle this issue. Abstractions, i.e. the goal-oriented accentuation of relevant relations between audit events, while concurrently hiding irrelevant data are a key aspect to support the security officer during audit data analysis. Another key aspect impacting the time requirements of the analysis is the representation of the data to be analyzed. Typically, a textual representation of audit data is used, which only inadequately allows to illustrate relations between audit events and thus is suboptimal for providing a holistic view on system behavior. Unclearly arranged representations are irritating and lead to wrong assessments and conclusions. These drawbacks can be remedied by using a graphical multi-dimensional representation of audit events.

We developed the tool ADO for three-dimensional representation of audit data that can be explored interactively. The user can create arbitrary views on the data and can study and visualize relations or dependencies of the data. Furthermore, the tool ADO is a part of the signature development tool, which supports the knowledge transfer from identified attack relevant relations between audit data and the actually signature modeling.

The current version of ADO supports BSM (Solaris Basic Security Module) audit logs as input data. Our ADO tool consists of the three components sensor, the analysis and transformation component, and the presentation component. The sensor transforms BSM audit events into a common data structure and provides the data to the analysis component. The analysis component allows the user to define metrics and to adjust particular abstraction parameters. These settings control the quantitative analysis which is followed by a space-specific transformation. The resulting three-dimensional virtual audit data world is turned over to visualization component, which offers the user visualization and interactive exploration capabilities.

The picture shows the single stages of an exploration of an attack on a Solaris system by using ADO. Starting from the picture in the upper left part the signature engineer explores a set of audit events and identifies and visualizes attack relevant relations in these events. The picture in the lower right part shows our SEG-Tool with the audit data visualization tool ADO and the other signature modeling components.