The market is not ready for security data visualization!

Maybe that's a bit provocative and maybe I am wrong, but let me tell you why I think that the market is not yet ready for security data visualization. If you look at the visualization space, where business intelligence (BI) and other similar technologies reside, you will find that visualization is used in areas where the underlying data is very well understood. For example for sales and marketing data. It is very simple to explain to someone what sales data is all about. People can relate to those pieces of information. They understand it.
Computer security logs are not well understood at all! How do you expect people to understand visualization of security data if nobody really understands the underlying data? What are the best ways to visualize all this data if you cannot even understand the individual textual entries?
What we have to do (and if I say 'we', I mean you guys reading this blog, you guys inerested in this topic), is to go about the problem of log analysis and visualization on a use-case by use-case basis. We cannot solve all the problems at once. Let's be very specific and show for one type of log file, one type of log entries, how they can be visualized and how that helps the user.
I would claim that the companies which have tried to play in the security visualization space have not had much success because they tried (and probably still try) to address the entire problem at once: Visualizing log files. Again, let's go use-case by use-case. Submit them here so people can learn from you and you can learn from others!

I do not think it is because

[Comment by MikeAndre]

I do not think it is because of the lack of understanding of the underlaying data that are the reason behind this. Look at how the "modern" tools work with the data today:

First of all the data are mainly presented in a tabular format (as alerts). Because of the amount of data, most tools "aggregate" the data and store the results in new tables. Then you have systems that monitors amount over time, and triggers new alerts when the result deviates too much from "the normal" values -- and reports the results into new tables.

The only visualization done by most of them are "amount over time" presented in bar/line/area charts, and "distribution" in pie charts.

This means that we who analyzes the data actually do understand them, but the amounts are often so overwhelming that we misses the important big picture. Here we need more visualization tools, and there are people who have done a lot of thinking about this. Take a look at the papers written by Gregory Conti (http://www.rumint.org/gregconti/index.html), there you'll find a lot of good stuff.

To those interested in this, but are not sure what to do: Implement a scatter chart that plots all the ports that have been scanned the last 24 hours. Let the X-axis be time and the Y-axis the port number (could be logarithmic for easier reading). What you do here is to visualize the noise, and if there is one thing our brain is good at it is to recognize patterns in noise.

I have done that where I work, and it is a surprisingly effective way of discovering different kind of scanning activities... given that you don't filter out or aggregate the scanning/probing data.

So in my opinion we actually have the understanding of the underlaying data, but we lack the good tools for doing visualization. And why the big companies don't put any resources into creating those tools is something I don't understand. Because I do believe the market is ready for it.

Security data visualizations are not ready for the market! ;-)

[Comment by herzog]

I'd rather say: today's security data visualizations are not ready for the market. Why? Let me tell you some of my thoughts:

1) As you said, today's visualizations rely on a huge number of very low-level parameters that are too far from the decision: human operators reason more in terms of users and applications than in terms of, for instance, bits in the payload of a network packet. I think that, to be successful, a visualization has to use fewer higher-lever parameters like, e.g., for each connection, the user, the application, the port and the hosts. Just with those parameters, it's possible to present relevant information that is easily exploitable by the human operator.

2) It's good to build a nice visualization but it's only half of the job (and unfortunately sometimes the easiest half). One needs to encapsulate it into an efficient interface enabling the user to easily interact with the visualization and accomplish his tasks. In most of current visualization-based tools, it seems that this part is often neglected.

3) The tool has to address a real problem for the user. For example, I've seen many visualization-based tools that allows to detect or visualize port scannings. Those visualizations are very nice but... so what? Besides the fact that port scannings are usually easy to detect by traditional systems, I'm not sure that such scannings are a real issue for administrators who sometimes don't know who is using which application on their network.