Security Information Management (SIM) is often referred to as the dumb portion of SIEM, and is typically a Log Management solution. Log management solutions will collect logs from different log sources at high volumes and store them for future reference.
A Security Information Management (SIM) solution should store logs in a scalable format, such as an ASCii flat file, however some solutions attempt to store their logs at the log management layer in relational database such as MYSQL, MSSQL or Oracle. This would not scale very well in most organisations but might be appropriate for very small organisations.
Tip #1 – Ask your potential vendor if the Log Management Layer stores logs in an ASCii flat file structure for scalable storage.
The SIM solution would allow reporting and basic alerting functionality, improved reporting and alerting would be expected at the Security Event Management (SEM) layer.
SIM/Log Management solution
As a minimum the SIM/Log Management solution should allow log collection and indexing or all collected logs. The most common indexing engine is the Lucene indexing from Apache. When logs are collected the Lucene Indexing Engine will index all words, typically done to 2-3 characters, as set by the vendor.
Tip #2 – Ask your potential vendor how many characters their indexing engine indexes down to, for example if the engine only indexes down to 3 characters you would NOT be able to find commands such as “SU root”, as the SU portion of the command would not be indexed.
The number of characters that the indexing engine has been set to index down to, will impact the size of the actual index storage requirements. While most vendors will be able to compress collected logs on a scale of 20-1, they will average approximately 10-1 compression. That is to say if they collect 10gb’s of log data, they would typically require 1gb of disk space to store the collected log data. However this will NOT include the indexing requirements.
Tip #3 – Ask your potential vendor how much disk space their index will consume, typically you will be looking for a 5-1 ratio, that is to say, for every 1gb of compressed stored log data, the index should take up to a maximum of 5gb of disk space.
You would not typically store your Index on external storage, but for scalability you should be able to store your collected, compressed, log data on external storage such as a SAN or Fiber storage.
Tip #4 – Ask you potential vendor if you can store the log data on a SAN or fiber connected disk storage.
Reporting
An important function of the Log management layer is the ability to report on the collected logs. You should always be able to collect logs, without dropping them if you go over the licensing maximums, but more importantly you should always be able to report on collected logs.
This would not be the case if you are using a vendor that stores logs in a relational database at the Log management layer. The reason for this is relational databases require structured data, which means the collected log data must be understood before it can be inserted in to the relational database.
Vendors would typically achieve this by writing Regular-Expressions (RegEx), that will “understand” the log, that is to say, the Vendor will write RegEx to break the logs in t parts, such as the Source IP, the Username, the Group name, the Domain name etc. Having Regex to understand your logs is not a bad thing, in fact it will help with your analysis, however, you must ensure that if the vendor comes across a new log source they have not seen previous, therefore do not have a RegEx rule to process the log, that they will still collect the log and index it for reporting purposes.
Tip #5 – Ask your potential vendor if the log is still collected and indexed for reporting purposes even if they do not have a Regular-Expression processing rule for the log.
It is important that logs are collected from a forensic purpose and are not dropped if the log format changes slightly, if your vendor relies on Regular-Expressions to understand the logs at the log management layer, BEFORE it can store the logs, you are likely to drop logs in the long run, as log formats will change and you will get new equipment that generates logs in a format that the vendor does not understand.
Reports should be able to be generated in HTML, PDF, RTF, CSV and XLS formats
Reports should be able to be generated in HTML, PDF, RTF, CSV and XLS formats and should be able to be emailed, uploaded to a server and scheduled to run overnight. Most reports will be large, and take a period of time to execute, so it is important they can be scheduled.
Alerting at the log management layer is not critical, as typically clever alerting would occur at the Security Event Management (SEM) layer. But basic alerting at this layer would be nice to have if they Log Manager does not have a built in SEM solution. Most vendors now sell both SIM and SEM together as a single solution and Alerting is more critical at the SEM layer.
Chain of Custody
It is important that any logs collected can be relied on from a forensic purpose, this means once the log has been collected, preferably in real-time, they should be stored and digitally signed. The digital signature should be a minimum of SHA256, with signatures stored independently of the log data, or at least protected via strong permissions.
While most international regulations, such as PCI DSS, GPG13 or SOX, will require digital signing of logs to prove chain of custody, encryption of logs at rest is optional. Having said that, in high protective environments having the option to encrypt is a nice option to have.
Tip #6 – Ask your potential vendor if the solution digitally signs collected logs at a minimum of 256bit SHA and optionally could encrypt the logs at rest.