Harness machine data, that’s how.
by Joe Goldberg, Chief Security Evangelist at Splunk
As organisations across the world embrace online channels to market, the bad guys of the internet are being rewarded with more avenues in which to breach businesses. Open networks, online payment portals and even Point-of-Sale devices are leaving organisations open to more advanced and persistent threats.
Such threats are coming from hacktivists, cybercriminals, malicious insiders, and nation states. Aided by speed, persistence and smarts, they adeptly penetrate an organisation and exfiltrate confidential data without alerting traditional security software tools.
Companies’ security software tools are failing to detect threats because of an increase in spear phishing and social engineering which takes advantage of human weaknesses to break through even hardened perimeters. The perpetrators also rely on custom, constantly-changing malware to avoid detection from the anti-malware solutions currently on the market. Once inside an organisation, the hackers can use tools such as key loggers and password hash crackers to obtain legitimate, privileged credentials and move with impunity. The hackers typically infect dozens of machines with a variety of backdoors, so eradicating them is difficult.
Cybercrime is booming and businesses, rightly, feel on the backfoot.
It’s bad news, but, there is a solution.
Instead of waving the white flag and admitting defeat, there are a number of ways that businesses can take steps to create more sophisticated security strategies to defeat these advanced threats. One way is to spot abnormal activity on a network – anything that deviates from the baseline of what would be expected for an average user or IP address. If you can hone in on these deviations and outliers, then you can better detect and defeat the threats.
To spot the outliers, first you need a way of aggregating the machine data or logs generated by your IT infrastructure, at both the network and endpoint. It is vital that this unstructured machine data is collected because it provides all the event detail from security sources like firewalls, anti-malware, IDS, as well as non-security sources like Windows event logs, DNS, web logs, and email logs. This data could be terabytes of data per day and fits the definition of “big data” – data of such high variety, velocity, and volume that it overwhelms traditional data stores. A big data platform is needed to index this massive amount of unstructured machine data.
Secondly, you need a way to do advanced correlations and statistical analysis on this data in real time. This will help to connect the dots and expose the minute fingerprints of an advanced threat hiding in a sea of seemingly harmless event data. Real-time alerts and reports should also be set up to highlight potential threats as they occur. Advanced threats can come in all different shapes and sizes, depending on their source and creator, and it’s difficult to know what anomalies to look for. In order to set up effective searches to detect threats, you need to understand what your most valuable assets and employees are, and understand how they might be targeted to identify attacks. Big data platforms possess the flexibility needed to perform these advanced correlations and analysis, as well as integration with asset and employee information.
What does an outlier representing an advanced threat look like in machine data? There is no magical, short list of the events that represent an advanced threat. Security practitioners need to “think like a criminal” and be creative to build real-time correlations that identify the behaviour.
Businesses can be forgiven for not knowing where to start though. Much security guidance is hypothetical – or based on ‘what not to do’. So, let’s bring this advice to life with real-world examples of how companies are using machine data to help spot unknown security threats.
Scenario one
One of our biggest financial customers recently detected suspicious behaviour by identifying that an internal user was logged into a Windows endpoint with a default name of ‘administrator.’ All users should have a unique user name, rather than ‘root’ or ‘administrator’. This raised a red flag with IT Security.
At the same time, the customers’ endpoint-based anti-malware software detected malware running on the same endpoint. Malware means “malicious software” and is a red flag because it is often used by hackers as part of the data theft process.
Finally, the customer used a data loss prevention tool (in this case the Snort Intrusion Detection Prevention product) to identify unencrypted credit card numbers leaving the organisation from the above ‘administrator’ machine to a suspected command and control server. This data loss of credit cards was a major red flag to the business that something had gone wrong.
The fact that these three events happened on the same machine in a short time period indicates a hacker inappropriately logged into the machine, probably using improperly created or escalated credentials. They probably then put malware on the machine, perhaps a backdoor to remotely connect back to the machine later, and then exfiltrated credit card data from the machine. The credit cards may have then been used for illegal or fraudulent purposes.
So, what did this company do? They took this pattern of data loss across three different data sources and turned it into a correlation search to automatically detect and alert on this pattern in real-time should it re-appear. They also attached a simple script to this alert to automatically block all external connections from the infected machine. This way future data theft could immediately be detected and blocked. The customer also ran a historical search to see if other internal machines had connected to the external IP of the command and control server, and saw several had. These machines were assumed to be compromised and were remediated appropriately.
This threat detection and remediation helped eliminate the real cost related to the loss of customer credit cards including re-issuing credit cards, bad publicity, unhappy customers taking their business elsewhere, customer lawsuits, fines for PCI non-compliance – the list goes on.
Scenario two
Another of our customers has used Splunk to prevent spear-phishing attacks on internal employees.
The company in question was using Splunk to analyse its machine data and help keep track of all the external email domains that were sending emails into the company, all external websites being visited by internal employees and all the services and executables running on internal machines. By doing this, the business achieved intelligence about the number of emails received from each external domain, and the number of times employees visited external web domains.
This information helped the company identified the sequence of events that lead to the theft of confidential information. On the day of the attack, an email reached an internal employee from an external email domain that had never/rarely been seen before. That same employee then visited a website that had never/rarely been visited by internal employees. At the same time, a service started up on the employee’s machine that was never/rarely seen in the organisation.
These three events happening on the same machine in a short time period indicated a hacker had performed a spear-phishing attack. This then led to the employee visiting a site with malware, likely a remote access toolkit, which was installed on the employee’s machine. After experiencing the event and identifying the cause, the customer was able to set up a custom search on our platform which meant they would be automatically alerted should this combination of events happen again.
So how can machine data help detect unknown attacks?
During both of the scenarios we’ve looked at here, cross-product machine data was correlated to connect the dots to find the proverbial needle in the haystack. One example used “security data” and the other used “non-security” data.
So what about traditional Security Information and Event Management (SIEM) products? Unfortunately the issue with them is that they cannot aggregate and correlate the massive amounts of data we are discussing. Their fixed schema, SQL databases and physical appliances severely limit their scale, what they can ingest, and how fast they can search it. Also, their inflexible search and reporting capabilities limit advanced correlations or analytics to detect advanced threats.
In response, over the last few years, new big data security platforms have emerged as a new weapon for forward-thinking organisations. These platforms have levelled the playing field and made it possible to detect advanced threats early. These systems are able to scale up to 100 terabytes or more per day and ingest all types of machine data, without a SQL datastore or fixed schema. They also leverage distributed search for fast real-time searches and alerts, use statistics, math and baselining to spot anomalies and deviations, scale horizontally by adding more indexers or nodes and install on commodity hardware. Splunk and Hadoop are two technologies leading the charge in this space.
Leading organisations adopting big data include CedarCrestone, which hosts ERP environments. CedarCrestone discovered that it was almost impossible to get its log data into a traditional SIEM and then parse and correlate that data. So it turned to a big data security platform so that it could easily ingest this machine data to monitor for both known and unknown threats, and also perform comprehensive security investigations. The identification of unknown threats includes monitoring for ports and services that have changed or appeared to be unauthorised or misconfigured—these are the possible fingerprints of an advanced threat.
Not only will these big data platforms help spot advanced threats, but they can be used for forensics, incident investigations and fraud detection. With all the historical data indexed in these platforms, extending them to other complementary use cases is logical and straightforward. They can even extend into non-security use cases such as IT operations or application management.
While there is no “silver bullet” for advanced threat detection, intelligence that comes from ‘big data’, and specifically machine data, will become integral to shifting the balance of power between advanced threats and business – introducing a more level playing field through the ability to speed up response times and identify security risks before they have a major impact. This new weapon of intelligence is vital to detect and defeat hacktivists, cybercriminals, malicious insiders, and nation states – rapidly evolving threats that are harder to detect and are growing in number.
For more information about the author and Splunk, visit www.splunk.com