Splunk Inc.

31/07/2024 | News release | Distributed by Public on 31/07/2024 15:20

Driving vSOC Detection with Machine Learning

In the blog titled "Driving the vSOC with Splunk," we discussed how automotive APIs are one of the leading attack vectors of the smart mobility ecosystem. The risk related to this is clear. If a threat actor is able to remotely exploit an API at scale they would be in a position to damage a brand or make ransom requests. Of course, the power of the Splunk platform is the ability to create any use case, from any data, at massive scale. In this blog we will take a deep dive into an API security use case, using machine learning to detect API anomalies, and correlate these detections with Splunk Enterprise Security to detect a threat actor in motion.

Develop the Base Search

The first step is to develop the base search that will be used to train the ML model. Here you can see a sample event showing an API request to remotely unlock a vehicle.

Under normal circumstances vehicle owners would send requests like this multiple times a day using a smartphone app. With this in mind, our goal is to train our ML model on what normal behavior looks like. To do this we must consider a number of factors such as:

  • What span of time do we measure behavior?
  • How much history should we train against?
  • What entity (user, vehicle, ip address) should we measure against?
  • Should we consider entity based or peer group analysis?

In the following example we are measuring user behavior over each day (span=1d), training with 90 days of history (earliest=-91d@d latest=-1d@d), and splitting the results by an entity (src_ip) and peer group (fleet).

Train the Machine Learning Model

With the base search complete we are going to leverage the Splunk Machine Learning Toolkit (MLTK) and its DensityFunction algorithm to dramatically simplify the development process. We will build a peer group model as it allows us to measure behavior by clustering data based on the class of vehicle. This could be valuable as the volume of API activity may differ across various fleets of vehicles.

To get started open the MLTK, create a new smart outlier detection experiment, then paste in our base search. In the detect outliers window we'll analyze the count field, split by the fleet field (peer group), then choose the distribution type to fit your data curve (or auto if you don't yet know). Finally chose the threshold tolerance and press detect outliers. Now we can inspect the trained DensityFunction model. The MLTK provides numerous visualizations to help understand the model and its properties. For example, this histogram helps us visualize the data distribution and the curve that DensityFunction approximated.

To get this into production you'll need to use the fit command to train the model. To start, press the "SPL" button and copy the syntax that the MLTK generated. Ultimately there are several options for the algorithm that we'll want to customize. Take a read through the DensityFunction docs, tune as needed, and test the results. For example, here I changed the auto-generated "threshold=" to "upper_threshold=" to ensure I only received anomalies for high values.

Finally, schedule this command to run as a report during a periodic basis. This will ensure the model is continuously updated with the latest data.

Create an ML Detection Using Splunk Enterprise Security

After the model has been published it is ready to use for active detection within Enterprise Security (ES). By modifying the search command, we can apply the ML model and search for outliers on a scheduled basis.

In this example, we see a source IP that issued a strangely high volume of API requests against a number of different vehicle IDs. Notice that the "count" of 88 is greater than the trained "BoundaryRange" of 43.7. Looks suspicious!

Apply the Power of Risk Based Alerting

To be effective, security teams must have a system to surface threats when there are more signals than the human eye can track. This is especially true with machine learning detections. When there are trillions of events per day, even the rarest of anomalies are likely to occur. This is where the power of risk-based alerting (RBA) in Enterprise Security comes in handy. ES can analyze risk data, correlate related events together, and present high fidelity stories to security analysts. Further it allows vSOC teams to tag risk data with the associated MITRE ATT&CK and Auto-ISAC ATM technique IDs. In this example, we see a correlation between an unexpected spawning of a root shell with anomalous API activity, all associated with the same vehicle.

Real life machine learning detections take a thoughtful approach, time, and testing to implement effectively. Splunk can help by providing a scalable data platform, simplifying machine learning, adding security domain specific tools, and providing experts to help guide you on the journey. Together, let's drive your vSOC with Splunk.