Source crafting: Setting standards in cyber threat intelligence
By Jörg Abraham, Senior Threat Analyst
So your SOC operation uses threat intelligence on a daily basis? That’s good. You’ve also established a cyber threat intelligence (CTI) practice in your organization? And have a Threat Intelligence Platform to support your analysts’ work through the entire CTI lifecycle? That’s awesome.
On the downside, you may be processing multiple feeds but realize that structured intelligence from vendor A significantly differs from vendor B? Or perhaps your analysts are having a hard time aligning the content from each feed with your own STIX data model? Welcome to the world of source crafting…
What is source crafting?
In a nutshell: Source crafting is the practice of reviewing intelligence from many open, commercial, or closed sources in its raw format, and modelling the content into a consistent data model using Structured Threat Information Expression (STIX).
Source crafting is one of the many value-adding practices that EclecticIQ Fusion Center offers its customers.
In this blog post we will describe what source crafting is and why we believe it is so important. We will also share some examples to help illustrate our approach.
Setting the scene: What is Fusion Center?
EclecticIQ Fusion Center fuses open, community and commercial sources into a unified delivery model that includes qualification, unified tagging, relevancy determination and multi-format delivery.
One of the core concepts of EclecticIQ Fusion Center is to constantly evaluate intelligence sources using objective measures such as unique insights, corroboration with other sources, structuring and data correlation.
To meet this objective Fusion Center analysts agreed on using a common data model that every analyst in our team adheres to. The data model ensures that intelligence is represented in a structured fashion in our Knowledge Base (KB) and is consistently applied in our day-to-day work.
Structured intelligence is important as it lets an analyst working with our platform to query intelligence from the KB. Moreover, it ensures that anyone can easily grasp the context of the provided information without reading through a long narrative report first.
However, the more feeds we have been onboarding into Fusion Center, the more we’ve realized that data from each vendor differs in its structure (if structure exists at all) and that we need a process to translate the content that would fit into our data model.
Quick wins: Entity Rules
The very first step in source crafting is to identify patterns, common strings or values that are provided in the intelligence feeds.
A very simple example is the two actor objects below that have been ingested from the same vendor feed. One of the actors is of type ‘Cyber Espionage Operation’, the second actor is of type ‘Hacktivist’.
EclecticIQ Fusion Center’s purpose is to deliver relevant intelligence to our customers (based on their intelligence requirements) and to reduce unnecessary noise.
Assume a customer would only be concerned about actor groups that fall under an ‘APT — Theme’ but is less worried about Hacktivists.
To deliver on our obligation, it is important for us to understand how a vendor classifies its data. Having this detailed understanding of feeds, lets our analysts write a set of rules or queries (image 3) in EclecticIQ platform.
Every STIX entity in our platform can be queried this way (because it is structured) and can be shared automatically with our customers. This form of automation is one of the cornerstones of the EclecticIQ Fusion Center.
If an intelligence feed is lacking context, e.g. a plain list of IPs or domains, we would first not name it ‘intelligence’.
Secondly, the data might still be ingested into our EclecticIQ Platform since the information could be relevant for future investigations. But unless it has been triaged, structured and contextualized by an analyst first, this data will not be sent to our customers.
Improve the Integration
In cases where the source data is more complex and cannot be evaluated with a single rule, Fusion Center analysts will closely work with our integrations team. Many vendors make their intelligence data available via APIs and the results are offered as JSON. Take for example the following results from an API call.
The JSON contains a lot of information about the malware (njRAT), its related IOCs (Indicator of Compromise), the vendor’s confidence, timestamps and much more. However, it is not modelled in line with STIX.
During source crafting our analyst reviews the JSON to:
1. Identify fields in the JSON that should be mapped to an appropriate STIX characteristic
2. Create new STIX objects wherever needed
The result (structured intelligence) after source crafting for the above JSON, would look like the graph on the left.
The indicator “….ore2xd.ddns.net” links to several hashes, which we represent as separate Malware TTPs (Tactics, techniques and procedures)in our data model (Malware Variant: njRAT). The two variants belong to the family object Malware: njRAT.
Each variant has distinct observables (hashes, port used) but also shares indicators like the C2 Domain (“….ore2xd.ddns.net) and C2 IP 168.228.xxx.xxx
Zooming into one of the variants, you can see that other details from the JSON have been mapped to the appropriate STIX characteristic, e.g. the malware name, malware type or confidence value.
Clearly, the integration efforts can only be as good as the raw data shared by the intelligence sources. While improving the integration is feasible for many vendors, the silver bullet is to reach out to the intelligence providers and work with them on proper STIX-qualified feeds.
EclecticIQ is currently running a pilot with Fox-IT to achieve exactly that. Fox-IT provides contextual intelligence to their customers but also generates a large amount of valuable threat data as a side product. Leveraging EclecticIQ’s expertise in structured intelligence, the goal is to process this data into a world class STIX feed for customers
At the moment, we cannot disclose more details on this engagement. Please stay tuned for updates and lessons learned from the pilot in future blogposts.
We hope you enjoyed this post. Follow us here on Medium for more interesting reads on Cyber Threat Intelligence or check out our resource section for whitepapers, threat analysis reports and more.