Evaluating Detection and Responses: how MITRE ATT&CK helps

How to make sense of results

The problem

Some cynics claim that cyber security product evaluations are rigged. Like a food photographer using shoe polish to paint grill marks on a steak, some vendors misrepresent the quality of their products. Fortunately, there are ways of assessing the truth of vendors’ claims.

This article explains how you can test advanced cyber threat detection and response (D&R) solutions by using the MITRE ATT&CK framework.

Who tests cyber security products?

Cyber threats evolve quickly, so security testing is complicated and expensive. The skills and other resources needed to thoroughly test a technology stack’s security levels is often the reserve of national security organizations or very big companies. The rest have to rely on more cost effective methods:

  • Laboratories that are paid to conduct tests on behalf of vendors and buyers: such as MRG Effitas; AV-Test; AV-Comparatives; SE Labs and the now-defunct NSS Labs. Although antivirus is a relatively straightforward tool to bench test, more complex detection capabilities can be painfully expensive.
  • Industry analysts who provide strategic analysis and market opinion on products, often without ever using them. User feedback is often their most significant source of information, but that feedback is often skewed by installed base, marketing spend and other factors, and it usually provides poor visibility of new or currently-deployed services.
  • Standards organizations that certify security products, such as AMTSO and MITRE.

Cyber security testing is problematic

Security testing is problematic because:

  • IT security testing methodology is flawed. Laboratory tests often focus on the sensitivity of products to threats, rather than how accurate they are. They may not measure the ratio of false positives to true results,
  • differentiating products, rather than testing their ability to detect specific threats. Results from these tests do not support risk-based decision-making.
  1. Testing independence is an issue; there is an obvious conflict of interest when laboratories are paid to test their customers' products.
  2. Opacity:
    • Vendors and independent testers usually only cooperate for as long as the test results align with the vendors’ interests,
    • Many test results sit behind a paywall. A lack of publicly available security testing means nobody knows how well the products work.
  3. Testing only happens when there is a market for the results: There is little incentive for vendors to be transparent about or provide evidence of the security of their product. Some vendors maintain secrecy around their product’s capabilities in an alleged effort to protect security. Some market analysts employ a 'pay-to-play' model and use vendor selection criteria to favor those willing to pay to be evaluated, which rewards the better-funded or more mainstream vendors.


MITRE is a US-based not-for-profit organization that aims to ‘solve problems for a safer world’. A subsidiary of MITRE Corporation is MITRE EngenuityTM, which owns the ATT&CK® knowledge base and runs the ATT&CK Evaluations service.

MITRE ATT&CK is a globally accessible, continually updated knowledge base of known state-sponsored and criminal groups, including the tactics, techniques and procedures that they use. This knowledge base enables organizations to prioritize detection around the most persistent threats and threat groups. 

MITRE works with D&R vendors to quantitatively assess the security of their products. Each year, it evaluates the detection capability of participating vendors’ products. The products are tested against the tactics, techniques and procedures (TTPs) used by two major threat groups to determine whether they can be detected. Products are scored based on the proportion of tests they detect.

At WithSecure, we use MITRE ATT&CK to provide standard vocabulary and descriptions. We’ve also taken part in the MITRE ATT&CK evaluations since they started; they illuminate what participants can do and how they do it. We’ve taken part in the second and third rounds, and the fourth evaluation will be released soon.

Interpreting MITRE ATT&CK results

You should remember four things when reviewing the results of a MITRE ATT&CK evaluation:

  1. Less can be more: Vendors do not need to score 100% to be assured of detecting a particular threat. Attacks span multiple phases and activities, but the attack only needs to be detected in one place. A product that focuses on several high-fidelity early-stage detection techniques will probably provide better protection than one that generates more low-fidelity alerts across all attack stages.
  2. Coverage: The results of a single evaluation that tests detection capability against the TTPs used by one or two threat groups provides an imperfect measure of product quality. In the same way, it’s hard to predict who will win the football league by analyzing a single match.
  3. It’s the taking part that counts: Participating in a public test of your product’s detection capability costs a six-figure sum (in Euros). Vendors participate because they want to be transparent about the quality of their product.
  4. Trends are important: Review vendor performance across current and previous evaluations to get the best measure of quality (and remember to account for the likely efficacy of the product in your own environment).

Other factors to consider when evaluating D&R solutions

Many factors go into choosing an effective, appropriate D&R solution to fit your specific needs.


Each detection technique relies on capturing and analyzing different datasets. Some organizations cannot collect the data needed for certain detection techniques, usually because of technical or performance limitations. The storage and analysis costs associated with telemetry may also be prohibitive.  

The importance of useful datasets

  • Detection quality: High-fidelity detection is critical. Sometimes it is better to not have a data set, detection rule or analytic that you know will be prone to false positives.
  • Investigation: Access to a broad set of data (such as process and memory execution, persistence mechanisms and user behavior) with consistent links within the data allows a hunter to jump quickly from one telemetry source to another to build an investigative picture.
  • Response: A detailed dataset enables you to identify all of the attacker's mechanisms for persisting in the victim’s systems and devise a containment plan to eliminate them all.
  • Threat hunting: Threat hunters identify and plug gaps in detection coverage. The more visibility and data available, the quicker the gap can be plugged with a new detection rule.

Using threat intelligence to focus detection capability development

Attackers can theoretically use any TTP in the MITRE ATT&CK framework, but threat intelligence will tell you which ones are most likely.

Attackers avoid unnecessary work. In most attacks we see, attackers use only a subset of the MITRE techniques; the framework contains 59 persistence techniques, but most attacks we have seen only involve seven. Focusing resources on the most commonly-used techniques will increase your detection rates and overall effectiveness. 

Analyzing public breach reports can be a great way to learn more about which techniques attackers commonly use.

Responding intelligently to threats

MITRE ATT&CK evaluations provide threat detection transparency, but they do not test the response capability of a product or service. Effective response is more about the people than the product. When detecting and responding to cyber threats, an ordinary tool in the hands of a threat hunter will yield better results than an industry-leading tool in the hands of a SOC analyst novice.

Managed security services 

If you are considering a managed security service to monitor your IT networks and respond to threats, look for:

  • enthusiasm around understanding your company, its business goals and the market in which it operates
  • the prospective partner enabling your long-term interests as well as its own 
  • open, transparent communication
  • responsiveness and flexibility, underpinned by meaningful SLAs
  • a culture of continual improvement.

Other D&R capability components

Other D&R capability components you should consider include:

  • the concepts, principles, and operational policies and processes and that govern how D&R will be managed
  • recruiting, training and support of relevant personnel
  • the support needed to deliver operational success, including organization, command and control, infrastructure, resilience and threat intelligence.

At the end of every serious cyber threat is a human adversary, which means that effective cyber security requires, for at least a few decades more, a human defender. But skilled personnel are hard to attract and retain.

Getting the desired outcomes

Many independent testing organizations provide visibility into D&R capabilities, but you should always prioritize the security outcomes for your business when choosing a D&R solution (either an MDR service or EDR/XDR technology). We at WithSecure are happy to help you get the best security outcomes with your chosen approach.


Paul Brucciani, Head of Product Marketing

Read more

WithSecure™ Countercept MDR

WithSecure™ Countercept acts as an extension of your cyber security team to deter and resist cyber attacks. Learn more.

Read more