Frequently Asked Questions

General

The evaluations use adversary emulation, which is a way of testing "in the style of" a specific adversary. This allows us to select a relevant subset of ATT&CK techniques to test. To generate our emulation plans, we use public threat intel reporting, map it to ATT&CK, and then determine a way to replicate the behaviors.
ATT&CK evaluations are built on the publicly-available information captured by ATT&CK, but they are separate from the ongoing work to maintain the ATT&CK knowledge base. The team who maintains ATT&CK will continue to accept contributions from anyone in the community. The ATT&CK knowledge base will remain free and open to everyone, and vendor participation in the evaluations has no influence on that process.
  • Feedback. We are always looking for feedback on what works and what doesn’t in our results and methodology. Learning how you use the results and what you want to get out of them helps us shape our work to help you, and your peers. Provide feedback.
  • Intel. We frame our evaluations in the context of the known threat to ensure our results are relevant and useful. These emulations are driven by available intel. If you share your insights, you can improve our plans. We welcome ideas for adversaries to emulate, as well as intel to help inform potential usage in our evaluation rounds

Participation

Each evaluation type (e.g., Enterprise or ICS) and round (e.g., APT3 or TRITON) has available information specific to those evaluations on this site. An overview page is available for each round to provide basic high level information, including the participants. For example, APT29's overview page is available here.
Each round's participation is independent.
Participation is focused on capability type (e.g., detection of ATT&CK behaviors) rather than market segment (e.g., EDR, EPP, etc.). If a technology addresses the capability, there may be additional technical requirements. For example, Enterprise evaluations have the following requirements:
  • Technology must address post-compromise behaviors as described by ATT&CK
  • Technology must deploy into the Microsoft Azure environment
  • Sensor/data, beyond those provided by default in Azure, must be provided by the vendor
  • For detection evaluations, capabilities that could prevent the successful execution of our emulation (e.g., protections, preventions, responses) must be disabled. The sensors that drive these actions can still be used as data sources to identify behavior
Vendor participation is subject to applicable legal restrictions, available resources, and other factors.
Visit the Get Evaluated section for more details on how to participate.
Vendors get a third-party evaluation of their capabilities ability to address adversary behaviors, as described by ATT&CK. These evaluations are not ATT&CK certifications, nor are they a guarantee that you are protected against the adversary we are emulating. Adversary behavior changes over time. The evaluations provide vendors with insight and confidence into how their capabilities map to ATT&CK techniques. Equally important, because we are publicly releasing the results, we enable their customers, and potential customers, to understand how to utilize their tools to defend ATT&CK-categorized behaviors.
Yes. There was significant demand for unbiased ATT&CK evaluations and we needed to create a mechanism to open up evaluations to the security vendor market. Participating companies understand that all results will be publicly released, which is true to our mission of providing objective insight.
All vendors receive the same information prior to evaluations. The majority of this information also gets posted on this site to ensure transparency. Vendors receive the scope of the evaluation (e.g., techniques that could be used), a named threat for whom the behaviors are being replicated, environment information, and a general idea of what data will be collected. Details on how the techniques will be implemented are not provided.
Rolling admissions participants had access to additional information during the evaluation process due to the launch of the website, which contained the initial cohort’s results and the methodology. Cybereason’s and FireEye’s feedback period occurred after the launch of the ATT&CK Evaluations website. F-Secure, McAfee, and Palo Alto Networks entire evaluation process took place after the ATT&CK Evaluations website was launched.
Let us know your needs, and the current limitations of our methodology. This will help us shape our evaluation road map.
No, all vendors signing up for the evaluation agree to have their results publicly released upon conclusion of their test.
Publicly released evaluations are the only vendor-paid evaluation provided at this time.

Process

The ATT&CK evaluations are based on a four-phased approach:
  1. Setup: The vendor installs their tool in a MITRE Engenuity provided cyber range.
  2. Evaluation: During a joint evaluation session, MITRE Engenuity adversary emulators ("red team") execute an emulation in the style of an adversary group, technique-by-technique. The vendor being tested will provide the personnel who review tool output ("blue team"). MITRE Engenuity provides the personnel to oversee the evaluation and facilitate communication between red and blue, as well as capture results ("white team").
  3. Feedback: Vendors are provided an opportunity to offer feedback on the preliminary results, but the feedback does not obligate MITRE Engenuity to make any modification to the results.
  4. Release: MITRE Engenuity publicly releases the evaluation methodology and results of the tool evaluations.
For additional details refer to our methodology.
MITRE Engenuity does not assign scores, rankings, or ratings. The evaluation results are available to the public, so other organizations may provide their own analysis and interpretation - these are not endorsed or validated by MITRE Engenuity.
The stoplight chart (which uses red, yellow, and green to indicate level of confidence for detection of techniques) has been used since ATT&CK's creation because it is a simple yet powerful way to understand ATT&CK coverage. While a stoplight chart may be useful to show coverage and gaps, we do not use this visualization because it is not granular enough to convey our results.
While we understand the importance of minimizing false positives, they are often tied to environment noise which is specific to an organization. To enable you to evaluate the results applicability in your contexts, we address false positives indirectly in the following ways:
  1. Vendors are required to define how they configured their capabilities. With that provided configuration and the evaluation's results as a baseline, users can then customize detections to reduce false positives in their unique environment.
  2. We articulate how the tool can perform detection. By releasing how to detect, as well as our methodology, organizations can implement their own tests to determine how the tools operate in their specific environment.
There is no one way to look at detection data. Some key concepts to consider include:
  1. Availability — Is the detection capability gathering the necessary data?
  2. Efficacy — Can the gathered data be processed into meaningful information?
  3. Actionability — Is the provided information sufficient to act on?