Software Engineering

How to Measure the Trustworthiness of an AI System


As potential applications of artificial intelligence (AI) continue to expand, the question remains: will users want the technology and trust it? How can innovators design AI-enabled products, services, and capabilities that are successfully adopted, rather than discarded because the system fails to meet operational requirements, such as end-user confidence? AI’s promise is bound to perceptions of its trustworthiness.

To spotlight a few real-world scenarios, consider:

  • How does a software engineer gauge the trustworthiness of automated code generation tools to co-write functional, quality code?
  • How does a doctor gauge the trustworthiness of predictive healthcare applications to co-diagnose patient conditions?
  • How does a warfighter gauge the trustworthiness of computer-vision enabled threat intelligence to co-detect adversaries?

What happens when users don’t trust these systems? AI’s ability to successfully partner with the software engineer, doctor, or warfighter in these circumstances depends on whether these end users trust the AI system to partner effectively with them and deliver the outcome promised. To build appropriate levels of trust, expectations must be managed for what AI can realistically deliver.

This blog post explores leading research and lessons learned to advance discussion of how to measure the trustworthiness of AI so warfighters and end users in general can realize the promised outcomes. Before we begin, let’s review some key definitions as they relate to an AI system:

  • trust—a psychological state based on expectations of the system’s behavior—the confidence that the system will fulfill its promise.
  • calibrated trust—a psychological state of adjusted confidence that is aligned to end users’ real-time perceptions of trustworthiness.
  • trustworthiness—a property of a system that demonstrates that it will fulfill its promise by providing evidence that it is dependable in the context of use and end users have awareness of its capabilities during use.

Trust is complex, transient, and personal, and these factors make the human experience of trust hard to measure. The individual’s experience of psychological safety (e.g., feeling safe within their personal situation, their team, their organization, and their government) and their perception of the AI system’s connection to them, can also affect their trust of the system.

As people interact and work with AI systems, they develop an understanding (or misunderstanding) of the system’s capabilities and limits within the context of use. Awareness may be developed through training, experience, and information colleagues share about their experiences. That understanding can develop into a level of confidence in the system that is justified by their experiences using it. Another way to think about this is that end users develop a calibrated level of trust in the system based on what they know about its capabilities in the current context. Building a system to be trustworthy engenders the calibrated trust of the system by its users.

Designing for Trustworthy AI

We can’t force people to trust systems, but we can design systems with a focus on measurable aspects of trustworthiness. While we cannot mathematically quantify overall system trustworthiness in context of use, certain aspects of trustworthiness can be measured quantitatively—for example, when user trust is revealed through user behaviors, such as system usage.

The National Institute of Standards and Technology (NIST) describes the essential components of AI trustworthiness as

  • validity and reliability
  • safety
  • security and resiliency
  • accountability and transparency
  • explainability and interpretability
  • privacy
  • fairness with mitigation of harmful bias

These components can be assessed through qualitative and quantitative instruments, such as functional performance evaluations to gauge validity and reliability, and user experience (UX) studies to gauge usability, explainability, and interpretability. Some of these components, however, may not be measurable at all due to their personal nature. We may evaluate a system that performs well across each of these components, and yet users may be wary or distrustful of the system outputs due to the interactions they have with it.

Measuring AI trustworthiness should occur across the lifecycle of an AI system. At the outset, during the design phase of an AI system, program managers, human-centered researchers, and AI risk specialists should conduct activities to understand the end users’ needs and anticipate requirements for AI trustworthiness. The initial design of the system must take user needs and trustworthiness into account. Moreover, as developers begin the implementation, team members should continue conducting user-experience sessions with end users to validate the design and collect feedback on the components of trustworthiness as the system is developed.

As the system is prepared for initial deployment, the development team should continue to validate the system against pre-specified criteria along the trustworthiness components and with end users. These activities serve a different purpose from acceptance-testing procedures for quality assurance. During deployment, each release must be continuously monitored both for its performance against expectations and to assess user perceptions of the system. System maintainers must establish criteria for pulling back a deployed system and guidance so that end users can set appropriate expectations for interacting with the system.

System builders should also intentionally partner with end users so that the technology is created to meet user needs. Such collaborations help the people who use the system regularly calibrate their trust of it. Again, trust is an internal phenomenon, and system builders must create trustworthy experiences through touchpoints such as product documentation, digital interfaces, and validation tests to enable users to make real-time judgements about the trustworthiness of the system.

Contextualizing Indicators of Trustworthiness for End Users

The ability for users to accurately evaluate the trustworthiness of a system helps them to gain calibrated trust in the system. User reliance on AI systems implies that they are deemed trustworthy to some degree. Indicators of a trustworthy AI system may include the ability for end users to answer the following baseline questions – can they:

  • Understand what the system is doing and why?
  • Evaluate why the system is making recommendations or generating a given output?
  • Understand how confident the system is in its recommendations?
  • Evaluate how confident they should be in any given output?

If the answer to any of these questions is no, then more work is necessary to ensure the system is designed to be trustworthy. Clarity of system capabilities is needed so that end users can be well-informed and confident in doing their work and will use the system as intended.

Criticisms of Trustworthy AI

As we emphasize in this post, there are many factors and viewpoints to consider when assessing an AI system’s trustworthiness. Criticisms of trustworthy AI include that it can be confusing and sometimes overwhelming, is seemingly impractical, or seen as unnecessary. A search of the literature regarding trustworthy AI reveals that authors often use the terms “trust” and “trustworthiness” interchangeably. Moreover, among literature that does define trust and trustworthiness as separate considerations, the ways in which trustworthiness is defined can vary from paper to paper. While it is encouraging that trustworthy AI is a multi-disciplinary space, multiple definitions of trustworthiness can confuse those who are new to designing a trustworthy AI system. Different definitions of trustworthiness for AI systems also make it possible for designers to arbitrarily choose or cherry-pick elements of trustworthiness to fit their needs.

Similarly, the definition of trustworthy AI varies depending on the system’s context of use. For example, the characteristics that make up a trustworthy AI system in a healthcare setting may not be the same as a trustworthy AI system in a financial setting. These contextual differences and influence on the system’s characteristics are important to designing a trustworthy AI system that fits the context and meets the needs of the desired end users to encourage acceptance and adoption. For people unfamiliar with such considerations, however, designing trustworthy systems may be frustrating and even overwhelming.

Even some of the commonly accepted elements that make up trustworthiness often appear in tension or conflict with each other. For example, transparency and privacy are often in tension. To ensure transparency, appropriate information describing how the system was developed should be revealed to end users, but the characteristic of privacy means that end users should not have access to all the details of the system. A negotiation is necessary to determine how to balance the aspects that are in tension and what tradeoffs may need to be made. The team should prioritize the system’s trustworthiness, the end users’ needs, and the context of use in these situations, which may result in tradeoffs for other aspects of the system.

Interestingly, while tradeoffs are a necessary consideration when designing and developing trustworthy AI systems, the topic is noticeably absent from many technical papers that discuss AI trust and trustworthiness. Often the ramifications of tradeoffs are left to the ethical and legal experts. Instead, this work should be conducted by the multi-disciplinary team making the system—and it should be given as much consideration as the work to define the mathematical aspects of these systems.

Exploring Trustworthiness of Emerging AI Technologies

As innovative and disruptive AI technologies, such as Microsoft 365 Copilot and ChatGPT, enter the market, there are many different experiences to consider. Before an organization determines if it wants to employ a new AI technology, it should ask:

  • What is the intended use of the AI product?
    • How representative is the training dataset to the operational context?
    • How was the model trained?
    • Is the AI product suitable for the use case?
    • How do the AI product’s characteristics align to the responsible AI dimensions of my use case and context?
    • What are limitations of its functionality?
  • What is the process to audit and verify the AI product performance?
    • What are the product performance metrics?
    • How can end users interpret the output of the AI product?
    • How is the product continuously monitored for failure and other risk conditions?
    • What implicit biases are embedded in the technology?
    • How are aspects of trustworthiness assessed? How frequently?
    • Is there a way that I can have an expert retrain this tool to implement fairness policies?
    • Will I be able to understand and audit the output of the tool?
    • What are the safety controls to prevent this system from causing damage? How can these controls be tested?

End users are typically the frontline observers of AI technology failures, and their negative experiences are risk indicators of deteriorating trustworthiness. Organizations employing these systems must therefore support end users with the following:

  • indicators within the system when it is not functioning as expected
  • performance assessments of the system in the current and new contexts
  • ability to report when the system is no longer operating at the acceptable trustworthiness level
  • information to align their expectations and needs with the potential risk the system introduces

Answers to the questions introduced at the beginning of this section aim to surface whether the technology is fit for the intended purpose and how the user can validate trustworthiness on an ongoing basis. Organizations can also deploy technology capabilities and governance structures to incentivize the ongoing maintenance of AI trustworthiness and provide platforms to test, evaluate, and manage AI products.

At the SEI

We conduct research and engineering activities to investigate methods, practices, and engineering guidance for building trustworthy AI. We seek to provide our government sponsors and the broad AI engineering community usable, practical tools for developing AI systems that are human-centered, robust, secure, and scalable. Here are a few highlights of how researchers in the SEI’s AI Division are advancing the measurement of AI trustworthiness:

  • On fairness: Identifying and mitigating bias in machine learning (ML) models will enable the creation of fairer AI systems. Fairness contributes to system trustworthiness. Anusha Sinha is leading work to leverage our experience in adversarial machine learning, and to develop new methods for identifying and mitigating bias. We are working to establish and explore symmetries in adversarial threat models and fairness criteria. We will then transition our methods to stakeholders interested in applying ML tools in their hiring pipelines, where equitable treatment of applicants is often a legal requirement.
  • On robustness: AI systems will fail, and Eric Heim is leading work to examine the likelihood of failure and quantify the likelihood of those failures. End users can use this information—along with an understanding of how AI systems might fail—as evidence of an AI system’s capability within the current context, making the system more trustworthy. The clear communication of that information supports stakeholders of all types in maintaining appropriate trust in the system.
  • On explainability: Explainability is a significant attribute of a trustworthy system for all stakeholders: engineers and developers, end users, and the decision-makers who are involved in the acquisition of these systems. Violet Turri is leading work to support these decision-makers in meeting purchasing needs by developing a process around requirements for explainability.

Ensuring the Adoption of Trustworthy AI Systems

Building trustworthy AI systems will increase the impact of these systems to augment work and support missions. Making successful AI-enabled systems is a big investment; trustworthy design considerations should be embedded from the initial planning stage through release and maintenance. With intentional work to create trustworthiness by design, organizations can capture the full potential of AI’s intended promise.