Measuring what truly matters

2025-06-14
11 min read.
Eight key preconditions for a sustainable reputational market that results in trustworthy AIs being developed and widely adopted.
Measuring what truly matters
Credit: Tesfu Assefa

Decentralised reputation systems and a four-stage solution to the AI alignment problem

Imagine a society that regularly applauds displays of wealth, aggression, or manipulation. Imagine if people with such traits are featured in glowing terms in magazine profiles. Imagine if children are told – implicitly or explicitly – that being successful in life involves acquiring large bank balances, dominating potential rivals, and expropriating any available resources to further their own interests.

What could go wrong?

The risks in that scenario are obvious. But in some ways, that’s what humanity is doing with AI systems. We’re applauding the wrong behaviour in AIs. We’re encouraging the emergence of AIs which prioritise raw intelligence, which are skilled in manipulating human users, which can make lots of money for their corporate owners, and which can deliver devastating military advantage when coupled with various weapons systems.

We might be aligning AIs with a set of human values, but those human values are ones that will lead to lots of social harm. In other words, we’re on track to a bad solution to the AI alignment problem.

Without particularly intending this outcome, we’re prioritising AIs that are good at titillating, feigning friendship, spying, flattering, selling, and deceiving. Like the greatest of human manipulators, these AIs often convince us that they have our best interests at heart. And as with the best of advertisements, they leave us convinced that our choices were made by ourselves, freely, without any undue influence. (We tell ourselves, naively, other people may be misled by AIs, but no, not us.)

Credit: David Wood via ChatGPT

With our words, we may say that we hope AIs will be characterised by integrity, honesty, transparency, security, justice, and collaboration. But all too often, our collective actions tell a different story.

It’s like a parent who admonishes a child, “do as I say, not as I do” – but the child learns from what they observe, rather than from the pious sentiments they hear.

It’s like a business manager who tells his employees that he values teamwork, but who gives the biggest salary bonuses to individuals who grab the lowest hanging fruit for themselves, while leaving their colleagues floundering.

It’s like a political leader who waxes lyrical about there being more to national success than rising metrics of GDP, but who rewards the companies that cut ethical corners in pursuit of international market share.

It’s when we all declare that we wish the developers of AI platforms would pay more attention to issues of security and trustworthiness, but then we rush to use new platforms that have the cutest bells and whistles – the platforms which score highest on metrics of IQ, which capture and hold our attention, which tell the funniest jokes, and which reassure us that we truly are special individuals.

If we don’t like this state of affairs – if we fear what will happen as AIs become even smarter – then we have to change what we value.

To be clear, we need to do more than change what we say we value. We have to change what values we follow in our behaviour. And we need to do this collectively, to avoid the choices and actions of just a few of us being swamped by the choices and actions of the overwhelming majority.

That may sound like a tough ask. But we can find inspiration from some prior examples.

Credit: David Wood via ChatGPT

Reputation matters

In April 2017, a video surfaced showing airport security personnel at Chicago O’Hare violently dragging a 69-year-old passenger, Dr David Dao, out of an overbooked United Airlines flight. Dr Dao was visibly disoriented and bloodied. Other passengers on the airplane were horrified.

The airline had asked for volunteers to give up their seats for crew members, and when no one complied, passengers were randomly selected to disembark. Dr Dao refused, prompting his physical removal – as caught on camera by fellow passengers. The footage went viral globally, triggering widespread public outrage.

United Airlines CEO Oscar Munoz initially described the incident as the airline having to “re-accommodate” customers and implied the passenger was at fault. This only intensified backlash. The airline’s market capitalization dropped by nearly $1 billion in the days following the event.

This incident has become a classic case study in how defensive, tone-deaf responses during crises can exacerbate reputational and financial damage. It’s far from the only example.

A second example also comes from the airline industry. Between October 2018 and March 2019, two Boeing 737 MAX aircraft – operated by Lion Air and Ethiopian Airlines – both crashed, killing a total of 346 people. Investigations revealed that a faulty automated system, the MCAS (Manoeuvring Characteristics Augmentation System), had repeatedly forced the planes’ noses down based on erroneous sensor data. For various financial reasons, Boeing had failed to adequately inform pilots about this system and resisted grounding the fleet even after the second crash.

As news of the crashes spread, Boeing’s initial responses were defensive, suggesting pilot error rather than acknowledging their own responsibility. Eventually, global regulators grounded all 737 MAX planes. The crisis severely tarnished Boeing’s reputation for safety, led to congressional investigations, and resulted in a loss of over $40 billion in market value.

More recently, a remarkable drop in sales of Tesla cars has reflected wide public disquiet with the actions taken by its famous CEO, Elon Musk. There is no suggestion that the cars are particularly faulty from a technical standpoint. Instead, the public revulsion is caused by a clash in perceived values: Musk has been taking actions, in social media and as part of the US government administration, which are sharply misaligned with the preferences of many potential purchasers of Tesla vehicles.

For one final example, consider the blow suffered to the reputation of Facebook in 2018 in the wake of what became known as the Cambridge Analytica scandal – a scandal that exposed deep vulnerabilities in the platform’s handling of user data, and which raised global concerns about privacy, political manipulation, and corporate responsibility.

What happened in this case is that the public became aware that data from over 87 million users had been harvested without consent and exploited for targeted political advertising during key elections, including the 2016 U.S. presidential race. Facebook’s slow and evasive response – initially downplaying the incident and failing to promptly inform affected users – was widely condemned. Public trust plummeted, regulatory scrutiny intensified, and CEO Mark Zuckerberg was summoned before U.S. Congress and European lawmakers. Facebook lost over $100 billion in market value in a matter of days. The phrase “#DeleteFacebook” trended worldwide, suggesting a broader cultural shift in attitudes toward Big Tech.

Credit: David Wood via ChatGPT

That final example illustrates two important points:

  • Good news: The public concern prompted some soul-searching within the company, and some changes in policy
  • Bad news: Nevertheless, evident flaws remain in the AI systems used by the company, and it continues to prosper financially, despite these shortcomings.

In other words, public expression of displeasure with a corporation is often only temporary. We consumers soon revert to our previous habits.

To have a longer-lasting impact, we’ll need to move from occasional societal expressions of preferences for higher integrity and greater trustworthiness, to actions that are more comprehensive. Instead of these expressions being peripheral to the economy, we’ll need to move them centre-stage. Here’s how this could happen.

The vision: a reputational market with lasting impact

Imagine a society that regularly applauds integrity, honesty, transparency, security, justice, and collaboration. (As you notice, that’s a different set of values than the ones in the scenario right at the start of this article.)

In that society, people adjust their purchasing and other decisions in the light of what is known about the trustworthiness of the providers of various products and services.

That is, people don’t just say they support these values. They actually embody these values. Not just occasionally, but systematically and sustainably.

Credit: David Wood via ChatGPT

This may seem like impractical wishful thinking. Indeed, there are big challenges facing this vision. But here are eight preconditions that will help make it a reality.

First, support from the ground up. A desire for this vision needs to be shared by the community as a whole, rather than being imposed from outside. The details need to be widely discussed within the community, rather than being dictated by perceived “elites” or “technocrats”. (What will prompt such discussions is a growing public awareness of the catastrophic threats posed by misconfigured AI.)

Second, reliable metrics. Rather than companies being able to pretend that they are operating with integrity, there needs to be good evidence in support of any numerical evaluations. That is, any attempts at reputation-washing should be detected and blocked.

Third, protection for whistleblowers. On that last point: people who are aware of a disconnect between claimed values and actual behaviour, should be encouraged and enabled to share these findings, without threat of adverse personal repercussions.

Fourth, sufficient funding. The collection, analysis, and communication of reliable metrics will require significant resourcing. But if there truly is support for this initiative from the ground up, there is a greater chance that these resources will be provided.

Fifth, multi-level reputations. Providers of products and services should be evaluated, not just on how they operate themselves, but on whether they restrict themselves to suppliers and partners who also have good reputational ratings.

Sixth – and having crucial importance – no significant economic sacrifice. If there is a choice between two suppliers, one with good overall reputation and the other with poor overall reputation, then any extra financial cost for the first one should be at most a modest increment over that for the second one. Otherwise the community will have to make the choice between integrity and affordability – or between transparency and utility, etc. In order to avoid any hard decisions along these lines, there will need to be powerful market forces that encourage suppliers to perform well on both scores. In turn, these forces will arise if there is a large enough community of people to whom the characteristics of integrity, transparency, etc, truly matter.

Seventh, mutual support. Whenever individuals experience pressures to set aside their declared values, and instead to act irresponsibly, they should be able to draw on support from the community as a whole – to stiffen their backbones, as it were.

Eighth, AI support. As well as support from other members of society, to help us to live up to our best ambitions, we should benefit, too, from real-time support from trustworthy AI – which, for example, could alert us to any cases when we are about to purchase or use other products or services which we are likely to subsequently regret. Evidently, this presupposes that the AI support in question can be trusted – which may not be the case if that AI is provided by a commercial supplier.

In practical terms, the overall outcome will be a set of powerful pressures on the corporations and organisations that are creating AI systems. The ones who cut corners will find their products shunned in the marketplace. There will no longer be a race for AI systems to be first, or fastest, or flashiest; instead, what matters will be that AI systems are assuredly trustworthy.

Four stages of alignment

To summarise, what I’m anticipating is a four-stage solution to the AI alignment problem:

  1. Society collaboratively decides which values matter most, and starts measuring conformance to these values
  2. Society takes increasing account of these reputational measurements in all of its decisions about which AI solutions to purchase or use; in other words, people throughout society will align their own actions with the values that they have decided to champion
  3. The providers of AI solutions will need to align their own processes and methods with the evident requirements of the purchasers and users of their products and services
  4. The alignment will then progress to the further state in which the AI platforms themselves manifest these values.

Before any of this gets off the ground, I anticipate a wide-ranging public discussion, drawing on insights from multiple areas of society, multiple disciplines, and multiple sets of experience. I anticipate, moreover, that quite a lot of what I suggest in this article will be modified and improved as such a discussion progresses. For example:

  • Different groups of people have divergent instincts as to which values matter most – but I anticipate that agreement can be reached on an important common core of principles
  • Different groups of people have divergent views as to who can be trusted to measure practical conformance to these values – but an open, decentralised assessment process is more likely to gain wide acceptance than something evidently proprietary
  • AI solutions providers will likely exercise great creativity in appearing to be on the side of users, rather than seeking a dominant position for their own platforms; to guard against us being misled, proactive vigilance will be required, by all of us – but how can this be done, without us becoming paranoid or obsessive?
  • At least some AIs will likewise be inclined to fake the degree of their conformance to the values expressed by society; this is a reminder of the importance of insisting that AIs be explainable and auditable, rather than mysterious and opaque – and that will require technical innovations as well as changes in our culture.

I’ll be ready to update my views as matters become clearer.

#AIEthics

#ArtificialStupidity

#Power-SeekingBehavior

#Reputation

#TheAlignmentProblem

#TheValueSpecificationProblem



Related Articles


Comments on this article

Before posting or replying to a comment, please review it carefully to avoid any errors. Reason: you are not able to edit or delete your comment on Mindplex, because every interaction is tied to our reputation system. Thanks!