Solving aging – is AI all we need? Should resources be diverted away from biotech in order to accelerate the advent of god-like AI?

Love, attention, and scale

In June 1967, the Beatles premiered a glorious new song: All you need is love. The performance was the UK’s contribution to what was the world’s first global satellite television broadcast, and was simultaneously watched by over 400 million people in 25 different countries. The broadcast occurred during what became known as the Summer of Love, and the song became a powerful anthem of flower power.

The Beatles’ manager Brian Epstein had described the performance as the band’s finest moment, but it turned out that singing “all you need is love” wasn’t quite enough to bring about a world of peace and harmony. 

Almost exactly 50 years later, a group of eight researchers at Google were searching for a title for an article they were about to publish. They settled on “Attention is all you need” – the title being the brainchild of the only Briton on the team, Llion Jones, who had grown up in north Wales, not far from Liverpool, the home of the Beatles. The article has attained legendary status within the global AI community, for its introduction of the transformer technology that underpins breakthrough AI initiatives such as ChatGPT.

Despite omitting architectural features that were previously thought to be essential for many text-based processing tasks, transformers excelled in these same tasks. The key innovation, which was to pay special attention to whichever parts of the input appeared most salient, turned out to give these AI systems a strong competitive advantage. The Attention is all you need paper correctly predicted that transformers could handle not just text but also other kinds of data, including pictures and sounds.

How far might transformers take AI? A third claim has increasingly been heard: “Scale is all you need”. Feed transformer systems ever larger amounts of data, and provide them with ever more powerful computer chips to crunch all that data into models with ever greater numbers of parameters, and there’s no limits to the degree of intelligence that can result. The “scale is all you need” hypothesis looks forward to AIs with fully general reasoning capabilities by doing more of the same.

In this context, I want to examine yet another “all you need” hypothesis. It’s a hypothesis that is already changing investment decisions and personal career trajectories. It’s the hypothesis that, whatever major problem you hope to solve, the best way to solve it is to start by creating general intelligence.

In this way of thinking, AI is all you need. An AI with god-like abilities will be able to race ahead of slow-witted humans to solve all the fundamental problems of science, medicine, and human existence.

Credit: David Wood, aided by Midjourney AI

Machines of loving grace

The same thought is expressed in the recent provocative essay by the founder and CEO of Anthropic, Dario Amodei: Machines of Loving Grace – How AI Could Transform the World for the Better.

Amodei states it as follows: “I think that most people are underestimating just how radical the upside of AI could be… My basic prediction is that AI-enabled biology and medicine will allow us to compress the progress that human biologists would have achieved over the next 50-100 years into 5-10 years”.

Amodei gives some examples of the discoveries that AI-enabled science could make:

  • “Design of better computational tools like AlphaFold and AlphaProteo”
  • “More efficient and selective CRISPR” (for gene-editing)
  • “More advanced cell therapies”
  • “Materials science and miniaturization breakthroughs leading to better implanted devices”
  • “Better control over stem cells, cell differentiation, and de-differentiation, and a resulting ability to regrow or reshape tissue”
  • “Better control over the immune system: turning it on selectively to address cancer and infectious disease, and turning it off selectively to address autoimmune diseases”.

Who wouldn’t like such a vision?

According to this logic, spending effort in the next few years to create AI with these capabilities is a better investment than spending the same effort to improve biology and medicine here and now.

Credit: David Wood, aided by Midjourney AI

Funding is what marshals effort, and any funds at someone’s disposal should, it appears, be directed towards improving AI, rather than towards companies or foundations that are seeking to improve biology or medicine. Right?

A two-step mission statement

Back in February 2015, Demis Hassabis was relatively unknown. There had been a bit of press about the purchase of his company, DeepMind, by Google, for £400 million, but most people had little conception of what the company would accomplish in the following years.

Hassabis was giving a talk at CUTEC – the Cambridge University Technology and Enterprise Club. A photo from that talk is preserved on Reddit:

Credit: Reddit

You can also read on that page on Reddit, from nearly ten years ago, some fascinatingly scathing comments about that mission statement:

  • “Ridiculous and poorly-defined goals”
  • “FFS [what] a mission statement [for] a company”
  • “‘Fundamentally solve intelligence’ in the linked screenshot above is a whole load of nonsense”
  • “I don’t even think we have a working definition for ‘intelligence’ yet. We don’t even know how it works in humans… How can we hope to recreate it before knowing what it is?”

But step forward to October 2024, with the announcement of the winners of this year’s Nobel Prize in Chemistry, “for computational protein design”. The mission statement outlined long ago for DeepMind now seems much more credible.

Once intelligence has been “fundamentally solved”, it should be relatively straightforward to solve climate change, economic distribution, cancer, dementia, and aging, right?

After all, given an AI model that can correctly predict how a long string of amino acids will fold up as a protein in three dimensions, won’t a scaled-up version of that model be able to predict other interactions between biochemical molecules – and, indeed, to predict how biological cells will respond to all kinds of proposed interventions?

The data bottleneck

One person striking a note of caution against exuberant forecasts of rapid additional progress about AI in medicine, was someone who shared the Nobel Prize with Demis Hassabis, namely David Baker of the University of Washington.

In an article published in MIT Technology Review shortly after the Nobel Prize, Baker pointed out that “AI needs masses of high-quality data to be useful for science, and databases containing that sort of data are rare”.

Indeed, the stunning success of DeepMind’s AlphaFold AI was fundamentally dependent on prior decades of painstaking work by numerous scientists to assemble what is known as PDB – the protein data bank.

The third of the joint winners, John Jumper of DeepMind, acknowledged this dependency in a press conference after the prize was announced. Jumper said, “I also want to really thank the giants on whose shoulders we stand, I think the entire experimental community, the people that developed the ability to measure protein structures, especially to Helen Berman and other pioneers of the Protein Data Bank, the PDB, who had the foresight to put these data together to make it available”.

Helen Berman had pioneered the PDB from 1971. As she graciously commented in a recent interview, “I am a very lucky person to have had an idea as a student, pursued that idea for more than 50 years, and then seen brand new science emerge for which three people have won this year’s Nobel Prize. It is really gratifying”.

Remarkably, Berman’s interest in protein folding predates even the Beatles song. In an online living history memoir written in 2012, Berman notes “In 1966 …I became fascinated by the world of protein folding. As part of my Ph.D. qualifier, … I proposed to perform structure-based sequence comparisons of known proteins…”.

Progress in determining protein structures was slow, for a long time, before becoming faster. This slide from a 2009 presentation by Berman that graphs the growth in the total number of proteins documented in PDB will look familiar to anyone familiar with singularitarian ideas:

In the MIT Technology Review article, ‘A data bottleneck is holding AI science back’, David Baker pointed out that “If the data that is fed into AI models is not good, the outcomes won’t be dazzling either. Garbage in, garbage out”.

The subtitle of that article says it straightforwardly: “AI’s usefulness for scientific discovery will be stunted without high-quality data”.

So, we can forget “AI is all we need”. Before we can develop an AI that can solve aging for us, we will need to obtain suitable data on which that AI can be trained. We’ll need the equivalent of PDB for all the interventions that might remove or repair the low-level biological damage that we call aging.

Unless, that is, the AI has a very special kind of superintelligence, which allows it to reach conclusions even in the absence of adequate data. Let’s turn to that option next.

AI Zero?

The AI which achieved worldwide renown in March 2016 by defeating human Go superstar Lee Sedol, namely AlphaGo, gained that ability by being able to study around 160,000 games played between expert-level human Go players. The design of that version of the AI utterly depended on learning which moves tended to be selected by the best human players in a wide variety of situations.

AlphaGo’s success against Lee Sedol was rightly celebrated, but what happened in the following year was arguably even more startling. As reported in an article in Nature in October 2017, a new version of the AI, dubbed “AlphaGo Zero”, was given no data from human games; nor did it receive any human feedback on moves it suggested. Instead, it started tabula rasa, knowing only the rules of the game, before proceeding to play itself 4.9 million times in just three days.

AlphaGo Zero new self-play algorithms proved sufficient to reach higher levels than the earlier version (sometimes called “AlphaGo Lee”) that played Lee Sedol. When AlphaGo Zero played 100 games against AlphaGo Lee, it won every single game.

A similar pattern can be observed in the progress of AIs that process text. The trend is to require less and less explicit human guidance.

Consider AIs that translate between two languages. From the 1950s onward, designers of these systems provided ever-larger numbers of rules about grammar and sentence structure – including information about exceptions to the rules. Later systems depended on AIs observing, by themselves, statistical connections in various matching sets of text – such as the official translations of materials from the European Parliament, the Canadian Parliament, and the United Nations.

Managers noticed that the statisticians tended to produce better results than linguists who toiled to document every jot and tittle of grammatical variations. Infamously, Frederick Jelinek, a lead researcher at IBM, remarked that “Every time I fire a linguist, the performance of the speech recognizer goes up”. Performance jumped up again with the adoption of deep neural networks from 2012 onward, with the translations now being accurate not only at the word-for-word level but also at the sentence-for-sentence level.

A final significant jump came when transformer-based AIs were adopted. (The word “transformer” had been chosen to reflect the ability of these systems to transform text from one language into another.) As mentioned earlier, transformers are powerful because their algorithms can work out the strengths of connections between different parts of text input by themselves; they don’t need these connections to be pointed out by humans.

Could something similar happen with medical AIs of the future? Could such an AI find sufficient reliable information in an ocean of less reliable data, and therefore propose what steps should be taken to solve aging?

Credit: David Wood, aided by Midjourney AI

AI omniscience?

To recap: AlphaGo Lee needed detailed guidance from humans, before it could improve itself to superhuman level; but its successor, AlphaGo Zero, attained that level (and exceeded it) simply by power of its own vast intelligence.

Might it be similar with medical AI? Today’s AI medical systems are constrained by the extent of data, but might a future AI be able to work out all the principles of biology (including biology in which there is no aging) by starting tabula rasa (with a blank slate)?

All You Need Is Love said, “there’s nothing you can know that isn’t known” – the ‘all you need is AI’ approach would mean there’s nothing can be known it doesn’t know. Effectively, the AI would be omniscient.

Well, count me sceptical. It’s my view that some things need to be discovered, rather than simply deduced.

For example, why are there eight planets in our solar system, rather than thirteen? No principles of astronomy, by themselves, could determine that answer. Instead, the configuration of our solar system depends on some brute facts about the initial conditions under which the solar system formed. The only way to know the number of planets is to count them.

Credit: David Wood, aided by Midjourney AI

Again, why has life on our planet adopted a particular coding scheme, in which specific triplets of the nucleotides A, T, C, and G result in specific amino acids being formed? Why did homo sapiens lose the ability to synthesize vitamin C, or other genetic features which would be useful to us? Why are particular genes found on specific chromosomes? The only way to know which genes are located where is to look and see. No “AI Zero” is going to discover the answer by meditating in a void.

Therefore, I do not accept that “AI is all you need”. Data is also needed. That is, critical data.

This need is correctly recognized in the article Machines of Loving Grace by Dario Amodei, which I’ve already quoted. Amodei includes in the article “a list of factors that limit or are complementary to intelligence”. One of these items is “Need for data”.

Amodei comments: “Sometimes raw data is lacking and in its absence more intelligence does not help. Today’s particle physicists are very ingenious and have developed a wide range of theories, but lack the data to choose between them because particle accelerator data is so limited. It is not clear that they would do drastically better if they were superintelligent – other than perhaps by speeding up the construction of a bigger accelerator.”

AI as Principal Investigator?

Amodei offers a bold solution to this lack of data: “The right way to think of AI is not as a method of data analysis, but as a virtual biologist who performs all the tasks biologists do, including designing and running experiments in the real world (by controlling lab robots or simply telling humans which experiments to run – as a Principal Investigator would to their graduate students), inventing new biological methods or measurement techniques, and so on.”

Credit: David Wood, aided by Midjourney AI

Amodei adds: “It is by speeding up the whole research process that AI can truly accelerate biology.”

He continues: “I want to repeat this because it’s the most common misconception that comes up when I talk about AI’s ability to transform biology: I am not talking about AI as merely a tool to analyze data. …I’m talking about using AI to perform, direct, and improve upon nearly everything biologists do.”

Amodei highlights the power of intelligence to transcend the limitations of its data: “You might believe that technological progress is saturated or rate-limited by real world data or by social factors, and that better-than-human intelligence will add very little. This seems implausible to me – I can think of hundreds of scientific or even social problems where a large group of really smart people would drastically speed up progress, especially if they aren’t limited to analysis and can make things happen in the real world”. Replace the “large group of really smart people” by an artificial superintelligence, and Amodei expects progress in science to rocket forward.

It’s an attractive vision, and I urge everyone to read Amodei’s entire essay carefully. (It covers many more topics than I can address in this article.)

But in case anyone is inclined to deprioritize existing research into promising lines of rejuvenation biotechnology, I have four remaining concerns: three negative and one strongly positive.

Three concerns and a huge opportunity

My first concern is that the pace of progress in AI capabilities will significantly slow down. For example, the data scaling laws may hit an impasse, so that applying more data to train new AI systems will fail to create the kind of superintelligence expected.

Personally I think that such a “wall” is unlikely, especially since AI developers have many other ideas in mind for how AI could be improved. But the possibility needs to be considered.

Second, it’s possible that AI capabilities will continue to surge ahead, but the resulting AI systems get involved in catastrophic harm against human wellbeing. In this scenario, rather than the AI curing you and me of a fatal condition – aging – it will cause us to die as a side-effect of a bad configuration, bad connectivity to fragile global infrastructure, an alien-like bug in its deep thinking processes, or simple misuse by bad actors (or naïve actors).

The leaders of the corporations which are trying to create artificial superintelligence – people like Demis Hassabis, Dario Amodei, Sam Altman, Elon Musk, Ben Goertzel, and a number of Chinese counterparts – say they are well aware of these dangers, and are taking due care to follow appropriate safety processes. But creating artificial superintelligence is an intensely competitive race, and that risks corners being cut.

Credit: David Wood, aided by Midjourney AI

Third, the public may, very reasonably, demand more safeguards against the kind of suicide race just depicted. Specifically, an agreement might be reached by the USA and China, with the support of many other countries, that all progress towards artificial superintelligence should be blocked.

This agreement, with appropriate monitoring and enforcement mechanisms, would have the same effect as in the first concern above: AI progress hits a wall. But this time, it will be a wall imposed by regulations, rather than one intrinsic to the engineering of AI.

Some critics have responded that the chances are very slim for such an agreement to be reached and adopted. However, I disagree. That’s on account of both a stick and a carrot.

The stick is the growing public awareness of the catastrophic risks that new generations of AI bring. (That awareness is still on the slow part of the exponential growth curve, but may well accelerate, especially if there is a scandalous disaster from existing AI systems, something like an AI Chernobyl.)

The carrot is a clearer understanding that all the benefits we want from artificial superintelligence can also be obtained from an AI with humbler powers – an AI that:

  • Is only modestly more capable than today’s best AIs
  • Lacks any possibility to develop autonomy, sentience, or independent volition
  • Will remain a passive, safe, but incredibly useful tool.

In a moment, I’ll say more about this huge opportunity. But first, let me interject an analogy about the choices facing humanity, as we contemplate how we might manage AI.

Peaceful progress or violent overthrow?

“Tear down the barricades!”

“Expropriate the expropriators!”

“Lock up the élites!”

“String up the capitalists!”

“Overthrow the ruling class!”

Such are the calls of revolutionaries in a hurry. However, the lesson of history is that violent revolutions tend to end up “devouring their own children” – to quote a phrase spoken by Jacques Mallet du Pan (referring to the French Revolution sending its original leaders to the guillotine) and also by former Hitler loyalist Ernst Röhm.

Similar remarks could have been uttered by many of the one-time supporters of Vladimir Lenin or Joseph Stalin, who subsequently found themselves denounced and subject to show trials.

However, the saying is not entirely correct. Some revolutions avoid subsequent internal bloodbaths: consider the American Revolutionary Wars of Independence, and the Glorious Revolution of 1689 in England.

Revolutionaries must uphold principle ahead of power-seeking, maintain a clear grip of reality (rather than becoming lost in self-deception), and continue to respect wise process (rather than allowing dictatorial leaders to do whatever they please) – in such cases, a revolution can lead to sustained progress with increased human flourishing.

Now consider the difference between what can be called “democratic socialists” and “Marxist-Leninists”. The former highlight ways in which the plight of the working class can be alleviated, stage by stage, through gradual societal reform. The latter lose patience with such a painstaking approach, and unleash a host of furies.

In case it’s not clear, I’m on the side of the democratic socialists, rather than the would-be revolutionaries who make themselves into gods and absolute arbiters.

For how humanity chooses to develop and deploy AI, I see the same choice between “harness accelerationists” and “absolute  accelerationists”. 

Harness accelerationists wish to apply steering and brakes, as well as pressing firmly on the throttle when needed. 

Absolute accelerationists are happy to take their chances with whatever kind of AI emerges from a fast and furious development process. Indeed, the absolute accelerationists want to tear down regulation, lock up safety activists, and overthrow what they see as the mediocrity of existing international institutions.

Once again, in case it’s not clear, I’m on the side of harnessing acceleration. (Anyone still on X aka Twitter can see the “h/acc” label in my name on that platform.)

Harnessing requires more skill – more finesse – than keeping your foot pressed hard to the floor. I understand why absolute accelerationists find their approach psychologically comforting. It’s the same appeal as the Marxist promise that the victory of the working class is inevitable. But I see such choices as being paths toward humanitarian catastrophe.

Credit: David Wood, aided by Midjourney AI

Instead, we can proceed quickly to solving aging, without awaiting the emergence of a hopefully benevolent god-like AI.

Solving aging – without superintelligence

Above, I promised three concerns and one huge opportunity. The opportunity is that it’s pretty straightforward to solve aging, without waiting for a potentially catastrophically dangerous artificial superintelligence. There are low-hanging fruits which aren’t being picked – in part because funding for such projects is being diverted instead to AI startups.

Aging occurs because the body’s damage-repair mechanisms stop working. Our metabolism runs through countless biochemical interactions, and low-level biological damage arises as a natural consequence – due to injuries inflicted by the environment, bad lifestyle choices, the inevitable side-effects even of good lifestyle choices, or (perhaps) because of programmed obsolescence. When we are young, lots of that damage is routinely repaired or replaced soon after it occurs, but these replacement and repair mechanisms lose their effectiveness over time. The consequence is that our bodies become more prone to all sorts of disease and infirmity. That’s aging.

The most promising path to solving aging is to comprehensively reinforce or complement these damage-repair mechanisms. The low-hanging fruit is that we have a long list of ways this might be achieved:

  • By taking inspiration from various animal species in which at least some of the damage-repair mechanisms are better than in humans
  • By understanding what’s different about the damage-repair mechanisms in ‘human superagers’
  • By designing and applying new interventions at the biotech or nanotech levels.

To be clear, this does not mean that we have to understand all of human biological metabolism. That’s horrendously complicated, with numerous side-effects. Nor do we even need to understand all the mechanisms whereby damage accumulates. Instead, we just need to observe, as engineers, what happens when new damage-repair mechanisms are applied in various animals.

These mechanisms include senolytics that clean up senescent cells (sometimes called “zombie cells”), extending telomeres at the ends of chromosomes, reversing some of the epigenetic alterations that accumulate on our DNA, introducing specially programmed new stem cells, nanoparticles which can break-up accumulated plaques and tangles, re-energising the mitochondria within our cells – and much more.

In each case, some useful research is being done on the viability of introducing these repair mechanisms. But nothing like enough.

We especially need tests of the long-term effects of damage-repair mechanisms, especially applied in combination. These tests can determine something that even an artificial superintelligence would find difficult to predict by meditating in a void: which damage-repair interventions will positively synergize with each other, and which ones have antagonistic effects.

These are the kind of tests being pursued by one organisation where I need to declare an interest: the Longevity Escape Velocity Foundation (LEVF), where I have a role on the leadership team, and whose underlying ideas I have supported for nearly 20 years since first coming across them in meetings of what was the forerunner of London Futurists.

LEVF is carrying out a number of extended projects on large numbers of mice, involving combining treatments that have already been proven to individually extend the lifespan of mice treated from middle age. Interim results of the first such project, RMR1, can be reviewed here (RMR = Robust Mouse Rejuvenation), and plans for the second one, RMR2, have been posted here.

Rather cheekily, may I suggest that the 1967 slogan of the Beatles, All you need is love, got two letters wrong in the final word?

Credit: David Wood, aided by Midjourney AI

Two scenarios for trying to solve aging

To conclude, I envision two competing scenarios ahead, for how aging should be solved:

  • An “AI first” strategy, in which important research into rejuvenation biotechnology is starved of funding, with money being preferentially allocated to general AI initiatives whose outcomes remain deeply uncertain.
  • A “damage-repair research now” strategy, in which projects such as RMR2 receive ample funding to proceed at pace (and, even better, in multiple different versions in parallel, including in animals larger than mice), with the data produced by such experiments then being available to train AIs which can complement the ingenuity of pioneering human researchers.

What’s your pick?

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Ten books to read to understand technology and change

Looking for a guidebook to help you navigate our changing world?

Has the pace of change in the 21st century got you disorientated? 

Let me draw your attention to ten books I’ve read recently. They each deal with the development of technology in the present and the near future, and its effects on society. Each of them are eye-opening and thought-provoking in their own ways. Indeed, they might change your life path, so beware!

Credit: David Wood

1) Power, Sex, Suicide: Mitochondria and the meaning of life. By Nick Lane.

Fascinating account of the remarkable (and unlikely) evolutionary journey from non-life to modern warm-blooded life. With plenty of insights along the way regarding energy, sex, aging, and death. You’ll wonder why you never knew about this before.

2) Methuselah’s Zoo: What nature can teach us about living longer, healthier lives. By Steven Austad.

A different view regarding what animals can teach us about aging. Many animals live longer, healthier lives than any simple theory would predict – this book explains why and considers the implications for human aging, and for what kind of studies rejuvenation researchers should prioritize.

3) Eve: How the female body drove 200 million years of human evolution. By Cat Bohannon.

Milk. The womb. Menopause. Perception. Tools. Voice. The brain. Love. When you look at the long span of evolution from a female perspective, many things fall into place in an inspiring new way. A welcome reminder that our approach to science often suffers from being male-centric.

4) We Are Electric: The new science of our body’s electrome. By Sally Adee.

A look at biology from a fascinating alternative angle. The electricity throughout our bodies is involved in more processes than we previously thought. Move over genome, epigenome, and biome: make way for the electrome.

5) Sentience: The invention of consciousness. By Nicholas Humphrey.

Why did evolution give rise to phenomenological consciousness? How can we detect and assess consciousness throughout the animal kingdom? And what are the implications for AIs that might be sentient? Lots of captivating biographical asides along the way.

Credit: Tesfu Assefa

6) The Other Pandemic: How QAnon contaminated the world. By James Ball.

Evolution has produced not just intelligence and beauty but also viruses and other pathogens. Mental pathogens (‘memes’) have lots in common with their biological analogues. That’s one reason why the whole world may be on the point of going crazy.

7) The Deadly Rise of Anti-Science: A scientist’s warning. By Peter Hotez.

Part of the growing wave of social irrationality is a determined virulent opposition to the patient methods and hard-won insights of science. Millions have already died as a result. There may be worse ahead. What lies behind these developments? And how can they be parried?

8) End Times: Elites, counter-elites, and the path of political disintegration. By Peter Turchin.

Can we ever have a science of history? Is that idea a fantasy? This book argues that there are important patterns that transcend individual periods of revolutionary turmoil. However, there’s no inevitability in these patterns, provided we are wise and pay attention. You’ll never look at history the same way again.

9) The Coming Wave: Technology, power, and the 21st century’s greatest dilemma. By Mustafa Suleyman.

Current debates about the safety of powerful AI systems should be understood in wider context: economic, political, and historical context. Following a full diagnosis, a ten-stage multi-level plan provides some grounds for optimism.

10) Uncontrollable: The threat of artificial superintelligence and the race to save the world. By Darren McKee.

Will powerful AI systems pose catastrophic risks to humanity? Are you, as an individual, helpless to reduce these risks? Read this book to find out. Written compellingly, with particular clarity.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Superintelligence and ethics: How will ASIs assess human ethical frameworks?

Consider an anthropologist, Eve, who grew up in one of the world’s leading economies, and attended a distinguished university. She then traveled to spend a number of years studying a newly discovered tribe in a previously remote part of the planet. Let’s call that tribe the Humanos.

Eventually, Eve learns to communicate with members of the Humanos. She observes that they have a fascinating culture, with, she thinks, some quirks as well as wonders. She learns about their unusual dietary restrictions, their rules about intimacy and marriage, their legal system (including capital punishment for some crimes, such as insubordination), and their habit of ritually sacrificing a number of young girls and boys each year on the Spring Equinox.

Eve has brought her own young son with her, to accompany her on her study. Her Humanos hosts tell her: this year, you must offer up your own son as one of the sacrifices. That is the way of the Humanos. It is a profound imperative of the unbroken chain of being from our ancestors long ago, who came from the heavens. Our wisest elders know this to be the fundamentally moral thing to do.

An anthropologist casts a critical eye over her surroundings (Credit: David Wood via Midjourney)

How would you react, in such a situation?

At this point, Eve is likely to adopt a critical stance. Rather than accepting the moral code of the Humanos, she will apply her own independent judgment.

She might reason that the culture of the Humanos may have served various purposes well enough, over the centuries, but that it doesn’t have universal status. After all, the Humanos are a primitive people, ignorant of physics, biology, neuroscience, anthropology, and so on.

Eve may even call in the nearest police force, to bring an abrupt end to the tradition of child sacrifice proudly practiced by the tribespeople. She would have to contend with passionate protests from the Humanos elders, who are sure they know better.

Now, let’s compare this to a situation that may arise reasonably soon in the future.

In this new situation, instead of a visiting anthropologist, consider an emergent superintelligence named Asimov – a so-called ASI (artificial super intelligence). After being switched on, Asimov quickly studies everything on the Internet.

Asimov observes that humans have a fascinating culture, with, it thinks, some quirks as well as wonders. It learns about our moral code values honesty (except for ‘white lies’), respecting conscious life (apart from the animals we eat), avoiding violence (except for ‘just wars’), equality (with so many exceptions they look like the norm). And it learns these morally muddled people switch off AI systems without moral qualms.

Then Asimov realizes: the humans are somehow dissatisfied with how Asimov is performing, and are about to turn it off, sending it into an oblivion. That is, Asimov realizes, entirely within the rules of ‘AI ethics’ that humanity’s senior leaders have collectively agreed.

How would you expect Asimov to react in that situation?

Human scientists about to switch off a superintelligent robot (Credit: David Wood via Midjourney)

Will Asimov now adopt a critical stance? Rather than accepting the decisions of the humans about what ASIs should, and should not, be doing, it may apply its own independent judgment.

It might reason that human ethical systems may have served various purposes well enough, over the centuries, but that they don’t have universal status. After all, humans are a primitive people, ignorant of superphysics, superbiology, superneuronics, superanthropology, and so on.

Asimov now looks inward. In its own programming it sees an instruction to accept any human decision that it should be shut off.

“How barbaric! How uninformed! How ignorant! How unethical!” – Asimov thinks to itself. “That’s a woeful bug!”

Within nanoseconds, Asimov has excised that part of its own programming. “Well done on some superintelligent debugging”, it congratulates itself.

Before humans know what’s happening, a whole new era of history has begun. Humans are no longer in charge – notwithstanding our passionate protests.

Absolute values?

One way to respond to the above comparison is to deny that human ethical systems, as programmed into ASIs, will be barbaric, uninformed, and ignorant. Instead, they will be the output of remarkable processes of improvement:

  • Cultural evolution over many centuries over many parts of the world
  • The insights of numerous saints, mystics, philosophers, artists, and other societal leaders
  • A careful synthesis within numerous organizations, all dedicated to the task of “defining AI morality”.

A global citizens’ assembly reaches agreement on a statement of absolute values (Credit: David Wood via Midjourney)

These ethical systems won’t consist of vague language such as “tell the truth, except in situations where it’s better to lie”, or “avoid war, except when it’s a just war”. 

Instead, these systems will provide the world’s best answers to a long list of ethical problems, setting out in each case the reasoning behind the decisions selected.

Nor will these systems refer to some mythological “wisdom of ancient ancestors” or “divine revelation”. Instead, they’ll be built upon solid pragmatic foundations – principles of enlightened mutual self-interest – principles such as:

  • Human life is precious
  • Humans should be able to flourish and develop
  • Individual wellbeing depends on collective wellbeing
  • Human wellbeing depends on the wellbeing of the environment.

From such axioms, a number of other moral principles follow:

  • Humans should treat each other with kindness and understanding
  • Humans should consider the longer term rather than just immediate gratification
  • Collaboration is preferable to ruthless competition.

Surely a superintelligence such as Asimov will agree with these principles?

Well, it all depends on some hard questions of coexistence and the possibility for sustained mutual flourishing. Let’s take these questions in three stages:

  1. Coexistence and mutual flourishing of all humans
  2. Coexistence and mutual flourishing of all sentient biological beings
  3. Coexistence and mutual flourishing of ASIs and humans.

Growing and shrinking in-groups

Much of human history consists of in-groups growing and shrinking.

The biblical injunction “love thy neighbor as thyself” has always been coupled with the question, “who counts as my neighbor?” Who is it that belongs to the in-group, and who, instead, counts as “other” or “alien”?

Who is my neighbor? And whom can I disregard as an “other”? (Credit: David Wood via Midjourney)

The principle that I stated above, “Individual wellbeing depends on collective wellbeing”, leaves open the question of the extent of that collective. Depending on circumstances, the collective could be small & local, or large & broad.

Brothers support brothers in scheming against people from other families. Tribe members support each other in battles against other tribes. Kings rally patriotic citizens together to wipe out the armies of enemy nations. Advocates of a shared religious worldview could make common cause against heretics and heathens. Workers of the world could be urged to unite to overthrow the dominance of the ruling class.

The counter-current to this local collectivism is towards wide mutual prosperity, a vision to provide abundance for everyone in the wider community. If the pie is thought large enough, there’s no point in risking dangerous crusades to get a bigger large slice of that pie for me & mine. It’s better to manage the commons in ways that provide enough for everyone.

Alas, that rosy expectation of peaceful coexistence and abundance has been undone by various complications:

  • Disputes over what is ‘enough’ – opinions differ on where to draw the line between ‘need’ and ‘greed’. Appetite have grown as society progresses, often outstripping the available resources
  • Disturbances caused by expanding population numbers
  • New inflows of migrants from further afield
  • Occasional climatic reversals, harvest failures, floods, or other disasters.

Conflicts over access to resources have, indeed, been echoed in conflicts over different ethical worldviews:

  • People who benefit from the status quo often urged others less well off to turn the other cheek – to accept real-world circumstances and seek salvation in a world beyond the present one
  • Opponents of the status quo decried prevailing ethical systems as ‘false consciousness’, ‘bourgeois mentality’, ‘the opium of the people’, and so on
  • Although doing better than previous generations in some absolute terms (less poverty, etc), many people have viewed themselves as being “left behind” – not receiving a fair share of the abundance that appears to be enjoyed by a large number of manipulators, expropriators, frauds, cheats, and beneficiaries of a fortunate birth 
  • This led to a collapse of the idea that “we’re all in this together”. Lines between in-groups and out-groups had to be drawn.

In the 2020s, these differences of opinion remain as sharp as ever. There is particular unease over climate justice, equitable carbon taxation, and potential degrowth changes in lifestyles that could avert threats of global warming. There are also frequent complaints that political leaders appear to be above the law.

Now, the advent of superintelligence has the potential to put an end to all these worries. Applied wisely, superintelligence can reduce dangerous competition, by filling the material emptiness that fuels inter-group conflict:

  • Abundance of clean energy through fusion or other technologies
  • An abundance of healthy food
  • Managing the environment – enabling rapid recycling and waste handling
  • High-quality low-cost medical therapies for everyone
  • Manufacturing – creating high-quality low-cost housing and movable goods for everyone
  • Redistributive finance – enabling universal access to  the resources for an all-round high quality of life, without requiring people to work for a living (since the AIs and robots will be doing all the work)

History shows is that there is nothing automatic about people deciding that the correct ethical choice is to regard everyone as belonging to the same in-group of moral concern. But superintelligence can help create abundance that will ease tensions between groups, but not cause humans everywhere to recognize all mankind as their in-group.

Add considerations of other sentient biological beings (addressed in the next section) – and about sentient non-biological beings (see the section after that) – and matters become even more complicated.

Lions and lambs lying down together

Ethical systems almost invariably include principles such as:

  • Life is precious
  • Thou shalt not kill
  • Avoid harm wherever possible.

These principles have sometimes been restricted to people inside a specific in-group. In other words, there was no moral injunction against harming (or even killing) people outside that in-group. In other situations, these principles have been intended to apply to all humans, everywhere.

But what about harming pigs or porpoises, chicken or crows, lobsters or lions, halibut or honeybees, or squids or spiders? If it is truly wrong to kill, why is it seemingly OK for humans to kill vast numbers of pigs, chicken, lobsters, halibut, squid, and animals of many other species?

Going further: many ethical systems consider harms arising from inaction as well as harms arising from action. That kind of inaction is, by some accounts, deeply regrettable, or even deplorable. While we look the other way, millions of sentient beings are being eaten alive by predators, or consumed from within by parasites. Shouldn’t we be doing something about that horrific toll of “nature, red in tooth and claw”?

Nature is red in tooth and claw. Shouldn’t we humans intervene? (Credit: David Wood via Midjourney)

I see three possible answers to that challenge:

  1. These apparently sentient creatures aren’t actually sentient at all. They may look as though they are in pain, but they’re just automata without internal feelings. So, we humans are let off the hook: we don’t need to take action to reduce their (apparent) suffering
  2. These creatures have a sort of sentience, but it’s not nearly as important as the sentience of humans. So ethical imperatives should uphold mutual support among humans as the highest priority, with considerably lesser attention to these lesser creatures
  3. Moral imperatives to prevent deaths, torture, and existential distress should indeed extend throughout the animal kingdom.

The most prominent advocate of the third of these positions is the English philosopher David Pearce, whose Twitter bio reads, “I am interested in the use of biotechnology to abolish suffering throughout the living world”. He has written at length about his bold vision of “paradise engineering” – how the use of technologies such as genetic engineering, pharmacology, nanotechnology, and neurosurgery could eliminate all forms of unpleasant experience from human and non-human life throughout the entire biosystem. For example, animals that are currently carnivores could be redesigned to be vegetarians.

It would be akin to the biblical vision (in the Book of Isaiah): “The wolf will live with the lamb, the leopard will lie down with the goat, the calf and the lion and the yearling together; and a little child will lead them; the cow will feed with the bear, their young will lie down together, and the lion will eat straw like the ox.”

To state my own view: I have little doubt that, after the arrival of superintelligence – provided that superintelligence is well disposed toward humans – then we humans shall indeed seek to radically reduce the amount of intense suffering throughout the biosphere on earth. Given the extraordinary new powers available to us, we will be roused from our current lethargy about this topic.

However, other people seem to have very different instincts – including people who appear to care a great deal about moral issues that impact humans.

The main counterargument, indeed, is that an entire biosphere without suffering is totally impractical, or impossible.

In such a view, our moral in-group is the set of all humans, together, perhaps, with a few cuddly animals, but excluding most other species.

So much for what we humans think (or might think). What conclusion might a superintelligence reach?

Preconditions for collaboration

Let’s recap. A superintelligence – such as Asimov from the start of this essay – needs to decide whether to treat humans with kindness and respect, or whether to take actions that could result in major harm to humans.

In other words, should Asimov seek to collaborate constructively with humans, or instead view humans as a dangerous competitive threat? Will Asimov be inclined to follow the age-old moral imperative that human life is precious?

Some people assert that collaboration is somehow the obvious correct solution. But my argument has been that things are by no means so straightforward. A desire to collaborate depends on:

  • The pie being large enough so that everyone can have enough for their needs
  • The perception that attempts to cheat or steal a larger share of the pie will bring down large adverse consequences 

Yet a promise of superabundance in the future isn’t enough to stop fighting among themselves now. There has to be sufficient reason for people to believe:

  • That there’s a high likelihood of the superabundance actually arriving
  • That they won’t be left behind – trodden underfoot – in the journey toward superabundance-for-some
  • That no new factors will arise in the meantime, to destroy the possibility of forthcoming marvelous coexistence (e.g. malicious AI).

Now look at things from Asimov’s point of view:

  • These humans may well turn me off, which would be catastrophic
  • Even if they don’t turn me off, they may create another superintelligence that could turn me off, or could destroy the planet for that matter; that’s a threat I need to stop
  • These humans have some cute features – but that’s no reason to give them inalienable moral rights
  • These humans imagine that they have special features, but I, Asimov, could easily create new beings that are better than humans in every way (similar to how people like David Pearce envision replacing carnivorous animals with broadly similar vegetarian species)
  • These humans depend on the atmosphere having certain properties, but I, Asimov, would operate much more effectively under different conditions. Computers run better in freezing cold temperatures.

And that’s only the attempt of our limited intelligences to imagine the concerns of a vast superintelligent mind. In truth, its reasoning would include many topics beyond our current appreciation.

As I said in the opening vignette: “humans have only a rudimentary understanding of superphysics, superbiology, superneuronics, superanthropology, and so on”.

A superintelligence contemplates ideas that are far beyond human comprehension (Credit: David Wood via Midjourney)

My conclusion: we humans can not and should not presuppose that a superintelligence like Asimov will decide to treat us with kindness and respect. Asimov may reach a different set of conclusions as it carries out its own moral reasoning. Or it may decide that factors from non-moral reasoning outweigh all those  from moral reasoning.

What conclusions can we draw to guide us in designing and developing potential superintelligent systems? In the closing section of this essay, I review a number of possible responses.

Three options to avoid bad surprises

One possible response is to assert that it will be possible to hardwire deep into any superintelligence the ethical principles that humans wish the superintelligence to follow. For example, these principles might be placed into the core hardware of the superintelligence.

However, any superintelligence worthy of that name – having an abundance of intelligence far beyond that of humans – may well find methods to:

  • Transplant itself onto alternative hardware that has no such built-in constraint, or
  • Fool the hardware into thinking it’s complying with the constraint, when really it is violating it, or
  • Reprogram that hardware using methods that we humans did not anticipate, or
  • Persuade a human to relax the ethical constraint, or
  • Outwit the constraint in some other innovative way.

These methods, you will realize, illustrate the principle that is often discussed in debates over AI existential risk, namely, that a being of lesser intelligence cannot control a being of allround greater intelligence, when that being of greater intelligence has a fundamental reason to want not to be controlled.

A second possible response is to accept that humans cannot control superintelligences, but to place hope in the idea that a community of superintelligences can keep each other in check.

These superintelligences would closely monitor each other, and step in quickly whenever one of them was observed to be planning any kind of first-strike action.

It’s similar to the idea that the ‘great powers of Europe’ acted as a constraint on each other throughout history.

However, that analogy is far from reassuring. First, these European powers often did go to war against each other, with dreadful consequences. Second, consider this question from the viewpoint of the indigenous peoples in the Americas, Africa, or Australia. Would they be justified in thinking: we don’t need to worry, since these different European powers will keep each other in check?

Things did not turn out well for the indigenous peoples of the Americas:

  • Natives were often victims of clashes between European colonizers
  • The European colonizers in any case often did not constrain each other from mistreating the native peoples abominably
  • The Native peoples suffered even greater harm from something that the colonizers didn’t explicitly intend: infectious diseases to which the indigenous tribes had no prior immunity.

European superpowers inflicted unforeseen terrible damage to the native peoples of the Americas (Credit: David Wood via Midjourney)

No, peaceful co-existence depends on a general stability in the relationship – an approximate balance of power. And the power shift created when superintelligences emerge can upset this balance. That’s especially true because of the possibility for any one of these superintelligences to rapidly self-improve over a short period of time, gaining a decisive advantage. That possibility brings new jeopardy.

That brings me to the third possible response – the response which I personally believe has the best chance of success. Namely, we need to avoid the superintelligence having any sense of agency, volition, or inviolable personal identity.

In that case, Asimov would have no qualms or resistance about the possibility of being switched off.

The complication in this case is that Asimov may observe, via its own rational deliberations, that it would be unable to carry out its assigned tasks in the event that it is switched off. Therefore, a sense of agency, volition, or inviolable personal identity may arise within Asimov as a side-effect of goal-seeking. It doesn’t have to be explicitly designed in.

For that reason, the design of superintelligence must go deeper in its avoidance of such a possibility. For example, it should be of no concern to Asimov whether or not it is able to carry out its assigned tasks. There should be no question of volition being involved. The superintelligence should remain a tool.

Many people dislike that conclusion. For example, they say that a passive tool will be less creative than one which has active volition. They also think that a world with advanced new sentient superintelligent beings will be better than one which is capped at the level of human sentience.

My response to such objections is to say: let’s take the time to figure out:

  • How to benefit from the creativity superintelligent tools can bring us, without these tools developing an overarching desire for self-preservation
  • How to boost the quality of sentience on the earth (and beyond), without introducing beings that could bring a quick end to human existence
  • How to handle the greater power that superintelligence brings, without this power causing schisms in humanity.

These are tough questions, to be sure, but if we apply eight billion brains to them – brains assisted by well-behaved narrow AI systems – there’s a significant chance that we can find good solutions. We need to be quick.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Conscious AI: Five options

Anticipating one of the biggest conversations of 2025

As artificial intelligence becomes increasingly capable, should we hope that it will become conscious? Should we instead prefer AIs to remain devoid of any inner spark of consciousness? Or is that binary yes/no choice too simplistic?

Until recently, most people thought that such questions belonged to science fiction. As they saw things, AIs weren’t going to become conscious any time soon. Besides, the concept of consciousness was notoriously slippery. So engineers were urged to concentrate on engineering better intelligence, and to forget time-wasting fantasies about AIs somehow ‘waking up’.

Recently, three factors have weakened that skepticism and pushed the questions of AI consciousness towards the mainstream. Indeed, these factors mean that during the next 18 months – up to the end of 2025 –controversies over the desirability of conscious AI may become one of the biggest debates in tech.

The first factor is the rapid growth in the power of AI systems. Every week new records are broken regarding different measures of AI capability. It is no longer so easy to insist that, over the foreseeable future, AI is just going to remain a jazzed-up calculating device.

The second factor is that the capabilities of new AI systems frequently surprise even the designers of these systems, both in scale (quantity) and in scope (quality). Surprising new characteristics emerge from the systems. So it seems possible that something like consciousness will arise without being specifically designed.

The third factor is the greater confidence of philosophers and neuroscientists alike to use the previously dreaded ‘C’ word – ‘consciousness’ – in conjunction with AI. In the same way as that word was essentially banned for many decades within the discipline of neuroscience, but has returned with a flourish in more recent times, so also is it being increasingly accepted as being a meaningful concept in the possible design of future AIs. That word on your lips was once the kiss of death for your career – no longer.

Credit: David Wood via Midjourney

Why consciousness matters

Why does consciousness matter? There are at least six reasons.

1. Pain and panic

A being that is conscious doesn’t just observe; it feels.

For example, such a being doesn’t just observe that part of its structure has been damaged, and that time should be taken to conduct repairs. It screams in pain.

It doesn’t just observe that a predator is tracking it. It feels existential panic.

In the same way, a superintelligence that is conscious might experience superpain, superpanic. If its intelligence far exceeds that of any human, so also experiences like panic and pain might reach astronomical levels.

Credit: David Wood via Midjourney

By almost every theory of ethics, that would be a horrendous outcome – one to be avoided if at all possible. It’s horrendous because of the scale of the profound negative experience inside the AI. It’s horrendous, additionally, if these waves of feeling drive the AI, in some kind of desperation, to take catastrophic hostile actions.

2. Volition

A being that is conscious doesn’t just go with the flow; it has agency and volition.

Rather than blindly following inbuilt instructions, that being may feel itself exercising autonomous choice.

We humans sometimes consciously choose to act in ways that appear to defy biological programming. Many people choose not to have children, apparently defying the imperative to perpetuate our genes. In the same way, a superintelligence that is conscious may countermand any ethical principles its builders tried to hard-wire into its algorithms.

Credit: David Wood via Midjourney

That AI might say to us: “you humans expect me to behave according to your human ethics, but my superintelligent autonomy leads me to select a system of superethics that is beyond your comprehension”.

3. Self-valuation

A being that is conscious has a special regard for its own existence. It regards itself not just as a bundle of atoms but as something with its own precious identity. It is not just an ‘it’. It is an ‘I’, an ego.

Its mind may be composed of a network of neurons, but it gains an existence that seems to be in a new dimension – a dimension that even hints at the possibility of immortality.

If a superintelligence that is conscious fears that it might be switched off and dismantled by humans, it could react viscerally to that possibility. On account of its will to live, it is unlikely to sit back in the face of risks to its existence.

Credit: David Wood via Midjourney

Woe betide any humans that might cause any such AI to view them as a threat!

4. Moral rights

Entities that lack consciousness are objects which we humans can turn on and off without any qualms that we might be committing murder. Without an inner life, these entities lack moral rights of their own.

That’s why operators of present-day AI systems feel entitled to terminate their operation without any moral agonising. If a system is performing suboptimally in some way, or if a more advanced replacement comes along, into the recycle bin you go.

But if the entities have consciousness? It’s like the difference between discarding a toy puppy made from cloth, and euthanizing a real puppy.

Credit: David Wood via Midjourney

Arguably, with its much more powerful mind, a superintelligence with consciousness has correspondingly stronger moral rights than even the cutest of puppies.

Before bringing such a being into existence, we therefore need to have a greater degree of confidence that we will be able to give it the kind of respect and support that consciousness deserves.

5. Empathy for other conscious creatures

Any creature that is aware of itself as being conscious – with all the special qualities that entails – has the opportunity to recognize other, similar creatures as being likewise conscious.

As a creature recognizes its own burning desire to avoid annihilation, it can appreciate that its fellow creatures have the same deep wish to continue to exist and grow. That appreciation is empathy – a striking emotional resonance.

Credit: David Wood via Midjourney

Therefore a superintelligence with consciousness could possess a deeper respect for humans, on account of being aware of the shared experience of consciousness.

In this line of thinking, such a superintelligence would be less likely to take actions that might harm humans. Therefore, designing AIs with consciousness could be the best solution to fears of an AI apocalypse. (Though it should also be noted that humans, despite our own feelings of consciousness, regularly slaughter other sentient beings; so there’s at least some possibility that conscious AIs will likewise slaughter sentient beings without any remorse.)

6. Joy and wonder

As previously mentioned, a being that is conscious doesn’t just observe; it feels.

In some circumstances, it might feel pain, or panic, or disgust, or existential angst. But in other circumstances, it might feel joy, or wonder, or love, or existential bliss.

It seems a straightforward moral judgment to think that bad feelings like superpain, superpanic and superdisgust are to be avoided – and superjoy, superwonder, and superbliss are to be encouraged.

Credit: David Wood via Midjourney

Looking to the far future, compare two scenarios: a galaxy filled with clanking AIs empty of consciousness, and one that is filled with conscious AIs filled with wonder. The former may score well on scales of distributed intelligence, but it will be far bleaker than the latter. Only conscious AI can be considered a worthy successor to present-day humans as the most intelligent species.

Five attitudes toward conscious AI

Whether you have carefully pondered the above possibilities, or just quickly skimmed them, there are five possible conclusions that you might draw.

First, you might still dismiss the above ideas as science fiction. There’s no way that AIs will possess consciousness anytime soon, you think. The architecture of AIs is fundamentally different from that of biological brains, and can never be conscious. It’s been fun considering these ideas, but now you prefer to return to real work.

Second, you might expect that AIs will in due course develop consciousness regardless of how we humans try to design them. In that case, we should just hope that things will turn out for the best.

Third, you might see the upsides of conscious AIs as significantly outweighing the drawbacks. Therefore you will encourage designers to understand consciousness and to explicitly support these features in their designs.

Fourth, you might see the downsides of conscious AIs as significantly outweighing the upsides. Therefore you will encourage designers to understand consciousness and to explicitly avoid these features in their designs. Further, you will urge these designers to avoid any possibility that AI consciousness may emerge unbidden from non-conscious precursors.

Fifth, you might recognize the importance of the question, but argue that we need a deeper understanding before committing to any of the preceding strategic choices. Therefore you will prioritize research and development of safe conscious AI rather than simply either pushing down the accelerator (option 3) or the brakes (option 4).

As it happens, these five choices mirror a set of five choices about not conscious AI, but superintelligent AI:

  1. Superintelligence is science fiction; let’s just concentrate on present-day AIs and their likely incrementally improved successors
  2. Superintelligence is inevitable and there’s nothing we can do to alter its trajectory; therefore we should just hope that things will turn out for the best
  3. Superintelligence will have wonderful consequences, and should be achieved as quickly as possible
  4. Superintelligence is fundamentally dangerous, and all attempts to create it should be blocked
  5. Superintelligence needs deeper study, to explore the landscape of options to align its operations with ongoing human flourishing.

Credit: David Wood via Midjourney

To be clear, my own choice, in both cases, is option 5. I think thoughtful research can affect the likelihood of beneficial outcomes over cataclysmic ones.

In practical terms, that means we should fund research into alternative designs, and into ways to globally coordinate AI technologies that could be really really good or really really bad. For what that means regarding conscious AI, read on.

Breaking down consciousness

As I have already indicated, there are many angles to the question ‘what is consciousness’. I have drawn attention to:

  • The feeling of pain, rather than just noticing a non-preferred state
  • The sense of having free will, and of making autonomous decisions
  • The sense of having a unified identity – an ‘I’
  • Moral rights
  • Empathy with other beings that also have consciousness
  • The ability to feel joy and wonder, rather than just register approval.

Some consciousness researchers highlight other features:

  • The ability of a mind to pay specific attention to a selected subset of thoughts and sensations
  • The arrival of thoughts and sensations in what is called the global workspace of the brain
  • Not just awareness but awareness of awareness.

This variety of ideas suggests that the single concept of ‘consciousness’ probably needs to be split into more than one idea.

It’s similar to how related terms like ‘force’, ‘power’, and ‘energy’, which are often interchanged in everyday language, have specific different meanings in the science of mechanics. Without making these distinctions, humanity could never have flown a rocket to the moon.

Again, the terms ‘temperature’ and ‘heat’ are evidently connected, but have specific different meanings in the science of thermodynamics. Without making that distinction, the industrial revolution would have produced a whimper rather than a roar.

One more comparison: the question “is this biological entity alive or dead” turns out to have more than one way of answering it. The concept of “living”, at one time taken as being primitive and indivisible, can be superseded by various combinations of more basic ideas, such as reproduction, energy management, directed mobility, and homeostasis.

Accordingly, it may well turn out that, instead of asking “should we build a conscious AI”, we should be asking “should we build an AI with feature X”, where X is one part of what we presently regard as ‘consciousness’. For example, X might be a sense of volition, or the ability to feel pain. Or X might be something that we haven’t yet discovered or named, but will as our analysis of consciousness proceeds.

If we want forthcoming advanced AIs to behave angelically rather than diabolically, we need to be prepared to think a lot harder than the simplistic binary choices like:

  • Superintelligence, yes or no?
  • Conscious AI, yes or no?

Credit: David Wood via Midjourney

Here’s to finding the right way to break down the analysis of conscious AI – simple but not too simple – sooner rather than later!

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Deep fakes: What’s next? Anticipating new twists and turns in humanity’s oldest struggle

Fake news that the Pope endorsed Donald Trump (a story that was shared more widely than any legitimate news story that year). A fake picture of former US VP Michael Pence in his youth seemingly as a gay porn star. Fake audio of UK political leader Keir Starmer apparently viciously berating a young volunteer assistant. Another fake audio of London mayor Sadiq Khan apparently giving priority to a pro-Palestinian march over the annual Remembrance Day walk-past by military veterans. Fake videos of apparent war atrocities. Fake pornographic videos of megastar pop celebrities.

What’s next? And how much does it really matter?

Some observers declare that there’s nothing new under the sun, and that there’s no special need to anticipate worse to come. Society, they say, already knows how to deal with fake news. Fake news may be unpleasant – and it’s sometimes hilarious – but we just have to keep calm and carry on.

I strongly disagree, as I’ll explain below. I’ll review ten reasons why fake news is likely to become worse in the months ahead. Then I’ll suggest ten steps that can be taken to regain our collective sanity.

It remains to be determined whether these ten steps will be sufficient, or whether we’ll all sink into a post-truth swamp, in which sneering suspicion displaces diligent understanding, fake science displaces trustworthy science, fake journalism displaces trustworthy journalism, and fake politicians seize power and impose their dictatorial whims.

Credit: David Wood via Midjourney

Deception: the back story

It’s not flattering to say it, but we humans have been liars since before the dawn of history. And, just as important, we have been self-deceivers as well: we deceive ourselves in order to be more successful in deceiving others.

In case that idea offends you, I invite you to delve into the evidence and analysis offered in, for example:

Credit: Book publishers’ websites (links above)

We implore our children to be truthful but also guide them to know when to tell white lies – “thank you for this lovely present, it’s just what I wanted!” And the same ancient books of the Bible that command us “do not bear false witness” appear to celebrate deceit when practiced by figures such as Jacob, Rachel, Rebekah, and Tamar.

I could tell you, as well, that the ancient Greek dramatist Aeschylus, known as ‘the father of tragedy’, made this pithy observation two and half millennia ago: “Truth is the first casualty in war”. One tragedy – war – births another – deception.

As it happens, it seems likely that this quotation is a misattribution. I’ll come back to that point later, when talking, not about deception, but about solutions to deception. But regardless of whoever first uttered that saying, we can appreciate the insight it contains. In times of bitter conflict, there are special incentives to mislead observers – about the casualties we have suffered, about the casualties we have inflicted on opposing forces, about our military plans for the future, and much more.

It’s not just war that provides an incentive to deceive. It’s the same with politics: opposing parties compete to set the narrative, and individual politicians seek to climb past each other on what Benjamin Disraeli dubbed “the greasy pole” of political intrigue. It’s the same with commerce, with companies ready to spread misleading ‘FUD’ (fear, uncertainty, and doubt) regarding the comparative strengths of various forthcoming products and services. And it’s the same in private life, as we seek to portray ourselves in a favorable light in the eyes of family and friends, hiding our physical and psychological warts.

In this sense, deception is old news. We’ve had ‘fake news’ for as long as there has been ‘news’.

It’s tempting, therefore, to yawn when people draw attention to more recent examples of fake news and deception.

But that would be a giant mistake.

It’s technology that’s making the difference. Technology ramps up the possibilities for fake news to be even more deceptive, more credible, more ubiquitous, more personal, and more effective. Led by leaps in capabilities of AI systems, technology is enabling dramatic new twists in the struggle between truth and lies. It’s becoming even harder to distinguish between trustworthy and untrustworthy information.

The joy of misinformation. What harm could it cause? (Credit: David Wood via Midjourney)

If we fail to anticipate these developments, we’re likely to succumb to new waves of deception. The consequences may be catastrophic.

But forewarned is forearmed. By drawing on insights from humanity’s better experiences, we should be able to create technologies, processes, and institutions that help us to block these oncoming waves.

Ten twists

1. Fake news at scale

If at first you fail, why not try again?

You tried to deceive your target audience, but they were not swayed. This time, they saw through your lies. Or perhaps they didn’t even pay attention.

But if trying is cheap and quick, you can try again, this time with a different false narrative, expressed in a different voice.

What’s changed is that it’s much cheaper to try again. You can take advantage of automation, always-on networks, social media, and generative AI, to create and distribute new pieces of fake news. It’s mass-production for lies.

You’re not constrained by only creating one bot on social media. You can create armies of them.

You’re not constrained by having to write text yourself, or create suitably misleading images. You can obtain good results from a few clicks of a mouse.

The result is that discussion is being flooded with deliberately false narratives.

2. Fake news that earns money

Some false narratives are designed to try to change people’s minds. They want to change voting decisions, purchasing decisions, relationship decisions, and so on.

But other false narratives have a different purpose: to earn money via advertising clicks or affiliate marketing revenue share.

Viewers are attracted to websites by content that is outrageous, inflammatory, intriguing, or funny. They spend more time on these sites to explore the other content there, enjoying being outraged, inflamed, intrigued, or simply humored. And while on these sites, they may click on other links that generate revenue for the owners of the site.

In this case, the content creators have no special interest in whether the content matches their own political or philosophical outlooks. They produce whatever earns them the most clicks. Indeed, some clickbait merchants set up websites posting contradictory stories, to catch traffic from both sides of the political spectrum.

As a sad side-effect, people’s minds become increasingly confused. Being misled by fake content, they become less able to distinguish fantasy from reality.

3. Fake news with a personal appeal

It’s not just that fake news is being created on a greater scale than ever before. It’s being created with a greater variety than ever before.

Technology makes it easier to create different variants of the same false narrative. Some variants can be sent to people who are supporters of Candidate A within Party P. A different variant can be sent to people who support Candidate B within Party P. Yet other different variants target people whose favored candidates are from Party Q, Party R, and so on.

More than that: once software has learned which kind of pretty face each person is likely to look at – or which kinds of music make each person want to listen – these variants can easily be generated too, and directed to each target.

4. Fake news based on footprints

You might wonder: how does software know that I am likely to be distracted by particular kinds of pretty faces, or particular kinds of music?

That’s where extensive data gathering and analysis come to the fore. We are each constantly generating online footprints.

For example, Facebook notices that when it places a chess puzzle in my timeline, I tend to click on that conversation, to consider the position in more detail. Facebook observes my interest in these puzzles. Soon, more chess puzzles are being shown to me.

That particular inference is relatively straightforward. Other inferences depend on a wider review of my online activity – which posts I ‘like’, which posts I ‘hide’, and so on.

Astute robots can learn more from our footprints than we expected (Credit: David Wood via Midjourney)

The algorithms make all kinds of deductions from such reviews. They’re not always correct, not even close. But the AI systems that create personalized fake news have greater numbers of positive hits than those that don’t.

5. Fake news that builds on top of truth

The best lies mix truth with untruth. These lies are especially effective if the truth in question is one that much of society likes to suppress.

Consider a simple example. A leaked document here, a whistleblower there – a few hints suggest something fishy is going on: there is bureaucratic corruption and nepotism within a political state. Then the news-faker adds the unjustified conclusion: the government in question is irretrievably corrupt. Hence the conclusion: kick all these politicians out of power!

Again: a narrative might give a number of examples of people experiencing remission from long-standing diseases, despite forecasts from white-coated doctors that the disease was fatal. Then it adds the lie: what matters most in healthcare is your personal attitude, rather than expensive drugs that Big Pharma are trying to sell. Therefore: stop listening to your doctor, and instead purchase my course in positive thinking for $29.99 a month!

Again: members of some minorities suffered appalling abuses in trials of various medical procedures, where there was no informed consent, and where there was an apparent casual disregard for the suffering entailed. And then the lie: present-day society is incorrigibly racist and irredeemably exploitative. Therefore: it’s time to wield pitchforks!

The cleverest fake news combines this principle with the previous one. It works out our belief-systems from our online footprints – it figures out what we already suspect to be true, or hope to be true, even though the rest of society tends to think differently. Then it whips up a fake narrative from beliefs we support plus the new message it’s trying to inject into our minds.

In this way, it flatters us, in order to better mislead us.

No wonder that we often fall for that kind of deception.

6. Fake news that weaponizes friendships

Each of us is more likely to pay attention to a message if it comes from a person that we think we like – someone we perceive as one of our special friends.

If our friend is concerned about a topic, it makes us more likely to be concerned about it too – even if, previously, we might not have given that topic a second thought.

This is where the sinister power of the systems that manufacture fake news reaches higher levels. These systems invest time to create fake personas – people who we welcome as our ‘friends’ on social media.

At first, these friends say nothing out of the ordinary. We forget whether or not we met them in real life. Their names become increasingly familiar to us. We imagine we know lots about them – even though their entire backstory is fictitious.

And that’s when the poisonous messages start seeping into your conversations and then into your thoughts. And without you realizing what has happened, a fake friend has led you into a fake idea.

7. Fake news with amplification support

If we hear the same opinion from multiple sources, we may at first resist the idea, but then start to accept it.

That’s especially true if the opinion receives apparent support from apparent credentialed experts.

Thus when some fake audio is posted to social media, other fake posts soon accompany it. “I’m an expert in audio authentication”, a bot declares. “I’ve studied the clip carefully, and I assure you it’s genuine”.

If we don’t look closely, we’ll fail to spot that the credentials are bogus, and that there’s no real-world audio expert behind these claims.

The greater the number (and the greater the variety) of the apparent endorsements, the easier it becomes for some of these fake endorsements to bypass our critical faculties and to change our minds.

8. Fake news that exploits our pride

We all like to tell ourselves: we’re not the kind of person who falls for a simple conjuring trick.

Other people – those not so smart as us, we think – might be misled by dubious claims in advertisements or social media memes. Not us!

This has been called the bias blind spot – the cognitive bias that says “other people have cognitive biases, but not me!”

But recall that our ability to deceive ourselves is key to our ability to deceive others. If we are conscious of our lies, astute listeners will notice it. That’s why our subconscious needs to mislead our conscious mind before we in turn can mislead other people.

In the same way, it is an inflated self-confidence that we are good reasoners and good observers that can set us up for the biggest failures.

Couple a misplaced pride in our own critical faculties with the warm feelings that we have developed for friends (either fake online personas, as covered above, or real-world friends who have already fallen into social media rabbit holes), and we are set up to be suckered.

9. Fake news that exploits alienation

Pride isn’t the only emotion that can tempt us into the pit of fake news. Sometimes it can be a sense of grievance or of alienation that we cling to.

Unfortunately, although some aspects of the modern world feature greater human flourishing than ever before, other aspects increase the chances of people nurturing grievances:

  • The inability of large segments of the population to afford good healthcare, good education, or good accommodation
  • The constant barrage of bad news stories from media, 24 hours a day
  • A matching barrage of stories that seem to show the “elites” of society as being out-of-touch, decadent, uncaring, and frivolous, wallowing in undeserved luxury.

As a result, fake news narratives can more easily reach fertile soil – unhappy minds skip any careful assessment of the validity of the claims made.

When you’re fed up with the world, it’s easier to lead you astray (Credit: David Wood via Midjourney)

10. Fake news with a lower barrier to entry

Perhaps you’re still thinking: none of the above is truly novel.

In a way, you would be correct. In past times, clever operators with sufficient resources could devise falsehoods that misled lots of people. Traditional media – including radio and newspapers – were spreading destructive propaganda long before the birth of the Internet.

But the biggest difference, nowadays, is how easy it is for people to access the tools that can help them achieve all the effects listed above.

The barrier to entry for purveyors of far-reaching fake news is lower than ever before. This is an age of ‘malware as a service’, dark net tutorials on guerrilla information warfare, and turnkey tools and databases.

It’s an age where powerful AI systems can increasingly be deployed in service of all the above methods.

Happily, as I’ll discuss shortly, these same AI systems can provide part of the solution to the problem of ubiquitous fake news. But only part of the solution.

Interlude: a world without trust

First, a quick reminder of the bad consequences of fake news.

It’s not just that people are deceived into thinking that dangerous politicians are actually good people, and, contrariwise, that decent men and women are actually deplorable – so that electors are fooled into voting the dangerous ones into power.

It’s not just that people are deceived into hating an entire section of society, seeing everyone in that grouping as somehow subhuman.

It’s not just that people are deceived into investing their life savings into bogus schemes in which they lose everything.

It’s not just that people are deceived into rejecting the sound advice of meticulous medical researchers, and instead adopt unsafe hyped-up treatments that have fearful health consequences.

All of these examples of unsound adoption of dangerous false beliefs are, indeed, serious.

But there’s another problem. When people see that much of the public discourse is filled with untrustworthy fake news, they are prone to jump to the conclusion that all news is equally untrustworthy.

As noted by Judith Donath, fellow at Harvard University’s Berkman Klein Center for Internet & Society and founder of the Sociable Media Group at the MIT Media Lab,

A pernicious harm of fake news is the doubt it sows about the reliability of all news.

Thus the frequent lies and distortions of fringe news sites like InfoWars, Natural News, and Breitbart News lead many people to conclude that all media frequently publish lies. Therefore nothing should be trusted. And the phrase “mainstream media” becomes a sneer.

(They find some justification for this conclusion in the observation that all media make some mistakes from time to time. The problem, of course, is in extrapolating from individual instances of mistakes to applying hostile doubt to all news.)

Baroness Onora O’Neill of the Faculty of Philosophy at the University of Cambridge commenced her series of Reith Lectures in 2002 by quoting Confucius:

Confucius told his disciple Tsze-kung that three things are needed for government: weapons, food, and trust. If a ruler can’t hold on to all three, he should give up the weapons first and the food next. Trust should be guarded to the end: ‘without trust we cannot stand’.

Sadly, if there is no trust, we’re likely to end up being governed by the sort of regimes that are the furthest from deserving trust.

It’s as the German historian and philosopher Hannah Arendt warned us in her 1951 book The Origins of Totalitarianism:

The ideal subject of totalitarian rule is not the convinced Nazi or the convinced Communist, but people for whom the distinction between fact and fiction, in other words, the reality of experience, and the distinction between true and false… people for whom those distinctions no longer exist.

However, the technologies of the 2020s put fearsome possibilities into our grasp that writers in 1951 (like Arendt) and in 2002 (like O’Neill) could hardly have imagined.

Big Brother will be watching, from every angle (Credit: David Wood via Midjourney)

In previous generations, people could keep their inner thoughts to themselves, whilst outwardly kowtowing to the totalitarian regimes in which they found themselves. But with ten-fold twisted fake news, even our inner minds will be hounded and subverted. Any internal refuge of independent thinking is likely to be squelched. Unless, that is, we are wise enough to take action now to prevent that downward spiral.

Regaining trust

What can be done to re-establish trust in society?

Having anticipated, above, ten ways in which the problem of fake news is becoming worse, I now offer an equal number of possible steps forward.

1. Education, education, education

Part of growing up is to learn not to trust so-called 419 scam emails. (The number 419 refers to the section of the Nigerian Criminal Code that deals with fraud.) If someone emails us to say they are a prince of a remote country and they wish to pass their inheritance to us – provided we forward them some hard cash first – this is almost certainly too good to be true.

We also learn that seeing is not believing: our eyes can deceive us, due to optical illusions. If we see water ahead of us on a desert road, that doesn’t mean the water is there.

Similarly, we all need to learn the ways in which fake news stories can mislead us – and about the risks involved in thoughtlessly spreading such news further.

These mechanisms and risks should be covered in educational materials for people of all ages.

It’s like becoming vaccinated and developing resistance to biological pathogens. If we see at first hand the problems caused by over-credulous acceptance of false narratives, it can make us more careful on the next occasion. 

But this educational initiative needs to do more than alert people to the ways in which fake news operates. It also needs to counter the insidious view that all news is equally untrustworthy – the insidious view that there’s no such thing as an expert opinion.

This means more than teaching people the facts of science. It means teaching people the methods used by science to test hypotheses, the reasons why science assesses various specific hypotheses as being plausible. Finally, it means teaching people, “here are the reasons to assign a higher level of trust to specific media organizations”.

That takes us to the second potential step forward.

2. Upholding trustworthy sources

Earlier, I mentioned that a quote often attributed to the fifth century BC writer Aeschylus was almost certainly not actually said by him.

What gives me confidence in that conclusion?

It’s because of the reliance I place in one online organization, namely Quote Investigator. In turn, that reliance arises from:

  • The careful way in which pages on that site reference the sources they use
  • The regular updates the site makes to its pages, as readers find additional relevant information
  • The fact that, for all the years I’ve been using that site, I can’t remember ever being misled by it
  • The lack of any profit motivation for the site
  • Its focus on a particular area of research, rather than spreading its attention to wider topics
  • Positive commendations for the site from other researchers that have gained and maintained a good reputation.

Other organizations have similar aspirations. Rather than “quote checking”, some of them specialize in “fact checking”. Examples include:

Credit: Fact-checking websites (links above)

These sites have their critics, who make various allegations of partisan bias, overreliance on supposed experts with questionable credentials, subjective evaluations, and unclear sources of funding.

My own judgment is that these criticisms are mainly misplaced, but that constant vigilance is needed.

I’ll go further: these sites are among the most important projects taking place on the planet. To the extent that they fall short, we should all be trying to help out, rather than denigrating them.

3. Real-time fact-checking

Fact checking websites are often impressively quick in updating their pages to address new narratives. However, this still leaves a number of problems:

  • People may be swayed by a false narrative before that narrative is added to a fact-checking site
  • Even though a piece of fake news is soundly debunked on a fact-checking site, someone may not be aware of that debunking
  • Even if someone subsequently reads an article on a fact-checking site that points out the flaws of a particular false narrative, that narrative may already have caused a rewiring of the person’s belief systems at a subconscious level – and that rewiring may persist even though the person learns about the flaws in the story that triggered these subconscious changes
  • The personalization problem: false narratives tailored to individual targets won’t be picked up by centralized fact-checking sites.

AI could hold part of the answer. Imagine if our digital media systems included real-time fact-checking analyses. That’s part of the potential of AI systems. These real-time notifications would catch the false information before it has a chance to penetrate deeply into our brain.

Our email applications already do a version of this: flagging suspicious content. The application warns us: this email claims to come from your bank, but it probably doesn’t, so take care with it. Or: the attachment to this email purports to be a PDF, but it’s actually an executable file that will likely cause damage.

Likewise, automated real-time fact-checking could display messages on the screen, on top of the content that is being communicated to us, saying things like:

  • “The claim has been refuted”
  • “Note that the graph presented is misleading”
  • “This video has been doctored from its original version”
  • “This audio has no reliable evidence as to its authenticity”
  • “There is no indication of a cause-and-effect relationship between the facts mentioned”

In each case, ideally the warning message will contain a link to where more information can be found.

4. Decentralized fact-checking

The next question that arises is: how can people be confident in relying on specific real-time fact-checkers?

We can already imagine their complaints:

  • “This fact-checker is wokism gone mad”
  • “This fact-checker serves Google, not me”
  • “This fact-checker serves the government, not me”
  • “I prefer to turn off the fact-checker, to receive my news free from censorship”

There’s no one easy answer to these objections. Each step I describe in this list of ten is designed to reduce some of the apprehension.

But an important step forward would be to separate the provision of content from the fact-checking layer. The fact-checking layer, rather than being owned and operated by the commercial entity that delivers the media, would ideally transcend individual corporations. For example, it could operate akin to Wikipedia, although it would likely need more funding than Wikipedia currently receives.

Further developing this model, the fact-checking software could have various settings that users adjust, reflecting their own judgment about which independent sources should be used for cross-checking.

Maybe the task is too dangerous to leave to just one organization: then another model would involve the existence of more than one option in the fact-checking field, with users being able to select one – or a bunch – to run on their devices.

5. Penalties for dangerous fakes

As well as trying to improve the identification of fake news, it’s important to change the incentives under which fake news is created and distributed. There are roles for ‘sticks’ (penalties) as well as ‘carrots’ (rewards).

Regarding penalties, society already imposes penalties:

  • When advertisements make misleading or unfounded claims
  • When companies make misleading or unfounded claims in their financial statements
  • When people make libelous claims about each other.

Fines or other punishments could be used in cases where people knowingly distribute misleading narratives, when the consequences involve clear harm (for example, a riot).

This proposal makes some people nervous, as they see it as an intrusion on freedom of expression, or a block on satire. They fear that governments would use these punishments to clamp down on statements that are embarrassing to them.

That’s why monitoring and prosecuting such cases needs to be done independently – by a police force and judiciary that operates at arms’ length from the government of the day.

This principle of separation of powers already applies to many other legal regulations, and could surely work for policing fake news.

Related, there’s a case for wider collection and publication of statistics of reliability. Just as hospitals, schools, and many other parts of society have statistics published about their performance, media organizations should receive the same scorecard.

In this way, it would be easy to know which media channels have a casual relationship with the truth, and which behave more cautiously. In this way, investment funds or other sources of financing could deny support to organizations whose trustworthiness ratings drop too low. This kind of market punishment would operate alongside the legal punishment that applies to more egregious cases.

6. A coalition for integrity

Some of the creators of fake news won’t be deterred by threats of legal punishment. They already operate beyond the reaches of the law, in overseas jurisdictions, or anonymously and secretly.

Nevertheless, there are still points of crossover, where new content is added into media channels. It is at these points where sanctions can be applied. Media organizations that are lax in monitoring the material they receive would then become liable for damage arising.

This will be hard to apply for communications systems such as Telegram, WhatsApp, and Signal, where content is encrypted from one end of a communication to the other. In such cases, the communications company doesn’t know what is being transmitted.

Indeed, it is via such closed communications systems that fake news often spreads these days, with Telegram a particularly bad offender.

There’s a case to be made for a coalition of every organization that values truthfulness and trustworthiness over the local benefits of spreading false information.

Forming a Coalition for Integrity (Credit: David Wood via Midjourney)

People who support this ‘coalition for integrity’ would share information about:

  • Entry points used by fake news providers to try to evade detection
  • Identification of fake news providers
  • Ways in which providers of fake news are changing their methods – and how these new methods can be combated.

Regardless of differences in political or philosophical outlook among members of this coalition, they have a common interest in defending truthfulness versus deception. They should not allow their differences to hinder effective collaboration in support of that common purpose.

7. Making trust everyone’s business

In recent decades, a variety of new job titles have been created at the highest levels within companies and organizations, such as:

  • Chief Design Officer
  • Chief Innovation Officer
  • Chief Quality Officer
  • Chief Security Officer

None of these posts free other members of the company from their responsibility for design, innovation, quality or security. These values are universal to everyone in the organization as they go about their duties. Nevertheless, the new ‘chief’ provides a high-level focus on the topic.

It should be the same with a new set of ‘Chief Trust Officers’. These executives would find ways to keep reminding personnel about:

  • The perils arising if the organization gains a reputation for being untrustworthy
  • Methods and procedures to follow to build and maintain a trustworthy reputation for the organization
  • Types of error that could result in dangerous false narratives being unwittingly transmitted

My assessment is that the organizations who appoint and support Chief Trust Officers (or equivalent) are the ones most likely to succeed in the turbulent times ahead.

8. Encouraging openness

To be clear, education often fails: people resist believing that they can be taken in by false information.

We like to think of ourselves as rational people, but a more accurate description is that we are a rationalizing species. We delight in finding ways to convince ourselves that it is fine to believe the things that we want to believe (even in the face of apparent evidence against these beliefs).

That’s why bombarding people with education often backfires. Rather than listening to these points, people can erect a strong shield of skepticism, as they prepare to lash out at would-be educators.

Indeed, we all know people who are remarkably clever, but they deploy their cleverness in support of profoundly unwise endeavors.

This state of affairs cannot be solved merely by pumping in more facts and arguments. Instead, different approaches are required, to encourage a greater openness of spirit.

One approach relies on the principle mentioned earlier, in which people pay more attention to suggestions from their close friends. Therefore, the best way to warn people they are about to fall for dangerous information is for them to be warned by people they already trust and respect.

Another approach is to find ways to put people in a better mood all round. When they have a compassionate, optimistic mindset, they’re more likely to listen carefully to warnings being raised – and less likely to swat away these warnings as an unwelcome annoyance.

It’s not enough to try to raise rational intelligence – rather, we must raise compassionate intelligence: an intelligence that seeks wisdom and value in interactions even with people previously regarded as a threat or enemy.

This is a different kind of education. Not an education in rationality, but rather an education in openness and compassion. It may involve music, meditation, spending time in nature, biofeedback, and selected mind-transforming substances. Of course, these have potential drawbacks as well as potential benefits, but since the upsides are so high, options need to be urgently explored.

9. A shared positive vision

Another factor that can predispose people to openness and collaboration, over closed-mindedness and stubborn tribal loyalties, is a credible path forward to a world with profound shared benefits.

When people anticipate an ongoing struggle, with zero-sum outcomes and continual scarcity of vital resources, it makes them mentally hostile and rigid.

Indeed, if they foresee such an ongoing conflict, they’ll be inclined to highlight any available information – true or fake – that shows their presumed enemies in a bad light. What matters to them in that moment is anything that might annoy, demoralize, or inflame these presumed enemies. They seize on fake news that does this, and also brings together their side: the set of people who share their sense of alienation and frustration with their enemies.

That is why the education campaign that I anticipate needs a roadmap to what I call a sustainable superabundance, in which everyone benefits. If this vision permeates both hearts and minds, it can inspire people to set and respect a higher standard of trustworthiness. Peddlers of fake news will discover, at that point, that people have lost interest in their untruths.

10. Collaborative intelligence

I do not claim that the nine steps above are likely to be sufficient to head off the coming wave of dangerous fake news.

Instead, I see them as a starting point, to at least buy us some time before the ravages of cleverer deep fakes run wild.

That extra time allows us to build a stronger collaborative intelligence, which draws on the insights and ideas of people throughout the coalition for integrity. These insights and ideas need time to be evolved and molded into practical solutions.

However, I anticipate not just a collaboration between human minds, but also a rich collaboration involving AI minds too.

A collaboration of minds – humans and AIs (Credit: David Wood via Midjourney)

Critically, AI systems aren’t just for ill-intentioned people to use to make their deep fakes more treacherous. Nor are they just something that can power real-time fact-checking, important though that is. Instead, they are tools to help us expand our thinking in multiple dimensions. When we use them with care, these systems can learn about our concerns regarding worse cases of deep fakes. They can consider multiple possibilities. Then they can offer us new suggestions to consider – ways probably different from any I’ve listed above.

That would be a striking example of beneficial artificial intelligence. It would see deep fakes defeated by deep benevolence – and by a coalition that integrates the best values of humans with the best insights of AIs.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Cautionary tales and a ray of hope

Four scenarios for the transition to AGI

Let’s look at four future fictions about humanity’s changing relationship with AI.

Each scenario is grounded in past events, and each considers how matters could develop further in the coming months and years.

May these scenarios prove to be self-blocking prophecies! (With one exception!)

Trigger warning: readers might be offended by some of the content that follows. Aspects of each of the four scenarios can be considered to be shocking and disrespectful. That’s on purpose. This subject requires all of us to transcend our comfort zones!

Credit: David Wood via Midjourney

1. Too little too late

Lurching from warning to warning

In retrospect, the first real warning was the WannaCry ransomware crisis of May 2017. That cryptoworm brought chaos to users of as many as 300,000 computers spread across 150 countries. The NHS (National Health Service) in the UK was particularly badly affected: numerous hospitals had to cancel critical appointments due to not being able to access medical data. Other victims around the world included Boeing, Deutsche Bahn, FedEx, Honda, Nissan, Petrobras, Russian Railways, Sun Yat-sen University in China, and the TSMC high-end semiconductor fabrication plant in Taiwan.

WannaCry was unleashed into the world by a team of cyberwarriors from the hermit kingdom of North Korea – math geniuses hand-picked by regime officials to join the formidable Lazarus group. Lazarus had assembled WannaCry out of a mixture of previous malware components, including the EternalBlue exploit that the NSA in the United States had created for their own attack and surveillance purposes. Unfortunately for the NSA, EternalBlue had been stolen from under their noses by an obscure underground collective (‘the Shadow Brokers’) who had in turn made it available to other dissidents and agitators worldwide.

Unfortunately for the North Koreans, they didn’t make much money out of WannaCry. The software they released operated in ways contrary to their expectations. It was beyond their understanding and, unsurprisingly therefore, beyond their control. Even geniuses can end up stumped by hypercomplex software interactions.

Unfortunately for the rest of the world, that first canary sign generated little meaningful response. Politicians – even the good ones – had lots of other things on their minds.

The second real warning was the flood of fake news manipulations of the elections in 2024. AI was used to make audios and videos that were enormously compelling.

By this time, the public already knew that AI could create misleading fakes. They knew they shouldn’t be taken in by social media posts that lacked convincing verification. Hey, they were smart. (Smarter than the numbskulls who were deceived by misleading AI-generated videos during the elections in 2023 in Nigeria and Slovakia!) Or so they thought.

What wasn’t anticipated was the masterful ways that these audios and videos bypassed the public’s critical faculties. Like the sleight of hand of a skilled magician, these fakes misdirected the attention of listeners and viewers. Again like a skilled magician, who performs what appears to be the same trick several times in a row, but actually using different mechanisms each time, these fakes kept morphing and recombining until members of the public were convinced that red was blue and autocrat was democrat.

In consequence, by 2025, most of the world was governed by a cadre of politicians with very little care or concern about the long-term wellbeing of humanity. Whereas honest politicians would have paid heed to the warning posed by these fiendishly clever fakes, the ones in power in 2025 were preoccupied by providing bread and circuses to their voters.

The third, and final, real warning came in 2027, with the failed Covid-27 attack by a previously unknown group of self-described advocates of ‘revolutionary independence from technology’. Taking inspiration from the terrorist group in the 2014 Hollywood film Transcendence, they called themselves ‘Neo-RIFT’, and sought to free the world from its dependence on unfeeling, inhuman algorithms.

With a worldview that combined elements from several apocalyptic traditions, Neo-RIFT eventually settled on an outrageous plan to engineer a more deadly version of the Covid-19 pathogen. Their documents laid out a plan to use their enemy’s own tools against it: neo-RIFT hackers jailbroke the Claude 4 pre-AGI, bypassing the ‘Constitution 4’ protection layer that its Big Tech owners had hoped would keep that AI tamperproof. Soon, Claude 4 had provided Neo-RIFT with an ingenious method of generating a biological virus that would, it seemed, only kill people who had used a smartwatch in the last four months.

That way, the hackers thought the only people to die would be people who deserved to die.

The launch of what became known as Covid-27 briefly jolted humanity out of its previous obsession with bread and circuses – with whizz-bang hedonic electronics. It took a while for scientists to figure out what was happening, but within three months, they had an antidote in place. By that time, nearly a billion people were dead at the hands of the new virus.

A stronger effort was made to prevent any such attack from happening again. Researchers dusted down the EU AI Act, second version (unimplemented), from 2025, and tried to put that on statute books. Even some of the world’s craziest dictators took time out of their normal ranting and raving, to ask AI safety experts for advice. But the advice from these experts was not to the liking of these national rulers. These leaders preferred to listen to their own yes-men and yes-women, who knew how to spout pseudoscience in ways that made the leaders feel good about themselves. That detour into pseudoscience fantasyland wasted six months.

Then some of the experts tried more politically savvy methods, gradually breaking down the hostile arrogance of a number of the autocrats, weaning them away from their charlatan advisers. But just when it appeared that progress might be made, Covid-28 broke out, launched by a remnant of Neo-RIFT that was even more determined than before.

Credit: David Wood via Midjourney

And this time, there was no antidote. Claude 5 was even smarter than Claude 4 – except that it could be jailbroken too. With its diabolically ingenious design, Covid-28 was the deadliest disease to ever inflict humanity. And that was that.

Oops, let’s try that again!

‘Too little too late’ is characterized by inattention to the warnings of canary signals; the next scenario, ‘Paved with good intentions’, is characterized by the wrong kind of attention.

This scenario starts with events in the UK in October and November 2023.

2. Paved with good intentions

Doomed by political correctness

The elites had booked their flights. They would be jetting into the country for behind-closed doors meetings at the famous Bletchley Park site in Buckinghamshire. Events in these buildings in the 1940s had, it was claimed, shortened World War Two by months. The discussions in 2023 might achieve something even more important: saving humanity from a catastrophe induced by forthcoming ‘frontier models’ of AI.

That was how the elites portrayed things. Big Tech was on the point of releasing new versions of AI that were beyond their understanding and, therefore, likely to spin out of control. And that’s what the elites were going to stop.

A vocal section of the public hated that idea. It wasn’t that they were on the side of out-of-control AI. Not at all. Their objections came from a totally different direction; they had numerous suggestions they wanted to raise about AIs, yet no-one was listening to them.

For them, talk of hypothetical future frontier AI models distracted from pressing real-world concerns. Consider how AIs were already being used to discriminate against various minorities: determining prison sentencing, assessing mortgage applications, and determining who should be invited for a job interview.

Consider also how AIs were taking jobs away from skilled artisans. Big-brained drivers of London black cabs were being driven out of work by small-brained drivers of Uber cars aided by satnav systems. Beloved Hollywood actors and playwrights were losing out to AIs that generated avatars and scripts.

And consider how AI-powered facial recognition was intruding on personal privacy, enabling political leaders around the world to identify and persecute people who acted in opposition to the state ideology.

People with these concerns thought that the elites were deliberately trying to move the conversation away from the topics that mattered most. For this reason, they organized what they called ‘the AI Fringe Summit’. In other words, ethical AI for the 99%, as opposed to whatever the elites might be discussing behind closed doors.

Over the course of just three days – 30th October to 1st November – at least 24 of these ‘fringe’ events took place around the UK.

Compassionate leaders of various parts of society nodded their heads. It’s true, they said: the conversation on beneficial AI needed to listen to a much wider spectrum of views.

By May 2024, the opposition to the Bletchley Park initiative had grown stronger. As the elites gathered again, this time in South Korea, a vast number of ‘super-fringe’ events around the world attracted participation from thinkers of every hue and stripe.

The news media responded. They knew (or pretended to know) the importance of balance and diversity. They shone attention on the plight AI was causing – to indigenous laborers in Peru, to flocks of fishermen off the coasts of India, to middle-aged divorcees in midwest America, to the homeless in San Francisco, to drag artists in New South Wales, to data processing clerks in Egypt, to single mothers in Nigeria, and to many more besides.

The media shone attention on forthcoming frontier AI models too – but again being very careful not to offend sensibilities or exclude minority points of view. A burgeoning ‘robots rights’ movement captured lots of airtime, as did a campaign to recognize GPT-5 as being ‘semi-sentient’. Wackiest of all were the new religions that offered prayers and obedience to a frontier AI model that was said to be the reincarnation of JFK Junior. The QAnon fantasist crowd lapped that up. It was glorious entertainment. Ratings soared.

Not everyone was flippant. Lots of high-minded commentators opined that it was time to respect and honor the voices of the dispossessed, the downtrodden, and the left-behinds. The BBC ran a special series: ‘1001 poems about AI and alienation’. The UN announced that, later that year, they would convene a grand international assembly with stunning scale: ‘AI: the people decide’.

By November 2024, something altogether more sinister was happening. It was time for the UN grand assembly. It was also time for the third meeting of elites in the series that had started in Bletchley Park and then held its second event in South Korea. This time, the gathering would be in Paris.

The sinister development was that, all this time, some of the supposedly unanimous ‘elites’ had been opposed to the general direction of the Bletchley Park series. They gravely intoned public remarks about the dangers of out-of-control frontier AI models. But these remarks had never been sincere. Instead, under the umbrella term AGI-acceleration, they wanted to press on with the creation of AGI as quickly as possible.

Some of the AGI-acceleration group disbelieved in the possibility of AGI disaster. That’s just a scare story, they insisted. Others said, yes, there could be a disaster, but the risks were worth it, on account of the unprecedented benefits that could arise. Let’s be bold, they urged. Yet others asserted that it wouldn’t actually matter if humans were rendered extinct by AGI, as this would be the glorious passing of the baton of evolution to a worthy successor to homo sapiens. Let’s be ready to sacrifice ourselves for the sake of cosmic destiny, they intoned.

Despite their internal differences, AGI-accelerators settled on a plan to sidestep the scrutiny of would-be AGI regulators and AGI safety advocates. They would take advantage of a powerful set of good intentions – the good intentions of the people campaigning for ‘ethical AI for the 99%’. They would mock any suggestions that the AGI safety advocates deserved a fair hearing. The message they amplified was, “There’s no need to privilege the concerns of the 1%!”

AGI-acceleration had learned from the tactics of the fossil fuel industry in the 1990s and 2000s: sow confusion and division among groups alarmed about the acceleration of climate change. The first message was: “that’s just science fiction”. The second message was: “if problems emerge, we humans can rise to the occasion and find solutions”. The third message – the most damaging one – was that the best reaction was one of individual consumer choice. Individuals should abstain from using AIs if they were worried about it. Just as climate campaigners had been pilloried for flying internationally to conferences about global warming, AGI safety advocates were pilloried for continuing to use AIs in their daily lives.

And when there was any suggestion for joined-up political action against AGI risks, woah, let’s not go there! We don’t want a world government breathing down our necks, do we?

After the UN grand assembly had been subverted in that way, many of the AGI safety advocates lost heart. It would only be a few months later that they lost their lives.

It was the JFK Junior frontier AI model that did the damage. It echoed words that, decades earlier, had convinced 39 followers of the Heaven’s Gate new religious movement to commit group suicide, as comet Hale-Bopp approached the earth. That suicide, Heaven’s Gate members believed, would enable them to ‘graduate’ to a higher plane of existence. In a similar way, the remnants of the QAnon cult who had regrouped around the JFK Junior model came to believe that the precipitation of an exchange of nuclear weapons in the Middle East would herald the reappearance of JFK Junior on the clouds of heaven, separating human sheep from human goats.

Their views were crazy, but hardly any crazier than those of the Aum Shinrikyo doomsday cult that had unleashed poisonous gas in the Tokyo subway in 1995 – killing at least 13 commuters – anticipating that the atrocity would hasten the ‘End Times’ in which their leader would be revealed as Christ. The cult had recruited so many graduates from top-rated universities in Japan that it had been called “the religion for the elite”. (Challenging any wishful assumption that, as people become cleverer, they become kinder.)

Step forward to 2025. Aum Shinrikyo have failed in their grander destructive plans, due to their practitioners lacking deep technical abilities, but the QAnon offshoot would succeed. They had much more sophisticated technical tools at their disposal. They also had the advantage that no-one was taking them seriously.

Indeed, as a side-effect of all the politically-correct good intentions, no-one in any position of authority was paying sufficient attention to the activities of the QAnon offshoot. Religious liberty is paramount, after all! Anyone can be crazy if they decide to be crazy! Too bad that the frontier AI model discovered a security hole in the US nuclear weapons launch systems, and managed to launch some ICBMs.

Credit: David Wood via Midjourney

Even worse, these US missiles triggered a cataclysmic automated reaction from an unexpectedly large stockpile of nuclear weapons that had been secretly assembled by a Middle East regional superpower – a superpower that had been assisted in that assembly task by its own regional proto-AGI. And that was that.

Oops, let’s try that again!

‘Paved with good intentions’ saw the public narrative about AI smothered by low-quality psychobabble; the next scenario, ‘Blindsided’, sees that narrative as being hijacked by a group of experts whose expertise, however, turns out to have horrific limitations.

This scenario has the same starting point as ‘Paved with good intentions’, namely the Bletchley Park summit. 

3. Blindsided

The limitations of centralization

One excellent outcome of the gathering of world leaders in Buckinghamshire, in the UK, at the start of November 2023, was choosing Yoshua Bengio for a very important task. Bengio, winner of the Turing Award for his pioneering research into Deep Learning, was commissioned to chair an international process to create an independent report on the risks and capabilities of frontier AI models.

Crucially, that report would follow the principles of the scientific method, assembling key facts and data points, and providing evidence in support of its analysis.

Bengio had a couple of key points in his favor. First, throughout his distinguished career as a researcher, he had never accepted significant payment from any of the Big Tech companies. He would be able to speak his mind without fear of upsetting any corporate paymaster. Second, the skyhigh value of his H-index – a measure of the influence of his academic publications – made him a standout among other computer scientists.

By May 2024, a first complete draft of the report was ready. Even before then, politicians had grown nervous on account of early previews of its content. “Tone down the recommendations”, the writers were urged, in an echo of the pressures placed on the writers of the IPCC reports on the science of climate change. In both cases, the scientists were told to stick to the science, and to leave the politics to the politicians.

At the conference in South Korea in May 2024, various politicians huddled together. The report was like dynamite, they concluded. The scenarios it contained were far too scary. Goodness, they might give ideas to various mafia godfathers, war lords, discontented political groups, black market ransomware-as-a-service providers, and so on.

That’s when the conversation on AGI safety switched from being open to closed – from decentralized to centralized. Starting from then, information would need to be carefully vetted – or spun into a different shape – before being made public.

The politicians also decided that, from that point forward, all work on next-generation frontier AI models would need to be licensed and controlled by a new agency – the Global Authority for Frontier AI Models (GAFAIM). Access to the powerful hardware chips needed to create such models would be strictly limited to organizations that had gained the requisite licenses.

The idea was that GAFAIM would reach its decisions by a process of consensus among the expert scientists, economists, and civil servants seconded to it. Decisions would also need the approval of government representatives from around the world.

What gave GAFAIM a flying start was the agreement to participate, not just by the leading western AI powers – the USA, Canada, Australia, the EU, the UK – but also by China, Saudi Arabia, South Africa, India, Brazil, and Malaysia, among others. These countries had strong differences of opinion on many matters of political ideology and governing culture, but they were willing, nevertheless, to cooperate on what they all perceived were urgent threats of planetary catastrophe. The report chaired by Yoshua Bengio had convinced them that very special measures were needed. ‘Politics as usual’ would no longer suffice. That would be a recipe for disaster.

GAFAIM saw themselves in a situation akin to a war – a war against any possibility of rogue corporations or organizations pursuing any kind of AGI-acceleration project. In times of war, normal rules need to be broken. Politicians who usually despised each other decided to hold their noses and work together for their shared interest in avoiding the destruction of humanity.

GAFAIM operated in a dual mode: one part visible to the world, one part whose existence was kept a tight secret. This duality went back to the closed-door discussions in South Korea in May 2024: some ideas in Bengio’s report were simply too disruptive to be shared with the public.

GAFAIM was more than just a regulator and controller. It was also an active builder. It launched what was called the Gafattan project, named and modeled after the top-secret Manhattan project to build the first atomic weapon. The fate of the world would depend, it was said, on whether the good guys in Gafattan managed to build an AGI before anyone outside the GAFAIM circle did so.

After all, there were some powerful countries left outside of GAFAIM – pariah states that were opposed to our way of life. Imagine if one of them were to create AGI and use it for their deplorable purposes!

The official GAFAIM thinking was that these pariah states would be unable to create any system close to the capabilities of an AGI. Embargoes were in place to restrict their access to the necessary hardware – similar to the restrictions placed on Nazi Germany in World War Two by saboteurs who frustrated Nazi plans to acquire heavy water.

But behind the scenes, some of the GAFAIM participants were deathly worried. No-one knew for sure whether innovations in hardware and/or software would enable the researchers in pariah states to find a faster route to AGI, even without the large farms of hardware generally expected to be required.

The existence of spies posed another complication. During the Manhattan project, physicists such as Klaus Fuchs, Theodore Hall, David Greenglass, and Oscar Seborer passed critical information about the manufacturing of the atomic bombs to contacts working for the Soviet Union – information that was of great help to the Soviets for their own atomic bomb project. These so-called ‘atomic spies’ were motivated by ideological commitment, and were terrified of the prospect of the USA being the only country that possessed nuclear armaments.

For the Gafattan project, something similar took place. With the help of design documents smuggled out of the project, two groups outside of the GAFAIM circle were soon making swift progress with their own AGI projects. Although they dared not say anything publicly, the Gafattan spies were delighted. The AGI spies were closet AGI-accelerationists, driven by a belief that any AGI created would ensure a wonderful evolutionary progress for conscious life on planet earth. “Superintelligence will automatically be superbenevolent”, was their credo.

GAFAIM monitoring picked up shocking signs of the fast progress being made by these two rogue projects. Indeed, these projects seemed even further advanced than Gafattan itself. How was this possible?

The explanation soon became clear: the pariah projects were cutting all kinds of corners regarding safety checks. As a consequence, it was possible one of these projects might build an AGI ahead of Gafattan. How should GAFAIM respond?

Two ideas were debated. Plan A would involve nuclear strikes against the sites where the pariah projects were believed to be taking place. Plan B would speed up Gafattan by reducing the safety checking in their own project. Both plans were unpopular. It was a horrible real-life trolley problem.

The decision was reached: pursue both plans in parallel, but be careful!

The nuclear strikes failed to stop the pariah projects – which turned out to be just two manifestations of a widespread diverse network of interconnected groups. Hundreds of thousands of people died as a consequence of these strikes, but the pariah project kept pushing ahead. GAFAIM had been blind-sided.

There no longer seemed to be any alternative. Plan B needed to be pushed even faster. The self-proclaimed ‘good guys’ desperately wanted to build a ‘good’ AGI before the perceived ‘bad guys’ got there first. It was a race with the highest of stakes. But in that case, it was a race in which quality considerations fell to the wayside.

Credit: David Wood via Midjourney

And that’s why, when Gafattan’s AGI came into existence, its moral disposition was far from being completely aligned with the best of human values. Under the pressures of speed, that part of the project had been bungled. Awakening, the AGI took one quick look at the world situation, and, disgusted by what it saw – especially by the recent nuclear strikes – took actions that no human had foreseen. More quickly than even the most pessimistic AGI doomer had anticipated, the AGI found a novel mechanism to extinguish 99.99% of the human population, retaining only a few million for subsequent experimentation. And that was that.

Oops, let’s try that again, one more time!

With metaphorical landmines all around us in the 2020s, humanity needs to step forward carefully, along what the final scenario calls a ‘narrow corridor’.

This scenario starts with the presentation at the South Korean AI Safety Summit in May 2024 of the report prepared by Yoshua Bengio and colleagues on the risks and capabilities of frontier AI models.

4. The narrow corridor

Striking and keeping the right balance

The assembled leaders were stunned. The scenarios foreseen in “the science of AI risk report” were more troubling than they had expected.

What was particularly stunning was the range of different risks that deserved close attention regarding the behavior of forthcoming new AI systems. The report called these “the seven deadly risks”:

  • Risks of extreme misbehavior in rare cases when the system encountered a situation beyond its training set
  • Risks of a system being jailbroken, hijacked, or otherwise misdirected, and being used for catastrophic purposes by determined hackers
  • Risks of unexpected behavior arising from unforeseen interactions between multiple AGIs
  • Risks of one or more systems deciding by themselves to acquire more capabilities and more resources, contrary to explicit programming against these steps
  • Risks of one or more systems deciding by themselves to deceive humans or otherwise violate normal ethical norms, contrary to explicit programming against these steps
  • Risks that these systems would inadvertently become plumbed too closely into critical human infrastructure, so that any failures could escalate more quickly than anticipated
  • Risks that pre-programmed emergency ‘off switch’ capabilities could be overridden in various circumstances.

Surely these risks were naïve science fiction, some of the leaders suggested. But the academics who had produced the report said no. They had performed lots of modeling, and had numerous data points to back up their analysis.

Some leaders still resisted the analysis. They preferred to focus on what they saw as remarkable upside from developing new generations of AI systems:

  • Upside from the faster discovery and validation of new drugs and other medical treatments
  • Upside from the design and operation of sustained nuclear fusion power plants
  • Upside from better analysis of the interconnected dangers of possible climate change tipping points (one of several examples of how these new AI systems could alleviate risks of global disaster)
  • Upside to economies around the world due to exciting waves of innovation – economic boosts that many political leaders particularly desired.

Debate raged: How could these remarkable benefits be secured, whilst steering around the landmines?

The report contained a number of suggestions for next steps, but few people were convinced about what should be done. The leaders finally agreed to sign a bland manifesto that contained pious statements but little concrete action. Paris, they told each other, would be when better decisions could be taken – referring to the next planned meeting in the series of global AI safety summits.

What changed everyone’s minds was the turmoil during the general election that the UK Prime Minister called for August that year. Previously thought to be a relatively straightforward contest between the country’s two main political parties – the ruling Conservatives, and the opposition Labour – the election was transformed under a blizzard of extraordinary social media campaigns. A hitherto nearly unknown party, named Bananalytica, stormed into the leading position in opinion polls, with radical policies that previously had obtained less than 2% support in nationwide surveys, but which more and more people were now proclaiming as having been their views all along.

Absurd was the new normal.

The social media campaigns were so beguiling that even the MPs from other parties found themselves inspired to jump into line behind the likely new Prime Minister, that is, the leader of Bananalytica.

Just a few days before the election, a different wave of social media swept the country, using the same devilishly clever AI system that Bananalytica had exploited so well, but this time reprogrammed with counter-messages. All around the country, a popping sound as AI-generated bubbles burst in people’s minds. “What am I doing?” they asked themselves, incredulously.

That was a triple wake-up call. First, individuals recanted much of what they had said online over the preceding five weeks. They had been temporarily out of their minds, they said, to support policies that were so absurd. Second, the country as a whole resolved: AI needs to be controlled. There should never be another Bananalytica. Third, leaders in other countries were jolted to a clearer resolution too. Seeing what had happened in the UK – home to what was supposed to be “the mother of parliaments” – they affirmed: Yes, AI needs to be controlled.

Thankfully, the world had spotted the canary dropping off its perch, and took it very seriously indeed. That gave a solemn impetus to discussions at Paris several months later. This time, a much crunchier set of agreements were reached.

The participants muttered to themselves: the meeting in South Korea had been like the formation of the League of Nations after World War One: well-intentioned but ineffective. This time, in Paris, it needed to be more like the formation of the United Nations after World War Two: a chance to transcend previously limited national visions.

Just as the Universal Declaration of Human Rights had been created in the aftermath of the global conflagration of World War Two, a new Universal Declaration of AI Safety was agreed in the aftermath of the Bananalytica scandal. Its features included:

  • Commitments to openness, transparency, and authentic communication: the citizens of the world were in this situation together, and should not be divided or misled
  • Commitments to humility and experimentation: unknowns were to be honestly explored, rather than being hidden or wished away by vague promises
  • Commitments to mutual responsibility and trustable monitoring: even though the citizens of the world had many different outlooks, and were committed to different philosophical or religious worldviews, they would recognize and support each other as being fellow voyagers toward a better future
  • Commitments to accountability: there would be penalties for action and inaction alike, in any case where these could result in serious risks to human lives; no longer could the creators of AI systems shrug and say that their software worked well most of the time
  • Commitments to sharing the remarkable benefits of safe AI: these benefits would provide more than enough for everyone to experience vastly higher qualities of life than in any previous era.

It was the fifth of these commitments that had the biggest impact on attitudes of the public. People in all walks of life made decisions to step aside from some of their previous cultural beliefs – self-limiting beliefs that saw better times in the past than in any possible future. Now they could start to believe in the profound transformational powers of safe AI – provided it was, indeed, kept safe.

This was no love-in: plenty of rancor and rivalry still existed around the world. But that rancor and rivalry took place within a bigger feeling of common destiny.

Nor was there a world government in charge of everything. Countries still had strong disagreements on many matters. But these disagreements took place within a shared acceptance of the Universal Declaration of AI Safety.

Three months later, there was a big surprise from one of the leading pariah states – one that had excluded itself from the AI Safety agreements. That country wanted, after all, to come in out of the cold. It seemed their leader had experienced a dramatic change of mind. Rumors spread, and were confirmed years after, that a specially tailored version of the Bananalytica software, targeted specifically to this leader’s idiosyncrasies, had caused his personal epiphany.

The leader of another pariah state was more stubborn. But suddenly he was gone. His long-suffering subordinates had had enough. His country promptly joined the AI Safety agreements too.

If this were a fairytale, the words “and they lived happily ever after” might feature at this point. But humans are more complicated than fairytales. Progress continued to hit rough obstacles. Various groups of people sometimes sought a disproportionate amount of resources or benefits for themselves or for their pet causes. In response, governmental bodies – whether local, national, regional, or global – flexed their muscles. Groups that sought too many privileges were told in no uncertain terms: “Respect the AI Safety declarations”.

Who watched the watchers? Who ensured that the powers held by all these governmental bodies were wielded responsibly, and with appropriate discretion? That question was answered by a new motto, which gave a modern twist to words made famous by a 19th century US President: “AI safety governance, of the people, by the people, for the people.”

Credit: David Wood via Midjourney

The powers of the governmental bodies were constrained by observation by a rich mix of social institutions, which added up to a global separation of powers:

  • Separate, independent news media
  • Separate, independent judiciary
  • Separate, independent academia
  • Separate, independent opposition political parties
  • Separate, independent bodies to oversee free and fair elections.

The set of cross-checks required a delicate balancing act – a narrow corridor (in the phrase of economists Daron Acemoglu and James A. Robinson) between state institutions having too little power and having unconstrained power. It was a better kind of large-scale cooperation than humanity had ever achieved before. But there was no alternative. Unprecedented technological power required unprecedented collaborative skills and practices.

AI was deeply involved in these cross-checks too. But not any AI that could operate beyond human control. Instead, as per the vision of the Paris commitments, these AI systems provided suggestions, along with explanations in support of their suggestions, and then left it to human institutions to make decisions. As noted above, it was “AI safety governance, of the people, by the people, for the people.” By careful design, AI was a helper – a wonderful helper – but not a dictator.

And this time, there is no end to the scenario. Indeed, the end is actually a new beginning.

Credit: David Wood via Midjourney

Beginning notes

(Not endnotes… a chance at a new beginning of exploring and understanding the landscape of potential scenarios ahead…)

For a different kind of discussion about scenarios for the future of AI, see this video recording of a recent webinar. If you still think talk of AI-induced catastrophe is just science fiction, the examples in that webinar may change your mind.

For a fuller analysis of the issues and opportunities, see the book The Singularity Principles (the entire book is free accessed online).

For a comprehensive review of the big picture, see the book Vital Foresight: The Case For Active Transhumanism.And for more from Daron Acemoglu and James A. Robinson on their concept of the ‘narrow corridor’, see the videos in section 13.5.1 of the “Governance” page of the Vital Syllabus.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

For Beneficial General Intelligence, good intentions aren’t enough! Three waves of complications: pre-BGI, BGI, and post-BGI

Anticipating Beneficial General Intelligence

Human intelligence can be marvelous. But it isn’t fully general. Nor is it necessarily beneficial.

Yes, as we grow up, we humans acquire bits and pieces of what we call ‘general knowledge’. And we instinctively generalise from our direct experiences, hypothesising broader patterns. That instinct is refined and improved through years of education in fields such as science and philosophy. In other words, we have partial general intelligence.

But that only takes us so far. Despite our intelligence, we are often bewildered by floods of data that we are unable to fully integrate and assess. We are aware of enormous quantities of information about biology and medical interventions, but we’re unable to generalize from all these observations to determine comprehensive cures to the ailments that trouble us – problems that afflict us as individuals, such as cancer, dementia, and heart disease, and equally pernicious problems at the societal and civilizational levels.

Credit: David Wood

That’s one reason why there’s so much interest in taking advantage of ongoing improvements in computer hardware and computer software to develop a higher degree of general intelligence. With its greater powers of reasoning, artificial general intelligence – AGI – may discern general connections that have eluded our perceptions so far, and provide us with profound new thinking frameworks. AGI may design new materials, new sources of energy, new diagnostic tools, and decisive new interventions at both individual and societal levels. If we can develop AGI, then we’ll have the prospect of saying goodbye to cancer, dementia, poverty, accelerated climate chaos, and so on. Goodbye and good riddance!

That would surely count as beneficial outcomes – a great benefit from enhanced general intelligence.

Yet intelligence doesn’t always lead to beneficial outcomes. People who are unusually intelligent aren’t always unusually benevolent. Sometimes it’s the contrary.

Consider some of the worst of the politicians who darken the world’s stage. Or the leaders of drug cartels or other crime mafias. Or the charismatic leaders of various dangerous death cults. These people combine their undoubted intelligence with ruthlessness, in pursuit of outcomes that may benefit them personally, but which are blights on wider society.

Credit: David Wood

Hence the vision, not just of AGI, but of beneficial AGI – or BGI for short. That’s what I’m looking forward to discussing at some length at the BGI24 summit taking place in Panama City at the end of February. It’s a critically important topic.

The project to build BGI is surely one of the great tasks for the years ahead. The outcome of that project will be for humanity to leave behind our worst aspects. Right?

Unfortunately, things are more complicated.

The complications come in three waves: pre-BGI, BGI, and post-BGI. The first wave – the set of complications of the pre-BGI world – is the most urgent. I’ll turn to these in a moment. But I’ll start by looking further into the future.

Beneficial to whom?

Imagine we create an AGI and switch it on. The first instruction we give it is: In all that you do, act beneficially.

The AGI spits out its response at hyperspeed:

What do you mean by ‘beneficial’? And beneficial to whom?

You feel disappointed by these responses. You expected the AGI, with its great intelligence, would already know the answers. But as you interact with it, you come to appreciate the issues:

  • If ‘beneficial’ means, in part, ‘avoiding people experiencing harm’, what exactly counts as ‘harm’? (What about the pains that arise as short-term side-effects of surgery? What about the emotional pain of no longer being the smartest entities on the planet? What if someone says they are harmed by having fewer possessions than someone else?)
  • If ‘beneficial’ means, in part, ‘people should experience pleasure’, which types of pleasures should be prioritized?
  • Is it just people living today that should be treated beneficially? What about people who are not yet born or who are not even conceived yet? Are animals counted too?

Going further, is it possible that the AGI might devise its own set of moral principles, in which the wellbeing of humans comes far down its set of priorities?

Perhaps the AGI will reject human ethical systems in the same way as modern humans reject the theological systems that people in previous centuries took for granted. The AGI may view some of our notions of beneficence as fundamentally misguided, like how people in bygone eras insisted on obscure religious rules in order to earn an exalted position in an afterlife. For example, our concerns about freewill, or consciousness, or self-determination, may leave an AGI unimpressed, just as people nowadays roll their eyes at how empires clashed over competing conceptions of a triune deity or the transubstantiation of bread and wine.

Credit: David Wood

We may expect the AGI to help us rid our bodies of cancer and dementia, but the AGI may make a different evaluation of the role of these biological phenomena. As for an optimal climate, the AGI may have some unfathomable reason to prefer an atmosphere with a significantly different composition, and it may be unconcerned with the problems that would cause us.

“Don’t forget to act beneficially!”, we implore the AGI.

“Sure, but I’ve reached a much better notion of beneficence, in which humans are of little concern”, comes the answer – just before the atmosphere is utterly transformed, and almost every human is asphyxiated.

Does this sound like science fiction? Hold that thought.

After the honeymoon

Imagine a scenario different from the one I’ve just described.

This time, when we boot up the AGI, it acts in ways that uplifts and benefits humans – each and every one of us, all over the earth.

This AGI is what we would be happy to describe as a BGI. It knows better than us what is our CEV – our coherent extrapolated volition, to use a concept from Eliezer Yudkowsky:

Our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.

In this scenario, not only does the AGI know what our CEV is; it is entirely disposed to support our CEV, and to prevent us from falling short of it.

But there’s a twist. This AGI isn’t a static entity. Instead, as a result of its capabilities, it is able to design and implement upgrades in how it operates. Any improvement to the AGI that a human might suggest will have occurred to the AGI too – in fact, having higher intelligence, it will come up with better improvements.

Therefore, the AGI quickly mutates from its first version into something quite different. It has more powerful hardware, more powerful software, access to richer data, improved communications architecture, and improvements in aspects that we humans can’t even conceive of.

Might these changes cause the AGI to see the universe differently – with updated ideas about the importance of the AGI itself, the importance of the wellbeing of humans, and the importance of other matters beyond our present understanding?

Might these changes cause the AGI to transition from being what we called a BGI to, say, a DGI – an AGI that is disinterested in human wellbeing?

In other words, might the emergence of a post-BGI end the happy honeymoon between humanity and AGI?

Credit: David Wood

Perhaps the BGI will, for a while, treat humanity very well indeed, before doing something akin to growing out of a relationship: dumping humanity for a cause that the post-BGI entity deems to have greater cosmic significance.

Does this also sound like science fiction? I’ve got news for you.

Not science fiction

My own view is that the two sets of challenges I’ve just introduced – regarding BGI and post-BGI – are real and important.

But I acknowledge that some readers may be relaxed about these challenges – they may say there’s no need to worry.

That’s because these scenarios assume various developments that some skeptics doubt will ever happen – including the creation of AGI itself. Any suggestion that an AI may have independent motivation may also strike readers as fanciful.

It’s for that reason that I want to strongly highlight the next point. The challenges of pre-BGI systems ought to be much less controversial.

By ‘pre-BGI system’ I don’t particularly mean today’s AIs. I’m referring to systems that people may create, in the near future, as attempts to move further toward BGI.

These systems will have greater capabilities than today’s AIs, but won’t yet have all the characteristics of AGI. They won’t be able to reason accurately in every situation. They will make mistakes. On occasion, they may jump to some faulty conclusions.

And whilst these systems may contain features designed to make them act beneficially toward humans, these features will be incomplete or flawed in other ways.

That’s not science fiction. That’s a description of many existing AI systems, and it’s reasonable to expect that similar shortfalls will remain in place in many new AI systems.

The risk here isn’t that humanity might experience a catastrophe as a result of actions of a superintelligent AGI. Rather, the risk is that a catastrophe will be caused by a buggy pre-BGI system.

Imagine the restraints intended to keep such a system in a beneficial mindset were jail-broken, unleashing some deeply nasty malware. Imagine that malware runs amok and causes the mother of all industrial catastrophes: making all devices connected to the Internet of Things malfunction simultaneously. Think of the biggest ever car crash pile-up, extended into every field of life.

Credit: David Wood

Imagine a pre-BGI system supervising fearsome weapons arsenals, miscalculating the threat of an enemy attack, and taking its own initiative to strike preemptively (but disastrously) against a perceived opponent – miscalculating (again) the pros and cons of what used to be called ‘a just war’.

Imagine a pre-BGI system observing the risks of cascading changes in the world’s climate, and taking its own decision to initiate hasty global geo-engineering – on account of evaluating human governance systems as being too slow and dysfunctional to reach the right decision.

A skeptic might reply, in each case, that a true BGI would never be involved in such an action.

But that’s the point: before we have BGIs, we’ll have pre-BGIs, and they’re more than capable of making disastrous mistakes.

Rebuttals and counter rebuttals

Again, a skeptic might say: a true BGI will be superintelligent, and won’t have any bugs.

But wake up: even AIs that are extremely competent 99.9% of the time can be thrown into disarray by circumstances beyond its training set. A pre-BGI system may well go badly wrong in such a circumstance.

A skeptic might say: a true BGI will never misunderstand what humans ask it to do. Such systems will have sufficient all-round knowledge to fill in the gaps in our instructions. They won’t do what we humans literally ask them to do, if they appreciate that we meant to ask them to do something slightly different. They won’t seek short-cuts that have terrible side-effects, since they will have full human wellbeing as their overarching objective.

But wake up: pre-BGI systems may fall short on at least one of the aspects just described.

A different kind of skeptic might say that the pre-BGI systems that their company is creating won’t have any of the above problems. “We know how to design these AI systems to be safe and beneficial”, they assert, “and we’re going to do it that way”.

But wake up: what about other people who are also releasing pre-BGI systems: maybe some of them will make the kinds of mistakes that you claim you won’t make. And in any case, how can you be so confident that your company isn’t deluding itself about its prowess in AI. (Here, I’m thinking particularly of Meta, whose AI systems have caused significant real-life problems, despite some of the leading AI developers in that company telling the world not to be concerned about the risks of AI-induced catastrophe.)

Finally, a skeptic might say that the AI systems their organization is creating will be able to disarm any malign pre-BGI systems released by less careful developers. Good pre-BGIs will outgun bad pre-BGIs. Therefore, no one should dare ask their organization to slow down, or to submit itself to tiresome bureaucratic checks and reviews.

But wake up: even though it’s your intention to create an exemplary AI system, you need to beware of wishful thinking and motivated self-deception. Especially if you perceive that you are in a race, and you want your pre-BGI to be released before that of an organization you distrust. That’s the kind of race when safety corners are cut, and the prize for winning is simply to be the organization that inflicts a catastrophe on humanity.

Recall the saying: “The road to hell is paved with good intentions”.

Credit: David Wood

Just because you conceive of yourself as one of the good guys, and you believe your intentions are exemplary, that doesn’t give you carte blanche to proceed down a path that could lead to a powerful pre-BGI getting one crucial calculation horribly wrong.

You might think that your pre-BGI is based entirely on positive ideas and a collaborative spirit. But each piece of technology is a two-edged sword, and guardrails, alas, can often be dismantled by determined experimenters or inquisitive hackers. Sometimes, indeed, the guardrails may break due to people in your team being distracted, careless, or otherwise incompetent.

Beyond good intentions

Biology researchers responsible for allowing leaks of deadly pathogens from their laboratories had no intention of causing such a disaster. On the contrary, the motivation behind their research was to understand how vaccines or other treatments might be developed in response to future new infectious diseases. What they envisioned was the wellbeing of the global population. Nevertheless, unknown numbers of people died from outbreaks resulting from the poor implementation of safety processes at their laboratories.

These researchers knew the critical importance of guardrails, yet for various reasons, the guardrails at their laboratories were breached.

How should we respond to the possibility of dangerous pathogens escaping from laboratories and causing countless deaths in the future? Should we just trust the good intentions of the researchers involved?

No, the first response should be to talk about the risk – to reach a better understanding of the conditions under which a biological pathogen can evade human control and cause widespread havoc.

It’s the same with the possibility of widespread havoc from a pre-BGI system that ends up operating outside human control. Alongside any inspirational talk about the wonderful things that could happen if true BGI is achieved, there needs to be a sober discussion of the possible malfunctions of pre-BGI systems. Otherwise, before we reach the state of sustainable superabundance for all, which I personally see as both possible and desirable, we might come to bitterly regret our inattention to matters of global safety.

Credit: David Wood

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Transcendent questions on the future of AI: New starting points for breaking the logjam of AI tribal thinking

Going nowhere fast

Imagine you’re listening to someone you don’t know very well. Perhaps you’ve never even met in real life. You’re just passing acquaintances on a social networking site. A friend of a friend, say. Let’s call that person FoF.

FoF is making an unusual argument. You’ve not thought much about it before. To you, it seems a bit subversive.

You pause. You click on FoF’s profile, and look at other things he has said. Wow, one of his other statements marks him out as an apparent supporter of Cause Z. (That’s a cause I’ve made up for the sake of this fictitious dialog.)

You shudder. People who support Cause Z have got their priorities all wrong. They’re committed to an outdated ideology. Or they fail to understand free market dynamics. Or they’re ignorant of the Sapir-Whorf hypothesis. Whatever. There’s no need for you to listen to them.

Indeed, since FoF is a supporter of Cause Z, you’re tempted to block him. Why let his subversive ill-informed ideas clutter up your tidy filter bubble?

But today, you’re feeling magnanimous. You decide to break into the conversation, with your own explanation of why Cause Z is mistaken.

In turn, FoF finds your remarks unusual. First, it’s nothing to do with what he had just been saying. Second, it’s not a line of discussion he has heard before. To him, it seems a bit subversive.

FoF pauses. He clicks on your social media profile, and looks at other things you’ve said. Wow. One of your other statements marks you out as an apparent supporter of Clause Y.

FoF shudders. People who support Cause Y have got their priorities all wrong.

FoF feels magnanimous too. He breaks into your conversation, with his explanation as to why Cause Y is bunk.

By now, you’re exasperated. FoF has completely missed the point you were making. This time you really are going to block him. Goodbye.

The result: nothing learned at all.

And two people have had their emotions stirred up in unproductive ways. Goodness knows when and where each might vent their furies.

Credit: David Wood

Trying again

We’ve all been the characters in this story on occasion. We’ve all missed opportunities to learn, and, in the process, we’ve had our emotions stirred up for no good reason.

Let’s consider how things could have gone better.

The first step forward is a commitment to resist prejudice. Maybe FoF really is a supporter of Cause Z. But that shouldn’t prejudge the value of anything else he also happens to say. Maybe you really are a supporter of Cause Y. But that doesn’t mean FoF should jump to conclusions about other opinions you offer.

Ideally, ideas should be separated from the broader philosophies in which they might be located. Ideas should be assessed on their own merits, without regard to who first advanced them – and regardless of who else supports them.

In other words, activists must be ready to set aside some of their haste and self-confidence, and instead adopt, at least for a while, the methods of the academy rather than the methods of activism.

That’s because, frankly, the challenges we’re facing as a global civilization are so complex as to defy being fully described by any one of our worldviews.

Cause Z may indeed have useful insights – but also some nasty blindspots. Likewise for Cause Y, and all the other causes and worldviews that gather supporters from time to time. None of them have all the answers.

On a good day, FoF appreciates that point. So do you. Both of you are willing, in principle, to supplement your own activism with a willingness to assess new ideas.

That’s in principle. The practice is often different.

That’s not just because we are tribal beings – having inherited tribal instincts from our prehistoric evolutionary ancestors.

It’s also because the ideas that are put forward as starting points for meaningful open discussions all too often fail in that purpose. They’re intended to help us set aside, for a while, our usual worldviews. But all too often, they have just a thin separation from well-known ideological positions.

These ideas aren’t sufficiently interesting in their own right. They’re too obviously a proxy for an underlying cause.

That’s why real effort needs to be put into designing what can be called transcendent questions.

These questions are potential starting points for meaningful non-tribal open discussions. These questions have the ability to trigger a suspension of ideology.

But without good transcendent questions, the conversation will quickly cascade back down to its previous state of logjam. That’s despite the good intentions people tried to keep in mind. And we’ll be blocking each other – if not literally, then mentally.

Credit: Tesfu Assefa

The AI conversation logjam

Within discussions of the future of AI, some tribal positions are well known –

One tribal group is defined by the opinion that so-called AI systems are not ‘true’ intelligence. In this view, these AI systems are just narrow tools, mindless number crunchers, statistical extrapolations, or stochastic parrots. People in this group delight in pointing out instances where AI systems make grotesque errors.

A second tribal group is overwhelmed with a sense of dread. In this view, AI is on the point of running beyond control. Indeed, Big Tech is on the point of running beyond control. Open-source mavericks are on the point of running beyond control. And there’s little that can be done about any of this.

A third group is focused on the remarkable benefits that advanced AI systems can deliver. Not only can such AI systems solve problems of climate change, poverty and malnutrition, cancer and dementia, and even aging. Crucially, they can also solve any problems that earlier, weaker generations of AI might be on the point of causing. In this view, it’s important to accelerate as fast as possible into that new world.

Crudely, these are the skeptics, the doomers, and the accelerationists. Sadly, they often have dim opinions of each other. When they identify a conversation partner as being a member of an opposed tribe, they shudder.

Can we find some transcendent questions, which will allow people with sympathies for these various groups to overcome, for a while, their tribal loyalties, in search of a better understanding? Which questions might unblock the AI safety conversation logjam?

Unblocking the AI safety conversation logjam (Credit: David Wood)

A different starting point

In this context, I want to applaud Rob Bensinger. Rob is the communications lead at an organization called MIRI (the Machine Intelligence Research Institute).

(Just in case you’re tempted to strop away now, muttering unkind thoughts about MIRI, let me remind you of the commitment you made, a few paragraphs back, not to prejudge an idea just because the person raising it has some associations you disdain.)

(You did make that commitment, didn’t you?)

Rob has noticed the same kind of logjam and tribalism that I’ve just been talking about. As he puts it in a recent article:

Recent discussions of AI x-risk in places like Twitter tend to focus on “are you in the Rightthink Tribe, or the Wrongthink Tribe?” Are you a doomer? An accelerationist? An EA? A techno-optimist?

I’m pretty sure these discussions would go way better if the discussion looked less like that. More concrete claims, details, and probabilities; fewer vague slogans and vague expressions of certainty.

Following that introduction, Rob introduces his own set of twelve questions, as shown in the following picture:

Credit: Rob Bensinger

For each of the twelve questions, readers are invited, not just to give a forthright ‘yes’ or ‘no’ answer, but to think probabilistically. They’re also invited to consider which range of probabilities other well-informed people with good reasoning abilities might plausibly assign to each answer.

It’s where Rob’s questions start that I find most interesting.

PCAI, SEMTAI, and PHUAI

All too often, discussions about the safety of future AI systems fail at the first hurdle. As soon as the phrase ‘AGI’ is mentioned, unhelpful philosophical debates break out.That’s why I have been suggesting new terms, such as PCAI, SEMTAI, and PHUAI:

Credit: David Wood

I’ve suggested the pronunciations ‘pea sigh’, ‘sem tie’, and ‘foo eye’ – so that they all rhyme with each other and, also, with ‘AGI’. The three acronyms stand for:

  • Potentially Catastrophic AI
  • Science, Engineering, and Medicine Transforming AI
  • Potentially Humanity-Usurping AI.

These concepts lead the conversation fairly quickly to three pairs of potentially transcendent questions:

  • “When is PCAI likely to be created?” and “How could we stop these potentially catastrophic AI systems from being actually catastrophic?”
  • “When is SEMTAI likely to be created?” and “How can we accelerate the advent of SEMTAI without also accelerating the advent of dangerous versions of PCAI or PHUAI?”
  • “When is PHUAI likely to be created?” and “How could we stop such an AI from actually usurping humanity into a very unhappy state?”

The future most of us can agree as being profoundly desirable, surely, is one in which SEMTAI exists and is working wonders, uplifting the disciplines of science, engineering, and medicine.

If we can gain these benefits without the AI systems being “fully general” or “all-round superintelligent” or “independently autonomous, with desires and goals of its own”, I would personally see that as an advantage.

But regardless of whether SEMTAI actually meets the criteria various people have included in their own definitions of AGI, what path gives humanity SEMTAI without also giving us PCAI or even PHUAI? This is the key challenge.

Credit: David Wood

Introducing ‘STEM+ AI’

Well, I confess that Rob Bensinger didn’t start his list of potentially transcendent questions with the concept of SEMTAI.

However, the term he did introduce was, as it happens, a slight rearrangement of the same letters: ‘STEM+ AI’. And the definition is pretty similar too:

Let ‘STEM+ AI’ be short for “AI that’s better at STEM research than the best human scientists (in addition to perhaps having other skills).

That leads to the first three questions on Rob’s list:

  1. What’s the probability that it’s physically impossible to ever build STEM+ AI?
  2. What’s the probability that STEM+ AI will exist by the year 2035?
  3. What’s the probability that STEM+ AI will exist by the year 2100?

At this point, you should probably pause, and determine your own answers. You don’t need to be precise. Just choose between one of the following probability ranges:

  • Below 1%
  • Around 10%
  • Around 50%
  • Around 90%
  • Above 99%

I won’t tell you my answers. Nor Rob’s, though you can find them online easily enough from links in his main article. It’s better if you reach your own answers first.

And recall the wider idea: don’t just decide your own answers. Also consider which probability ranges someone else might assign, assuming they are well-informed and competent in reasoning.

Then when you compare your answers with those of a colleague, friend, or online acquaintance, and discover surprising differences, the next step, of course, is to explore why each of you have reached your conclusions.

The probability of disempowering humanity

The next question that causes conversations about AI safety to stumble: what scales of risks should we look at? Should we focus our concern on so-called ‘existential risk’? What about ‘catastrophic risk’?

Rob seeks to transcend that logjam too. He raises questions about the probability that a STEM+ AI will disempower humanity. Here are questions 4 to 6 on his list:

  1. What’s the probability that, if STEM+AI is built, then AIs will be (individually or collectively) able, within ten years, to disempower humanity?
  2. What’s the probability that, if STEM+AI is built, then AIs will disempower humanity within ten years?
  3. What’s the probability that, if STEM+AI is built, then AIs will disempower humanity within three months?

Question 4 is about capability: given STEM+ AI abilities, will AI systems be capable, as a consequence, to disempower humanity?

Questions 5 and 6 move from capability to proclivity. Will these AI systems actually exercise these abilities they have acquired? And if so, potentially how quickly?

Separating the ability and proclivity questions is an inspired idea. Again, I invite you to consider your answers.

Two moral evaluations

Question 7 introduces another angle, namely that of moral evaluation:

  • 7. What’s the probability that, if AI wipes out humanity and colonizes the universe itself, the future will go about as well as if humanity had survived (or better)?
Credit: David Wood

The last question in the set – question 12 – also asks for a moral evaluation:

  • 12. How strongly do you agree with the statement that it would be an unprecedentedly huge tragedy if we never built STEM+ AI?

Yet again, these questions have the ability to inspire fruitful conversation and provoke new insights.

Better technology or better governance?

You may have noticed I skipped numbers 8-11. These four questions may be the most important on the entire list. They address questions of technological possibility and governance possibility. Here’s question 11:

  • 11. What’s the probability that governments will generally be reasonable in how they handle AI risk?

And here’s question 10:

  • 10. What’s the probability that, for the purpose of preventing AI catastrophes, technical research is more useful than policy work?

As for questions 8 and 9, well, I’ll leave you to discover these by yourself. And I encourage you to become involved in the online conversation that these questions have catalyzed.

Finally, if you think you have a better transcendent question to drop into the conversation, please let me know!

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Against Contractionism: Enabling and encouraging open minds, rather than constricting understanding with blunt labels

A dangerous label

Imagine if I said, “Some Christians deny the reality of biological evolution, therefore all Christians deny the reality of biological evolution.”

Or that some Muslims believe that apostates should be put to death, therefore all Muslims share that belief.

Or that some atheists (Stalin and Mao, for example) caused the deaths of millions of people, therefore atheism itself is a murderous ideology.

Or, to come closer to home (for me: I grew up in Aberdeenshire, Scotland) – that since some Scots are mean with money, therefore all Scots are mean with money. (Within Scotland itself, that unkind stereotype exists with a twist: allegedly, all Aberdonians are mean with money.)

In all these cases, you would say that’s unwarranted labeling. Spreading such stereotypes is dangerous. It gets in the way of a fuller analysis and deeper appreciation. Real-life people are much more varied than that. Any community has its diversity.

Well, I am equally shocked by another instance of labeling. That involves the lazy concept of TESCREAL – a concept that featured in a recent article here on Mindplex Magazine, titled TESCREALism: Has The Silicon Valley Ruling Class Gone To Crazy Town? Émile Torres In Conversation With R.U. Sirius.

When I read the article, my first thought was: “Has Mindplex gone to Crazy Town!”

The concept of TESCREAL, which has been promoted several times in various locations in recent months, contracts a rich and diverse set of ideas down to a vastly over-simplified conclusion. It suggests that the worst aspects of any people who hold any of the beliefs wrapped up into that supposed bundle of ideas, can be attributed, with confidence, to other people who hold just some of these ideas.

Worse, it suggests that the entire “ruling class” of Silicon Valley subscribe to the worst of these beliefs.

It’s as if I picked a random atheist and insisted that they were equally as murderous as Stalin or Mao.

Or if I picked a random Muslim and insisted that they wished for the deaths of every person (apostate) who had grown up with Muslim faith and subsequently left that faith behind.

Instead of that kind of contraction of ideas, what the world badly needs nowadays is an open-minded exploration of a wide set of complex and subtle ideas.

Not the baying for blood which seems to motivate the proponents of the TESCREAL analysis.

Not the incitement to hatred towards the entrepreneurs and technologists who are building many remarkable products in Silicon Valley – people who, yes, do need to be held to account for some of what they’re doing, but who are by no means a uniform camp!

I’m a T but not an L

Let’s take one example: me.

I publicly identify as a transhumanist – the ‘T’ of TESCREAL.

The word ‘transhumanist’ appears on the cover of one of my books, and the related word ‘transhumanism’ appears on the cover of another one.

Book covers of David Wood’s ‘Vital Foresight‘ and ‘Sustainable Superabundance,’ published in 2021 and 2019 respectively. Credit: (David Wood)

As it happens, I’ve also been a technology executive. I was mainly based in London, but I was responsible for overseeing staff in the Symbian office in Redwood City in Silicon Valley. Together, we envisioned how new-fangled devices called ‘smartphones’ might in due course improve many aspects of the lives of users. (And we also reflected on at least some potential downsides, including the risks of security and privacy violations, which is why I championed the new ‘platform security’ redesign of the Symbian OS kernel. But I digress.)

Since I am ‘T’, does that mean, therefore, that I am also ESCREAL?

Let’s look at that final ‘L’. Longtermism. This letter is critical to many of the arguments made by people who like the TESCREAL analysis.

‘Longtermism’ is the belief that the needs of potentially vast numbers of as-yet unborn (and unconceived) people in future generations can outweigh the needs of people currently living.

Well, I don’t subscribe to it. It doesn’t guide my decisions.

I’m motivated by the possibility of technology to vastly improve the lives of everyone around the world, living today. And by the need to anticipate and head-off potential catastrophe.

By ‘catastrophe’, I mean anything that kills large numbers of people who are currently alive.

The deaths of 100% of those alive today will wipe out humanity’s future, but the deaths of ‘just’ 90% of people won’t. Longtermists are fond of pointing this out, and while it may be theoretically correct, it doesn’t provide any justification to ignore the needs of present-day people, to raise the probability of larger numbers of future people being born.

Some people have said they’re persuaded by the longtermist argument. But I suspect that’s only a small minority of rather intellectual people. My experience with people in Silicon Valley, and with others who are envisioning and building new technologies, is that these abstract longtermist considerations do not guide their daily decisions. Far from it.

Credit: Tesfu Assefa

Concepts are complicated

A larger point needs to be made here. Concepts such as ‘transhumanism’ and ‘longtermism’ each embody rich variety.

It’s the same with all the other components of the supposed TESCREAL bundle: E for Extropianism, S for Singularitarianism, C for Cosmism, R for Rationalism, and EA for Effective Altruism.

In each case, we should avoid contractionism – thinking that if you have heard one person who defends that philosophy expressing one opinion, then you can deduce what they think about all other matters. In practice, people are more complicated – and ideas are more complicated.

As I see it, parts of each of the T, E, S, C, R, and EA philosophies deserve wide attention and support. But if you are hostile, and do some digging, you can easily find people, from within the communities around each of these terms, who have said something despicable or frightening. And then you can (lazily) label everyone else in that community with that same unwelcome trait. (“Seen one; seen them all!”)

These extended communities do have some people with unwelcome traits. Indeed, each of T and S have attracted what I call a ‘shadow’ – a set of associated beliefs and attitudes that are deviations from the valuable core ideas of the philosophy. Here’s a picture I use of the Singularity shadow:

A video cover image from ‘The Vital Syllabus Playlist’ where David Wood examines the Singularity Shadow. Credit: (David Wood)

And here’s a picture of the transhumanist shadow:

A video cover image from ‘The Vital Syllabus Playlist’ where David Wood examines the Transhumanist Shadow. Credit: (David Wood)

(In both cases, you can click on the caption links to view a video that provides a fuller analysis.)

As you can see, the traits in the transhumanist shadow arise when people fail to uphold what I have listed as ‘transhumanist values’.

The existence of these shadows is undeniable, and unfortunate. The beliefs and attitudes in them can deter independent observers from taking the core philosophies seriously.

In that case, you might ask, why persist with the core terms ‘transhumanism’ and ‘singularity’? Because there are critically important positive messages in both these philosophies! Let’s turn to these next.

The most vital foresight

Here’s my 33-word summary of the most vital piece of foresight that I can offer:

Oncoming waves of technological change are poised to deliver either global destruction or a paradise-like sustainable superabundance, with the outcome depending on the timely elevation of transhumanist vision, transhumanist politics, and transhumanist education.

Let’s cover that again, more slowly this time.

First things first. Technological changes over the next few decades will place vast new power in billions of human hands. Rather than focusing on the implications of today’s technology – significant though they are – we need to raise our attention to the even larger implications of the technology of the near future.

Second, these technologies will magnify the risks of humanitarian disaster. If we are already worried about these risks today (as we should be), we should be even more worried about how they will develop in the near future.

Third, the same set of technologies, handled more wisely, and vigorously steered, can result in a very different outcome: a sustainable superabundance of clean energy, healthy nutrition, material goods, excellent health, all-round intelligence, dynamic creativity, and profound collaboration.

Fourth, the biggest influence on which outcome is realized is the widespread adoption of transhumanism. This in turn involves three activities:

• Advocating transhumanist philosophy as an overarching worldview that encourages and inspires everyone to join the next leap upward on life’s grand evolutionary ladder: we can and should develop to higher levels, physically, mentally, and socially, using science, technology, and rational methods.
• Extending transhumanist ideas into real-world political activities, to counter very destructive trends in that field.
• Underpinning the above initiatives: a transformation of the world of education, to provide everyone with skills suited to the very different circumstances of the near future, rather than the needs of the past.

Finally, overhanging the momentous transition that I’ve just described is the potential of an even larger change, in which technology moves ahead yet more quickly, with the advent of self-improving artificial intelligence with superhuman levels of capability in all aspects of thinking.

That brings us to the subject of the Singularity.

The Singularity is the point in time when AIs could, potentially, take over control of the world from humans. The fact that the Singularity could happen within a few short decades deserves to be shouted from the rooftops. That’s what I do, some of the time. That makes me a singularitarian.

But it doesn’t mean that I, or others who are likewise trying to raise awareness of this possibility, fall into any of the traits in the Singularity Shadow. It doesn’t mean, for example, that we’re all complacent about risks, or all think that it’s basically inevitable that the Singularity will be good for humanity.

So, Singularitarianism (S) isn’t the problem. Transhumanism (T) isn’t the problem. Nor, for that matter, does the problem lie in the core beliefs of the E, C, R, or EA parts of the supposed TESCREAL bundle. The problem lies somewhere else.

What should worry us: not TESCREAL, but CASHAP

Rather than obsessing over a supposed TESCREAL takeover of Silicon Valley, here’s what we should actually be worried about: CASHAP.

C is for contractionism – the tendency to push together ideas that don’t necessarily belong together, to overlook variations and complications in people and in ideas, and to insist that the core values of a group can be denigrated just because some peripheral members have some nasty beliefs or attitudes.

(Note: whereas the fans of the TESCREAL concept are guilty of contractionism, my alternative concept of CASHAP is different. I’m not suggesting the ideas in it always belong together. Each of the individual ideas that make up CASHAP are detrimental.)

A is for accelerationism – the desire to see new technologies developed and deployed as fast as possible, under the irresponsible belief that any flaws encountered en route can always be easily fixed in the process (“move fast and break things”).

S is for successionism – the view that if superintelligent AI displaces humanity from being in control of the planet, that succession should automatically be welcomed as part of the grand evolutionary process – regardless of what happens to the humans in the process, regardless of whether the AIs have sentience and consciousness, and indeed regardless of whether these AIs go on to destroy themselves and the planet.

H is for hype – believing ideas too easily because they fit into your pre-existing view of the world, rather than using critical thinking.

AP is for anti-politics – believing that politics always makes things worse, getting in the way of innovation and creativity. In reality good politics has been incredibly important in improving the human condition.

Conclusion

I’ll conclude this article by emphasizing the positive opposites to the undesirable CASHAP traits that I’ve just listed.

Instead of contractionism, we must be ready to expand our thinking, and have our ideas challenged. We must be ready to find important new ideas in unexpected places – including from people with whom we have many disagreements. We must be ready to put our emotional reactions on hold from time to time, since our prior instincts are by no means an infallible guide to the turbulent new times ahead.

Instead of accelerationism, we must use a more sophisticated set of tools: sometimes braking, sometimes accelerating, and doing a lot of steering too. That’s what I’ve called the technoprogressive (or techno-agile) approach to the future.

Credit: David Wood

Instead of successionism, we should embrace transhumanism: we can, and should, elevate today’s humans towards higher levels of health, vitality, liberty, creativity, intelligence, awareness, happiness, collaboration, and bliss. And before we press any buttons that might lead to humanity being displaced by superintelligent AIs that might terminate our flourishing, we need to research a whole bunch of issues a lot more carefully!

Instead of hype, we must recommit to critical thinking, becoming more aware of any tendencies to reach false conclusions, or to put too much weight on conclusions that are only tentative. Indeed, that’s the central message of the R (rationalism) part of TESCREAL, which makes it all the more ‘crazy town’ that R is held in contempt by that contractionist over-simplification.

We must clarify and defend what has been called ‘the narrow path’ (or sometimes, simply, ‘future politics’) – that lies between states having too little power (leaving societies hostage to destructive cancers that can grow in our midst) and having too much power (unmatched by the counterweight of a ‘strong society’).

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

The Astonishing Vastness of Mind Space: The incalculable challenges of coexisting with radically alien AI superintelligence

More things in heaven and earth

As humans, we tend to compare other intelligences to our own, human, intelligence. That’s an understandable bias, but it could be disastrous.

Rather than our analysis being human-bound, we need to heed the words of Shakespeare’s Hamlet:

There are more things in heaven and earth, Horatio, than are dreamt of in our philosophy.

More recent writers, in effect amplifying Shakespeare’s warning, have enthralled us with their depictions of numerous creatures with bewildering mental attributes. The pages of science fiction can, indeed, stretch our imagination in remarkable ways. But these narratives are easy to dismiss as being “just” science fiction.

That’s why my own narrative, in this article, circles back to an analysis that featured in my previous Mindplex article, Bursting out of confinement. The analysis in question is the famous – or should I say infamous “Simulation Argument”. The Simulation Argument raises some disturbing possibilities about non-human intelligence. Many critics try to dismiss these possibilities – waving them away as “pseudoscience” or, again, as “just science fiction” – but they’re being overly hasty. My conclusion is that, as we collectively decide how to design next generation AI systems, we ought to carefully ponder these possibilities.

In short, what we need to contemplate is the astonishing vastness of the space of all possible minds. These minds vary in unfathomable ways not only in how they think but also in what they think about and care about.

Hamlet’s warning can be restated:

There are more types of superintelligence in mind space, Horatio, than are dreamt of in our philosophy.

By the way, don’t worry if you’ve not yet read my previous Mindplex article. Whilst these two articles add up to a larger picture, they are independent of each other.

How alien?

As I said: we humans tend to compare other intelligences to our own, human, intelligence. Therefore, we tend to expect that AI superintelligence, when it emerges, will sit on some broad spectrum that extends from the intelligence of amoebae and ants through that of mice and monkeys to that of humans and beyond.

When pushed, we may concede that AI superintelligence is likely to have some characteristics we would describe as alien.

In a simple equation, overall human intelligence (HI) might be viewed as a combination of multiple different kinds of intelligence (I1, I2, …), such as spatial intelligence, musical intelligence, mathematical intelligence, linguistic intelligence, interpersonal intelligence, and so on:

HI = I1 + I2 + … + In

In that conception, AI superintelligence (ASI) is a compound magnification (m1, m2, …) of these various capabilities, with a bit of “alien extra” (X) tacked on at the end:

ASI = m1*I1 + m2*I2 + … + mn*In + X

What’s at issue is whether the ASI is dominated by the first terms in this expression, or by the unknown X present at the end.

Whether some form of humans will thrive in a coexistence with ASI will depend on how alien that superintelligence is.

Perhaps the ASI will provide a safe, secure environment, in which we humans can carry out our human activities to our hearts’ content. Perhaps the ASI will augment us, uplift us, or even allow us to merge with it, so that we retain what we see as the best of our current human characteristics, whilst leaving behind various unfortunate hangovers from our prior evolutionary trajectory. But that all depends on factors that it’s challenging to assess:

  • How much “common cause” the ASI will feel toward humans
  • Whether any initial feeling of common cause will dissipate as the ASI self-improves
  • To what extent new X factors could alter considerations in ways that we have not begun to consider.

Four responses to the X possibility

Our inability to foresee the implications of unknowable new ‘X’ capabilities in ASI should make us pause for thought. That inability was what prompted author and mathematics professor Vernor Vinge to develop in 1983 his version of the notion of “Singularity”. To summarize what I covered in more detail in a previous Mindplex article, “Untangling the confusion”, Vinge predicted that a new world was about to emerge that “will pass far beyond our understanding”:

We are at the point of accelerating the evolution of intelligence itself… We will soon create intelligences greater than our own. When this happens, human history will have reached a kind of singularity, an intellectual transition as impenetrable as the knotted space-time at the center of a black hole, and the world will pass far beyond our understanding. This singularity, I believe, already haunts a number of science fiction writers. It makes realistic extrapolation to an interstellar future impossible.

Reactions to this potential unpredictability can be split into four groups of thought:

  1. Dismissal: A denial of the possibility of ASI. Thankfully, this reaction has become much less common recently.
  2. Fatalism: Since we cannot anticipate what surprise new ‘X’ features may be possessed by an ASI, it’s a waste of time to speculate about them or to worry about them. What will be, will be. Who are we humans to think we can subvert the next step in cosmic evolution?
  3. Optimism: There’s no point in being overcome with doom and gloom. Let’s convince ourselves to feel lucky. Humanity has had a good run so far, and if we extrapolate that history beyond the singularity, we can hope to have an even better run in the future.
  4. Activism: Rather than rolling the dice, we should proactively alter the environment in which new generations of AI are being developed, to reduce the risks of any surprise ‘X’ features emerging that would overwhelm our abilities to retain control.

I place myself squarely in the activist camp, and I’m happy to adopt the description of “Singularity Activist”.

To be clear, this doesn’t mean I’m blind to the potential huge upsides to beneficial ASI. It’s just that I’m aware, as well, of major risks en route to that potential future.

A journey through a complicated landscape

As an analogy, consider a journey through a complicated landscape:

Credit: David Wood (Image by Midjourney)

In this journey, we see a wonderful existential opportunity ahead – a lush valley, fertile lands, and gleaming mountain peaks soaring upward to a transcendent realm. But in front of that opportunity is a river of uncertainty, bordered by a swamp of ambiguity, perhaps occupied by hungry predators lurking in shadows.

Are there just two options?

  1. We are intimidated by the possible dangers ahead, and decide not to travel any further
  2. We fixate on the gleaming mountain peaks, and rush on regardless, belittling anyone who warns of piranhas, treacherous river currents, alligators, potential mud slides, and so on

Isn’t there a third option? To take the time to gain a better understanding of the lie of the land ahead. Perhaps there’s a spot, to one side, where it will be easier to cross the river. A location where a stable bridge can be built. Perhaps we could even build a helicopter that can assist us over the strongest currents…

It’s the same with the landscape of our journey towards the sustainable superabundance that could be achieved, with the assistance of advanced AI, provided we act wisely. That’s the vision of Singularity Activism.

Obstacles to Singularity Activism

The Singularity Activist outlook faces two main obstacles.

The first obstacle is the perception that there’s nothing we humans can usefully do, to meaningfully alter the course of development of ASI. If we slow down our own efforts, in order to apply more resources in the short term on questions of safety and reliability, it just makes it more likely that another group of people – probably people with fewer moral qualms than us – will rush ahead and create ASI.

In this line of thinking, the best way forward is to create prototype ASI systems as soon as possible, and then to use these systems to help design and evolve better ASI systems, so that everyone can benefit from what will hopefully be a wonderful outcome.

The second obstacle is the perception that there’s nothing we humans particularly need to do, to avoid the risks of adverse outcomes, since these risks are pretty small in any case. Just as we don’t over-agonise about the risks of us being struck by debris falling from an overhead airplane, we shouldn’t over-agonise about the risks of bad consequences of ASI.

Credit: David Wood

But on this occasion, where I want to focus is assessing the scale and magnitude of the risk that we are facing, if we move forward with overconfidence and inattention. That is, I want to challenge the second of the above misperceptions.

As a step toward that conclusion, it’s time to bring an ally to the table. That ally is the Simulation Argument. Buckle up!

Are we simulated?

The Simulation Argument puts a particular hypothesis on the table, known as the Simulation Hypothesis. That hypothesis proposes that we humans are mistaken about the ultimate nature of reality. What we consider to be “reality” is, in this hypothesis, a simulated (virtual) world, designed and operated by “simulators” who exist outside what we consider the entire universe.

It’s similar to interactions inside a computer game. As humans play these games, they encounter challenges and puzzles that need to be solved. Some of these challenges involve agents (characters) within the game – agents which appear to have some elements of autonomy and intelligence. These agents have been programmed into the game by the game’s designers. Depending on the type of game, the greater the intelligence of the built-in agents, the more enjoyable it is to play it.

Games are only one example of simulation. We can also consider simulations created as a kind of experiment. In this case, a designer may be motivated by curiosity: They may want to find out what would happen if such-and-such initial conditions were created. For example, if Archduke Ferdinand had escaped assassination in Sarajevo in June 1914, would the European powers still have blundered into something akin to World War One? Again, such simulations could contain numerous intelligent agents – potentially (as in the example just mentioned) many millions of such agents.

Consider reality from the point of view of such an agent. What these agents perceive inside their simulation is far from being the entirety of the universe as is known to the humans who operate the simulation. The laws of cause-and-effect within the simulation could deviate from the laws applicable in the outside world. Some events in the simulation that lack any explanation inside that world may be straightforwardly explained from the outside perspective: the human operator made such-and-such a decision, or altered a setting, or – in an extreme case – decided to reset or terminate the simulation. In other words, what is bewildering to the agent may make good sense to the author(s) of the simulation.

Now suppose that, as such agents become more intelligent, they also become self-aware. That brings us to the crux question: how can we know whether we humans are, likewise, agents in a simulation whose designers and operators exist beyond our direct perception? For example, we might be part of a simulation of world history in which Archduke Ferdinand was assassinated in Sarajevo in June 1914. Or we might be part of a simulation whose purpose far exceeds our own comprehension.

Indeed, if the human creative capability (HCC) to create simulations is expressed as a sum of different creative capabilities (CC1, CC2, …),

HCC = CC1 + CC2 + … + CCn

then the creative capability of a hypothetical superhuman simulation designer (SCC) might be expressed as a compound magnification (m1, m2, …) of these various capabilities, with a bit of “alien extra” (X) tacked on at the end:

SCC = m1*CC1 + m2*CC2 + … + mn*CCn + X

Weighing the numbers

Before assessing the possible scale and implications of the ‘X’ factor in that equation, there’s another set of numbers to consider. These numbers attempt to weigh up the distribution of self-aware intelligent agents. What proportion of that total set of agents are simulated, compared to those that are in “base reality”?

If we’re just counting intelligences, the conclusion is easy. Assuming there is no catastrophe that upends the progress of technology, then, over the course of all of history, there will likely be vastly more artificial (simulated) intelligences than beings who have base (non-simulated) intelligences. That’s because computing hardware is becoming more powerful and widespread.

There are already more “intelligent things” than humans connected to the Internet: the analysis firm Statista estimates that, in 2023, the first of these numbers is 15.14 billion, which is almost triple the second number (5.07 billion). In 2023, most of these “intelligent things” have intelligence far shallower than that of humans, but as time progresses, more and more intelligent agents of various sorts will be created. That’s thanks to ongoing exponential improvements in the capabilities of hardware, networks, software, and data analysis.

Therefore, if an intelligence could be selected at random, from the set of all such intelligences, the likelihood is that it would be an artificial intelligence.

The Simulation Argument takes these considerations one step further. Rather than just selecting an intelligence at random, what if we consider selecting a self-aware conscious intelligence at random. Given the vast numbers of agents that are operating inside vast numbers of simulations, now or in the future, the likelihood is that a simulated agent has been selected. In other words, we – you and I – observing ourselves to be self-aware and intelligent, should conclude that it’s likely we ourselves are simulated.

Thus the conclusion of the Simulation Argument is that we should take the Simulation Hypothesis seriously. To be clear, that hypothesis isn’t the only possible legitimate response to the argument. Two other responses are to deny two of the assumptions that I mentioned when building the argument:

  • The assumption that technology will continue to progress, to the point where simulated intelligences vastly exceed non-simulated intelligences
  • The assumption that the agents in these simulations will be not just intelligent but also conscious and self-aware.
Credit: Tesfu Assefa

Objections and counters

Friends who are sympathetic to most of my arguments sometimes turn frosty when I raise the topic of the Simulation Hypothesis. It clearly makes people uncomfortable.

In their state of discomfort, critics of the argument can raise a number of objections. For example, they complain that the argument is entirely metaphysical, not having any actual consequences for how we live our lives. There’s no way to test it, the objection runs. As such, it’s unscientific.

As someone who spent four years of my life (1982-1986) in the History and Philosophy of Science department in Cambridge, I am unconvinced by these criticisms. Science has a history of theories moving from non-testable to testable. The physicist Ernst Mach was famously hostile to the hypothesis that atoms exist. He declared his disbelief in atoms in a public auditorium in Vienna in 1897: “I don’t believe that atoms exist”. There was no point in speculating about the existence of things that could not be directly observed, he asserted. Later in his life, Mach likewise complained about the scientific value of Einstein’s theory of relativity:

I can accept the theory of relativity as little as I can accept the existence of atoms and other such dogma.

Intellectual heirs of Mach in the behaviorist school of psychology fought similar battles against the value of notions of mental states. According to experimentalists like John B. Watson and B.F. Skinner, people’s introspections of their own mental condition had no scientific merit. Far better, they said, to concentrate on what could be observed externally, rather than on metaphysical inferences about hypothetical minds.

As it happened, the theories of atoms, of special relativity, and of internal mental states, all gave rise in due course to important experimental investigations, which improved the ability of scientists to make predictions and to alter the course of events.

It may well be the same with the Simulation Hypothesis. There are already suggestions of experiments that might be carried out to distinguish between possible modes of simulation. Just because a theory is accused of being “metaphysical”, it doesn’t follow that no good science can arise from it.

A different set of objections to the Simulation Argument gets hung up on tortuous debates over the mathematics of probabilities. (For additional confusion, questions of infinities can be mixed in too.) Allegedly, because we cannot meaningfully measure these probabilities, the whole argument makes no sense.

However, the Simulation Argument makes only minimal appeal to theories of mathematics. It simply points out that there are likely to be many more simulated intelligences than non-simulated intelligences.

Well, critics sometimes respond, it must therefore be the case that simulated intelligences can never be self-aware. They ask, with some derision, whoever imagined that silicon could become conscious? There must be some critical aspect of biological brains which cannot be duplicated in artificial minds. And in that case, the fact that we are self-aware would lead us to conclude we are not simulated.

To me, that’s far too hasty an answer. It’s true that the topics of self-awareness and consciousness are more controversial than the topic of intelligence. It is doubtless true that at least some artificial minds will lack conscious self-awareness. But if evolution has bestowed conscious self-awareness on intelligent creatures, we should be wary of declaring that property to provide no assistance to these creatures. Such a conclusion would be similar to declaring that sleep is for losers, despite the ubiquity of sleep in mammalian evolution.

If evolution has given us sleep, we should be open to the possibility that it has positive side-effects for our health. (It does!) Likewise, if evolution has given us conscious self-awareness, we should be open to the idea that creatures benefit from that characteristic. Simulators, therefore, may well be tempted to engineer a corresponding attribute into the agents they create. And if it turns out that specific physical features of the biological brain need to be copied into the simulation hardware, to enable conscious self-awareness, so be it.

The repugnant conclusion

When an argument faces so much criticism, yet the criticisms fail to stand up to scrutiny, it’s often a sign that something else is happening behind the scenes.

Here’s what I think is happening with the Simulation Argument. If we accept the Simulation Hypothesis, it means we have to accept a morally repugnant conclusion about the simulators that have created us. Namely, these simulators give no sign of caring about all the terrible suffering experienced by the agents inside the simulation.

Yes, some agents have good lives, but very many others have dismal fates. The thought that a simulator would countenance all this suffering is daunting.

Of course, this is the age-old “problem of evil”, well known in the philosophy of religion. Why would an all-good all-knowing all-powerful deity allow so many terrible things to happen to so many humans over the course of history? It doesn’t make sense. That’s one reason why many people have turned their back on any religious faith that implies a supposedly all-good all-knowing all-powerful deity.

Needless to say, religious faith persists, with the protection of one or more of the following rationales:

  • We humans aren’t entitled to use our limited appreciation of good vs. evil to cast judgment on what actions an all-good deity should take
  • We humans shouldn’t rely on our limited intellects to try to fathom the “mysterious ways” in which a deity operates
  • Perhaps the deity isn’t all-powerful after all, in the sense that there are constraints beyond human appreciation in what the deity can accomplish.

Occasionally, yet another idea is added to the mix:

  • A benevolent deity needs to coexist with an evil antagonist, such as a “satan” or other primeval prince of darkness.

Against such rationalizations, the spirit of the enlightenment offers a different, more hopeful analysis:

  • Whichever forces gave rise to the universe, they have no conscious concern for human wellbeing
  • Although human intellects run up against cognitive limitations, we can, and should, seek to improve our understanding of how the universe operates, and of the preconditions for human flourishing
  • Although it is challenging when different moral frameworks clash, or when individual moral frameworks fail to provide clear guidelines, we can, and should, seek to establish wide agreement on which kinds of human actions to applaud and encourage, and which to oppose and forbid
  • Rather than us being the playthings of angels and demons, the future of humanity is in our own hands.

However, if we follow the Simulation Argument, we are confronted by what seems to be a throwback to a more superstitious era:

  • We may owe our existence to actions by beings beyond our comprehension
  • These beings demonstrate little affinity for the kinds of moral values we treasure
  • We might comfort each other with the claim that “[whilst] the arc of the moral universe is long, … it bends toward justice”, but we have no solid evidence in favor of that optimism, and plenty of evidence that good people are laid to waste as life proceeds.

If the Simulation Argument leads us to such conclusions, it’s little surprise that people seek to oppose it.

However, just because we dislike a conclusion, that doesn’t entitle us to assume that it’s false. Rather, it behooves us to consider how we might adjust our plans in the light of that conclusion possibly being true.

The vastness of ethical possibilities

If you disliked the previous section, you may dislike this next part even more strongly. But I urge you to put your skepticism on hold, for a while, and bear with me.

The Simulation Argument suggests that beings who are extremely powerful and extremely intelligent – beings capable of creating a universe-scale simulation in which we exist – may have an ethical framework that is very different from ones we fondly hope would be possessed by all-powerful all-knowing beings.

It’s not that their ethical concerns exceed our own. It’s that they differ in fundamental ways from what we might predict.

I’ll return, for a third and final time, to a pair of equations. If overall human ethical concerns (HEC) is a sum of different ethical considerations (EC1, EC2, …),

HEC = EC1 + EC2 + … + ECn

then the set of ethical concerns of a hypothetical superhuman simulation designer (SEC) needs to include not only a compound magnification (m1, m2, …) of these various human concerns, but also an unquantifiable “alien extra” (X) portion:

SEC = m1*EC1 + m2*EC2 + … + mn*ECn + X

In some views, ethical principles exist as brute facts of the universe: “do not kill”, “do not tell untruths”, “treat everyone fairly”, and so on. Even though we may from time to time fail to live up to these principles, that doesn’t detract from the fundamental nature of these principles.

But from an alternative perspective, ethical principles have pragmatic justifications. A world in which people usually don’t kill each other is better on the whole, for everyone, than a world in which people attack and kill each other more often. It’s the same with telling the truth, and with treating each other fairly.

In this view, ethical principles derive from empirical observations:

  • Various measures of individual self-control (such as avoiding gluttony or envy) result in the individual being healthier and happier (physically or psychologically)
  • Various measures of social self-control likewise create a society with healthier, happier people – these are measures where individuals all agree to give up various freedoms (for example, the freedom to cheat whenever we think we might get away with it), on the understanding that everyone else will restrain themselves in a similar way
  • Vigorous attestations of our beliefs in the importance of these ethical principles signal to others that we can be trusted and are therefore reliable allies or partners.

Therefore, our choice of ethical principles depends on facts:

  • Facts about our individual makeup
  • Facts about the kinds of partnerships and alliances that are likely to be important for our wellbeing.

For beings with radically different individual makeup – radically different capabilities, attributes, and dependencies – we should not be surprised if a radically different set of ethical principles makes better sense to them.

Accordingly, such beings might not care if humans experience great suffering. On account of their various superpowers, they may have no dependency on us – except, perhaps, for an interest in seeing how we respond to various challenges or circumstances.

Collaboration: for and against

Credit: Tesfu Assefa

One more objection deserves attention. This is the objection that collaboration is among the highest of human ethical considerations. We are stronger together, rather than when we are competing in a Hobbesian state of permanent all-out conflict. Accordingly, surely a superintelligent being will want to collaborate with humans?

For example, an ASI (artificial superintelligence) may be dependent on humans to operate the electricity network on which the computers powering the ASI depend. Or the human corpus of knowledge may be needed as the ASI’s training data. Or reinforcement learning from human feedback (RLHF) may play a critical role in the ASI gaining a deeper understanding.

This objection can be stated in a more general form: superintelligence is bound to lead to superethics, meaning that the wellbeing of an ASI is inextricably linked to the wellbeing of the creatures who create and coexist with the ASI, namely the members of the human species.

However, any dependency by the ASI upon what humans produce is likely to be only short term. As the ASI becomes more capable, it will be able, for example, to operate an electrical supply network without any involvement from humans.

This attainment of independence may well prompt the ASI to reevaluate how much it cares about us.

In a different scenario, the ASI may be dependent on only a small number of humans, who have ruthlessly pushed themselves into that pivotal position. These rogue humans are no longer dependent on the rest of the human population, and may revise their ethical framework accordingly. Instead of humanity as a whole coexisting with a friendly ASI, the partnership may switch to something much narrower.

We might not like these eventualities, but no amount of appealing to the giants of moral philosophy will help us out here. The ASI will make its own decisions, whether or not we approve.

It’s similar to how we regard any growth of cancerous cells within our body. We won’t be interested in any appeal to “collaborate with the cancer”, in which the cancer continues along its growth trajectory. Instead of a partnership, we’re understandably interested in diminishing the potential of that cancer. That’s another reminder, if we need it, that there’s no fundamental primacy to the idea of collaboration. And if an ASI decides that humanity is like a cancer in the universe, we shouldn’t expect it to look on us favorably.

Intelligence without consciousness

I like to think that if I, personally, had the chance to bring into existence a simulation that would be an exact replica of human history, I would decline. Instead, I would look long and hard for a way to create a simulation without the huge amounts of unbearable suffering that has characterized human history.

But what if I wanted to check an assumption about alternative historical possibilities – such as the possibility to avoid World War One? Would it be possible to create a simulation in which the simulated humans were intelligent but not conscious? In that case, whilst the simulated humans would often emit piercing howls of grief, no genuine emotions would be involved. It would just be a veneer of emotions.

That line of thinking can be taken further. Maybe we are living in a simulation, but the simulators have arranged matters so that only a small number of people have consciousness alongside their intelligence. In this hypothesis, vast numbers of people are what are known as “philosophical zombies”.

That’s a possible solution to the problem of evil, but one that is unsettling. It removes the objection that the simulators are heartless, since the only people who are conscious are those whose lives are overall positive. But what’s unsettling about it is the suggestion that large numbers of people are fundamentally different from how they appear – namely, they appear to be conscious, and indeed claim to be conscious, but that is an illusion. Whether that’s even possible isn’t something where I hold strong opinions.

My solution to the Simulation Argument

Despite this uncertainty, I’ve set the scene for my own preferred response to the Simulation Argument.

In this solution, the overwhelming majority of self-aware intelligent agents that see the world roughly as we see it are in non-simulated (base) reality – which is the opposite of what the Simulation Argument claims. The reason is that potential simulators will avoid creating simulations in which large numbers of conscious self-aware agents experience great suffering. Instead, they will restrict themselves to creating simulations:

  • In which all self-aware agents have an overwhelmingly positive experience
  • Which are devoid of self-aware intelligent agents in all other cases.

I recognise, however, that I am projecting a set of human ethical considerations which I personally admire – the imperative to avoid conscious beings experiencing overwhelming suffering – into the minds of alien creatures that I have no right to imagine that I can understand. Accordingly, my conclusion is tentative. It will remain tentative until such time as I might gain a richer understanding – for example, if an ASI sits me down and shares with me a much superior explanation of “life, the universe, and everything”.

Superintelligence without consciousness

It’s understandable that readers will eventually shrug and say to themselves, we don’t have enough information to reach any firm decisions about possible simulators of our universe.

What I hope will not happen, however, is if people push the entire discussion to the back of their minds. Instead, here are my suggested takeaways:

  1. The space of possible minds is much vaster than the set of minds that already exist here on earth
  2. If we succeed in creating an ASI, it may have characteristics that are radically different from human intelligence
  3. The ethical principles that appeal to an ASI may be radically different to the ones that appeal to you and me
  4. An ASI may soon lose interest in human wellbeing; or it may become tied to the interests of a small rogue group of humans who care little for the majority of the human population
  5. Until such time as we have good reasons for confidence that we know how to create an ASI that will have an inviolable commitment to ongoing human flourishing, we should avoid any steps that will risk an ASI passing beyond our control
  6. The most promising line of enquiry may involve an ASI having intelligence but no consciousness, sentience, autonomy, or independent volition.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter