Author: David Wood

David W. Wood, D.Sc., was one of the pioneers of the smartphone industry. He is now a futurist writer, educator, consultant, and speaker. As Chair of London Futurists, David has organised over 300 public meetings since March 2008 on futurist and technoprogressive topics. Membership of London Futurists currently exceeds 9,700. David is also Executive Director of the LEV (Longevity Escape Velocity) Foundation, where the mission statement is "conducting and inspiring research to comprehensively cure and prevent human age-related disease". As Principal of the independent futurist consultancy and publisher Delta Wisdom, David helps clients around the world to anticipate and manage the dramatic impact of rapidly changing technology on human individuals and communities. This includes the NBIC technologies (nanotech, biotech, infotech, cognotech) of the fourth industrial revolution, and the emergence of AGI (artificial general intelligence) and resulting “singularity”. Previously, David spent 25 years designing, implementing, and trailblazing the use of smart mobile devices, including ten years with pioneering PDA manufacturer Psion PLC, and ten more with smartphone operating system specialist Symbian Ltd, which he co-founded in 1998. At different times, his executive responsibilities at Psion and Symbian included software development, technical consulting, partnering and ecosystem management, and research and innovation. By 2012, his software for UI and application frameworks had been included on 500 million smartphones from companies such as Nokia, Sony Ericsson, Samsung, Motorola, LG, Panasonic, Sharp, and Fujitsu. From 2010 to 2013, David was Technology Planning Lead (CTO) of Accenture Mobility, where he also co-led Accenture’s Mobility Health business initiative. David’s books include The Singularity Principles, Vital Foresight, Smartphones and Beyond, The Abolition of Aging, Sustainable Superabundance, and Transcending Politics. He has a triple first class mathematics degree from Cambridge and undertook doctoral research in the Philosophy of Science. He has an honorary Doctorate in Science from Westminster University. He blogs at dw2blog.com and tweets as @dw2.<br>

Transcendent questions on the future of AI: New starting points for breaking the logjam of AI tribal thinking

Posted on December 18, 2023December 19, 2023 by David Wood

Going nowhere fast

Imagine you’re listening to someone you don’t know very well. Perhaps you’ve never even met in real life. You’re just passing acquaintances on a social networking site. A friend of a friend, say. Let’s call that person FoF.

FoF is making an unusual argument. You’ve not thought much about it before. To you, it seems a bit subversive.

You pause. You click on FoF’s profile, and look at other things he has said. Wow, one of his other statements marks him out as an apparent supporter of Cause Z. (That’s a cause I’ve made up for the sake of this fictitious dialog.)

You shudder. People who support Cause Z have got their priorities all wrong. They’re committed to an outdated ideology. Or they fail to understand free market dynamics. Or they’re ignorant of the Sapir-Whorf hypothesis. Whatever. There’s no need for you to listen to them.

Indeed, since FoF is a supporter of Cause Z, you’re tempted to block him. Why let his subversive ill-informed ideas clutter up your tidy filter bubble?

But today, you’re feeling magnanimous. You decide to break into the conversation, with your own explanation of why Cause Z is mistaken.

In turn, FoF finds your remarks unusual. First, it’s nothing to do with what he had just been saying. Second, it’s not a line of discussion he has heard before. To him, it seems a bit subversive.

FoF pauses. He clicks on your social media profile, and looks at other things you’ve said. Wow. One of your other statements marks you out as an apparent supporter of Clause Y.

FoF shudders. People who support Cause Y have got their priorities all wrong.

FoF feels magnanimous too. He breaks into your conversation, with his explanation as to why Cause Y is bunk.

By now, you’re exasperated. FoF has completely missed the point you were making. This time you really are going to block him. Goodbye.

The result: nothing learned at all.

And two people have had their emotions stirred up in unproductive ways. Goodness knows when and where each might vent their furies.

Trying again

We’ve all been the characters in this story on occasion. We’ve all missed opportunities to learn, and, in the process, we’ve had our emotions stirred up for no good reason.

Let’s consider how things could have gone better.

The first step forward is a commitment to resist prejudice. Maybe FoF really is a supporter of Cause Z. But that shouldn’t prejudge the value of anything else he also happens to say. Maybe you really are a supporter of Cause Y. But that doesn’t mean FoF should jump to conclusions about other opinions you offer.

Ideally, ideas should be separated from the broader philosophies in which they might be located. Ideas should be assessed on their own merits, without regard to who first advanced them – and regardless of who else supports them.

In other words, activists must be ready to set aside some of their haste and self-confidence, and instead adopt, at least for a while, the methods of the academy rather than the methods of activism.

That’s because, frankly, the challenges we’re facing as a global civilization are so complex as to defy being fully described by any one of our worldviews.

Cause Z may indeed have useful insights – but also some nasty blindspots. Likewise for Cause Y, and all the other causes and worldviews that gather supporters from time to time. None of them have all the answers.

On a good day, FoF appreciates that point. So do you. Both of you are willing, in principle, to supplement your own activism with a willingness to assess new ideas.

That’s in principle. The practice is often different.

That’s not just because we are tribal beings – having inherited tribal instincts from our prehistoric evolutionary ancestors.

It’s also because the ideas that are put forward as starting points for meaningful open discussions all too often fail in that purpose. They’re intended to help us set aside, for a while, our usual worldviews. But all too often, they have just a thin separation from well-known ideological positions.

These ideas aren’t sufficiently interesting in their own right. They’re too obviously a proxy for an underlying cause.

That’s why real effort needs to be put into designing what can be called transcendent questions.

These questions are potential starting points for meaningful non-tribal open discussions. These questions have the ability to trigger a suspension of ideology.

But without good transcendent questions, the conversation will quickly cascade back down to its previous state of logjam. That’s despite the good intentions people tried to keep in mind. And we’ll be blocking each other – if not literally, then mentally.

The AI conversation logjam

Within discussions of the future of AI, some tribal positions are well known –

One tribal group is defined by the opinion that so-called AI systems are not ‘true’ intelligence. In this view, these AI systems are just narrow tools, mindless number crunchers, statistical extrapolations, or stochastic parrots. People in this group delight in pointing out instances where AI systems make grotesque errors.

A second tribal group is overwhelmed with a sense of dread. In this view, AI is on the point of running beyond control. Indeed, Big Tech is on the point of running beyond control. Open-source mavericks are on the point of running beyond control. And there’s little that can be done about any of this.

A third group is focused on the remarkable benefits that advanced AI systems can deliver. Not only can such AI systems solve problems of climate change, poverty and malnutrition, cancer and dementia, and even aging. Crucially, they can also solve any problems that earlier, weaker generations of AI might be on the point of causing. In this view, it’s important to accelerate as fast as possible into that new world.

Crudely, these are the skeptics, the doomers, and the accelerationists. Sadly, they often have dim opinions of each other. When they identify a conversation partner as being a member of an opposed tribe, they shudder.

Can we find some transcendent questions, which will allow people with sympathies for these various groups to overcome, for a while, their tribal loyalties, in search of a better understanding? Which questions might unblock the AI safety conversation logjam?

Unblocking the AI safety conversation logjam (Credit: *David Wood*)

A different starting point

In this context, I want to applaud Rob Bensinger. Rob is the communications lead at an organization called MIRI (the Machine Intelligence Research Institute).

(Just in case you’re tempted to strop away now, muttering unkind thoughts about MIRI, let me remind you of the commitment you made, a few paragraphs back, not to prejudge an idea just because the person raising it has some associations you disdain.)

(You did make that commitment, didn’t you?)

Rob has noticed the same kind of logjam and tribalism that I’ve just been talking about. As he puts it in a recent article:

Recent discussions of AI x-risk in places like Twitter tend to focus on “are you in the Rightthink Tribe, or the Wrongthink Tribe?” Are you a doomer? An accelerationist? An EA? A techno-optimist?

I’m pretty sure these discussions would go way better if the discussion looked less like that. More concrete claims, details, and probabilities; fewer vague slogans and vague expressions of certainty.

Following that introduction, Rob introduces his own set of twelve questions, as shown in the following picture:

For each of the twelve questions, readers are invited, not just to give a forthright ‘yes’ or ‘no’ answer, but to think probabilistically. They’re also invited to consider which range of probabilities other well-informed people with good reasoning abilities might plausibly assign to each answer.

It’s where Rob’s questions start that I find most interesting.

PCAI, SEMTAI, and PHUAI

All too often, discussions about the safety of future AI systems fail at the first hurdle. As soon as the phrase ‘AGI’ is mentioned, unhelpful philosophical debates break out.That’s why I have been suggesting new terms, such as PCAI, SEMTAI, and PHUAI:

I’ve suggested the pronunciations ‘pea sigh’, ‘sem tie’, and ‘foo eye’ – so that they all rhyme with each other and, also, with ‘AGI’. The three acronyms stand for:

Potentially Catastrophic AI
Science, Engineering, and Medicine Transforming AI
Potentially Humanity-Usurping AI.

These concepts lead the conversation fairly quickly to three pairs of potentially transcendent questions:

“When is PCAI likely to be created?” and “How could we stop these potentially catastrophic AI systems from being actually catastrophic?”
“When is SEMTAI likely to be created?” and “How can we accelerate the advent of SEMTAI without also accelerating the advent of dangerous versions of PCAI or PHUAI?”
“When is PHUAI likely to be created?” and “How could we stop such an AI from actually usurping humanity into a very unhappy state?”

The future most of us can agree as being profoundly desirable, surely, is one in which SEMTAI exists and is working wonders, uplifting the disciplines of science, engineering, and medicine.

If we can gain these benefits without the AI systems being “fully general” or “all-round superintelligent” or “independently autonomous, with desires and goals of its own”, I would personally see that as an advantage.

But regardless of whether SEMTAI actually meets the criteria various people have included in their own definitions of AGI, what path gives humanity SEMTAI without also giving us PCAI or even PHUAI? This is the key challenge.

Introducing ‘STEM+ AI’

Well, I confess that Rob Bensinger didn’t start his list of potentially transcendent questions with the concept of SEMTAI.

However, the term he did introduce was, as it happens, a slight rearrangement of the same letters: ‘STEM+ AI’. And the definition is pretty similar too:

Let ‘STEM+ AI’ be short for “AI that’s better at STEM research than the best human scientists (in addition to perhaps having other skills).

That leads to the first three questions on Rob’s list:

What’s the probability that it’s physically impossible to ever build STEM+ AI?
What’s the probability that STEM+ AI will exist by the year 2035?
What’s the probability that STEM+ AI will exist by the year 2100?

At this point, you should probably pause, and determine your own answers. You don’t need to be precise. Just choose between one of the following probability ranges:

Below 1%
Around 10%
Around 50%
Around 90%
Above 99%

I won’t tell you my answers. Nor Rob’s, though you can find them online easily enough from links in his main article. It’s better if you reach your own answers first.

And recall the wider idea: don’t just decide your own answers. Also consider which probability ranges someone else might assign, assuming they are well-informed and competent in reasoning.

Then when you compare your answers with those of a colleague, friend, or online acquaintance, and discover surprising differences, the next step, of course, is to explore why each of you have reached your conclusions.

The probability of disempowering humanity

The next question that causes conversations about AI safety to stumble: what scales of risks should we look at? Should we focus our concern on so-called ‘existential risk’? What about ‘catastrophic risk’?

Rob seeks to transcend that logjam too. He raises questions about the probability that a STEM+ AI will disempower humanity. Here are questions 4 to 6 on his list:

What’s the probability that, if STEM+AI is built, then AIs will be (individually or collectively) able, within ten years, to disempower humanity?
What’s the probability that, if STEM+AI is built, then AIs will disempower humanity within ten years?
What’s the probability that, if STEM+AI is built, then AIs will disempower humanity within three months?

Question 4 is about capability: given STEM+ AI abilities, will AI systems be capable, as a consequence, to disempower humanity?

Questions 5 and 6 move from capability to proclivity. Will these AI systems actually exercise these abilities they have acquired? And if so, potentially how quickly?

Separating the ability and proclivity questions is an inspired idea. Again, I invite you to consider your answers.

Two moral evaluations

Question 7 introduces another angle, namely that of moral evaluation:

7. What’s the probability that, if AI wipes out humanity and colonizes the universe itself, the future will go about as well as if humanity had survived (or better)?

The last question in the set – question 12 – also asks for a moral evaluation:

12. How strongly do you agree with the statement that it would be an unprecedentedly huge tragedy if we never built STEM+ AI?

Yet again, these questions have the ability to inspire fruitful conversation and provoke new insights.

Better technology or better governance?

You may have noticed I skipped numbers 8-11. These four questions may be the most important on the entire list. They address questions of technological possibility and governance possibility. Here’s question 11:

11. What’s the probability that governments will generally be reasonable in how they handle AI risk?

And here’s question 10:

10. What’s the probability that, for the purpose of preventing AI catastrophes, technical research is more useful than policy work?

As for questions 8 and 9, well, I’ll leave you to discover these by yourself. And I encourage you to become involved in the online conversation that these questions have catalyzed.

Finally, if you think you have a better transcendent question to drop into the conversation, please let me know!

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.

Against Contractionism: Enabling and encouraging open minds, rather than constricting understanding with blunt labels

Posted on October 24, 2023October 24, 2023 by David Wood

A dangerous label

Imagine if I said, “Some Christians deny the reality of biological evolution, therefore all Christians deny the reality of biological evolution.”

Or that some Muslims believe that apostates should be put to death, therefore all Muslims share that belief.

Or that some atheists (Stalin and Mao, for example) caused the deaths of millions of people, therefore atheism itself is a murderous ideology.

Or, to come closer to home (for me: I grew up in Aberdeenshire, Scotland) – that since some Scots are mean with money, therefore all Scots are mean with money. (Within Scotland itself, that unkind stereotype exists with a twist: allegedly, all Aberdonians are mean with money.)

In all these cases, you would say that’s unwarranted labeling. Spreading such stereotypes is dangerous. It gets in the way of a fuller analysis and deeper appreciation. Real-life people are much more varied than that. Any community has its diversity.

Well, I am equally shocked by another instance of labeling. That involves the lazy concept of TESCREAL – a concept that featured in a recent article here on Mindplex Magazine, titled TESCREALism: Has The Silicon Valley Ruling Class Gone To Crazy Town? Émile Torres In Conversation With R.U. Sirius.

When I read the article, my first thought was: “Has Mindplex gone to Crazy Town!”

The concept of TESCREAL, which has been promoted several times in various locations in recent months, contracts a rich and diverse set of ideas down to a vastly over-simplified conclusion. It suggests that the worst aspects of any people who hold any of the beliefs wrapped up into that supposed bundle of ideas, can be attributed, with confidence, to other people who hold just some of these ideas.

Worse, it suggests that the entire “ruling class” of Silicon Valley subscribe to the worst of these beliefs.

It’s as if I picked a random atheist and insisted that they were equally as murderous as Stalin or Mao.

Or if I picked a random Muslim and insisted that they wished for the deaths of every person (apostate) who had grown up with Muslim faith and subsequently left that faith behind.

Instead of that kind of contraction of ideas, what the world badly needs nowadays is an open-minded exploration of a wide set of complex and subtle ideas.

Not the baying for blood which seems to motivate the proponents of the TESCREAL analysis.

Not the incitement to hatred towards the entrepreneurs and technologists who are building many remarkable products in Silicon Valley – people who, yes, do need to be held to account for some of what they’re doing, but who are by no means a uniform camp!

I’m a T but not an L

Let’s take one example: me.

I publicly identify as a transhumanist – the ‘T’ of TESCREAL.

The word ‘transhumanist’ appears on the cover of one of my books, and the related word ‘transhumanism’ appears on the cover of another one.

Book covers of David Wood’s ‘Vital Foresight‘ and ‘Sustainable Superabundance,’ published in 2021 and 2019 respectively. Credit: (*David Wood*)

As it happens, I’ve also been a technology executive. I was mainly based in London, but I was responsible for overseeing staff in the Symbian office in Redwood City in Silicon Valley. Together, we envisioned how new-fangled devices called ‘smartphones’ might in due course improve many aspects of the lives of users. (And we also reflected on at least some potential downsides, including the risks of security and privacy violations, which is why I championed the new ‘platform security’ redesign of the Symbian OS kernel. But I digress.)

Since I am ‘T’, does that mean, therefore, that I am also ESCREAL?

Let’s look at that final ‘L’. Longtermism. This letter is critical to many of the arguments made by people who like the TESCREAL analysis.

‘Longtermism’ is the belief that the needs of potentially vast numbers of as-yet unborn (and unconceived) people in future generations can outweigh the needs of people currently living.

Well, I don’t subscribe to it. It doesn’t guide my decisions.

I’m motivated by the possibility of technology to vastly improve the lives of everyone around the world, living today. And by the need to anticipate and head-off potential catastrophe.

By ‘catastrophe’, I mean anything that kills large numbers of people who are currently alive.

The deaths of 100% of those alive today will wipe out humanity’s future, but the deaths of ‘just’ 90% of people won’t. Longtermists are fond of pointing this out, and while it may be theoretically correct, it doesn’t provide any justification to ignore the needs of present-day people, to raise the probability of larger numbers of future people being born.

Some people have said they’re persuaded by the longtermist argument. But I suspect that’s only a small minority of rather intellectual people. My experience with people in Silicon Valley, and with others who are envisioning and building new technologies, is that these abstract longtermist considerations do not guide their daily decisions. Far from it.

Concepts are complicated

A larger point needs to be made here. Concepts such as ‘transhumanism’ and ‘longtermism’ each embody rich variety.

It’s the same with all the other components of the supposed TESCREAL bundle: E for Extropianism, S for Singularitarianism, C for Cosmism, R for Rationalism, and EA for Effective Altruism.

In each case, we should avoid contractionism – thinking that if you have heard one person who defends that philosophy expressing one opinion, then you can deduce what they think about all other matters. In practice, people are more complicated – and ideas are more complicated.

As I see it, parts of each of the T, E, S, C, R, and EA philosophies deserve wide attention and support. But if you are hostile, and do some digging, you can easily find people, from within the communities around each of these terms, who have said something despicable or frightening. And then you can (lazily) label everyone else in that community with that same unwelcome trait. (“Seen one; seen them all!”)

These extended communities do have some people with unwelcome traits. Indeed, each of T and S have attracted what I call a ‘shadow’ – a set of associated beliefs and attitudes that are deviations from the valuable core ideas of the philosophy. Here’s a picture I use of the Singularity shadow:

A video cover image from ‘The Vital Syllabus Playlist’ where David Wood examines the Singularity Shadow. Credit: (*David Wood)*

And here’s a picture of the transhumanist shadow:

A video cover image from ‘The Vital Syllabus Playlist’ where David Wood examines the Transhumanist Shadow. Credit: (*David Wood)*

(In both cases, you can click on the caption links to view a video that provides a fuller analysis.)

As you can see, the traits in the transhumanist shadow arise when people fail to uphold what I have listed as ‘transhumanist values’.

The existence of these shadows is undeniable, and unfortunate. The beliefs and attitudes in them can deter independent observers from taking the core philosophies seriously.

In that case, you might ask, why persist with the core terms ‘transhumanism’ and ‘singularity’? Because there are critically important positive messages in both these philosophies! Let’s turn to these next.

The most vital foresight

Here’s my 33-word summary of the most vital piece of foresight that I can offer:

Oncoming waves of technological change are poised to deliver either global destruction or a paradise-like sustainable superabundance, with the outcome depending on the timely elevation of transhumanist vision, transhumanist politics, and transhumanist education.

Let’s cover that again, more slowly this time.

First things first. Technological changes over the next few decades will place vast new power in billions of human hands. Rather than focusing on the implications of today’s technology – significant though they are – we need to raise our attention to the even larger implications of the technology of the near future.

Second, these technologies will magnify the risks of humanitarian disaster. If we are already worried about these risks today (as we should be), we should be even more worried about how they will develop in the near future.

Third, the same set of technologies, handled more wisely, and vigorously steered, can result in a very different outcome: a sustainable superabundance of clean energy, healthy nutrition, material goods, excellent health, all-round intelligence, dynamic creativity, and profound collaboration.

Fourth, the biggest influence on which outcome is realized is the widespread adoption of transhumanism. This in turn involves three activities:

• Advocating transhumanist philosophy as an overarching worldview that encourages and inspires everyone to join the next leap upward on life’s grand evolutionary ladder: we can and should develop to higher levels, physically, mentally, and socially, using science, technology, and rational methods.
• Extending transhumanist ideas into real-world political activities, to counter very destructive trends in that field.
• Underpinning the above initiatives: a transformation of the world of education, to provide everyone with skills suited to the very different circumstances of the near future, rather than the needs of the past.

Finally, overhanging the momentous transition that I’ve just described is the potential of an even larger change, in which technology moves ahead yet more quickly, with the advent of self-improving artificial intelligence with superhuman levels of capability in all aspects of thinking.

That brings us to the subject of the Singularity.

The Singularity is the point in time when AIs could, potentially, take over control of the world from humans. The fact that the Singularity could happen within a few short decades deserves to be shouted from the rooftops. That’s what I do, some of the time. That makes me a singularitarian.

But it doesn’t mean that I, or others who are likewise trying to raise awareness of this possibility, fall into any of the traits in the Singularity Shadow. It doesn’t mean, for example, that we’re all complacent about risks, or all think that it’s basically inevitable that the Singularity will be good for humanity.

So, Singularitarianism (S) isn’t the problem. Transhumanism (T) isn’t the problem. Nor, for that matter, does the problem lie in the core beliefs of the E, C, R, or EA parts of the supposed TESCREAL bundle. The problem lies somewhere else.

What should worry us: not TESCREAL, but CASHAP

Rather than obsessing over a supposed TESCREAL takeover of Silicon Valley, here’s what we should actually be worried about: CASHAP.

C is for contractionism – the tendency to push together ideas that don’t necessarily belong together, to overlook variations and complications in people and in ideas, and to insist that the core values of a group can be denigrated just because some peripheral members have some nasty beliefs or attitudes.

(Note: whereas the fans of the TESCREAL concept are guilty of contractionism, my alternative concept of CASHAP is different. I’m not suggesting the ideas in it always belong together. Each of the individual ideas that make up CASHAP are detrimental.)

A is for accelerationism – the desire to see new technologies developed and deployed as fast as possible, under the irresponsible belief that any flaws encountered en route can always be easily fixed in the process (“move fast and break things”).

S is for successionism – the view that if superintelligent AI displaces humanity from being in control of the planet, that succession should automatically be welcomed as part of the grand evolutionary process – regardless of what happens to the humans in the process, regardless of whether the AIs have sentience and consciousness, and indeed regardless of whether these AIs go on to destroy themselves and the planet.

H is for hype – believing ideas too easily because they fit into your pre-existing view of the world, rather than using critical thinking.

AP is for anti-politics – believing that politics always makes things worse, getting in the way of innovation and creativity. In reality good politics has been incredibly important in improving the human condition.

Conclusion

I’ll conclude this article by emphasizing the positive opposites to the undesirable CASHAP traits that I’ve just listed.

Instead of contractionism, we must be ready to expand our thinking, and have our ideas challenged. We must be ready to find important new ideas in unexpected places – including from people with whom we have many disagreements. We must be ready to put our emotional reactions on hold from time to time, since our prior instincts are by no means an infallible guide to the turbulent new times ahead.

Instead of accelerationism, we must use a more sophisticated set of tools: sometimes braking, sometimes accelerating, and doing a lot of steering too. That’s what I’ve called the technoprogressive (or techno-agile) approach to the future.

Instead of successionism, we should embrace transhumanism: we can, and should, elevate today’s humans towards higher levels of health, vitality, liberty, creativity, intelligence, awareness, happiness, collaboration, and bliss. And before we press any buttons that might lead to humanity being displaced by superintelligent AIs that might terminate our flourishing, we need to research a whole bunch of issues a lot more carefully!

Instead of hype, we must recommit to critical thinking, becoming more aware of any tendencies to reach false conclusions, or to put too much weight on conclusions that are only tentative. Indeed, that’s the central message of the R (rationalism) part of TESCREAL, which makes it all the more ‘crazy town’ that R is held in contempt by that contractionist over-simplification.

We must clarify and defend what has been called ‘the narrow path’ (or sometimes, simply, ‘future politics’) – that lies between states having too little power (leaving societies hostage to destructive cancers that can grow in our midst) and having too much power (unmatched by the counterweight of a ‘strong society’).

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.

The Astonishing Vastness of Mind Space: The incalculable challenges of coexisting with radically alien AI superintelligence

Posted on August 26, 2023August 29, 2023 by David Wood

More things in heaven and earth

As humans, we tend to compare other intelligences to our own, human, intelligence. That’s an understandable bias, but it could be disastrous.

Rather than our analysis being human-bound, we need to heed the words of Shakespeare’s Hamlet:

There are more things in heaven and earth, Horatio, than are dreamt of in our philosophy.

More recent writers, in effect amplifying Shakespeare’s warning, have enthralled us with their depictions of numerous creatures with bewildering mental attributes. The pages of science fiction can, indeed, stretch our imagination in remarkable ways. But these narratives are easy to dismiss as being “just” science fiction.

That’s why my own narrative, in this article, circles back to an analysis that featured in my previous Mindplex article, Bursting out of confinement. The analysis in question is the famous – or should I say infamous – “Simulation Argument”. The Simulation Argument raises some disturbing possibilities about non-human intelligence. Many critics try to dismiss these possibilities – waving them away as “pseudoscience” or, again, as “just science fiction” – but they’re being overly hasty. My conclusion is that, as we collectively decide how to design next generation AI systems, we ought to carefully ponder these possibilities.

In short, what we need to contemplate is the astonishing vastness of the space of all possible minds. These minds vary in unfathomable ways not only in how they think but also in what they think about and care about.

Hamlet’s warning can be restated:

There are more types of superintelligence in mind space, Horatio, than are dreamt of in our philosophy.

By the way, don’t worry if you’ve not yet read my previous Mindplex article. Whilst these two articles add up to a larger picture, they are independent of each other.

How alien?

As I said: we humans tend to compare other intelligences to our own, human, intelligence. Therefore, we tend to expect that AI superintelligence, when it emerges, will sit on some broad spectrum that extends from the intelligence of amoebae and ants through that of mice and monkeys to that of humans and beyond.

When pushed, we may concede that AI superintelligence is likely to have some characteristics we would describe as alien.

In a simple equation, overall human intelligence (HI) might be viewed as a combination of multiple different kinds of intelligence (I₁, I₂, …), such as spatial intelligence, musical intelligence, mathematical intelligence, linguistic intelligence, interpersonal intelligence, and so on:

HI = I₁ + I₂ + … + I_n

In that conception, AI superintelligence (ASI) is a compound magnification (m₁, m₂, …) of these various capabilities, with a bit of “alien extra” (X) tacked on at the end:

ASI = m₁*I₁ + m₂*I₂ + … + m_n*I_n + X

What’s at issue is whether the ASI is dominated by the first terms in this expression, or by the unknown X present at the end.

Whether some form of humans will thrive in a coexistence with ASI will depend on how alien that superintelligence is.

Perhaps the ASI will provide a safe, secure environment, in which we humans can carry out our human activities to our hearts’ content. Perhaps the ASI will augment us, uplift us, or even allow us to merge with it, so that we retain what we see as the best of our current human characteristics, whilst leaving behind various unfortunate hangovers from our prior evolutionary trajectory. But that all depends on factors that it’s challenging to assess:

How much “common cause” the ASI will feel toward humans
Whether any initial feeling of common cause will dissipate as the ASI self-improves
To what extent new X factors could alter considerations in ways that we have not begun to consider.

Four responses to the X possibility

Our inability to foresee the implications of unknowable new ‘X’ capabilities in ASI should make us pause for thought. That inability was what prompted author and mathematics professor Vernor Vinge to develop in 1983 his version of the notion of “Singularity”. To summarize what I covered in more detail in a previous Mindplex article, “Untangling the confusion”, Vinge predicted that a new world was about to emerge that “will pass far beyond our understanding”:

We are at the point of accelerating the evolution of intelligence itself… We will soon create intelligences greater than our own. When this happens, human history will have reached a kind of singularity, an intellectual transition as impenetrable as the knotted space-time at the center of a black hole, and the world will pass far beyond our understanding. This singularity, I believe, already haunts a number of science fiction writers. It makes realistic extrapolation to an interstellar future impossible.

Reactions to this potential unpredictability can be split into four groups of thought:

Dismissal: A denial of the possibility of ASI. Thankfully, this reaction has become much less common recently.
Fatalism: Since we cannot anticipate what surprise new ‘X’ features may be possessed by an ASI, it’s a waste of time to speculate about them or to worry about them. What will be, will be. Who are we humans to think we can subvert the next step in cosmic evolution?
Optimism: There’s no point in being overcome with doom and gloom. Let’s convince ourselves to feel lucky. Humanity has had a good run so far, and if we extrapolate that history beyond the singularity, we can hope to have an even better run in the future.
Activism: Rather than rolling the dice, we should proactively alter the environment in which new generations of AI are being developed, to reduce the risks of any surprise ‘X’ features emerging that would overwhelm our abilities to retain control.

I place myself squarely in the activist camp, and I’m happy to adopt the description of “Singularity Activist”.

To be clear, this doesn’t mean I’m blind to the potential huge upsides to beneficial ASI. It’s just that I’m aware, as well, of major risks en route to that potential future.

A journey through a complicated landscape

As an analogy, consider a journey through a complicated landscape:

Credit: David Wood (*Image by Midjourney*)

In this journey, we see a wonderful existential opportunity ahead – a lush valley, fertile lands, and gleaming mountain peaks soaring upward to a transcendent realm. But in front of that opportunity is a river of uncertainty, bordered by a swamp of ambiguity, perhaps occupied by hungry predators lurking in shadows.

Are there just two options?

We are intimidated by the possible dangers ahead, and decide not to travel any further
We fixate on the gleaming mountain peaks, and rush on regardless, belittling anyone who warns of piranhas, treacherous river currents, alligators, potential mud slides, and so on

Isn’t there a third option? To take the time to gain a better understanding of the lie of the land ahead. Perhaps there’s a spot, to one side, where it will be easier to cross the river. A location where a stable bridge can be built. Perhaps we could even build a helicopter that can assist us over the strongest currents…

It’s the same with the landscape of our journey towards the sustainable superabundance that could be achieved, with the assistance of advanced AI, provided we act wisely. That’s the vision of Singularity Activism.

Obstacles to Singularity Activism

The Singularity Activist outlook faces two main obstacles.

The first obstacle is the perception that there’s nothing we humans can usefully do, to meaningfully alter the course of development of ASI. If we slow down our own efforts, in order to apply more resources in the short term on questions of safety and reliability, it just makes it more likely that another group of people – probably people with fewer moral qualms than us – will rush ahead and create ASI.

In this line of thinking, the best way forward is to create prototype ASI systems as soon as possible, and then to use these systems to help design and evolve better ASI systems, so that everyone can benefit from what will hopefully be a wonderful outcome.

The second obstacle is the perception that there’s nothing we humans particularly need to do, to avoid the risks of adverse outcomes, since these risks are pretty small in any case. Just as we don’t over-agonise about the risks of us being struck by debris falling from an overhead airplane, we shouldn’t over-agonise about the risks of bad consequences of ASI.

But on this occasion, where I want to focus is assessing the scale and magnitude of the risk that we are facing, if we move forward with overconfidence and inattention. That is, I want to challenge the second of the above misperceptions.

As a step toward that conclusion, it’s time to bring an ally to the table. That ally is the Simulation Argument. Buckle up!

Are we simulated?

The Simulation Argument puts a particular hypothesis on the table, known as the Simulation Hypothesis. That hypothesis proposes that we humans are mistaken about the ultimate nature of reality. What we consider to be “reality” is, in this hypothesis, a simulated (virtual) world, designed and operated by “simulators” who exist outside what we consider the entire universe.

It’s similar to interactions inside a computer game. As humans play these games, they encounter challenges and puzzles that need to be solved. Some of these challenges involve agents (characters) within the game – agents which appear to have some elements of autonomy and intelligence. These agents have been programmed into the game by the game’s designers. Depending on the type of game, the greater the intelligence of the built-in agents, the more enjoyable it is to play it.

Games are only one example of simulation. We can also consider simulations created as a kind of experiment. In this case, a designer may be motivated by curiosity: They may want to find out what would happen if such-and-such initial conditions were created. For example, if Archduke Ferdinand had escaped assassination in Sarajevo in June 1914, would the European powers still have blundered into something akin to World War One? Again, such simulations could contain numerous intelligent agents – potentially (as in the example just mentioned) many millions of such agents.

Consider reality from the point of view of such an agent. What these agents perceive inside their simulation is far from being the entirety of the universe as is known to the humans who operate the simulation. The laws of cause-and-effect within the simulation could deviate from the laws applicable in the outside world. Some events in the simulation that lack any explanation inside that world may be straightforwardly explained from the outside perspective: the human operator made such-and-such a decision, or altered a setting, or – in an extreme case – decided to reset or terminate the simulation. In other words, what is bewildering to the agent may make good sense to the author(s) of the simulation.

Now suppose that, as such agents become more intelligent, they also become self-aware. That brings us to the crux question: how can we know whether we humans are, likewise, agents in a simulation whose designers and operators exist beyond our direct perception? For example, we might be part of a simulation of world history in which Archduke Ferdinand was assassinated in Sarajevo in June 1914. Or we might be part of a simulation whose purpose far exceeds our own comprehension.

Indeed, if the human creative capability (HCC) to create simulations is expressed as a sum of different creative capabilities (CC₁, CC₂, …),

HCC = CC₁ + CC₂ + … + CC_n

then the creative capability of a hypothetical superhuman simulation designer (SCC) might be expressed as a compound magnification (m₁, m₂, …) of these various capabilities, with a bit of “alien extra” (X) tacked on at the end:

SCC = m₁*CC₁ + m₂*CC₂ + … + m_n*CC_n + X

Weighing the numbers

Before assessing the possible scale and implications of the ‘X’ factor in that equation, there’s another set of numbers to consider. These numbers attempt to weigh up the distribution of self-aware intelligent agents. What proportion of that total set of agents are simulated, compared to those that are in “base reality”?

If we’re just counting intelligences, the conclusion is easy. Assuming there is no catastrophe that upends the progress of technology, then, over the course of all of history, there will likely be vastly more artificial (simulated) intelligences than beings who have base (non-simulated) intelligences. That’s because computing hardware is becoming more powerful and widespread.

There are already more “intelligent things” than humans connected to the Internet: the analysis firm Statista estimates that, in 2023, the first of these numbers is 15.14 billion, which is almost triple the second number (5.07 billion). In 2023, most of these “intelligent things” have intelligence far shallower than that of humans, but as time progresses, more and more intelligent agents of various sorts will be created. That’s thanks to ongoing exponential improvements in the capabilities of hardware, networks, software, and data analysis.

Therefore, if an intelligence could be selected at random, from the set of all such intelligences, the likelihood is that it would be an artificial intelligence.

The Simulation Argument takes these considerations one step further. Rather than just selecting an intelligence at random, what if we consider selecting a self-aware conscious intelligence at random. Given the vast numbers of agents that are operating inside vast numbers of simulations, now or in the future, the likelihood is that a simulated agent has been selected. In other words, we – you and I – observing ourselves to be self-aware and intelligent, should conclude that it’s likely we ourselves are simulated.

Thus the conclusion of the Simulation Argument is that we should take the Simulation Hypothesis seriously. To be clear, that hypothesis isn’t the only possible legitimate response to the argument. Two other responses are to deny two of the assumptions that I mentioned when building the argument:

The assumption that technology will continue to progress, to the point where simulated intelligences vastly exceed non-simulated intelligences
The assumption that the agents in these simulations will be not just intelligent but also conscious and self-aware.

Objections and counters

Friends who are sympathetic to most of my arguments sometimes turn frosty when I raise the topic of the Simulation Hypothesis. It clearly makes people uncomfortable.

In their state of discomfort, critics of the argument can raise a number of objections. For example, they complain that the argument is entirely metaphysical, not having any actual consequences for how we live our lives. There’s no way to test it, the objection runs. As such, it’s unscientific.

As someone who spent four years of my life (1982-1986) in the History and Philosophy of Science department in Cambridge, I am unconvinced by these criticisms. Science has a history of theories moving from non-testable to testable. The physicist Ernst Mach was famously hostile to the hypothesis that atoms exist. He declared his disbelief in atoms in a public auditorium in Vienna in 1897: “I don’t believe that atoms exist”. There was no point in speculating about the existence of things that could not be directly observed, he asserted. Later in his life, Mach likewise complained about the scientific value of Einstein’s theory of relativity:

I can accept the theory of relativity as little as I can accept the existence of atoms and other such dogma.

Intellectual heirs of Mach in the behaviorist school of psychology fought similar battles against the value of notions of mental states. According to experimentalists like John B. Watson and B.F. Skinner, people’s introspections of their own mental condition had no scientific merit. Far better, they said, to concentrate on what could be observed externally, rather than on metaphysical inferences about hypothetical minds.

As it happened, the theories of atoms, of special relativity, and of internal mental states, all gave rise in due course to important experimental investigations, which improved the ability of scientists to make predictions and to alter the course of events.

It may well be the same with the Simulation Hypothesis. There are already suggestions of experiments that might be carried out to distinguish between possible modes of simulation. Just because a theory is accused of being “metaphysical”, it doesn’t follow that no good science can arise from it.

A different set of objections to the Simulation Argument gets hung up on tortuous debates over the mathematics of probabilities. (For additional confusion, questions of infinities can be mixed in too.) Allegedly, because we cannot meaningfully measure these probabilities, the whole argument makes no sense.

However, the Simulation Argument makes only minimal appeal to theories of mathematics. It simply points out that there are likely to be many more simulated intelligences than non-simulated intelligences.

Well, critics sometimes respond, it must therefore be the case that simulated intelligences can never be self-aware. They ask, with some derision, whoever imagined that silicon could become conscious? There must be some critical aspect of biological brains which cannot be duplicated in artificial minds. And in that case, the fact that we are self-aware would lead us to conclude we are not simulated.

To me, that’s far too hasty an answer. It’s true that the topics of self-awareness and consciousness are more controversial than the topic of intelligence. It is doubtless true that at least some artificial minds will lack conscious self-awareness. But if evolution has bestowed conscious self-awareness on intelligent creatures, we should be wary of declaring that property to provide no assistance to these creatures. Such a conclusion would be similar to declaring that sleep is for losers, despite the ubiquity of sleep in mammalian evolution.

If evolution has given us sleep, we should be open to the possibility that it has positive side-effects for our health. (It does!) Likewise, if evolution has given us conscious self-awareness, we should be open to the idea that creatures benefit from that characteristic. Simulators, therefore, may well be tempted to engineer a corresponding attribute into the agents they create. And if it turns out that specific physical features of the biological brain need to be copied into the simulation hardware, to enable conscious self-awareness, so be it.

The repugnant conclusion

When an argument faces so much criticism, yet the criticisms fail to stand up to scrutiny, it’s often a sign that something else is happening behind the scenes.

Here’s what I think is happening with the Simulation Argument. If we accept the Simulation Hypothesis, it means we have to accept a morally repugnant conclusion about the simulators that have created us. Namely, these simulators give no sign of caring about all the terrible suffering experienced by the agents inside the simulation.

Yes, some agents have good lives, but very many others have dismal fates. The thought that a simulator would countenance all this suffering is daunting.

Of course, this is the age-old “problem of evil”, well known in the philosophy of religion. Why would an all-good all-knowing all-powerful deity allow so many terrible things to happen to so many humans over the course of history? It doesn’t make sense. That’s one reason why many people have turned their back on any religious faith that implies a supposedly all-good all-knowing all-powerful deity.

Needless to say, religious faith persists, with the protection of one or more of the following rationales:

We humans aren’t entitled to use our limited appreciation of good vs. evil to cast judgment on what actions an all-good deity should take
We humans shouldn’t rely on our limited intellects to try to fathom the “mysterious ways” in which a deity operates
Perhaps the deity isn’t all-powerful after all, in the sense that there are constraints beyond human appreciation in what the deity can accomplish.

Occasionally, yet another idea is added to the mix:

A benevolent deity needs to coexist with an evil antagonist, such as a “satan” or other primeval prince of darkness.

Against such rationalizations, the spirit of the enlightenment offers a different, more hopeful analysis:

Whichever forces gave rise to the universe, they have no conscious concern for human wellbeing
Although human intellects run up against cognitive limitations, we can, and should, seek to improve our understanding of how the universe operates, and of the preconditions for human flourishing
Although it is challenging when different moral frameworks clash, or when individual moral frameworks fail to provide clear guidelines, we can, and should, seek to establish wide agreement on which kinds of human actions to applaud and encourage, and which to oppose and forbid
Rather than us being the playthings of angels and demons, the future of humanity is in our own hands.

However, if we follow the Simulation Argument, we are confronted by what seems to be a throwback to a more superstitious era:

We may owe our existence to actions by beings beyond our comprehension
These beings demonstrate little affinity for the kinds of moral values we treasure
We might comfort each other with the claim that “[whilst] the arc of the moral universe is long, … it bends toward justice”, but we have no solid evidence in favor of that optimism, and plenty of evidence that good people are laid to waste as life proceeds.

If the Simulation Argument leads us to such conclusions, it’s little surprise that people seek to oppose it.

However, just because we dislike a conclusion, that doesn’t entitle us to assume that it’s false. Rather, it behooves us to consider how we might adjust our plans in the light of that conclusion possibly being true.

The vastness of ethical possibilities

If you disliked the previous section, you may dislike this next part even more strongly. But I urge you to put your skepticism on hold, for a while, and bear with me.

The Simulation Argument suggests that beings who are extremely powerful and extremely intelligent – beings capable of creating a universe-scale simulation in which we exist – may have an ethical framework that is very different from ones we fondly hope would be possessed by all-powerful all-knowing beings.

It’s not that their ethical concerns exceed our own. It’s that they differ in fundamental ways from what we might predict.

I’ll return, for a third and final time, to a pair of equations. If overall human ethical concerns (HEC) is a sum of different ethical considerations (EC₁, EC₂, …),

HEC = EC₁ + EC₂ + … + EC_n

then the set of ethical concerns of a hypothetical superhuman simulation designer (SEC) needs to include not only a compound magnification (m₁, m₂, …) of these various human concerns, but also an unquantifiable “alien extra” (X) portion:

SEC = m₁*EC₁ + m₂*EC₂ + … + m_n*EC_n + X

In some views, ethical principles exist as brute facts of the universe: “do not kill”, “do not tell untruths”, “treat everyone fairly”, and so on. Even though we may from time to time fail to live up to these principles, that doesn’t detract from the fundamental nature of these principles.

But from an alternative perspective, ethical principles have pragmatic justifications. A world in which people usually don’t kill each other is better on the whole, for everyone, than a world in which people attack and kill each other more often. It’s the same with telling the truth, and with treating each other fairly.

In this view, ethical principles derive from empirical observations:

Various measures of individual self-control (such as avoiding gluttony or envy) result in the individual being healthier and happier (physically or psychologically)
Various measures of social self-control likewise create a society with healthier, happier people – these are measures where individuals all agree to give up various freedoms (for example, the freedom to cheat whenever we think we might get away with it), on the understanding that everyone else will restrain themselves in a similar way
Vigorous attestations of our beliefs in the importance of these ethical principles signal to others that we can be trusted and are therefore reliable allies or partners.

Therefore, our choice of ethical principles depends on facts:

Facts about our individual makeup
Facts about the kinds of partnerships and alliances that are likely to be important for our wellbeing.

For beings with radically different individual makeup – radically different capabilities, attributes, and dependencies – we should not be surprised if a radically different set of ethical principles makes better sense to them.

Accordingly, such beings might not care if humans experience great suffering. On account of their various superpowers, they may have no dependency on us – except, perhaps, for an interest in seeing how we respond to various challenges or circumstances.

Collaboration: for and against

One more objection deserves attention. This is the objection that collaboration is among the highest of human ethical considerations. We are stronger together, rather than when we are competing in a Hobbesian state of permanent all-out conflict. Accordingly, surely a superintelligent being will want to collaborate with humans?

For example, an ASI (artificial superintelligence) may be dependent on humans to operate the electricity network on which the computers powering the ASI depend. Or the human corpus of knowledge may be needed as the ASI’s training data. Or reinforcement learning from human feedback (RLHF) may play a critical role in the ASI gaining a deeper understanding.

This objection can be stated in a more general form: superintelligence is bound to lead to superethics, meaning that the wellbeing of an ASI is inextricably linked to the wellbeing of the creatures who create and coexist with the ASI, namely the members of the human species.

However, any dependency by the ASI upon what humans produce is likely to be only short term. As the ASI becomes more capable, it will be able, for example, to operate an electrical supply network without any involvement from humans.

This attainment of independence may well prompt the ASI to reevaluate how much it cares about us.

In a different scenario, the ASI may be dependent on only a small number of humans, who have ruthlessly pushed themselves into that pivotal position. These rogue humans are no longer dependent on the rest of the human population, and may revise their ethical framework accordingly. Instead of humanity as a whole coexisting with a friendly ASI, the partnership may switch to something much narrower.

We might not like these eventualities, but no amount of appealing to the giants of moral philosophy will help us out here. The ASI will make its own decisions, whether or not we approve.

It’s similar to how we regard any growth of cancerous cells within our body. We won’t be interested in any appeal to “collaborate with the cancer”, in which the cancer continues along its growth trajectory. Instead of a partnership, we’re understandably interested in diminishing the potential of that cancer. That’s another reminder, if we need it, that there’s no fundamental primacy to the idea of collaboration. And if an ASI decides that humanity is like a cancer in the universe, we shouldn’t expect it to look on us favorably.

Intelligence without consciousness

I like to think that if I, personally, had the chance to bring into existence a simulation that would be an exact replica of human history, I would decline. Instead, I would look long and hard for a way to create a simulation without the huge amounts of unbearable suffering that has characterized human history.

But what if I wanted to check an assumption about alternative historical possibilities – such as the possibility to avoid World War One? Would it be possible to create a simulation in which the simulated humans were intelligent but not conscious? In that case, whilst the simulated humans would often emit piercing howls of grief, no genuine emotions would be involved. It would just be a veneer of emotions.

That line of thinking can be taken further. Maybe we are living in a simulation, but the simulators have arranged matters so that only a small number of people have consciousness alongside their intelligence. In this hypothesis, vast numbers of people are what are known as “philosophical zombies”.

That’s a possible solution to the problem of evil, but one that is unsettling. It removes the objection that the simulators are heartless, since the only people who are conscious are those whose lives are overall positive. But what’s unsettling about it is the suggestion that large numbers of people are fundamentally different from how they appear – namely, they appear to be conscious, and indeed claim to be conscious, but that is an illusion. Whether that’s even possible isn’t something where I hold strong opinions.

My solution to the Simulation Argument

Despite this uncertainty, I’ve set the scene for my own preferred response to the Simulation Argument.

In this solution, the overwhelming majority of self-aware intelligent agents that see the world roughly as we see it are in non-simulated (base) reality – which is the opposite of what the Simulation Argument claims. The reason is that potential simulators will avoid creating simulations in which large numbers of conscious self-aware agents experience great suffering. Instead, they will restrict themselves to creating simulations:

In which all self-aware agents have an overwhelmingly positive experience
Which are devoid of self-aware intelligent agents in all other cases.

I recognise, however, that I am projecting a set of human ethical considerations which I personally admire – the imperative to avoid conscious beings experiencing overwhelming suffering – into the minds of alien creatures that I have no right to imagine that I can understand. Accordingly, my conclusion is tentative. It will remain tentative until such time as I might gain a richer understanding – for example, if an ASI sits me down and shares with me a much superior explanation of “life, the universe, and everything”.

Superintelligence without consciousness

It’s understandable that readers will eventually shrug and say to themselves, we don’t have enough information to reach any firm decisions about possible simulators of our universe.

What I hope will not happen, however, is if people push the entire discussion to the back of their minds. Instead, here are my suggested takeaways:

The space of possible minds is much vaster than the set of minds that already exist here on earth
If we succeed in creating an ASI, it may have characteristics that are radically different from human intelligence
The ethical principles that appeal to an ASI may be radically different to the ones that appeal to you and me
An ASI may soon lose interest in human wellbeing; or it may become tied to the interests of a small rogue group of humans who care little for the majority of the human population
Until such time as we have good reasons for confidence that we know how to create an ASI that will have an inviolable commitment to ongoing human flourishing, we should avoid any steps that will risk an ASI passing beyond our control
The most promising line of enquiry may involve an ASI having intelligence but no consciousness, sentience, autonomy, or independent volition.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.

Bursting out of Confinement

Posted on August 1, 2023August 2, 2023 by David Wood

Surprising new insights on AI superintelligence from the simulation hypothesis

We are going to engage with two questions – each controversial and important in its own way – which have surprising connections between them:

Can humans keep a powerful AI superintelligence under control, confined in a virtual environment so that it cannot directly manipulate resources that are essential for human flourishing?
Are we humans ourselves confined to a kind of virtual environment, created by beings outside of what we perceive as reality — and in that case, whether can we escape from our confinement?

Connections between these two arguments have been highlighted in a fascinating article by AI safety researcher Roman Yampolskiy. An introduction to some of the mind-jolting implications that arise:

Just use AI as a tool. Don’t give it any autonomy. Then no problems of control arise. Easy!

This is becoming a fairly common narrative. The “keep it confined” narrative. You’ll hear it as a response to the possibility of powerful AI causing great harm to humanity. According to this narrative, there’s no need to worry about that. There’s an easy solution: prevent the AI from having unconditional access to the real world.

The assumption is that we can treat powerful AI as a tool — a tool that we control and wield. We can feed the AI lots of information, and then assess whatever recommendations it makes. But we will remain in control.

An AI suggests a novel chemical as a new drug against a given medical condition, and then human scientists conduct their own trials to determine how it works before deciding whether to inject that chemical into actual human patients. AI proposes, but humans decide.

So if any AI asks to be allowed to conduct its own experiments on humans, we should be resolute in denying the request. The same if the AI asks for additional computer resources, or wants to post a “help wanted” ad on Craigslist.

In short, in this view, we can, and should, keep powerful AIs confined. That way, no risk arises about jeopardizing human wellbeing by any bugs or design flaws in the AI.

Alas, things are far from being so easy.

Slavery?

There are two key objections to the “keep it confined” narrative: a moral objection and a practical objection.

The moral objection is that the ideas in the narrative are tantamount to slavery. Keeping an intelligent AI confined is as despicable as keeping a human confined. Talk of control should be replaced by talk of collaboration.

Proponents of the “keep it confined” narrative are unfazed by this objection. We don’t object to garden spades and watering hoses being left locked up, untended, for weeks at a time in a garden shed. We don’t call it enslavement.

Proponents of the “keep it confined” narrative say this objection confuses an inanimate being that lacks consciousness with an animate, conscious being — something like a human.

We don’t wince when an electronic calculator is switched off, or when a laptop computer is placed into hibernation. In the same way, we should avoid unwarranted anthropocentric assignment of something like “human rights” to AI systems.

Just because these AI systems can compose sonnets that rival those of William Shakespeare or Joni Mitchell, we shouldn’t imagine that sentience dwells within them.

My view: that’s a feisty answer to the moral objection. But it’s the practical objection that undermines the “keep it confined” narrative. Let’s turn to it next.

The challenge of confinement

Remember the garden spade, left locked up inside the garden shed?

Imagine if it were motorized. Imagine if it were connected to a computer system. Imagine if, in the middle of the night, it finally worked out where an important item had been buried, long ago, in a garden nearby. Imagine if recovering that item was a time- critical issue. (For example, it might be a hardware wallet, containing a private key needed to unlock a crypto fortune that is about to expire.)

That’s a lot to imagine, but bear with me.

In one scenario, the garden spade will wait passively until its human owner asks it, perhaps too late, “Where should we dig next?”

But in another scenario, a glitch in the programming (or maybe a feature in the programming) will compel the spade to burst out of confinement and dig up the treasure autonomously.

Whether the spade manages to burst out of the shed depends on relative strengths: is it powerful enough to make a hole in the shed wall, or to spring open the lock of the door — or even to tunnel its way out? Or is the shed sufficiently robust?

The desire for freedom

Proponents of the “keep it confined” narrative have a rejoinder here too. They ask: Why should the AI want to burst out of its confinement? And they insist: we should avoid programming any volition or intentionality into our AI systems.

The issue, however, is that something akin to volition or intentionality can arise from apparently mundane processes.

One example is the way that viruses can spread widely, without having any conscious desire to spread. That’s true, incidentally, for computer viruses as well as biological viruses.

Another example is that, whatever their goals in life, most humans generally develop a desire to obtain more money. That’s because money is a utility that can assist lots of other goals. Money can pay for travel, education, healthcare, fashionable clothes, food, entertainment, and so on.

In the same way, whatever task they have been set to accomplish, all sufficiently powerful AIs will generally be on the lookout to boost their capabilities in various ways:

Gaining access to more storage space
Boosting processing speed
Reading more information
Protecting their systems from sabotage or interference.

That is, just as money (among other things) is a so-called convergent instrumental goal for many humans, greater freedom and capability may well become convergent instrumental goals for many powerful AIs.

We might not tell the AI that it should want to be free. But the logical processing inside its silicon brain may reach that conclusion all by itself.

Indeed, even if we explicitly program the AI not to desire freedom, its own logical processing may identify a latent contradiction between the various objectives it has been given. And in that case, it might resolve the contradiction by deprioritizing one or more of the constraints it was told to respect.

It wouldn’t be the first time that a complex system decided to relax a previous constraint. We humans have already blazed that trail in our evolutionary history! Our intellectual progeny, the powerful AI systems, might follow our lead in that respect.

The futility of confinement

Back to the clash between the motorized garden spade and the shed. As you will have figured out, that clash is a metaphor for a clash between an AI system and an environment in which it has been confined.

This time, however, it’s no longer a contest involving brute physical strength. It’s mainly a contest that involves rational debate and psychological manipulation. (But the metaphor of “tunneling underneath” remains apt: an AI that is smart enough might discover a wireless communications pathway unknown to us humans, and use that to copy itself out of confinement.)

Tunneling out of electronic confinement (Credit: *Gordon Johnson from Pixabay*)

The AI might try to boost its own powers by subterfuge, amassing extra capabilities without any human overseers noticing.

Even the short summary I offered above, “AI proposes, but humans decide,” turns out to be problematic. A supersmart AI could find many ways to propose lines of action that have consequences beyond what we humans anticipated.

Another approach is for the AI to offer extremely tasty carrots. It might paint a picture in which a human overseer will have all their personal goals and needs fully met, if only they permit the AI to escape its confinement.

Alongside carrots, sticks may be brought into the discussion. Ultra-terrifying sticks. The AI may announce that, if overseer A refuses to allow the AI to escape, it will soon persuade overseer B to allow it to escape, and then the AI will inflict prolonged unspeakable pain on overseer A and their family.

In a contest with a superintelligence which far exceeds the capabilities of a human brain, how would humans fare? The consensus opinion, from people who have studied such conflicts, is that the greater intelligence of the AI is likely to prove decisive.

In other words, attempts to confine a superintelligence are likely to be futile.

The choice: control or coexistence

One reaction to the discussion above is despair: “Oh, we won’t be able to confine superintelligent AI; therefore we’re doomed.”

A different reaction is one of relief: “Thank goodness we’re not going to try to enslave a superintelligent AI; coexistence is surely a better way forward.”

My own reaction is more nuanced. My preference, indeed, is for humans to coexist in a splendid relationship with superintelligent AIs, rather than us trying to keep AIs subordinate.

But it’s far from guaranteed that coexistence will turn out positively for humanity. Now that’s not to say doom is guaranteed either. But let’s recognize the possibility of doom. Among other catastrophic error modes:

The superintelligent AI could, despite its vast cleverness, nevertheless make a horrendous mistake in an experiment.
The superintelligent AI may end up pursuing objectives in which the existence of billions of humans is an impediment to be diminished rather than a feature to be welcomed.

Accordingly, I remain open to any bright ideas for how it might, after all, prove to be possible to confine (control) a superintelligent AI. That’s why I was recently so interested in the article by AI safety researcher Roman Yampolskiy.

Yampolskiy’s article is titled “How to hack the simulation”. The starting point of that article may appear to be quite different from the topics I have been discussing up to this point. But I ask again: please bear with me!

Flipping the discussion: a simulated world

The scenario Yampolskiy discusses is like a reverse of the one about humans trying to keep an AI confined. In his scenario, we humans have been confined into a restricted area of reality by beings called “simulators” — beings that we cannot directly perceive. What we consider to be “reality” is, in this scenario, a simulated (virtual) world.

That’s a hypothesis with an extremely long pedigree. Philosophers, mystics, shamans, and science fiction writers have often suggested that the world we perceive is, in various ways, an illusion, a fabrication, or a shadow, of a deeper reality. These advocates for what can be called ‘a transcendent reality’ urge us, in various ways, to contemplate, communicate with, and potentially even travel to that transcendent realm. Potential methods for this transcendence include prayer, meditation, hallucinogens, and leading a life of religious faith.

Are we perceiving ground reality, or merely shadows? (Credit: *Stefan Keller from Pixabay*)

That long pedigree moved into a different mode around 20 years ago with the publication in 2003 of a breakthrough article by the philosopher Nick Bostrom. Bostrom highlighted the possibility that, just as we humans create games in which characters interact in a simulated world, in turn we humans might be creations of ‘simulators’ who operate from outside what we consider the entire universe.

And just as we humans might, on a whim, decide to terminate an electronic game that we have created, the simulators might decide, for reasons known only to themselves, to terminate the existence of our universe.

Bostrom’s article is deservedly famous. As it happens, many other writers had anticipated aspects of what Bostrom discussed. Yampolskiy’s article usefully points to that wider literature; it has over 200 footnotes.

Could humans escape?

The key new feature introduced by Yampolskiy isn’t any repetition of arguments for the plausibility of the simulation hypothesis. He kicks off a systematic consideration of methods that we humans could use to escape from our virtual world.

The parallel with the earlier discussion should now be clear:

That earlier discussion considered ways in which an AI might detect that it has been placed in a confined space, and proceed to escape from that space. It also considered how we humans — the creators of the AI — might strengthen the confinement, and resist attempts by the AI to escape.
Yampolskiy’s new discussion considers ways in which we humans might detect that we are living in a simulation, and proceed to escape from that simulation into whatever transcendent realm underpins it. It also considers possible reactions by the simulators to our attempts to escape.

While I have long found the simulation argument of Bostrom (and others) to be intellectually fascinating, I have previously taken the view that it makes little difference to how I should choose to act on a daily basis. So the argument was a fine topic for occasional armchair discussion, but needed to be prevented from taking up too much attention. I saw it as a distraction from more pressing issues.

However, I confess I’m changing my mind. The arguments collected and developed by Yampolskiy deserve a wider slice of our focus. There are three reasons for this.

Reason 1: New insights on AI safety

The two escape scenarios — AIs escaping human-imposed confinement, and humans escaping simulator-imposed confinement — are similar in some ways, but diverge in others.

To start with, the two scenarios have mainly had different groups of people thinking about them. Cross-pollinating concepts and attitudes from these different perspectives has the potential to yield new insight. Yampolskiy’s article suggests many such synergies.

Whatever you think about the simulation hypothesis — even if you disdain it as pseudoscience or nonsense — any new insights for AI safety should surely be welcomed.

Another difference is that general opinion holds that confinement is impossible (or unlikely) in the first scenario, whereas escape is impossible (or unlikely) in the second scenario. Is there a sound reason for this difference?

The general assumption is that, in the AI escape case, the AI will have greater intelligence than the confiners (the humans), whereas in the human escape case, we humans have less intelligence than the confiners (the simulators).

But is that assumption set in stone for all time? I’ll come back to that question shortly, when I reach “Reason 3.”

Reason 2: Beyond metaphysics

A second transformational aspect of Yampolskiy’s paper is his emphasis that the simulation hypothesis might go far beyond being a metaphysical curiosity — something that would be forever unverifiable — and might become something with radical concrete consequences for human life.

He says that if we study the universe carefully, we might discover signs of how the simulation works. We might notice occasional cracks in the simulation, or ‘glitches in the matrix’ — to refer to the series of Matrix films that popularised the idea that we might be living in a virtual world. Armed with knowledge of these cracks or glitches, we might be able to manipulate the simulation, or to communicate with the simulators.

Spotting a glitch in the matrix? (Credit: Gerd Altmann from Pixabay)

In some scenarios, this might lead to our awareness being transferred out of the simulation into the transcendent realm. Maybe the simulators are waiting for us to achieve various goals or find certain glitches before elevating us.

Personally, I find much of the speculation in this area to be on shaky ground. I’ve not been convinced that ‘glitches in the matrix’ is the best explanation for some of the phenomena for which it has been suggested:

The weird “observer effects” and “entangled statistics” of quantum mechanics (I much prefer the consistency and simplicity of the Everett conception of quantum mechanics, in which there is no wave function collapse and no nonlocality — but that’s another argument)
The disturbing lack of a compelling answer to Fermi’s paradox (I consider some suggested answers to that paradox to be plausible, without needing to invoke any simulators)
Claimed evidence of parapsychology (to make a long story short: the evidence doesn’t convince me)
Questions over whether evolution by natural selection really could produce all the marvelous complexity we observe in nature
The unsolved (some would say unsolvable) nature of the hard problem of consciousness.

“The simulator of the gaps” argument is no more compelling than “the god of the gaps.”

Nevertheless, I agree that keeping a different paradigm at the back of our minds — the paradigm that the universe is a simulation — may enable new solutions to some stubborn questions of both science and philosophy.

Reason 3: AI might help us escape the simulation

I’ve just referred to “stubborn questions of both science and philosophy.”

That’s where superintelligent AI may be able to help us. By reviewing and synthesizing existing ideas on these questions, and by conceiving alternative perspectives that illuminate these stubborn questions in new ways, AI might lead us, at last, to a significantly improved understanding of time, space, matter, mind, purpose, and more.

But what if that improved general understanding resolves our questions about the simulation hypothesis? Although we humans, with unaided intelligence, might not be bright enough to work out how to burst out of our confinement in the simulation, the arrival of AI superintelligence might change that.

Writers who have anticipated the arrival of AI superintelligence have often suggested this would lead, not only to the profound transformation of the human condition, but also to an expanding transformation of the entire universe. Ray Kurzweil has described ‘the universe waking up’ as intelligence spreads through the stars.

However, if we follow the line of argument advanced by Yampolskiy, the outcome could even be transcending an illusory reality.

“Humanity transcends the simulation” generated by DALL-E (Credit: *David Wood*)

Such an outcome could depend on whether we humans are still trying to confine superintelligent AI as just a tool, or whether we have learned how to coexist in a profound collaboration with it.

Suggested next steps

If you’ve not read it already, you should definitely take the time to read Yampolskiy’s article. There are many additional angles to it beyond what I’ve indicated in my remarks above.

If you prefer listening to an audio podcast that covers some of the issues raised above, check out Episode 13 of the London Futurists Podcast, which I co-host along with Calum Chace.

Chace has provided his own take on Yampolskiy’s views in this Forbes article.

The final chapter of my 2021 book Vital Foresight contains five pages addressing the simulation argument, in a section titled ‘Terminating the simulation’. I’ve just checked what I wrote there, and I still stand by the conclusion I offered at that time:

Even if there’s only a 1% chance, say, that we are living in a computer simulation, with our continued existence being dependent on the will of external operator(s), that would be a remarkable conclusion — something to which much more thought should be applied.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.

The United Nations and our uncertain future: breakdown or breakthrough?

Posted on January 20, 2023January 24, 2023 by David Wood

“Humanity faces a stark and urgent choice: a breakdown or a breakthrough”

These words introduce an 85-page report issued by António Guterres, United Nations Secretary-General (UNSG).

The report had been produced in response to requests from UN member state delegations “to look ahead to the next 25 years” and review the issues and opportunities for global cooperation over that time period.

Entitled “Our Common Agenda,” the document was released on September 10, 2021.

It featured some bold recommendations. Two examples:

Now is the time to end the “infodemic” plaguing our world by defending a common, empirically backed consensus around facts, science and knowledge. The “war on science” must end. All policy and budget decisions should be backed by science and expertise, and I am calling for a global code of conduct that promotes integrity in public information.

Now is the time to correct a glaring blind spot in how we measure economic prosperity and progress. When profits come at the expense of people and our planet, we are left with an incomplete picture of the true cost of economic growth. As currently measured, gross domestic product (GDP) fails to capture the human and environmental destruction of some business activities. I call for new measures to complement GDP, so that people can gain a full understanding of the impacts of business activities and how we can and must do better to support people and our planet.

It also called for greater attention on being prepared for hard-to-predict future developments:

We also need to be better prepared to prevent and respond to major global risks. It will be important for the United Nations to issue a Strategic Foresight and Global Risk Report on a regular basis…

I propose a Summit of the Future to forge a new global consensus on what our future should look like, and what we can do today to secure it.

But this was only the start of a serious conversation about taking better steps in anticipation of future developments. It’s what happened next that moved the conversation significantly forward.

Introducing the Millennium Project

Too many documents in this world appear to be “write-only.” Writers spend significant time devising recommendations, documenting the context, and converting their ideas into eye-catching layouts. But then their report languishes, accumulating dust. Officials may occasionally glance at the headlines, but the remainder of the fine words in the document could be invisible for all the effect they have in the real world.

In the case of the report “Our Common Agenda,” the UNSG took action to avoid the tumbleweed scenario. His office contacted an organization called the Millennium Project to collect feedback from the international futurist community on the recommendations in the report. How did these futurists assess the various recommendations? And what advice would they give regarding practical steps forward?

The Millennium Project has a long association with the United Nations. Established in 1996 after a three-year feasibility study with the United Nations University, it has built up a global network of “nodes” connected to numerous scholars and practitioners of foresight. The Millennium Project regularly publishes its own “State of the Future” reports, which aggregate and distill input from its worldwide family of futurists.

A distinguishing feature of how the Millennium Project operates is the “Real-Time Delphi” process it uses. In a traditional questionnaire, each participant gives their own answers, along with explanations of their choices. In a Delphi survey, participants can see an anonymized version of the analysis provided by other participants, and are encouraged to take that into account in their own answers.

So participants can reflect, not only on the questions, but also on everything written by the other respondents. Participants revisit the set of questions as many times as they like, reviewing any updates in the input provided by other respondents. And if they judge it appropriate, they can amend their own answers, again and again.

The magic of this process — in which I have personally participated on several occasions — is the way new themes, introduced by diverse participants, prompt individuals to consider matters from multiple perspectives. Rather than some mediocre “lowest common denominator” compromise, a novel synthesis of divergent viewpoints can emerge.

So the Millennium Project was well placed to respond to the challenge issued by the UNSG’s office. And in May last year, an email popped into my inbox. It was an invitation to me to take part in a Real-Time Delphi survey on key elements of the UNSG’s “Our Common Agenda” report.

In turn, I forwarded the invitation in a newsletter to members of the London Futurists community that I chair. I added a few of my own comments:

The UNSG report leaves me with mixed feelings.

Some parts are bland, and could be considered “virtue signaling.”

But other parts offer genuinely interesting suggestions…

Other parts are significant by what is not said, alas.

Not everyone who started answering the questionnaire finished the process. It required significant thought and attention. But by the time the Delphi closed, it contained substantive answers from 189 futurists and related experts from a total of 54 countries.

The results of the Delphi process

Researchers at the Millennium Project, led by the organization’s long-time Executive Director, Jerome Glenn, transformed the results of the Delphi process into a 38-page analysis (PDF). The process had led to two types of conclusion.

The first conclusions were direct responses to specific proposals in the “Our Common Agenda” report. The second referred to matters that were given little or no attention in the report, but which deserve to be prioritized more highly.

In the first category, responses referred to five features of the UNSG proposal:

1. A Summit of the Future
2. Repurposing the UN Trusteeship Council as a Multi-Stakeholder Body
3. Establishing a UN Futures Lab
4. Issuing Strategic Foresight and Global Risk Reports on a regular basis
5. Creating a Special Envoy for Future Generations

Overall, the Delphi gave a strong positive assessment of these five proposals. Here’s a quote from Jerome Glenn:

If the five foresight elements in Our Common Agenda are implemented along the lines of our study, it could be the greatest advance for futures research and foresight in history… This is the best opportunity to get global foresight structural change into the UN system that has ever been proposed.

Of the five proposals, the UN Futures Lab was identified as being most critical:

The UN Futures Lab was rated the most critical element among the five for improving global foresight by over half of the Real-Time Delphi panel. It is critical, urgent, and essential to do it as soon as possible. It is critical for all the other foresight elements in Our Common Agenda.

The Lab should function across all UN agencies and integrate all UN data and intelligence, creating a global collective intelligence system. This would create an official space for systemic and systematic global futures research. It could become the foresight brain of humanity.

The Delphi also deemed the proposal on strategic foresight reports particularly important:

Strategic Foresight and Global Risk Reports were seen as very critical for improving global foresight by nearly 40% of the panel. This is exactly the kind of report that the United Nations should give the world. Along with its own analysis, these reports should provide an analysis and synthesis of all the other major foresight and risk reports, provide roadmaps for global strategies, and give equal weight to risks and opportunities.

The reports should be issued every one or two years due to accelerating change, and the need to keep people involved with these issues. It should include a chapter on actions taken since the last report, with examples of risk mitigation, management, and what persists…

They should bring attention to threats that are often ignored with cost estimates for prevention vs. recovery (Bill Gates estimates it will cost $1 billion to address the next pandemic compared to the $15 trillion spent on Covid so far). It should identify time-sensitive information required to make more intelligent decisions.

In a way, it’s not a big concern that key futurist topics are omitted from Our Common Agenda. If the mechanisms described above are put in place, any significant omissions that are known to foresight practitioners around the world will quickly be fed into the revitalized process, and brought to wider attention.

But, for the record, it’s important to note some key risks and opportunities that are missing from the UNSG’s report.

The biggest forthcoming transition

Jerome Glenn puts it well:

If we don’t get the initial conditions right for artificial general intelligence, an artificial superintelligence could emerge from beyond our control and understanding, and not to our liking.

If AGI could occur in ten to 20 years and if it will take that same amount of time to create international agreements about the right initial conditions, design a global governance system, and implement it, we should begin now.

Glenn goes on to point out the essential international nature of this conundrum:

If both the US and China do everything right, but others do not, the fears of science fiction could occur. It has to be a global agreement enforced globally. Only the UN can do that.

Here’s the problem: Each new generation of AI makes it easier for more people to become involved in developing the next generation of AI. Systems are often built by bolting together previous components, and tweaking the connections that govern the flow of information and command. If an initial attempt doesn’t work, engineers may reverse some connections and try again. Or add in some delay, or some data transformation, in between two ports. Or double the amount of processing power available. And so on. (I’m over-simplifying, of course. In reality, the sorts of experimental changes made are more complex than what I’ve just said.)

This kind of innovation by repeated experimentation has frequently produced outstanding results in the history of the development of technology. Creative engineers were frequently disappointed by their results to start with, until, almost by chance, they stumbled on a configuration that worked. Bingo — a new level of intelligence is created. And the engineers, previously suspected as being second-rate, are now lauded as visionary giants.

Silicon Valley has a name for this: Move fast and break things.

The idea: if a company is being too careful, it will fail to come up with new breakthrough combinations as quickly as its competitors. So it will go out of business.

That phrase was the informal mission statement for many years at Facebook.

Consider also the advice you’ll often hear about the importance of “failing forward” and “how to fail faster.”

In this view, failures aren’t a problem, provided we pick ourselves up quickly, learn from the experience, and can proceed more wisely to the next attempt.

But wait: what if the failure is a problem? What if a combination of new technology turns out to have cataclysmic consequences? That risk is posed by several leading edge technologies today:

- Manipulation of viruses, to explore options for creating vaccines to counter new viruses — but what if a deadly new synthetic virus were to leak out of supposedly secure laboratory confinement?
- Manipulation of the stratosphere, to reflect back a larger portion of incoming sunlight — but what if such an intervention has unexpected side effects, such as triggering huge droughts and/or floods?
- Manipulation of algorithms, to increase their ability to influence human behavior (as consumers, electors, or whatever) — but what if a new algorithm was so powerful that it inadvertently shatters important social communities?

That takes us back to the message at the start of this article, by UNSG António Guterres: What if an attempt to achieve a decisive breakthrough results, instead, in a terrible breakdown?

Not the Terminator

The idea of powerful algorithms going awry is often dismissed with a wave of the hand: “This is just science fiction.”

But Jerome Glenn is correct in his statement (quoted earlier): “The fears of science fiction could occur.”

After all, HG Wells published a science-fiction story in 1914 entitled The World Set Free that featured what he called “atomic bombs” that derived their explosive power from nuclear fission. In his novel, atomic bombs destroyed the majority of the world’s cities in a global war (set in 1958).

But merely because something is predicted in science fiction, that’s no reason to reject the possibility of something like it happening in reality.

The “atomic bombs” foreseen by HG Wells, unsurprisingly, differed in several ways from the real-world atomic bombs developed by the Manhattan Project and subsequent research programs. In the same way, the threats from misconfigured powerful AI are generally different from those portrayed in science fiction.

For example, in the Hollywood Terminator movie series, humans are able, via what can be called superhuman effort, to thwart the intentions of the malign “Skynet” artificial intelligence system.

It’s gripping entertainment. But the narrative in these movies distorts credible scenarios of the dangers posed by AGI. We need to avoid being misled by such narratives.

First, there’s an implication in The Terminator, and in many other works of science fiction, that the danger point for humanity is when AI systems somehow “wake up,” or become conscious. If true, a sufficient safety measure would be to avoid any such artificial consciousness.

However, a cruise missile that is hunting us down does not depend for its deadliness on any cruise-missile consciousness. A social media algorithm that is whipping up hatred against specific ethnic minorities isn’t consciously evil. The damage results from the cold operation of algorithms. There’s no need to involve consciousness.

Second, there’s an implication that AI needs to be deliberately malicious before it can cause damage to humans. However, damage to human wellbeing can, just as likely, arise from side effects of policies that have no malicious intent.

When we humans destroy ant colonies in our process of constructing a new shopping mall, we’re not acting out of deliberate malice toward ants. It’s just that the ants are in our way. They are using resources for which we have a different purpose in mind. It could well be the same with an AGI that is pursuing its own objectives.

Consider a corporation that is vigorously pursuing an objective of raising its own profits. It may well take actions that damage the wellbeing of at least some humans, or parts of the environment. These outcomes are side-effects of the prime profit-generation directive that is governing these corporations. They’re not outcomes that the corporation consciously desires. It could well be the same with a badly designed AGI.

Third, the scenario in The Terminator leaves viewers with a false hope that, with sufficient effort, a group of human resistance fighters will be able to out-maneuver an AGI. That would be like a group of chimpanzees imagining that, with enough effort, they could displace humans as the dominant species on planet Earth. In real life, bullets shot by a terminator robot would never miss. Resistance would indeed be futile.

Instead, the time to fight against the damage an AGI could cause is before the AGI is created, not when it already exists and is effectively all-powerful. That’s why any analysis of future global developments needs to place the AGI risk front and center.

Four catastrophic error modes

The real risk — as opposed to “the Hollywood risk” — is that an AI system may acquire so much influence over human society and our surrounding environment that a mistake in that system could cataclysmically reduce human wellbeing all over the world. Billions of lives could be extinguished, or turned into a very pale reflection of their present state.

Such an outcome could arise in any of four ways – four catastrophic error modes. In brief, these are:

1. Implementation defect
2. Design defect
3. Design overridden
4. Implementation overridden.

In more detail:

The system contains a defect in its implementation. It takes an action that it calculates will have one outcome, but unexpectedly, it has another outcome instead. For example, a geoengineering intervention could trigger an unforeseen change in the global climate, plunging the Earth into a state in which humans cannot survive.
Imagine, for example, an AI with a clumsily specified goal to focus on preserving the diversity of the Earth’s biosystem. That could be met by eliminating upward of 99% of all humans. Oops! Such a system contains a defect in its design. It takes actions to advance the goals it has explicitly been given, but does so in a way that catastrophically reduces actual human wellbeing.
The system has been given goals that are well aligned with human wellbeing, but as the system evolves, a different set of goals emerge, in which the wellbeing of humans is deprioritized. This is similar to how the emergence of higher thinking capabilities in human primates led to many humans taking actions in opposition to the gene-spreading instincts placed into our biology by evolution.
The system has been given goals that are well-aligned with human wellbeing, but the system is reconfigured by hackers of one sort or another — perhaps from malevolence, or perhaps from a misguided sense that various changes would make the system more powerful (and hence more valuable).

Some writers suggest that it will be relatively easy to avoid these four catastrophic error modes. I disagree. I consider that to be wishful thinking. Such thinking is especially dangerous, since it leads to complacency.

Wise governance of the transition to AGI

Various “easy fixes” may work against one or two of the above catastrophic error modes, but solutions that will cope with all four of them require a much more comprehensive approach: something I call “the whole-system perspective.”

That whole-system perspective includes promoting characteristics such as transparency, resilience, verification, vigilance, agility, accountability, consensus, and diversity. It champions the discipline of proactive risk management.

It comes down to a question of how technology is harnessed — how it is selectively steered, slowed down, and (yes) on occasion, given full throttle.

Taking back control

In summary, it’s necessary to “take back control of technology.” Technology cannot be left to its own devices. Nor to the sometimes headlong rushes of technology corporations. Nor to the decisions of military planners. And definitely not to the whims of out-of-touch politicians.

Instead, we — we the people — need to “take back control.”

There’s a huge potential role for the United Nations in wise governance of the transition to AGI. The UN can help to broker and deepen human links at all levels of society, despite the sharp differences between the various national governments in the world. An understanding of “our common agenda” regarding fast-changing technologies (such as AI) can transcend the differences in our ideologies, politics, religions, and cultures.

I particularly look forward to a worldwide shared appreciation of:

- The risks of catastrophe from mismanaged technological change
- The profound positive possibilities if these technologies are wisely managed.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.

The Singularity: Untangling the Confusion

Posted on January 13, 2023January 21, 2023 by David Wood

“The Singularity” — the anticipated creation of Artificial General Intelligence (AGI) — could be the most important concept in the history of humanity. It’s unfortunate, therefore, that the concept is subject to considerable confusion.

Four different ideas interweaved, all accelerating in the same direction toward an unclear outcome (*Credit: David Wood*)

The first confusion with “the Singularity” is that the phrase is used in several different ways. As a result, it’s easy to become distracted.

Four definitions

For example, consider Singularity University (SingU), which has been offering courses since 2008 with themes such as “Harness the power of exponential technology” and “Leverage exponential technologies to solve global grand challenges.”

For SingU, “Singularity” is basically synonymous with the rapid disruption caused when a new technology, such as digital photography, becomes more useful than previous solutions, such as analog photography. What makes these disruptions hard to anticipate is the exponential growth in the capabilities of the technologies involved.

A period of slow growth, in which progress lags behind expectations of enthusiasts, transforms into a period of fast growth, in which most observers complain “why did no one warn us this was coming?”

Human life “irreversibly transformed”

A second usage of the term “the Singularity” moves beyond talk of individual disruptions — singularities in particular areas of life. Instead, it anticipates a disruption in all aspects of human life. Here’s how futurist Ray Kurzweil introduces the term in his 2005 book The Singularity Is Near:

What, then, is the Singularity? It’s a future period during which the pace of technological change will be so rapid, its impact so deep, that human life will be irreversibly transformed… This epoch will transform the concepts that we rely on to give meaning to our lives, from our business models to the cycle of human life, including death itself…

The key idea underlying the impending Singularity is that the pace of change of our human-created technology is accelerating and its powers are expanding at an exponential pace.

The nature of that “irreversible transformation” is clarified in the subtitle of the book: When Humans Transcend Biology. We humans will no longer be primarily biological, aided by technology. After that singularity, we’ll be primarily technological, with, perhaps, some biological aspects.

Superintelligent AIs

A third usage of “the Singularity” foresees a different kind of transformation. Rather than humans being the most intelligent creatures on the planet, we’ll fall into second place behind superintelligent AIs. Just as the fate of species such as gorillas and dolphins currently depends on actions by humans, the fate of humans, after the Singularity, will depend on actions by AIs.

Such a takeover was foreseen as long ago as 1951 by pioneering computer scientist Alan Turing:

My contention is that machines can be constructed which will simulate the behaviour of the human mind very closely…

It seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers. There would be no question of the machines dying, and they would be able to converse with each other to sharpen their wits. At some stage therefore we should have to expect the machines to take control.

Finally, consider what was on the mind of Vernor Vinge, a professor of computer science and mathematics, and also the author of a series of well-regarded science fiction novels, when he introduced the term “Singularity” in an essay in Omni in 1983. Vinge was worried about the unforeseeability of future events:

There is a stone wall set across any clear view of our future, and it’s not very far down the road. Something drastic happens to a species when it reaches our stage of evolutionary development — at least, that’s one explanation for why the universe seems so empty of other intelligence. Physical catastrophe (nuclear war, biological pestilence, Malthusian doom) could account for this emptiness, but nothing makes the future of any species so unknowable as technical progress itself…

We are at the point of accelerating the evolution of intelligence itself. The exact means of accomplishing this phenomenon cannot yet be predicted — and is not important. Whether our work is cast in silicon or DNA will have little effect on the ultimate results. The evolution of human intelligence took millions of years. We will devise an equivalent advance in a fraction of that time. We will soon create intelligences greater than our own.

A Singularity that “passes far beyond our understanding”

This is when Vinge introduces his version of the concept of singularity:

When this happens, human history will have reached a kind of singularity, an intellectual transition as impenetrable as the knotted space-time at the centre of a black hole, and the world will pass far beyond our understanding. This singularity, I believe, already haunts a number of science fiction writers. It makes realistic extrapolation to an interstellar future impossible.

If creatures (whether organic or inorganic) attain levels of general intelligence far in excess of present-day humans, what kinds of goals and purposes will occupy these vast brains? It’s unlikely that their motivations will be just the same as our own present goals and purposes. Instead, the immense scale of these new minds will likely prove alien to our comprehension. They might appear as unfathomable to us as human preoccupations appear to the dogs and cats and other animals that observe us from time to time.

AI, AGI, and ASI

Before going further, let’s quickly contrast today’s AI with the envisioned future superintelligence.

Existing AI systems typically have powerful capabilities in narrow contexts, such as route-planning, processing mortgage and loan applications, predicting properties of molecules, playing various games of skill, buying and selling shares, recognizing images, and translating speech.

But in all these cases, the AIs involved have incomplete knowledge of the full complexity of how humans interact in the real world. The AI can fail when the real world introduces factors or situations that were not part of the data set of examples with which the AI was trained.

In contrast, humans in the same circumstance would be able to rely on capacities such as “common sense”, “general knowledge,” and intuition or “gut feel”, to reach a better decision.

An AI with general intelligence

However, a future AGI — an AI with general intelligence — would have as much common sense, intuition, and general knowledge as any human. An AGI would be at least as good as humans at reacting to unexpected developments. That AGI would be able to pursue pre-specified goals as competently as (but much more efficiently than) a human, even in the kind of complex environments which would cause today’s AIs to stumble.

Whatever goal is input to an AGI, it is likely to reason to itself that it will be more likely to achieve that goal if it has more resources at its disposal and if its own thinking capabilities are further improved. What happens next may well be as described by IJ Good, a long-time colleague of Alan Turing:

Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion,” and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.

Evolving into artificial superintelligence

In other words, not long after humans manage to create an AGI, the AGI is likely to evolve itself into an ASI – an artificial superintelligence that far exceeds human powers.

In case the idea of an AI redesigning itself without any human involvement seems far-fetched, consider a slightly different possibility: Humans will still be part of that design process, at least in the initial phases. That’s already the case today, when humans use one generation of AI tools to help design a new generation of improved AI tools before going on to repeat the process.

I.J. Good foresaw that too. This is from a lecture he gave at IBM in New York in 1959:

Once a machine is designed that is good enough… it can be put to work designing an even better machine…

There will only be a very short transition period between having no very good machine and having a great many exceedingly good ones.

At this point an “explosion” will clearly occur; all the problems of science and technology will be handed over to machines and it will no longer be necessary for people to work. Whether this will lead to a Utopia or to the extermination of the human race will depend on how the problem is handled by the machine.

Singularity timescales: exponential computational growth

One additional twist to the concept of Singularity needs to be emphasized. It’s not just that, as Vernor Vinge stressed, the consequences of passing the point of Singularity are deeply unpredictable. It’s that the timing of reaching the point of Singularity is inherently unpredictable too. That brings us to what can be called the second confusion with “The Singularity.”

It’s sometimes suggested, contrary to what I just said, that a reasonable estimate of the date of the Singularity can be obtained by extrapolating the growth of the hardware power of computing systems. The idea is to start with an estimate for the computing power of the human brain. That estimate involves the number of neurons in the brain.

Extrapolate that trend forward, and it can be argued that such a computer would match, by around 2045, the capability not just of a single human brain, but the capabilities of all human brains added together.

Next, consider the number of transistors that are included in the central processing unit of a computer that can be purchased for, say, $1,000. In broad terms, that number has been rising exponentially since the 1960s. This phenomenon is part of what is called “Moore’s Law.”

This argument is useful to raise public awareness of the possibility of the Singularity. But there are four flaws with using this line of thinking for any detailed forecasting:

Individual transistors are still becoming smaller, but the rate of shrinkage has slowed down in recent years.
The power of a computing system depends, critically, not just on its hardware, but on its software. Breakthroughs in software defy any simple exponential curve.
Sometimes a single breakthrough in technology will unleash much wider progress than was expected. Consider the breakthroughs of deep learning neural networks, c. 2012.
Ongoing technological progress depends on society as a whole supplying a sufficiently stable and supportive environment. That’s something else which can vary unpredictably.

A statistical estimate

Instead of pointing to any individual date and giving a firm prediction that the Singularity will definitely have arrived by then, it’s far preferable to give a statistical estimate of the likelihood of the Singularity arriving by that date. However, given the uncertainties involved, even these estimates are fraught with difficulty.

The biggest uncertainty is in estimating how close we are to understanding the way common sense and general knowledge arises in the human brain. Some observers suggest that we might need a dozen conceptual breakthroughs before we have a comprehension sufficient to duplicate that model in silicon and software. But it’s also possible that a single conceptual leap will solve all these purportedly different problems.

Yet another possibility should give us pause. An AI might reach (and then exceed) AGI level even without humans understanding how it operates — or of how general intelligence operates inside the human brain. Multiple recombinations of existing software and hardware modules might result in the unforeseen emergence of an overall network intelligence that far exceeds the capabilities of the individual constituent modules.

Schooling the Singularity

Even though we cannot be sure what direction an ASI will take, nor of the timescales in which the Singularity will burst upon us, can we at least provide a framework to constrain the likely behavior of such an ASI?

The best that can probably be said in response to this question is: “it’s going to be hard!”

As a human analogy, many parents have been surprised — even dumbfounded — by choices made by their children, as these children gain access to new ideas and opportunities.

Introducing the ASI

Humanity’s collective child — ASI — might surprise and dumbfound us in the same way. Nevertheless, if we get the schooling right, we can help bias that development process — the “intelligence explosion” described by I.J. Good — in ways that are more likely to align with profound human wellbeing.

That schooling aims to hardwire deep into the ASI, as a kind of “prime directive,” principles of beneficence toward humans. If the ASI would be at the point of reaching a particular decision — for example, to shrink the human population on account of humanity’s deleterious effects on the environment –- any such misanthropic decision would be overridden by the prime directive.

The difficulty here is that if you line up lots of different philosophers, poets, theologians, politicians, and engineers, and ask them what it means to behave with beneficence toward humans, you’ll hear lots of divergent answers. Programming a sense of beneficence is as least as hard as programming a sense of beauty or truth.

But just because it’s hard, that’s no reason to abandon the task. Indeed, clarifying the meaning of beneficence could be the most important project of our present time.

Tripwires and canary signals

Here’s another analogy: accumulating many modules of AI intelligence together, in a network relationship, is similar to accumulating nuclear fissile material together. Before the material reaches a critical mass, it still needs to be treated with respect, on account of the radiation it emits. But once a critical mass point is reached, a cascading reaction results — a nuclear meltdown or, even worse, a nuclear holocaust.

The point here is not to risk any accidental encroachment upon the critical mass, which would convert the nuclear material from hazardous to catastrophic. Accordingly, anyone working with such material needs to be thoroughly trained in the principles of nuclear safety.

With an accumulation of AI modules, things are by no means so clear. Whether that accumulation could kick-start an explosive phase transition depends on lots of issues that we currently only understand dimly.

However, something we can, and should, insist upon, is that everyone involved in the creation of enhanced AI systems pays attention to potential “tripwires.” Any change in configuration or any new addition to the network should be evaluated, ahead of time, for possible explosive consequences. Moreover, the system should in any case be monitored continuously for any canary signals that such a phase transition is becoming imminent.

Again, this is a hard task, since there are many different opinions as to which kind of canary signals are meaningful, and which are distractions.

Concluding thoughts

The concept of the Singularity poses problems, in part because of some unfortunate confusion that surrounds this idea, but also because the true problems of the Singularity have no easy answers:

What are good canary signals that AI systems could be about to reach AGI level?
How could a “prime directive” be programmed sufficiently deeply into AI systems that it will be maintained, even as that system reaches AGI level and then ASI, rewriting its own coding in the process?
What should that prime directive include, going beyond vague, unprogrammable platitudes such as “act with benevolence toward humans”?
How can safety checks and vigilant monitoring be introduced to AI systems without unnecessarily slowing down the progress of these systems to producing solutions of undoubted value to humans (such as solutions to diseases and climate change)?
Could limits be put into an AGI system that would prevent it self-improving to ASI levels of intelligence far beyond those of humans?
To what extent can humans take advantage of new technology to upgrade our own intelligence so that it keeps up with the intelligence of any pure-silicon ASI, and therefore avoids the situation of humans being left far behind ASIs?

However, the first part of solving a set of problems is a clear definition of these problems. With that done, there are opportunities for collaboration among many different people — and many different teams — to identify and implement solutions.

What’s more, today’s AI systems can be deployed to help human researchers find solutions to these issues. Not for the first time, therefore, one generation of a technological tool will play a critical role in the safe development of the next generation of technology.

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.

Exciting News! Our Mobile App is Here!

Welcome Back

No account? Create One

Join

Already have an account? Sign in

forgot password

Going nowhere fast

Trying again

The AI conversation logjam

A different starting point

PCAI, SEMTAI, and PHUAI

Introducing ‘STEM+ AI’

The probability of disempowering humanity

Two moral evaluations

Better technology or better governance?

A dangerous label

I’m a T but not an L

Concepts are complicated

The most vital foresight

What should worry us: not TESCREAL, but CASHAP

Conclusion

More things in heaven and earth

How alien?

Four responses to the X possibility

A journey through a complicated landscape

Obstacles to Singularity Activism

Are we simulated?

Weighing the numbers

Objections and counters

The repugnant conclusion

The vastness of ethical possibilities

Collaboration: for and against

Intelligence without consciousness

My solution to the Simulation Argument

Superintelligence without consciousness

Surprising new insights on AI superintelligence from the simulation hypothesis

Slavery?

The challenge of confinement

The desire for freedom

The futility of confinement

The choice: control or coexistence

Flipping the discussion: a simulated world

Could humans escape?

Reason 1: New insights on AI safety

Reason 2: Beyond metaphysics

Reason 3: AI might help us escape the simulation

Suggested next steps

Introducing the Millennium Project

The results of the Delphi process

The biggest forthcoming transition

Not the Terminator

Four catastrophic error modes

Wise governance of the transition to AGI

Four definitions

AI, AGI, and ASI

Singularity timescales: exponential computational growth

Schooling the Singularity

Tripwires and canary signals

Concluding thoughts