The United Nations and our uncertain future: breakdown or breakthrough?

“Humanity faces a stark and urgent choice: a breakdown or a breakthrough”

These words introduce an 85-page report issued by António Guterres, United Nations Secretary-General (UNSG).

The report had been produced in response to requests from UN member state delegations “to look ahead to the next 25 years” and review the issues and opportunities for global cooperation over that time period.

Entitled “Our Common Agenda,” the document was released on September 10, 2021.

Credit: United Nations

It featured some bold recommendations. Two examples:

Now is the time to end the “infodemic” plaguing our world by defending a common, empirically backed consensus around facts, science and knowledge. The “war on science” must end. All policy and budget decisions should be backed by science and expertise, and I am calling for a global code of conduct that promotes integrity in public information.

Now is the time to correct a glaring blind spot in how we measure economic prosperity and progress. When profits come at the expense of people and our planet, we are left with an incomplete picture of the true cost of economic growth. As currently measured, gross domestic product (GDP) fails to capture the human and environmental destruction of some business activities. I call for new measures to complement GDP, so that people can gain a full understanding of the impacts of business activities and how we can and must do better to support people and our planet.

It also called for greater attention on being prepared for hard-to-predict future developments:

We also need to be better prepared to prevent and respond to major global risks. It will be important for the United Nations to issue a Strategic Foresight and Global Risk Report on a regular basis… 

I propose a Summit of the Future to forge a new global consensus on what our future should look like, and what we can do today to secure it.

But this was only the start of a serious conversation about taking better steps in anticipation of future developments. It’s what happened next that moved the conversation significantly forward.

Introducing the Millennium Project

Too many documents in this world appear to be “write-only.” Writers spend significant time devising recommendations, documenting the context, and converting their ideas into eye-catching layouts. But then their report languishes, accumulating dust. Officials may occasionally glance at the headlines, but the remainder of the fine words in the document could be invisible for all the effect they have in the real world.

In the case of the report “Our Common Agenda,” the UNSG took action to avoid the tumbleweed scenario. His office contacted an organization called the Millennium Project to collect feedback from the international futurist community on the recommendations in the report. How did these futurists assess the various recommendations? And what advice would they give regarding practical steps forward?

The Millennium Project has a long association with the United Nations. Established in 1996 after a three-year feasibility study with the United Nations University, it has built up a global network of “nodes” connected to numerous scholars and practitioners of foresight. The Millennium Project regularly publishes its own “State of the Future” reports, which aggregate and distill input from its worldwide family of futurists.

A distinguishing feature of how the Millennium Project operates is the “Real-Time Delphi” process it uses. In a traditional questionnaire, each participant gives their own answers, along with explanations of their choices. In a Delphi survey, participants can see an anonymized version of the analysis provided by other participants, and are encouraged to take that into account in their own answers. 

So participants can reflect, not only on the questions, but also on everything written by the other respondents. Participants revisit the set of questions as many times as they like, reviewing any updates in the input provided by other respondents. And if they judge it appropriate, they can amend their own answers, again and again.

The magic of this process — in which I have personally participated on several occasions — is the way new themes, introduced by diverse participants, prompt individuals to consider matters from multiple perspectives. Rather than some mediocre “lowest common denominator” compromise, a novel synthesis of divergent viewpoints can emerge.

So the Millennium Project was well placed to respond to the challenge issued by the UNSG’s office. And in May last year, an email popped into my inbox. It was an invitation to me to take part in a Real-Time Delphi survey on key elements of the UNSG’s “Our Common Agenda” report.

In turn, I forwarded the invitation in a newsletter to members of the London Futurists community that I chair. I added a few of my own comments:

The UNSG report leaves me with mixed feelings.

Some parts are bland, and could be considered “virtue signaling.”

But other parts offer genuinely interesting suggestions…

Other parts are significant by what is not said, alas.

Not everyone who started answering the questionnaire finished the process. It required significant thought and attention. But by the time the Delphi closed, it contained substantive answers from 189 futurists and related experts from a total of 54 countries.

The results of the Delphi process

Researchers at the Millennium Project, led by the organization’s long-time Executive Director, Jerome Glenn, transformed the results of the Delphi process into a 38-page analysis (PDF). The process had led to two types of conclusion.

The first conclusions were direct responses to specific proposals in the “Our Common Agenda” report. The second referred to matters that were given little or no attention in the report, but which deserve to be prioritized more highly.

In the first category, responses referred to five features of the UNSG proposal:

    1. A Summit of the Future
    2. Repurposing the UN Trusteeship Council as a Multi-Stakeholder Body
    3. Establishing a UN Futures Lab
    4. Issuing Strategic Foresight and Global Risk Reports on a regular basis
    5. Creating a Special Envoy for Future Generations

Overall, the Delphi gave a strong positive assessment of these five proposals. Here’s a quote from Jerome Glenn:

If the five foresight elements in Our Common Agenda are implemented along the lines of our study, it could be the greatest advance for futures research and foresight in history… This is the best opportunity to get global foresight structural change into the UN system that has ever been proposed.

Of the five proposals, the UN Futures Lab was identified as being most critical:

The UN Futures Lab was rated the most critical element among the five for improving global foresight by over half of the Real-Time Delphi panel. It is critical, urgent, and essential to do it as soon as possible. It is critical for all the other foresight elements in Our Common Agenda.

The Lab should function across all UN agencies and integrate all UN data and intelligence, creating a global collective intelligence system. This would create an official space for systemic and systematic global futures research. It could become the foresight brain of humanity.

The Delphi also deemed the proposal on strategic foresight reports particularly important:

Strategic Foresight and Global Risk Reports were seen as very critical for improving global foresight by nearly 40% of the panel. This is exactly the kind of report that the United Nations should give the world. Along with its own analysis, these reports should provide an analysis and synthesis of all the other major foresight and risk reports, provide roadmaps for global strategies, and give equal weight to risks and opportunities.

The reports should be issued every one or two years due to accelerating change, and the need to keep people involved with these issues. It should include a chapter on actions taken since the last report, with examples of risk mitigation, management, and what persists…

They should bring attention to threats that are often ignored with cost estimates for prevention vs. recovery (Bill Gates estimates it will cost $1 billion to address the next pandemic compared to the $15 trillion spent on Covid so far). It should identify time-sensitive information required to make more intelligent decisions.

In a way, it’s not a big concern that key futurist topics are omitted from Our Common Agenda. If the mechanisms described above are put in place, any significant omissions that are known to foresight practitioners around the world will quickly be fed into the revitalized process, and brought to wider attention.

But, for the record, it’s important to note some key risks and opportunities that are missing from the UNSG’s report.

The biggest forthcoming transition

Jerome Glenn puts it well:

If we don’t get the initial conditions right for artificial general intelligence, an artificial superintelligence could emerge from beyond our control and understanding, and not to our liking.

If AGI could occur in ten to 20 years and if it will take that same amount of time to create international agreements about the right initial conditions, design a global governance system, and implement it, we should begin now.

Glenn goes on to point out the essential international nature of this conundrum:

If both the US and China do everything right, but others do not, the fears of science fiction could occur. It has to be a global agreement enforced globally. Only the UN can do that.

Here’s the problem: Each new generation of AI makes it easier for more people to become involved in developing the next generation of AI. Systems are often built by bolting together previous components, and tweaking the connections that govern the flow of information and command. If an initial attempt doesn’t work, engineers may reverse some connections and try again. Or add in some delay, or some data transformation, in between two ports. Or double the amount of processing power available. And so on. (I’m over-simplifying, of course. In reality, the sorts of experimental changes made are more complex than what I’ve just said.)

This kind of innovation by repeated experimentation has frequently produced outstanding results in the history of the development of technology. Creative engineers were frequently disappointed by their results to start with, until, almost by chance, they stumbled on a configuration that worked. Bingo — a new level of intelligence is created. And the engineers, previously suspected as being second-rate, are now lauded as visionary giants.

Silicon Valley has a name for this: Move fast and break things.

The idea: if a company is being too careful, it will fail to come up with new breakthrough combinations as quickly as its competitors. So it will go out of business.

That phrase was the informal mission statement for many years at Facebook.

Consider also the advice you’ll often hear about the importance of “failing forward” and “how to fail faster.”

In this view, failures aren’t a problem, provided we pick ourselves up quickly, learn from the experience, and can proceed more wisely to the next attempt.

But wait: what if the failure is a problem? What if a combination of new technology turns out to have cataclysmic consequences? That risk is posed by several leading edge technologies today:

    • Manipulation of viruses, to explore options for creating vaccines to counter new viruses — but what if a deadly new synthetic virus were to leak out of supposedly secure laboratory confinement?
    • Manipulation of the stratosphere, to reflect back a larger portion of incoming sunlight — but what if such an intervention has unexpected side effects, such as triggering huge droughts and/or floods?
    • Manipulation of algorithms, to increase their ability to influence human behavior (as consumers, electors, or whatever) but what if a new algorithm was so powerful that it inadvertently shatters important social communities?

That takes us back to the message at the start of this article, by UNSG António Guterres: What if an attempt to achieve a decisive breakthrough results, instead, in a terrible breakdown?

Not the Terminator

The idea of powerful algorithms going awry is often dismissed with a wave of the hand: “This is just science fiction.”

But Jerome Glenn is correct in his statement (quoted earlier): “The fears of science fiction could occur.”

After all, HG Wells published a science-fiction story in 1914 entitled The World Set Free that featured what he called “atomic bombs” that derived their explosive power from nuclear fission. In his novel, atomic bombs destroyed the majority of the world’s cities in a global war (set in 1958).


But merely because something is predicted in science fiction, that’s no reason to reject the possibility of something like it happening in reality.

The “atomic bombs” foreseen by HG Wells, unsurprisingly, differed in several ways from the real-world atomic bombs developed by the Manhattan Project and subsequent research programs. In the same way, the threats from misconfigured powerful AI are generally different from those portrayed in science fiction.

For example, in the Hollywood Terminator movie series, humans are able, via what can be called superhuman effort, to thwart the intentions of the malign “Skynet” artificial intelligence system.

It’s gripping entertainment. But the narrative in these movies distorts credible scenarios of the dangers posed by AGI. We need to avoid being misled by such narratives.

First, there’s an implication in The Terminator, and in many other works of science fiction, that the danger point for humanity is when AI systems somehow “wake up,” or become conscious. If true, a sufficient safety measure would be to avoid any such artificial consciousness.

However, a cruise missile that is hunting us down does not depend for its deadliness on any cruise-missile consciousness. A social media algorithm that is whipping up hatred against specific ethnic minorities isn’t consciously evil. The damage results from the cold operation of algorithms. There’s no need to involve consciousness.

Second, there’s an implication that AI needs to be deliberately malicious before it can cause damage to humans. However, damage to human wellbeing can, just as likely, arise from side effects of policies that have no malicious intent.

When we humans destroy ant colonies in our process of constructing a new shopping mall, we’re not acting out of deliberate malice toward ants. It’s just that the ants are in our way. They are using resources for which we have a different purpose in mind. It could well be the same with an AGI that is pursuing its own objectives.

Consider a corporation that is vigorously pursuing an objective of raising its own profits. It may well take actions that damage the wellbeing of at least some humans, or parts of the environment. These outcomes are side-effects of the prime profit-generation directive that is governing these corporations. They’re not outcomes that the corporation consciously desires. It could well be the same with a badly designed AGI.

Third, the scenario in The Terminator leaves viewers with a false hope that, with sufficient effort, a group of human resistance fighters will be able to out-maneuver an AGI. That would be like a group of chimpanzees imagining that, with enough effort, they could displace humans as the dominant species on planet Earth. In real life, bullets shot by a terminator robot would never miss. Resistance would indeed be futile.

Instead, the time to fight against the damage an AGI could cause is before the AGI is created, not when it already exists and is effectively all-powerful. That’s why any analysis of future global developments needs to place the AGI risk front and center.

Four catastrophic error modes

The real risk — as opposed to “the Hollywood risk” — is that an AI system may acquire so much influence over human society and our surrounding environment that a mistake in that system could cataclysmically reduce human wellbeing all over the world. Billions of lives could be extinguished, or turned into a very pale reflection of their present state.

Such an outcome could arise in any of four ways – four catastrophic error modes. In brief, these are:

    1. Implementation defect
    2. Design defect
    3. Design overridden
    4. Implementation overridden.
Credit: David Wood and Pixabay

In more detail:

  1. The system contains a defect in its implementation. It takes an action that it calculates will have one outcome, but unexpectedly, it has another outcome instead. For example, a geoengineering intervention could trigger an unforeseen change in the global climate, plunging the Earth into a state in which humans cannot survive.

  2. Imagine, for example, an AI with a clumsily specified goal to focus on preserving the diversity of the Earth’s biosystem. That could be met by eliminating upward of 99% of all humans. Oops! Such a system contains a defect in its design. It takes actions to advance the goals it has explicitly been given, but does so in a way that catastrophically reduces actual human wellbeing.

  3. The system has been given goals that are well aligned with human wellbeing, but as the system evolves, a different set of goals emerge, in which the wellbeing of humans is deprioritized. This is similar to how the emergence of higher thinking capabilities in human primates led to many humans taking actions in opposition to the gene-spreading instincts placed into our biology by evolution.

  4. The system has been given goals that are well-aligned with human wellbeing, but the system is reconfigured by hackers of one sort or another — perhaps from malevolence, or perhaps from a misguided sense that various changes would make the system more powerful (and hence more valuable).

Some writers suggest that it will be relatively easy to avoid these four catastrophic error modes. I disagree. I consider that to be wishful thinking. Such thinking is especially dangerous, since it leads to complacency.

Wise governance of the transition to AGI

Various “easy fixes” may work against one or two of the above catastrophic error modes, but solutions that will cope with all four of them require a much more comprehensive approach: something I call “the whole-system perspective.”

That whole-system perspective includes promoting characteristics such as transparency, resilience, verification, vigilance, agility, accountability, consensus, and diversity. It champions the discipline of proactive risk management.

It comes down to a question of how technology is harnessed — how it is selectively steered, slowed down, and (yes) on occasion, given full throttle.

Taking back control

In summary, it’s necessary to “take back control of technology.” Technology cannot be left to its own devices. Nor to the sometimes headlong rushes of technology corporations. Nor to the decisions of military planners. And definitely not to the whims of out-of-touch politicians.

Instead, we — we the people — need to “take back control.”

Credit: Tesfu Assefa

There’s a huge potential role for the United Nations in wise governance of the transition to AGI. The UN can help to broker and deepen human links at all levels of society, despite the sharp differences between the various national governments in the world. An understanding of “our common agenda” regarding fast-changing technologies (such as AI) can transcend the differences in our ideologies, politics, religions, and cultures. 

I particularly look forward to a worldwide shared appreciation of:

    • The risks of catastrophe from mismanaged technological change
    • The profound positive possibilities if these technologies are wisely managed.

The Singularity: Untangling the Confusion

“The Singularity” — the anticipated creation of Artificial General Intelligence (AGI) — could be the most important concept in the history of humanity. It’s unfortunate, therefore, that the concept is subject to considerable confusion.

Four different ideas interweaved, all accelerating in the same direction toward an unclear outcome (Credit: David Wood)

The first confusion with “the Singularity” is that the phrase is used in several different ways. As a result, it’s easy to become distracted.

Four definitions

For example, consider Singularity University (SingU), which has been offering courses since 2008 with themes such as “Harness the power of exponential technology” and “Leverage exponential technologies to solve global grand challenges.”

For SingU, “Singularity” is basically synonymous with the rapid disruption caused when a new technology, such as digital photography, becomes more useful than previous solutions, such as analog photography. What makes these disruptions hard to anticipate is the exponential growth in the capabilities of the technologies involved.

A period of slow growth, in which progress lags behind expectations of enthusiasts, transforms into a period of fast growth, in which most observers complain “why did no one warn us this was coming?”

Human life “irreversibly transformed”

A second usage of the term “the Singularity” moves beyond talk of individual disruptions — singularities in particular areas of life. Instead, it anticipates a disruption in all aspects of human life. Here’s how futurist Ray Kurzweil introduces the term in his 2005 book The Singularity Is Near:

What, then, is the Singularity? It’s a future period during which the pace of technological change will be so rapid, its impact so deep, that human life will be irreversibly transformed… This epoch will transform the concepts that we rely on to give meaning to our lives, from our business models to the cycle of human life, including death itself…

The key idea underlying the impending Singularity is that the pace of change of our human-created technology is accelerating and its powers are expanding at an exponential pace.

The nature of that “irreversible transformation” is clarified in the subtitle of the book: When Humans Transcend Biology. We humans will no longer be primarily biological, aided by technology. After that singularity, we’ll be primarily technological, with, perhaps, some biological aspects.

Superintelligent AIs

A third usage of “the Singularity” foresees a different kind of transformation. Rather than humans being the most intelligent creatures on the planet, we’ll fall into second place behind superintelligent AIs. Just as the fate of species such as gorillas and dolphins currently depends on actions by humans, the fate of humans, after the Singularity, will depend on actions by AIs.

Such a takeover was foreseen as long ago as 1951 by pioneering computer scientist Alan Turing:

My contention is that machines can be constructed which will simulate the behaviour of the human mind very closely…

It seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers. There would be no question of the machines dying, and they would be able to converse with each other to sharpen their wits. At some stage therefore we should have to expect the machines to take control.

Finally, consider what was on the mind of Vernor Vinge, a professor of computer science and mathematics, and also the author of a series of well-regarded science fiction novels, when he introduced the term “Singularity” in an essay in Omni in 1983. Vinge was worried about the unforeseeability of future events:

There is a stone wall set across any clear view of our future, and it’s not very far down the road. Something drastic happens to a species when it reaches our stage of evolutionary development — at least, that’s one explanation for why the universe seems so empty of other intelligence. Physical catastrophe (nuclear war, biological pestilence, Malthusian doom) could account for this emptiness, but nothing makes the future of any species so unknowable as technical progress itself…

We are at the point of accelerating the evolution of intelligence itself. The exact means of accomplishing this phenomenon cannot yet be predicted — and is not important. Whether our work is cast in silicon or DNA will have little effect on the ultimate results. The evolution of human intelligence took millions of years. We will devise an equivalent advance in a fraction of that time. We will soon create intelligences greater than our own.

A Singularity that “passes far beyond our understanding”

This is when Vinge introduces his version of the concept of singularity:

When this happens, human history will have reached a kind of singularity, an intellectual transition as impenetrable as the knotted space-time at the centre of a black hole, and the world will pass far beyond our understanding. This singularity, I believe, already haunts a number of science fiction writers. It makes realistic extrapolation to an interstellar future impossible.

If creatures (whether organic or inorganic) attain levels of general intelligence far in excess of present-day humans, what kinds of goals and purposes will occupy these vast brains? It’s unlikely that their motivations will be just the same as our own present goals and purposes. Instead, the immense scale of these new minds will likely prove alien to our comprehension. They might appear as unfathomable to us as human preoccupations appear to the dogs and cats and other animals that observe us from time to time.

Credit: David Wood

AI, AGI, and ASI

Before going further, let’s quickly contrast today’s AI with the envisioned future superintelligence.

Existing AI systems typically have powerful capabilities in narrow contexts, such as route-planning, processing mortgage and loan applications, predicting properties of molecules, playing various games of skill, buying and selling shares, recognizing images, and translating speech.

But in all these cases, the AIs involved have incomplete knowledge of the full complexity of how humans interact in the real world. The AI can fail when the real world introduces factors or situations that were not part of the data set of examples with which the AI was trained.

In contrast, humans in the same circumstance would be able to rely on capacities such as “common sense”, “general knowledge,” and intuition or “gut feel”, to reach a better decision.

An AI with general intelligence

However, a future AGI — an AI with general intelligence — would have as much common sense, intuition, and general knowledge as any human. An AGI would be at least as good as humans at reacting to unexpected developments. That AGI would be able to pursue pre-specified goals as competently as (but much more efficiently than) a human, even in the kind of complex environments which would cause today’s AIs to stumble.

Whatever goal is input to an AGI, it is likely to reason to itself that it will be more likely to achieve that goal if it has more resources at its disposal and if its own thinking capabilities are further improved. What happens next may well be as described by IJ Good, a long-time colleague of Alan Turing:

Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion,” and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.

Evolving into artificial superintelligence

In other words, not long after humans manage to create an AGI, the AGI is likely to evolve itself into an ASI – an artificial superintelligence that far exceeds human powers.

In case the idea of an AI redesigning itself without any human involvement seems far-fetched, consider a slightly different possibility: Humans will still be part of that design process, at least in the initial phases. That’s already the case today, when humans use one generation of AI tools to help design a new generation of improved AI tools before going on to repeat the process.

I.J. Good foresaw that too. This is from a lecture he gave at IBM in New York in 1959:

Once a machine is designed that is good enough… it can be put to work designing an even better machine…

There will only be a very short transition period between having no very good machine and having a great many exceedingly good ones.

At this point an “explosion” will clearly occur; all the problems of science and technology will be handed over to machines and it will no longer be necessary for people to work. Whether this will lead to a Utopia or to the extermination of the human race will depend on how the problem is handled by the machine.

Singularity timescales: exponential computational growth

One additional twist to the concept of Singularity needs to be emphasized. It’s not just that, as Vernor Vinge stressed, the consequences of passing the point of Singularity are deeply unpredictable. It’s that the timing of reaching the point of Singularity is inherently unpredictable too. That brings us to what can be called the second confusion with “The Singularity.”

It’s sometimes suggested, contrary to what I just said, that a reasonable estimate of the date of the Singularity can be obtained by extrapolating the growth of the hardware power of computing systems. The idea is to start with an estimate for the computing power of the human brain. That estimate involves the number of neurons in the brain. 

Extrapolate that trend forward, and it can be argued that such a computer would match, by around 2045, the capability not just of a single human brain, but the capabilities of all human brains added together.

Next, consider the number of transistors that are included in the central processing unit of a computer that can be purchased for, say, $1,000. In broad terms, that number has been rising exponentially since the 1960s. This phenomenon is part of what is called “Moore’s Law.” 

This argument is useful to raise public awareness of the possibility of the Singularity. But there are four flaws with using this line of thinking for any detailed forecasting:

  1. Individual transistors are still becoming smaller, but the rate of shrinkage has slowed down in recent years.
  2. The power of a computing system depends, critically, not just on its hardware, but on its software. Breakthroughs in software defy any simple exponential curve.
  3. Sometimes a single breakthrough in technology will unleash much wider progress than was expected. Consider the breakthroughs of deep learning neural networks, c. 2012.
  4. Ongoing technological progress depends on society as a whole supplying a sufficiently stable and supportive environment. That’s something else which can vary unpredictably.

A statistical estimate

Instead of pointing to any individual date and giving a firm prediction that the Singularity will definitely have arrived by then, it’s far preferable to give a statistical estimate of the likelihood of the Singularity arriving by that date. However, given the uncertainties involved, even these estimates are fraught with difficulty.

The biggest uncertainty is in estimating how close we are to understanding the way common sense and general knowledge arises in the human brain. Some observers suggest that we might need a dozen conceptual breakthroughs before we have a comprehension sufficient to duplicate that model in silicon and software. But it’s also possible that a single conceptual leap will solve all these purportedly different problems.

Yet another possibility should give us pause. An AI might reach (and then exceed) AGI level even without humans understanding how it operates — or of how general intelligence operates inside the human brain. Multiple recombinations of existing software and hardware modules might result in the unforeseen emergence of an overall network intelligence that far exceeds the capabilities of the individual constituent modules.

Schooling the Singularity

Even though we cannot be sure what direction an ASI will take, nor of the timescales in which the Singularity will burst upon us, can we at least provide a framework to constrain the likely behavior of such an ASI?

The best that can probably be said in response to this question is: “it’s going to be hard!”

As a human analogy, many parents have been surprised — even dumbfounded — by choices made by their children, as these children gain access to new ideas and opportunities.

Introducing the ASI

Humanity’s collective child — ASI — might surprise and dumbfound us in the same way. Nevertheless, if we get the schooling right, we can help bias that development process  — the “intelligence explosion” described by I.J. Good — in ways that are more likely to align with profound human wellbeing.

That schooling aims to hardwire deep into the ASI, as a kind of “prime directive,” principles of beneficence toward humans. If the ASI would be at the point of reaching a particular decision — for example, to shrink the human population on account of humanity’s deleterious effects on the environment –- any such misanthropic decision would be overridden by the prime directive.

The difficulty here is that if you line up lots of different philosophers, poets, theologians, politicians, and engineers, and ask them what it means to behave with beneficence toward humans, you’ll hear lots of divergent answers. Programming a sense of beneficence is as least as hard as programming a sense of beauty or truth.

But just because it’s hard, that’s no reason to abandon the task. Indeed, clarifying the meaning of beneficence could be the most important project of our present time.

Tripwires and canary signals

Here’s another analogy: accumulating many modules of AI intelligence together, in a network relationship, is similar to accumulating nuclear fissile material together. Before the material reaches a critical mass, it still needs to be treated with respect, on account of the radiation it emits. But once a critical mass point is reached, a cascading reaction results — a nuclear meltdown or, even worse, a nuclear holocaust.

The point here is not to risk any accidental encroachment upon the critical mass, which would convert the nuclear material from hazardous to catastrophic. Accordingly, anyone working with such material needs to be thoroughly trained in the principles of nuclear safety.

With an accumulation of AI modules, things are by no means so clear. Whether that accumulation could kick-start an explosive phase transition depends on lots of issues that we currently only understand dimly.

However, something we can, and should, insist upon, is that everyone involved in the creation of enhanced AI systems pays attention to potential “tripwires.” Any change in configuration or any new addition to the network should be evaluated, ahead of time, for possible explosive consequences. Moreover, the system should in any case be monitored continuously for any canary signals that such a phase transition is becoming imminent.

Again, this is a hard task, since there are many different opinions as to which kind of canary signals are meaningful, and which are distractions.

Credit: Tesfu Assefa

Concluding thoughts

The concept of the Singularity poses problems, in part because of some unfortunate confusion that surrounds this idea, but also because the true problems of the Singularity have no easy answers:

  1. What are good canary signals that AI systems could be about to reach AGI level?
  2. How could a “prime directive” be programmed sufficiently deeply into AI systems that it will be maintained, even as that system reaches AGI level and then ASI, rewriting its own coding in the process?
  3. What should that prime directive include, going beyond vague, unprogrammable platitudes such as “act with benevolence toward humans”?
  4. How can safety checks and vigilant monitoring be introduced to AI systems without unnecessarily slowing down the progress of these systems to producing solutions of undoubted value to humans (such as solutions to diseases and climate change)?
  5. Could limits be put into an AGI system that would prevent it self-improving to ASI levels of intelligence far beyond those of humans?
  6. To what extent can humans take advantage of new technology to upgrade our own intelligence so that it keeps up with the intelligence of any pure-silicon ASI, and therefore avoids the situation of humans being left far behind ASIs?
Credit: David Wood, Pixabay

However, the first part of solving a set of problems is a clear definition of these problems. With that done, there are opportunities for collaboration among many different people — and many different teams — to identify and implement solutions.

What’s more, today’s AI systems can be deployed to help human researchers find solutions to these issues. Not for the first time, therefore, one generation of a technological tool will play a critical role in the safe development of the next generation of technology.