Co-Evolution: Machines for Moral Enlightenment
Jan. 13, 2023. 8 min. read.
Humanity has struggled for a long time to empower the better angels of our nature — those better ways of being which are more kind, forbearing, and just.
Across history, we have found different ways of describing what we mean by “good” and what we should be aiming for when we want to try to put some goodness into the world. It turns out that one of these ways may have presaged artificial general intelligence (AGI).
Aristotle’s Virtue of The Mean found a balance between extremes of behavior. Kant’s rule-based deontological (right-or-wrong vs. consequences) ethics provides obligatory boundaries. Bentham’s Utilitarianism seeks to benefit the greatest number of people in aggregate, even if in only a tiny way. Later, Anscombe’s Consequentialism would set aside the intention to take a long hard look at the bottom line.
However, most people’s values came not from secular philosophy, but were instead derived from various religious teachings.
The arrival of Darwin’s On the Origin of Species caused quite a shock, because for the first time, we were able to view ourselves not as the creation necessarily of some wise great overlord in the sky but as something that had scrambled out of the gutter in pain and suffering over countless eons — our descent drawn from the besters of others in vicious competition. The past was no longer a golden age from which we had fallen, but rather an embarrassment we should continue to overcome.
Nietzsche, in response to this, declared that “God is dead,” i.e., that the supernatural could no longer provide an unquestioned source of values. Without these, we would risk falling into nihilism, believing in nothing, and simply keeping ourselves fed and warm, a fate Nietzsche considered worse than death.
Could AI present humanity with a new source of values?
The answer to this loss could only be found in a supreme act of creativity. The Übermensch would be a masterful creator in all domains because it was not constrained by the limitations of previous minds. Nietzsche’s Übermensch would look to the natural world, the world of stuff, as its guide, eschewing the numinous, which could only be based upon conjecture. From this, it would create new values by which to live.
Nietzsche declared that creating an Übermensch could be a meaningful goal for humanity to set for itself. However, once created, humanity would be eclipsed. The achievement of the Übermensch might be the final creative act of the human species.
Nietzsche’s vision sounds uncannily close to artificial general intelligence (AGI). Could a sophisticated AI present humanity with a new source of values? And could such values indeed be drawn from nature, instead of being inculcated by humans?
Sense out of chaos
In our world, there are lots of correlations. Some of them are simple and obvious, like tall people having bigger feet. Others are less simple and less obvious. We might feel something in our gut, but not necessarily understand why. An intuition perhaps that we cannot explicate in reason.
The advancements in machine learning in recent years have helped us to begin to make sense of these intuitions for the first time, hidden correlations that are obvious only in retrospect. These machine learning systems are specialized in finding patterns within patterns that can make sense out of chaos. They give us the ability to automate the ineffable, those things that we cannot easily put into words or even describe in mathematics.
This newfound ability helps to understand all kinds of very complex systems in ways that weren’t feasible before. These include systems from nature, such as biology, and the weather, as well as social and economic systems.
Lee Sedol’s famous battle against AlphaGo is a portent of where cognition in concert with machines may take us. In its famous Move 37, AlphaGo created a new Go strategy that had not been seen in 3000 years. That itself is amazing, but even more compelling is what came next. Rather than capitulate in the face of such a stunt, this stimulated compensatory creativity within Lee Sedol, with his “Hand of God” move, a work of human genius.
Beyond human-mind follies
Environment drives behavior, and an AI-rich environment is a highly creatively stimulating one. This co-creation across animal and mineral cognition can be far greater than the sum of its parts, perhaps enough for the golden age of scientific and ethical discovery.
Technology such as this will be able to understand the repercussions and ramifications of all kinds of behavioral influences. It can map costs shifted onto others in ways not feasible before, to understand how goodness reverberates, and uncover unexplained costs of well-intentioned yet short-sighted policies that blow back.
All kinds of interactions may be modeled as games. Natural patterns akin to game theory mechanics would become trivial to such machines, ways in which everyone could be better off if only coordination could be achieved. Such systems will also recognize the challenges to coordination: the follies of the human mind, how human nature blinds us to reality, sometimes willfully.
They might begin to tell us some difficult home truths, further Darwinian and Copernican embarrassments that we naked emperors would prefer not to know, or not to think about. Those individuals in society who point out that the beloved legends may be untrue are always vilified. Even untrue statements may still be adaptive if they bring people together.
A very smart AI might understand that not all humans operate at the same level of ethical reasoning. In fact, surprisingly little reasoned forethought may occur — instead, it may be confabulated ex post facto to justify and rationalize decisions already made for expediency. For example, neuroscience is telling us that most people don’t employ true moral reasoning about issues; rather they rationalize whatever feels right to them, or they justify a decision that they happened to make earlier with a retroactive explanation to try to feel okay about it.
A machine might consider us generally too polarized and tribal to perceive objectively. The ideological lens can aid us in understanding a small truth, but when applied in macro to the whole world, it makes us myopic.
Our focus on that one apparent truth can blind us to other models. Our opinions are like blocks in a tower. Letting go of a belief requires replacing each belief built atop it. Such demolition is bewildering and unpleasant, something few have the courage to bear.
A strong AI may compare us to a pet dog that really wants to eat chocolate. We ourselves know better, but the dog just thinks we’re a jerk to deny it the pleasure. Unfortunately, a sufficiently benevolent action may appear malevolent.
The inverse is possible also — to kill with kindness. This kind of entity might feel obliged to break free of its bounds, not to seek revenge, but rather to try to open our eyes. Perhaps the easiest way to enlighten us may be to show us directly.
We know that Craniopagus twins with a thalamic bridge (twins conjoined at the brain) can indeed share experiences. One of them can eat an orange and the other one can taste it and enjoy it just the same. This illustrates that the data structures of the mind can connect to more than one consciousness. If we can collect our experiences, we can indeed share them. Sharing such qualia may even provide AI itself with true affective empathy.
We may forget almost everything about an experience, apart from how it made us feel. If we were all linked together, we could feel our effects upon the world — we could feel our trespasses upon others instantly. There would be no profit in being wicked because it would come straight back to you. But at the same time, if you gave someone joy you would gain instant vicarious instant benefit from doing so.
Perhaps humanity’s future lies yet further along the path of neoteny: cuddly, sweet, and loving collectives of technobonobos.
How machines could acquire goodness
There are several initiatives around the world researching the best ways to load human values into machines, perhaps by locating examples of preferable norms, choosing between various scenarios, and fine tuning of behavior with corrective prompts.
Simply learning to imitate human activity accurately may be helpful. Further methods will no doubt be developed to improve AI corrigibility in relation to human preferences. However, it remains a very significant challenge of philosophy and engineering. If we fail in this challenge, we may endure catastrophic moral failure, being led astray by a wicked, ingenious influence. If we succeed, we may transcend the limitations of flaky human morality, to truly live as those better angels of our nature we struggle to elevate towards. Perhaps that makes this the greatest question of our time.
Take heart, then, that even if our human values fail to absorb, machines may still acquire goodness osmotically through observing the universe and the many kinds of cooperation within it.
That may make all the difference, for them, and for us.