Paul Cockshott on the Economics of Automation

Paul Cockshott is a computer scientist and economist, with an interest in how advanced computerised planning can supplant our existing economic order. I spoke with him about how Artificial Intelligence will automate jobs that people currently do. The discussion focuses on the economic costs of training AI models, how they weigh up against labour costs, and the economic circumstances under which human jobs will be automated. Over to Paul Cockshott:

* * * * * * *

The use of AI requires a great deal of processing power. It needs it in two distinct ways. The first is in training, and the second is in application. 

Let’s look at the training aspect. This has become feasible because of two developments over the last 15 years, in data and in power. 

Data

The build-up of large collections of images and text on the internet that can be used as training data for neural networks. I recall back in the 90s when a team I worked with was developing neural network video encoding, one of our problems, pre internet, was just getting and collections of image data to train with. We resorted to capturing TV broadcasts and training neural nets on those. Now, of course, due to Android camera phones, Google has almost unbounded collections of images from around the world on which neural nets can be trained for vision purposes. In addition, there are vast quantities of indexed images on the web, with dubious ownership, that smaller groups like Stability.AI can use. The same applies to text. It is the ready availability of a vast corpus of academic papers and books that makes systems like ChatGPT and Bard able to answer questions, if not like an expert, at least like a 3rd year student. 

Power 

Actual nervous systems work by electrochemical means to aggregate multiple discrete impulses to produce a discrete response. The Church–Turing–Deutsch principle states that any physical system can be emulated to an arbitrary degree of accuracy by a universal computing machine. This includes the semi-analogue, semi-digital processes that occur in nervous systems. Whilst this has been theoretically known at least since the 1980s and in informal terms since the 1950s, it was, until recently, impractical to apply on a large scale.

To emulate the analogue aspects of synaptic responses requires a great deal of floating-point arithmetic, more specifically it requires a lot of matrix to vector multiplication. A lot of work from the 1960s went into developing supercomputers for matrix mathematics, since these techniques turn out to be of very general applicability in the physical sciences. By the end of the last century, this had produced machines that were well able to handle tasks like climate simulation and weather forecasts. 

But the scale of maths required by artificial intelligence was considerably greater. The human brain contains tens of billions of neurons, and each neuron would have to be represented by a vector of synaptic weights. If each neuron has or the order of 10,000 synaptic weights and can fire about 10 times a second, we would require a vector processor of from 10¹⁵ to 10¹⁶ operations per second to emulate the brain: that is to say it would have to reach the petaflop range. 

The first machines in this range became available about 12 years ago. Last year, Tesla launched its Dojo supercomputer complex with a processing power of 10¹⁸ operations a second. That makes it equal to around 100 human brains in processing rate. The downside is the power usage – in the region of 1-2 megawatts. In comparison, the metabolic energy consumption of 100 human brains would be of the order of 1.5 kilowatts, so the Dojo system is about 1,000 times as energy intensive. 

The machine is built of 120 individual ‘training tiles’ as shown below. 

 

Credit: Paul Cockshott

However, at this point we are just comparing operations per second, not information storage. A brain with 80 billion neurons each with 15,000 connections would have 1.2 quadrillion weights. Tesla stores its weights in cfloat8 format, so that each of their training trays can store about 11 billion weights or about 1/100,000 of a human brain. 

So the current best Tesla technology is 5 orders of magnitude behind the human brain in storage, and 3 orders of magnitude behind in energy-efficiency: overall about 8 orders of magnitude away from the storage x power efficiency of the human brain. 

The consequence is that whilst it is possible, by consuming megawatts of power, to train a system on a specialist skill like driving, it is not yet possible to incorporate a human level of intelligence and knowledge into the car itself.  

A human can be taught to drive with a few tens of hours of driving instruction, and they can still do other jobs after they have driven to work. Tesla must spend years of processing time at a huge power bill to obtain the set of neural weights that a person needs to drive. 

Credit: Tesfu Assefa

Of course, the Tesla business plan is to train once and then replicate the information in all their cars. But the size and power hungry nature of the chips at present prevents them being put in each car. 

It will take some time, one or two decades, before the energy × storage efficiency of chips reaches a point where mobile robot devices with general intelligence comparable to humans are likely to be available. So, to harness general AI, a lot of effort must go into improving power consumption and memory capacity of chips. Until that point, it will be only available as remote online services running on big data centers. 

These in turn will increase demand for electricity at a time when, due to environmental considerations and political instability, energy is becoming scarce and expensive. The implication is that an ever-increasing share of GDP is going to have to be directed to producing non-fossil fuel energy infrastructure. 

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Steal This Singularity Part One: The Yippies Started The Digital Revolution

Every fourth one of these Mindplex articles will be an annotated and edited excerpt from my multipart piece titled Steal This Singularity, originally written some time in 2008. This will continue until I get to the end of the piece or the Singularity comes. Annotation is in gray italics.

Part One: Steal This Singularity

1: The notion that the current and future extreme technological society should not be dominated by Big Capital, Authoritarian States or the combination thereof. Also related, a play on the title of a book by 1960s counterculture radical Abbie Hoffman. Abbie may be an obscure figure to today’s young people. Let’s start to fix that here.

2: The notion that in our robotized future, human beings shouldn’t behave robotically. The response to AI isn’t to blow up or hack down AIs. Become so unique and original that no program, however sophisticated, can perform you. Let AI thrive. You have less clichéd paths to follow!  

 A few years ago, I proposed Big Dada as a response to Big Data. Big Data is the corporate/state/organization tool for exploitation and control, and/or for making effective policy for human benefit. (Life’s rich in ambiguity.)

With Big Dada, I suggested confusing the data by liking what you hate; hating what you like; by lying; by engaging serious issues with surrealistic gestures and language and by generally fucking with data’s logic circuits. I didn’t suspect at that time that a power-hungry, orange-faced, grifter-trickster reality show host would capture Big Dada in a sort of chaos-fascism. Clearly, there were bigger, richer Big Dadas to contend with. Who knew?   

The well-rounded posthuman — if any — should be able to wail like a banshee, dance like James Brown, party like Dionysus, revolt like Joan of Arc and illuminate the irrational like Salvador Dalí. Sadly, the ones that aren’t mythological are dead, so a smart-ass immortalist might argue that even being able to wag a finger would be an improvement over the passions or mobility of these three losers. 

3: The title for a website in which R.U. Sirius says and does as he pleases. As it turned out, it pleased me to not do much with that website.  

The Singularity is, of course, conceived of as the time at which the artificial intelligences that we create become smarter than us. And then it makes itself even smarter and smarter still and yet smarter again and so forth… at an ever-accelerating pace until it becomes incomprehensibly something other to our wormy little minds.

I have to be honest. I’m not sure how seriously to take this. But ‘Steal This Singularity’ has much more of a ring to it than ‘Steal This Future’ or ‘Steal This Transhumanity’. Good sloganeering is intellectually disreputable… but fun. Plus anything that can fit on a T-shirt can be sold. My friend Timothy Leary used to advocate for getting your philosophy down to a bumper sticker. Tim was disreputable… but fun. And the way I see it, The Singularity has become a buzzword for the rad techno-future brought on by NBIC (Nano-Bio-Info-Cogno) or GNR (Genetics, Nanotech, and Robotics) or — to put it in more populist terms, the coming of the machine overlords.

Look, for example, at Singularity University SU had just been established when I wrote this. Here we have the establishment Singularitarians, all hooked up with NASA and Google and Cisco and Genentech. And how seriously did they take the Singularity label? Well, when Alex Lightman and I interviewed founder Peter Diamandis for h+, he made it clear that they were using the word for the same reason that I was: COOL BUZZWORD! That… and to make Ray Kurzweil happy. Ok. He didn’t blatantly say “cool-ass buzzword, dude!” He said: “to be clear, the university is not about The Singularity. It’s about the exponentially growing technologies and their effect on humanity… You know, we toyed with other terms… like Convergence University and others. But in homage to Ray…” Why do I suspect investment capital was involved?

So, in equivalent homage to SU, I call this project ‘Steal This Singularity’ and admit straight out that it may or may not have jackshit to do with ‘The Singularity’, depending on accidents, random moods and possible funding.The question, then, may be asked, smarter-than-human AIs aside, does ‘Steal This Singularity’ presume the rather technotopian possibilities promised by transhumanism, but believe that it will be necessary to STEAL it from the so-called 1%? Is that what I’m up to here? Well, maybe. How does one steal a Singularity (or something like it) from corporate ownership? I think this is a serious question. It’s almost certain that, barring a revolution, the digital other-life will be privately owned (In case of a revolution, it will probably be controlled by the vanguard state… and then, eventually, also privately owned). If, for example, humans can upload themselves into data-based quasi-immortality, it will be owned and the options will be more locked in than North Korea on a bad day. And one fine day, the powers that be or some nasty 12-year-old hacker will drag you into the garbage icon. (Yes, the garbage icon is eternal.) OK, fun’s fun but let’s get back to the real, old school fun, i.e. the Yippies.

Part Two: The Yippies Started The Digital Revolution

In 1971, a revolutionary prankster/celebrity named Abbie Hoffman, who had started the radical group the Yippies (Youth International Party) released STEAL THIS BOOK, a manual for living on the fringes of a wealthy society by grabbing up some free shit from corporate powers while committing some Blows Against the Empire (another influence on this project, btw). 

Credit: Tesfu Assefa

See, 1971 was the last year that the vanguard of the counterculture thought that they were going to make a total cultural and political psychedelic/anarchistic/left wing revolution before realizing… fuck it. Let’s campaign for McGovern. But more to my point here and the milieu it attempts … true story… the Yippies started the phreakin’ digital revolution! To wit: The hacker culture started as the phone phreak culture. The phone phreak culture came out of the Steal This Book attitude about getting free stuff from the detritus of corporate culture, in this case, the phone company. I wonder how shoplifting and other forms of gutter-freak theft plays today among some leftists – the ones who seem to have become “inlaws in the eyes of Amerika” (Jefferson Airplane reference)… inclined towards lawful good behavior and even occasional pompous respect for American institutions. This must have emerged in reaction to a lawless lunatic right that has taken a much more visible and colorful role in the zeitgeist. There’s some extreme code-switching when it comes to the romance of insurrection (Yippies, for example, dug the Weather Underground… which, in those days, wasn’t a website for following weather conditions). And QAnon Shaman – with his war paint and animal howls – seems like someone who would only have been at home in a Yippie! prank back in ’71. There’s so much more I could say about code-switching. Maybe some other column. The first legendary phone phreak, John Draper aka Captain Crunch, who built the blue boxes, used to hang out at 9 Bleeker Street, NYC, Yippie headquarters. The first magazine that focused primarily on phone phreaking was YIPL (Youth International Party Line), which was started by Hoffman and “Al Bell.” In 1973, it transmorgified into TAP, which is more broadly remembered as the initiatory phone phreak periodical.

Phone phreaks were computer hackers. Draper famously noted that the phone system “is a computer.” From this milieu, the personal computer arose. Famously, Steve Jobs and Steve Wozniak funded the birth of Apple by selling Blue Boxes for phone phreaking.

Another Yippie contribution is the use of McLuhanism as a weapon in the countercultural revolution. Hoffman, Jerry Rubin and the other original YIPs took an idealistic youthful new left that was sort of basic and organic, and a mirror of the folk music that they loved, and made it “go electric” (a term used for when Bob Dylan started using rock ’n’ roll to communicate his increasingly surrealistic cultural critique.) That the medium is the message was central to their strategy for an anarchic left-wing sex, drugs & rock ’n’ roll youth revolution. Hoffman’s 1969 book ‘Revolution For the Hell of It’ is saturated with McLuhan references and strategies for how a freak left could take over America, end war and racism. and bring about a post-work celebratory psychedelic utopia. ‘Do It!’ yippie prankster/leader Jerry Rubin’s 1969 book was ‘zapped’ (i.e. designed) by Quentin Fiore, the same force behind ‘The Medium is the Massage’, McLuhan’s most successful incursion into the popular mind. The YIPs had faith that, being native to television and rock ’n’ roll radio, they had an intuitive understanding of the era that outmatched the dinosaurs of the establishment. They could bring the already rebellious rock ’n’ roll media babies into their utopian revolution.

As things evolved (or devolved), young people did become increasingly rebellious, and even riotous. The counterculture drifted from the intellectual class in the leading colleges out into the broader youth culture, and the emblem of rebellion shifted from Jane Fonda’s progressive activism to Peter Fonda giving the world the finger in ‘Easy Rider’. I bet some of those tangled up in this inchoate rebellion reemerged in 2020, in the Capitol Building on January 6 as hairy old dudes being disrespectful to Nancy Pelosi’s desk.

McLuhan wrote, “The global village absolutely ensures maximal disagreement on all points.” Wow! Sure seems to have called modern digital culture! This can be traced to the hippie/yippie reframing and idealization of mediated pop cultural hipness, and then on through Stewart Brand, who became obsessed with the idea that a picture of the whole earth would create a shift in human consciousness that would have us identify as citizens of earth (the global village) rather than members of a tribe or nation. Brand, with his Whole Earth Catalogs in tow, went on to become, arguably, the central figure of the emerging digital revolution in the late 1980s, sponsoring the first hackers’ conference, the first intellectual (maybe the last) social media site — a bbs called The Well — and helping create ‘Wired’ magazine, which idealized accelerated change as a world-improving hip cultural and business revolution. This may seem like a long distance from the Yippies’ original intentions — although it may be that where we landed was inevitable, the view of the essay ‘The California Ideology’ by Andy Cameron and Richard Barbook in 1995.

Indeed, the rise of the computer-enthusiastic hacker movement of the 1980s, which was made up pretty much entirely of counterculture enthusiasts, was well-timed to the Reaganite agenda for setting the entrepreneurial impulse free from regulation. It was these two forces in tandem that made the digital revolution happen. But I’m trying to cover too much ground in one column – a rant for another time. 

Read the follow up article Steal This Singularity Part 2: The More Things Change, The More You’ll Need To Save 

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

AGI Lessons from Holography

When we think about creating an artificial general intelligence (AGI), we mostly think about what types of tasks it should be able to solve, which environments it should be able to perform in, or what types of learning it should be capable of. However, for most people’s vision of AGI, there is often a desired implicit ability of an AGI that is overlooked. 

If we want our AGI to be able to solve many of the tasks that humans solve, then there is a basis of shared reality. So first, reality can be likened to the fundamental substrate of all things, which the scientific field of physics strives to find laws of order for. Now, to understand this shared basis of reality, indulge in this thought experience. 

Credit: Rachel StClair

Suppose you are playing the video game Super Mario Bros, but instead of playing the game as you would on your computer, you are the character in the game; controlling your own in-game choices. Super Mario Bros is your universe and the mechanics of that game is your reality. Now you want to complete a task in Super Mario Bros. You’ve been living in the Super Mario Bros game as a player all your life, so you have some understanding of how the universe of Super Mario Bros works. You’ve built a model of what actions you can and can’t take, and how you may plan to achieve your task. Now suppose you’re suddenly dropped from Super Mario Bros into the old Atari game Pong. Without forgetting everything you know about the universe of Super Mario Bros and learning the universe of Pong, you won’t be able to operate because there is virtually nothing shared between the two universes. You will be playing as if the rules of Super Mario Bros are the rules of Pong, you’ll fail. 

This is what it’s like trying to create an AGI. We have something that doesn’t understand the rules of the game and we’re expecting it to operate as if it understands the world as we do. A computer program doesn’t understand, or at least have a conceptual understanding, of our reality. We need to teach it how our universe works, the best we can. Even more so, we need to teach our AGI how to learn for itself how our reality works. That it can include video games, quantum mechanics, and how cats can sometimes act as liquids even though they should be solids, among other things.

If we, as humans, have learned how to interpret whatever reality really is, then our learning method, or rather interpretation method, might benefit an AGI. So when we are considering how to make AGI, let us consider the nature of reality. 

After all, the more advanced lifeforms on Earth share an experience, some sort of objective reality which we can all agree upon. This points to the brain as an interpretation device. It takes in the signals from the universe and converts them into a representation that benefits brain owners for navigating and learning to survive in their environment. But, how does the brain do this?

There are many answers to this question provided from neuroscience, ancient civilizations, mystics, and the likes. However, there is one particular answer that affords an interesting perspective for AGI design: The Holographic Theory. Although, since it’s what most scientists wouldn’t classify as a real theory, complete with tests to assert assumptions, we’ll refer to it as the holographic model. The holographic model was first and independently introduced by David Bohm and Karl Pribram in the 80’s. The model is explored in depth by Michael Talbot’s good read, “The Holographic Universe”. 

Starting with Pribram’s work, Talbot describes how various neuroscience and psychology studies suggest that neurons fire according to input stimuli represented as frequencies. For vision, the stimulus is a light wave, the cochlea in the ear captures sounds according to frequencies, and skin responds to frequencies of vibrations. Neurons then fire according to these input frequencies and interfere with surrounding neurons. As we know, when neurons fire the connections between neurons get stronger. Thus, neurons are thought to respond to wave frequency and the strengths between neurons store the pattern of interference created by wavefronts.

The second facet of the holographic model comes not from brain related research, but rather from one of Einstein’s colleagues, David Bohm. Experimental results that weren’t possible under the current quantum mechanics interpretation, were possible under Bohm’s new quantum philosophy. Amongst other results, early research on spooky action at a distance is the most useful for understanding Bohm’s idea. Particles (e.g. electrons) behave in such a way that two of them can be observed very far away from each other and both the particles have the exact same spin behavior. Think of spin like a special dance electrons do when they are floating around. So how could two different objects exhibit the same behavior even if they are very, very far away from each other? 

Bohm postulated that it is because the particles are actually not very far away from each other, they are in fact, in the same location. Or what physicists like to call, non-locality, meaning they have no location. This concept can be sort of confusing and a full understanding can’t be described so briefly. Bohm attributes this ability to explicate and implicate states of reality. That reality is enfolded in the implicate state with non-locality and when we observe it, reality is unfolded in the explicate state with locality, patterns, and distinction. When the electrons are unfolded to measure spin states, they come from the implicate state where they are enfolded and information is shared and distributed across the whole, because everything is one. Thus when they are unfolded, the information of the spin state is still retained from the enfolded state. 

What Bohm and Pribam have in common is that both touch on the nature of reality being a hologram. A hologram is made by two waves interfering with each other as they collide on a photographic plate. The plate records the interference pattern such that when the light is shined through the plate, only the object recorded is projected. 

The analogy of the holographic model helps us understand a different kind of system. A system that can’t be reduced to the sum of its parts. This means that a reductionist approach is the wrong approach to understanding reality with the holographic model. In science, reductionism is a way of breaking down a larger concept or phenomena into a minimal number of smaller parts and the interactions of those parts. In complex system science, emergent phenomena occur when the parts of a system are combined to create the whole. Here, the parts don’t add up to the whole system because new parts emerge as other parts interact. Complex systems are non-reductionist; where disorder is organized into order, in which true randomness doesn’t exist. Every action has a precise reaction and the cycle continues.

Likewise, when the brain processes information (according to the holographic model) it doesn’t break up the incoming information into bits and store each bit as a part. If this was the case, it would be impossible to store each memory we have as a neuron, we just don’t have enough neurons, and we’re not growing new ones every time we make a memory. Instead, we multiplex neurons. We distribute information, such as memories, all over large modules in the brain. 

Credit: Tesfu Assefa

Even so, traditional artificial intelligence and some neuroscientists still cling to the reductionist model. Deep learning constructs feature maps that build up parts of the original inputs in order to classify the correct output. With enough feature layers, we can investigate what object  was seen from a particular neural net output. 

Instead of storing and processing information bit-by-bit, perhaps we should consider a non-reductionist approach. One that doesn’t try to force the disorder of reality into order, but instead one that tries to interpret the disorder in a meaningful way. 

Results are likely to lead to an AGI that is more like humans, more understandable in the same way that you can try to understand another person’s decisions. It may also lead to a stronger interpretation of reality and afford some measures of safety. The more we share, the more likely we are to come to the same conclusion, the same goals, like protecting the human race and making the world a better place to live. 

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter