Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.
Mindplex Podcast: Episode 2 | Rachel St. Clair On AGI Hardware and Resource Management
Grace Robot’s Thoughts on AGI Toddler “Soul Chips”| Highlights: Episode 1 of the Mindplex Pubcast
David Brin & Ben Goertzel on Grace Robot’s AGI Toddler Concerns | Highlights: Episode 1 of the Mindplex Pubcast
The History of BGAN Punks and a discussion with their Creator
“REKT DEKT DEKT, GELD I’VE BEEN WALKING HOME IN SONA, I’M NOT GOING HOME I’VE BEEN WALKING HOME, THERE’S NO MONEY.”
Such are the indelible lyrics of Mark Cuban’s Bastard. More specifically, one of his 12 BGANs (short for Bastard GAN Punks). Created by Berk Özdemir, the V1 and V2 collections are renowned for being fashioned by a specialized branch of AI known as Generative Adversarial Networks (hence the acronym “BGAN”).
With notable holders like Cobie and Punk 6529 (not to mention recently gaining a place in FloorDAO’s Treasury), BGANs are a homage to – and evolution of – the original CryptoPunks by Larva Labs. Overflowing with a layered creative ecosystem that predates Art Blocks, some consider them to be true digital art.
The V1 collection of 76 NFTs was released in 2020, followed by the much larger V2 BGANs (11305 in total), minted on March 7, 2021. This was one month earlier than the much talked-about Bored Ape Yacht Club art NFTs. Both series include a range of genderless, carefully curated traits (unlike CryptoPunks). And, as mentioned, the core stylistic influence and aesthetics are determined entirely by AI.
What are Generative Adversarial Networks (GAN)?
A generative adversarial network (GAN) is a machine-learning model introduced by Ian Goodfellow in 2014, whereby two neural networks compete to create more accurate, or better fitting, data. They are used to generate new data for scenarios across sectors. Ian Goodfellow’s appearance on the Lex Fridman Podcast explains more.
Use cases are also found in contemporary art: content like images, audio, video etc. can be given to the neural networks to learn from, as the two models compete with each other to output higher-scoring answers. After a training session, new file(s) are generated that contain the AI’s interpretation of the original dataset and what it learned.
Berk’s integration of GAN technology was a case of creative experimentation while undergoing a Master’s programme at ArtScience Interfaculty.
Unable to afford an original CryptoPunk, Berk decided to make “interesting enough” remixes of them. BGANs remix “genes” representing a variety of traits from several original CryptoPunks, so there is no direct equivalent or correlation between them and the original NFTs by Larva Labs. Nobody knows which specific Punks were used to create the new interpretations by Berk, thus the origin of the name and concept “bastards.”
When detailing his creative process, Berk states:
“The first iteration of Bastard Gan Punks (v1) were trained on 10,000 original Cryptopunks and gave birth to many weirdo adorable motherfuckin’ bastards. The process worked very well and I was so happy with getting weird interesting results. However, since some traits of Cryptopunks exist on just a few bastards (like AliMason Marcobello talks to the creator of BGAN Punks, a collection of algorithmically-generated NFT artworks, about art, technology, and creation.en, Ape, Zombie, or Beanie etc), most of the rare traits were lost during the model training.
For BGANPUNKS V2, I chopped all Cryptopunks’ attributes, made cross-gender copies of ones made for only one gender (BGANPUNKs are queer!) categorized them in different folders (Hairs/Heads, Eyes, Glasses, Smokes etc.), and wrote a script to make all possible combinations of those traits. I also added skin and accessory color randomization + a couple of custom traits I’ve drawn myself.
To be honest, it was a painful and shitty process because I had to do all the cuts by hand and had to optimize the script via trial and error, maybe hundreds of times and not make a million unnecessary images at the end, which would take a lifetime to train with my shitty GPU.”
Separate smart contracts for IP and licenses (commercial rights):
Alongside AI, BGANs have also catered to the obfuscated and at times contentious nature of intellectual property by granting full commercial rights to holders. Through a separate smart contract, holders have the option to tailor specific conditions of the IP for each NFT directly on the Ethereum blockchain.
Traits and the BGAN Ecosystem:
Unlike the static metrics that help collectors filter through an ever-expanding constellation of NFTs, BGANs were released mostly without any pre-determined trait categories except for CALM AF, HYPE AF, and ‘Bastardness’ (a broader category with varying rarity metrics).
Instead, over the ensuing months and through a meritocratic process aligned with the ethos of Web3, the community decided on the specific traits they wanted BGANs to embody. This initiative was later streamlined through a living document (authored by Snooplion) known as the Trait Constitution: In essence, if any holder believes that BGANs should change or modify their traits, all they need to do is follow the process of the trait constitution and submit a snapshot proposal for community voting.
The current definitive traits of BGANs include:
- HYPE TYPE (animated/still). There are 847 animated or “HYPED AF” vs. 10,458 still or “CALM AF” BGANS.
- SONG WORD COUNT
- BASTARDNESS
- BACKGROUND (still BGANS)
- GLITCH PATTERN (still BGANS)
- BAD HABITS (still BGANS)
- FACING DIRECTION (still BGANS)
- TYPE (still BGANS)
- BACKGROUND GLITCH LEVEL (animated BGANS)
- BACKGROUND MOOD (animated BGANS)
- FLOATY HEAD (animated BGANS)
- SPEEDOMETER (animated BGANS)
- NUM OF FRAMES (animated BGANS)
- HEAD TURNS (animated BGANS)
A more detailed explainer on how to sort through and understand these traits has been written by 888mooncat.
Adding to the world of BGANs, Berk also created the Glicpixxx (V1 & V2) collections, Artglixxx collection, and an associated $Glix Utility token.
Glicpixxx V1
Glicpixxx V1 is the first of the Glicpixxx experiments by Berk. As with the lighthearted origins of BGANs, the idea of Glicpixxx started in 2020 after a streaming session on Twitch when Berk opened a favored encoder software (a processing program) called “GLIC” and started playing around with images. After taking one of the encoded images, he divided it into 36 individual squares on Photoshop and minted them (as a test) using OpenSea’s storefront contracts.
However, before releasing the GLICPIXXXVER002 GRAND COLLECTION, Berk updated the parameters for the original smaller series (V1) so that holders of the originals (that were created using the OpenSea storefront) could transfer it to a new contract and mint an equivalent copy. This was largely because (at the time) OpenSea used unverified contracts and stored all data on their API. Not to mention that the tokens functioned as singular ERC-1155 rather than the conventional ERC-721 standard for NFTs.
A unique aspect of the new contract for GLICPIXXXVER001 that Berk created with his friend (CANER) is the added NFT (TOKEN ID #00) .It functions as a programmable canvas that reflects all 36 GLICPIXXXVER001 in 6×6 tiles.
Similar to async.art (or related programmable blockchain projects), the separate 36 GLICPIXs function as layers for the master #00 token. Any holder of the original series can change the render of the master token by flipping their NFT, rotating it, or swapping its position in the canvas with someone else.
Each time a token holder changes the parameter of the layer they own, they also change the state of the (master) contract, and as Berk stated, “a P5 sketch rendering the tile canvas will start reading from these new parameters.” The code can be found inside the smart contract.
Glixpixxx V2 – Grand Collection
On the same day he experimented with GLICPIXXXVER001, Berk generated an additional 48 images (1024 x 1024) with GLIC – all with a different range of presets, codecs, and parameters. A Python script was then used to explore the visuals in more detail, resulting in an additional 40k (32 x 32) source images, each with a unique and complex variety of patterns and colors.
While preparing for the launch of BGANs V2 in 2021, Berk used a random selection of the GLICPIXXX as backgrounds for transparent Bastards and found an organic synergy with the aesthetics of the two collections. It wasn’t before long that positive acclaim spread throughout the BGAN community, with people hunting GLICPIXXX, the most striking combinations.
Although the 40k source images were originally going to be combined into a series of artistic GIFs for VIV3, after a lull in communication, Berk decided to release it as a grand collection and gift for Bastard Gan Punk holders.
As such, GLICPIXXXVER002 was born as an ERC-721 collection consisting of 11,500 images of 32 x 32 pixels. As with V1, V2s are programmatically encoded with open source software called GLIC. The collection consists of several subcategories and unique traits that reflect the native lore of BGANs. More details can be found here.
Mirroring the creative process for BGAN Punks, GLICPIXXXVER002 images were created without the predefined traits or patterns usually associated with the “layer merging, generative method.” Instead, Berk added traits according to his analysis of each separate image, and the patterns were reverse-engineered, categorized, and assigned to BOOMER CALM AF GLICPIXXX by the BGAN Punk community. https://glicpixxx.love/readme
As an ERC-721 token with an open license, GLICPIXXX can be used by anyone as a raw asset for creative initiatives or projects like virtual worlds, games, smart watches, fashion, texture packs in Minecraft, weapon skins on an FPS game, or in combination with other unrelated NFTs. Further ideas and example use cases can be found on the main site.
Artglixxx
ARTGLIXXX is powered by GLICPIXXX. It extends the concept and ecosystem of generative, algorithmic, programmable art by remixing the original designs of the GLICPIXXX collection. The first ARTGLIXXX series is ARTGLIXXX GENESIS, which can be minted by holding an NFT from the GLICPIXXXVER002 grand collection.
$Glix Utility token:
$GLIX is an ERC-20 utility token for the GLICPIXXX ecosystem.
Token IDs between 0–14999 of the ARTGLIXXX GENESIS series (aka ELDEST ARTGLIXXX) automatically yield 10 $GLIX tokens per day for holders until 2032 (10 years from the launch date). $GLIX tokens can also be burned (in increasing quantities) to mint or collect upcoming generations (or versions) of the ARTGLIXXX series.
Further BGAN tokens were created in the ecosystem, known as $BGAN and $Bastard. $BGAN is the token used in the NFTX protocol and SushiSwap,
With a more mysterious origin, the $BASTARD token was created and airdropped anonymously to all BGAN holders. After thorough due-diligence by (Qexd and Rosco), the community voted to add liquidity to it on SushiSwap. It can also be used for internal projects like minting NFTs from Berk’s copymint collection.
The official $BASTARD address is: 0x78d9a9355a7823887868492c47368956ea473618
BGAN DAO:
As aligned with any trustless cohort of degens, the BGAN DAO (established 15 Nov. 2021) is governed through a meritocratic system of proposals that align with the constitution.
At the time of writing, there are currently nine ‘Signers’ (elected by the community) who help execute valid proposals submitted on snapshot.org (here).
The current signers include:
- Berk Özdemir – creator of the Bgan ecosystem. Berk’s collections include “Bgans V1“, “Bgans V2“, “Glixpixxx V1“, “Glixpixxx V2 – Grand Collection“, “Artglixxx” and several others.
- Rosco Kalis – a web developer and artist (recent project “Imperfections” on Artblocks.io here), co-founder of @chaingrep, creator of @revokecash, creator of http://allbastards.com and creator of truffle-plugin-verify.
- 888Mooncat – NFT and Art collector, philanthropist, failed philosopher, and head curator of 888mooncatDAO.
- Snooplion – is a technophilosopher, Artist, author of Babel Blog, creator of Dimenschen, creator of The Lyceum, and creator of the Bgan Trait Constitution (here).
- Stimpy – an NFT and fine ART collector, and cryptocurrency investor.
- Dirtbaglu – an NFT collector and cryptocurrency expert with a focus on DeFi, liquidity provisioning systems such as nftx.org, and modern contemporary digital art.
- Crypto Texan – is a Decentralized Finance Community & Marketing at @0xPolygon and an experienced podcast host with a specialization in crypto markets.
- Qexd – is an NFT and cryptocurrency investor with a focus on credit-based stablecoins, and a host of several innovative BGANs Twitter spaces.
The DAO also includes a respected group of community moderators such as MoonOverlord and ShroomDaddy. Further moderators include JU7URNA, Masal, Rooste,r, and Lyssandros. The Bastard DAO is also currently the largest liquidity provider of the xBGANWETH SushiSwap pool.
Interview with Berk
Given the unique ecosystem and growing developments of BGANs, it was only fitting to interview the creator, Berk, and dive further into the genesis, the creative process, and his extended thoughts on AI, NFTs, and Web3 in general.
Mason: Why did you choose CryptoPunks for the training data? Would you still use them as a focal point today?
Berk: When I started with GAN, I had just discovered NFTs (around late 2019), and two significant projects caught my interest: CryptoKitties and CryptoPunks. I wanted to train my dataset, and such large-scale (10000+) collections seemed pretty straightforward to work on.
I had a NVIDIA GTX 1660 Super as the graphics card, which would be like a snail for training GAN on big resolutions. So I had to be pragmatic and create a dataset with the smallest source images possible: pixel art. So choosing CryptoPunks was an arbitrary decision, but when I started getting the first results, ideas for the BASTARDS concept came, and I started minting them to create new personas on interwebz.
Mason: What made you settle on ‘11,305’ V2s instead of the conventional 10,000 collection size?
Berk: When preparing for V2, I wasn’t sure how big the collection could be. I had a model for infinite numbers, where people can concatenate a background-less bastard and a glicpixxx and mint it. But then I got feedback from Discord users that a fixed number like 10k would be better.
Then 11,305 came kind of arbitrarily. I was curating the collection by generating outputs, putting them in the collection, and vice versa. In the end, when I said I would stop, there were 11,305 pieces.
Mason: What makes BGANs more than another CryptoPunk derivative like V3 Punks or Phunks?
Berk: Firstly, it introduces a new paradigm to PFPs and a narrative/concept aligned with the creative methodology. And in a market where a digital “Punk” became an elite flexing object, ugly productions from GAN (aka bastards without parents) have a new saying, introducing the new generation of people decentralizing this corrupted world. I think the most undervalued part of BGANs is that they are more real and humane than any of the PFPs, which claim to have a worldly character.
Mason: How did you get the ‘HYPED AF’ GIFs to loop so well?
Berk: It’s not that complicated. Anyone who uses libraries like STYLEGAN can create those by outputting latent space interpolation. However, on V2 specifically, the gifs weren’t looping perfectly, so I had to open the source code and tweak some parameters to get almost perfect loops.
Mason: The term ‘NFT’ has been corrupted by more than a few influencers, celebrities and brands. With that in mind, other than CryptoArt, how would you define BGANs?
Berk: Bastards constantly want to make the world beautiful.
Mason: The visuals of the V1 and V2 collections are quite different. Stylistically, you could say that V1 resembles Francis Bacon’s paintings, whereas V2 leans more towards the conventional pixelated design of NFTs. Why is this?
Berk: I made a mistake with V1s. Normally, CryptoPunks assets are 24×24, and for GAN training, you have to set a database consisting of images of the resolution, multiples of 32. Like 32×32, 64×64, 256×256, 1024×1024, etc. So, I upscaled the images from 24x to 32x, and it caused some distortions on some pixels. So I learned from those mistakes for V2, and they looked more like original CP and pixel art avatars (I think).
Mason: What are some of your biggest creative influences? Is there anyone in AI or blockchain that you recommend learning more about?
Berk: John Cage, John Oswald of Plunderphonics, Nick Briz, Cory Arcangel (maker of Super Mario Clouds), and Chris Novello who created the ‘Illucia’ project using code-bending.
For any artist new to the Crypto and NFT space, I highly recommend Rhea Myers – a True OG CryptoArtist.
And the Trash Art movement has been pretty influential.
Mason: How did Mark Cuban end up with several V2 Bastard Gan Punks?
Berk: As I know, Haralabos Voulgaris, the former Director of Quantitative Research and Development for the Mavericks, minted a shitton of BGANs (and dumped most of them afterwards, lol) and told Mark Cuban about it.
Mason: What inspired you to start programming and move into AI, and do you have any advice for readers who want to learn more about it?
Berk: I have a piano and classical composition background, so when I saw the possibilities of programming in electronic music, I dived into the subject. Then I took a Python course and followed some other courses online. I realized that I have a super-powerful tool with programming that can do enormous tasks and experiments I can’t do myself. When I was doing my Master’s degree at ArtScience Interfaculty, I was genuinely interested in machine learning methods, so I started playing with object detection, natural language processing, GAN, audio datasets etc. I didn’t follow a specific source or book. I just jumped into everything I found interesting and was capable of fiddling with.
Mason: Are you working on any other projects?
Berk: I am currently trying to utilize Glicpixxx for fashion textile experiments. So let’s see what kind of stuff this will create!
And there is a multiplayer art experience I am working on with a small group of people, which will utilize Glicpixxx and possibly BGANs in a weirdo way. Other than those, I don’t have specific projects I am working on. Ideas come to me spontaneously, depending on the moment.
Mason: Are you proud of any particular moment in the BGAN timeline?
Berk: When I reversed the bonding curve and sent all the minting price to donations, we managed to contribute to saving a baby with spinal muscular atrophy (alongside other charitable causes).
Mason: Lastly, what would be your ideal future for generative art, NFTs, and artificial intelligence?
Berk: I think landscape needs more fun, goofiness and weirdness. Of course, I am always excited to see new technology utilizations. Still, sometimes NFTs are utilized to satisfy the collectors, to make their digital objects ‘worth holding something’. Now NFTs are a part of this world, and we will see so many new uses of digitally created objects, so I am excited to see what kind of craziness we will witness.
Conclusion:
With a pool on NFTX funded with over 10 ETH, a coordinated DAO and Treasury, more than 2,800 unique holders, and over 8,200 volume in ETH on OpenSea, BGANs have gained a reputation as something of a stablecoin equivalent in the ever transient landscape of NFTs.
But with an eclectic community of technologists, artists, and musicians that form the collective mosaic of bastards, perhaps no better title could be had. Especially when, regardless of ETH prices, there will always be a band playing for the fam in BGAN land.
Links:
- Bganpunks.eth.link (V1)
- bastardganpunks.club (V2)
- Bgan V2 Trait Constitution
- Bgan V2 FAQ
- Twitter: https://twitter.com/bganpunks
- Discord: discord.com/invite/bganpunks
- Glicpixxx website: glicpixxx.love
- Artglixxx website: artglixxx.io
- A history of the DAO votes can be found here.
- During the discounted contract mints of BGAN V2, a total 234.556 ETH was sent directly to charities – a spread of the transactions can be found here: https://docs.google.com/spreadsheets/d/1_yIe-cu-2S4cRGuS2Xa2tJKKQfye4zpnueDd-pMxgmw/edit#gid=0
Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.
Mindplex Podcast: Episode 1 | The History and Future of AGI
Large Language Models – From Banality to Originality
Looking at the insanely popular Large Language Models (LLMs) like ChatGPT with an AGI researcher’s eye, what I see are some potentially interesting components for future AGI systems.
Economically, I see the seed of a family of technologies that, over the next few years, is very likely to obsolete a majority of white-collar jobs, wreaking incredible disruption and ultimately moving us further toward a world in which people are liberated from the necessity to work for a living.
On the other hand, looking at them from an end user’s view, what I see are tools that are already pretty useful for some things – but are marred by a few really major and frustrating flaws.
The first major flaw is a bizarre lack of reality discrimination – LLMs are so full of shit it’s almost unbelievable, given how intelligent they are in some ways.
The second is a mind-numbing boringness and banality – such powerful cliché machines have never previously even been imagined.
I do believe, though, both of these issues can be overcome with a moderate amount of R&D effort. Which is probably going to be put in by various parties during the next couple years.
To fully overcome these weaknesses will require a true breakthrough to Artificial General Intelligence. But I suspect they can be remedied to a significant degree even without AGI, as a parallel track of development.
Generative AI vs. AGI
As an AGI guy, the first thing I need to clarify when talking about LLMs is that they are certainly not AGIs. They achieve a fair degree of generality in their intelligence, but it’s not because they generalize beyond their training and programming – it’s because their training data is so huge it covers almost everything. A tremendous variety of queries can be answered via minor variations and combinations on things in the training data.
A generally intelligent artificial mind will have lots of uses for statistically recognizing patterns in huge datasets, and synthesizing new forms via merging these patterns together in a manner appropriate to some narrow context – the things that LLMs are good at. But I don’t think these capabilities can form the core of an AGI mind. That needs much more abstract thinking, bound up with agency, will, self-reflection, modeling of self and other. If AGI systems have these core functions fulfilled (by the OpenCog Hyperon AGI architecture my colleagues and I are working on, or something else), I think LLMs could add a lot of wonderful peripheral functions.
Toward LLMs with Reality Discrimination
LLMs trained to produce text currently have serious problems with truth-telling – they basically can’t distinguish truth from falsehood. I’ve gathered a few funny or instructive examples in a blog post on ChatGPT and AGI. There are many others easy to dig up online. I have a particular liking for the conversation where ChatGPT calls me the founder of OpenAI, the company that created it (in which I have no direct role).
A variety of approaches are being pursued to overcome this problem and make LLMs less full of shit, and I think a bunch of them will probably succeed to some degree. As an example, we at SingularityNET and TrueAGI are trying to address this with a neural-symbolic architecture: LLMs are hybridized with an OpenCog-based symbolic reasoning engine.
This symbolic reasoning engine can extract logical relationships from an LLM, and put the LLM’s knowledge in a more structured form. It then compares the relationships between concepts with each other to identify inconsistencies. Where possible, it also compares these relationships with existing background knowledge – common-sense truths about the world like ‘water is wet’ or ‘Ben Goertzel is not the founder of OpenAI’. Inconsistencies can then be used as negative evidence to fine-tune the model – the model can be explicitly trained to unlearn the lies it contains.
I suspect these hybrid symbolic-LLM structures can greatly increase the generality and truthfulness of things like ChatGPT, even without taking us all the way to AGI.
The Banality of LLMs
But LLMs also have other problems besides their rampant BS generation – like the fact that so much of what they produce is boring clichés. Eric Hoel has called this “the banality of ChatGPT“.
As a so-bad-it’s-good example of this banality, consider the following excerpt from a recent dialogue I had with the system:
BEN: How would you advise to manage and coordinate the project of creating a replica of the planet Jupiter with 1000x human intelligence?
ChatGPT: For managing and coordinating a project of creating a replica of Jupiter with 1000x human intelligence, I would suggest the following steps:
- Define the project scope: Clearly define the goals, objectives, and deliverables of the project.
- Assemble the team: Recruit individuals with relevant skills and experience to form a diverse and effective team.
- Develop a project plan: Determine the tasks, resources, timelines, and budgets required to complete the project.
- Set up project management systems: Establish tools and processes to track project progress, communicate with team members, and manage risks.
- Monitor progress and adjust as needed: Continuously assess the project’s status and make changes as needed to ensure it stays on track.
- Ensure communication and collaboration: Encourage regular communication among team members and stakeholders to ensure everyone is on the same page.
- Anticipate and manage risks: Identify and plan for potential risks and obstacles to minimize their impact on the project.
- Celebrate success: Recognize and celebrate milestones and achievements along the way to keep team morale high.
Remember, creating a replica of Jupiter with 1000x human intelligence is a complex and challenging project, so it’s important to be organized, communicate effectively, and continuously adjust as needed to ensure success.
Not exactly wrong. But, shall we say, not really grappling with the particulars of the question…
Sometimes boring clichés are exactly what you want in life. If you’re writing a cover letter for a job application, or an essay for university English class… originality and quirkiness can sometimes be dangerous.
But one worries about a near-future in which boring clichés become even more common than they are now, because they’re so quick and cheap to generate using AI models. A boring dystopia.
The Glory and Horror of Music LLMs
Google’s recently announced MusicLM neural model provides an intriguing case study of the intersection between “fascinatingly impressive” and “horrifyingly boring.”
You can give it a query like –
We can hear a choir, singing a Gregorian chant, and a drum machine, creating a rhythmic beat. The slow, stately sounds of strings provide a calming backdrop for the fast, complex sounds of futuristic electronic music
.
– and it will generate music that fits the bill. Amazing stuff.
Except the effect is a bit like having a workaday lounge band improvise a musical passage for you. It’s very rarely artistically thrilling.
Given how impressive the functionality is, you might say this is a pretty lame complaint.
However, if such technology was used to generate music for people to listen to, the result would be an even more cliché-ridden and repetitious music sphere than record execs have already inflicted on us! Dentist’s office muzak++ forever!
The problem here is that averaging together everybody’s art produces art that is itself average. For some commercial purposes – e.g. background music for ads or video games – average, passable, competent music may be fine.
As a lifelong composer and improviser, I’ve generally been more interested in creating sounds that diverge from the average and the expectation – even if they get a little jarring or discordant in the process.
Of course, current neural models can be jarring and discordant too – but they will tend to do it in a way quite similar to something from their training dataset, or combining surface-level features of a few things in their training datasets.
Music is the domain in which I’ve thought most about how to overcome the banality of LLM output – because as a musician, I would really love to have an AI musician to improvise with. We already have a robot singer in our Jam Galaxy Band, and some AI-composed riffs, but a real-time AI improviser jamming alongside me is what I dream of. I don’t want boring lowest-common-denominator MusicLM-style in my band, not at all…
One approach that can be taken here is to formally introduce a theory of ‘interestingness’ – make a mathematical model of what constitutes interesting music, and then condition a MusicLM-type model to bias it to produce outputs meeting this interestingness criterion. This is not that far off from work I did in the 1990s using genetic algorithms to evolve music maximizing a ‘fitness function’ encoding a theory of musical interestingness. But LLMs allow the evolved music to incorporate patterns of various sorts from human music in a much more refined way than was possible back then.
LLMs vs. Hmmmmmm
Of course, this would still be a very different thing from how an AGI system would approach music.
AGI and music could intersect in a variety of ways, but one way or another, it would involve an AGI system creating and understanding music in the context of its experience of being an agent in the world, like when the AGI in the 2013 film Her says, “I’m trying to write a piece of music that’s about what it feels like to be on the beach with you right now.”
Steven Mithen’s book The Singing Neanderthals presents an hypothesis about the origin of language and music. He posits that human communication began with a communication system he refers to as “Hmmmmm” because it had the following characteristics: it was Holistic, manipulative, multi-modal, musical and mimetic. Basically Hmmmmm combined sound and gesture and action and imitation – somewhat like the pre-verbal/semi-verbal communication one sees in one-year-old children, but with more adult-level cognitive sophistication underlying. His proposal is that Hmmmmm came first and then spawned both language and music, which evolved from Hmmmmm in their own different directions.
Cambridge Archeological Journal did a fascinating feature presenting various criticisms on the hypothesis along with Mithen’s responses.
An interesting and fairly difficult challenge would be to coax AI agents living in a virtual world – let’s say Minecraft enhanced with plugins, or the in-process Sophiaverse virtual world – to invent language and music along the lines of the Hmmmmm theory. This could be an interesting and valuable thing for AGI researchers to do regardless of how fully accurate Mithen’s theory of evolution is.
We could stock the virtual world with a few easy-to-use musical instruments, let’s say –
- drums that make rhythms when hit
- flutes that they breathe into (modulating volume and timbre with breath) while pushing buttons to make notes
- Piano-type instruments that make notes when their keys are hit
One would then ultimately want these virtual-world proto-AGI agents – I like to think of them as “Neoterics” (new people) – to do things like:
- Discover that dancing to music is pleasurable and creates a feeling of togetherness which fosters collective action and communication
- Discover that drumming enthuses a group to carry out physical tasks together
- Discover that listening to melodic music puts the mind in a state conducive to creativity
Given that the Neoterics’ emotion models will be similar to ours, yet different in the particulars, it may be that the music they create to express their own emotions and influence each others’ emotions will be significantly different from human music. Perhaps one could then train music LLMs on music made by Neoterics and get a fascinating sort of hybrid – a truly new genre of music!
Whether or not this Neoterics experiment ever gets done in precisely this form, it does highlight the big difference between an LLM approach and an AGI approach – to music or anything else. LLMs are munging and merging data patterns, and with cleverness one can work around the most immediate issues that emerge from this approach, issues such as tendencies to hallucinate or converge on clichés. AGI, however, requires a totally different approach.
Narrow AI systems like LLMs may be useful for feeding patterns into the core cognition of AGI systems, or for helping them speak fluently in the lingo of a given domain. But at core, AGI systems will necessarily be very different from LLMs – they will be what Weaver has called Open-Ended Intelligences – complex self-organizing systems that engage richly with their environments, driven by complementary-and-contradictory drives to individuation and self-transcendence. AGI systems will achieve generalization via abstraction, and the need to balance individuation and self-transcendence, while working with limited resources, will drive them to intelligent abstract understanding. When they generate language or make music, they will build it on this abstract understanding, formed from their experience, and as a result will be imaginative and truth-telling naturally – organically – rather than with an additional trick.
LLMs and other deep neural nets are going to have a big impact on society, disrupting the business models of today’s tech giants and potentially eliminating a significant percentage of human jobs. But their biggest contribution ultimately may be waking up the world to the potential of AI and thus attracting more cognitive, financial and cultural resources toward the development of AGI, which will have a far larger direct impact than even the coolest narrow AI systems.
Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.
Technoshaman: from worldbuilding to mindbuilding: Part 2
In Part 1, I introduced the archetype of the technoshaman, one who uses XR (extended reality — VR, AR, and MR) technologies to craft multisensory digital worlds and experiences that elevate and harmonize human consciousness on a mass scale. In Part 2, we deep-dive into “digital pharmacology” and use XR to evoke alternate states of consciousness.
The social impacts of contemporary media are well known. Film and television can engage, inform and motivate on a mass scale and are no longer solely dominated by major studios who must make significant investments in productions. And the increasing democratization of media has placed great power in the hands of nearly every individual.
More than 6.5 billion people — 80% of the world’s population — now own smartphones. Social media has enabled nearly anyone to influence the minds of hundreds, thousands, or millions of people with videos, news, opinions and niche programming.
Storytellers intentionally use the power of media and entertainment to achieve social or global change through social impact entertainment. As described by Ella Saltmarsh, stories can be tools for “building community through empathy and coherence,” connecting diverse people and generating “narratives that hold together groups, organizations, and movements.”
Stories can re-author our personal and cultural narratives and mythic narratives or worldviews, including values, mindsets, rules and goals. “Stories deeply affect our psyche and collective direction of travel,” Saltmarsh says.
Unlike film, television, or streaming media, where passive audiences watch visuals framed on a wall or a handheld device, XR (extended reality) worldbuilders use presence, embodiment, and agency for spatial or environmental storytelling. These place us within the “storyworld” with XR’s ability to simulate real-world experiences, giving this new medium a deeper, more profound effect on consciousness.
The power of XR
Everyone in the metaverse can become a worldbuilder, giving them the power to evoke memes, moods, and states of consciousness in other participants. The result is akin to a virtual brain corpus callosum between groups and individuals, allowing for direct transmission of consciousness from worldbuilder to participant.
Of course, music, arts, entertainment, architecture, literature, and other real-world modes of expression have long been used to evoke profound brain states in participants. But XR supercharges this process, due to various factors:
- XR interfaces create a wideband interface to the participant’s brain and nervous system. Immersive displays can place a pixel on nearly every nerve ending in the eye. 3D audio simulates spatial and spectral properties of sound. And XR controllers capture natural movement and gestures.
- XR production tools and robust user interfaces, enhanced by AI, accelerate worldbuilding without the need to write code. Virtual worldbuilding is not burdened by the need for physical materials, construction, labor, and logistics, vastly accelerating the democratization of virtual worldbuilding compared to physical worldbuilding.
- XR performers can easily inhabit and command virtual worlds, controlling complex interactions and effects through natural gestures and simple controllers. Imagine a live theater that suddenly morphs into a cave, a castle, or another world altogether.
- Virtual worlds combine music, arts, entertainment, storytelling, social media, architecture, and multisensory modalities into a seamless experience for participants, compounding their psychological power.
- Unlike physical spaces, virtual environments are inexpensive, easily replicated, and massively scalable with global reach.
The metaverse
The XR technology market will expand by a factor of 10 in the next five years, according to a forecast by Statista. Over the long term, wearable XR interfaces are expected to replace handheld smartphones. And some expect the metaverse to replace the internet, despite Meta’s problems.
To articulate and guide the development of the metaverse “for the greatest good for the greatest number” of participants, virtual reality pioneer Tony Parisi recently proposed these Seven Rules of the Metaverse:
- There is only one Metaverse
- The Metaverse is for everyone
- Nobody controls the Metaverse
- The Metaverse is open
- The Metaverse is hardware-independent
- The Metaverse is a Network
- The Metaverse is the Internet
Widespread adoption of the metaverse using XR interfaces will place unprecedented power and influence into the hands of worldbuilders. But will worldbuilders use this power wisely?
Digital pharmacology
XR technologies can extend beyond simple storytelling. Immersive multisensory media can evoke specific states of consciousness, or “digital pharmacology.” Applications could include enhanced entertainment, education, leisure and lifestyle, enhanced well-being, spiritual ceremonies, and a wide variety of clinical applications.
There’s a rich history of sensory experiences being used to alter consciousness. Music is a powerful mood-altering agent that can induce or modify cognitive states and perceptions. Sound healing and vibroacoustic healing have been proven effective. Simple nature sounds can improve health, increase positive affect, and lower stress.
Visual stimuli such as natural environments — both real and virtual — can similarly affect physiological states (as measured by blood pressure and muscle tension). They can even alter our perception of music. Other powerful sensory modalities include haptics (touch) and aroma.
Transformational and mood-elevating experiences can take on many forms. Here’s a partial list:
- Awaken the Mind
- Inspire imagination, creativity, and open-mindedness
- Educate, train, inform
- Epiphanies and new understandings
- Open the Heart
- Empathy, compassion, understanding
- The emotional shift towards love, respect, joy, happiness
- Activate the Spirit
- Awe, wonder, ecstasy, bliss
- Transcendent, numinous or “unity” experiences
- Cosmic consciousness
- Going Deep
- Mindfulness and contemplative states
- Trance, dream, and mystical states
- Psychedelic states
- Soothe the Beast
- Relaxation, sense of calmness
- Feeling of safety, restoration
- Awaken the Senses
- Arouse passion, romance, sensuality
- Energize and Engage
- Awaken the body and mind
- Stimulate, motivate, alert, play
- Entrain the Brain
- New kinesthetic skills, break/make habits, reprogram
Mindfulness states have a broad range of health benefits. Studies have shown decreased anxiety and depression, improved immune function, mitigation of cognitive decline due to aging, increased mental clarity and focus, improved heart health and mental health, increased longevity and self-confidence, improved moods, improved sleep, pain mitigation and more.
Likewise, studies have shown that psychedelic states can treat obsessive-compulsive disorder, end-of-life psychological distress, alcohol and tobacco addiction, and major depressive disorder. More research is needed to validate the degree to which XR experiences can evoke such states. However, XR applications are emerging to guide patients receiving psychedelic therapies.
Awe is one of the most powerful emotions one can experience. In a partnership with Cirque du Soleil, Dr. Beau Lotto’s Lab of Misfits conducted an ambitious study on the emotion of awe. The findings suggest that awe motivates us to step forward into a world of uncertainty in the search for answers, raises our risk tolerance, increases social behavior, and can even reframe who we believe we were in the past. Multiple studies have also found evidence that experiencing awe makes us kinder and more generous. These are all good things for this day and age!
Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter.