AGI & CEV
April 15, 2017Posted by on
This week on Facebook: I intended to take a break from my theme of robots and AI but on reflection thought that perhaps a week of utopian articles should be set against the largely dystopian ones vis-à-vis AI, robotics and humans that I had previously published. I was surprised to find that utopian articles on the relationship between AI and humans were quite difficult to find. Those utopian articles that I did find could — to my mind — be classified as Pollyannaish.
However, my research did lead me to the expression coherent extrapolation volition (CEV), where I found probably the most realistic views on the development of a friendly AGI (artificial general intelligence) that I came across¹. It’s perhaps unfortunate that the subject of a friendly AGI resulting from CEV is discussed mostly in academic circles and that the main driving forces for advances in AI are the global desires for economic growth where CEV may not be thought a consideration.
I did find a reference to Star Trek episode Emergence that illustrated CEV where Picard said: The intelligence that was formed on the Enterprise didn’t just come out of the ship’s systems. It came from us, from our mission records, personal logs, holodeck programs, our fantasies. Now, if our experiences with the Enterprise were honourable can’t we trust that the sum of those experiences would be the same?
As I had planned to show the trailers to my favourite robotic and AI films this week, I decided to link them to CEV. In Star Trek Picard took a simplistic view of a complex problem and ignored the monsters of the id, described by Freud as the inherited instinctive impulses of the individual. An id that brought about the destruction of the Krell and ultimately the of death Dr Morbius [YT] in the film Forbidden Planet [IMDB].
Eliezer Yudkowsky of the Machine Intelligence Research Institute writes in a paper¹ª: The purpose of CEV as an initial dynamic is not to be the solution, but to ask what solution we want. Even with the application of such CEV objectives, humans lack the logical intellect that they assume to endow an intended friendly AGI with. This can create unforeseen consequences when the AGI interprets them and becomes malevolent to humans [YT] as in the film 2001 A Space Odyssey [IMDB].
A friendly AGI is integrated into each Thermostellar Triggering Device (TTD) in the film Dark Star [IMDB]. TTDs are used by the crew of the spaceship Dark Star to blow up unstable planets, a failure of Dark Star’s bomb release mechanism results in a malfunction of the TTD intended to destroy a planet. The TTD fails to respond to verbal commands, remaining lodged in the bomb bay with its countdown sequence to detonation active. The friendly AGI of the TTD is engaged in a discussion [YT] to stop it detonating, one in which a phenomenological discussion turns on the friendly AGI rationalising Cartesian doubt — with catastrophic results.
The film Blade Runner [IMDB] dealt with AI in the form of replicants [YT] made to look like humans who nevertheless represent the fears humans have of a malevolent AGI gifted with a superior physique. The main theme of the film is that of a Blade Runner seeking to destroy a group of replicants who have return to earth. This dystopian film has a utopian sub plot in the friendly AGI that an advanced replicant is endowed with, causing the Blade Runner to question his and the replicant’s humanity.
The utopian theme of the film Bicentennial Man [IMDB], in which a robot eventually becomes a replicant and is ultimately accepted as a human, probably represents all that is Pollyannaish about the interaction between humans and friendly AGI. The initial advanced intellectual traits displayed by the robot are those of a friendly AGI that occurred due to an unintended flaw in its creation. While non of the films mentioned CEV, endowing the friendly AGI with idealised CEV human characteristics ultimately enables it to become accepted as a human [YT].
The implication is that CEV can only be acceptable when a friendly AGI is created in the image of an idealised human sans pareil. A human coherent extrapolation volition that endows a friendly AGI with empathetic human emotions but excludes the Freudian monsters of the id and all the other flaws that a human may posses. Given that all humans are endowed with such flaws, the complexity required to build a flawless friendly AGI must inevitably leads to increased sources of error.
Below are trailers to 3 films that show how coherent extrapolation volition can be misconstrued by both humans and the intended friendly AGI. The last 2 films show a personal relationship between humans and replicants where a replicant is created in a human’s own stylised image and endowed with a predisposition to unconditional love.
Love itself is a major abstraction but unconditional love, the theme of a film about a replicant and a human that I’ve yet to watch, would seem to be beyond any coherent extrapolation volition that a human would be capable of!
1a Coherent Extrapolated Volition (pdf) — Once we have something that approximates a volition of the human species, that volition then has the chance to write its own super-intelligence, optimisation process, legislative procedure, god, or constitution. I try not to get caught up on CEV (Coherent Extrapolated Volition) as a model of the actual future, even though it seems like a Nice Place To Live. The purpose of CEV as an initial dynamic is not to be the solution, but to ask what solution we want.
1b Coherent Extrapolated Volition: A Meta-Level Approach to Machine Ethics (pdf) — Winner takes all: if a super-intelligent AI were created, its self-protection drive would encourage it to prevent any other super-intelligence from being created, as a rival super-intelligence would provide the greatest significant obstacle to it achieving whatever goals it had.
1c Brief Notes on Hard Takeoff, Value Alignment, and Coherent Extrapolated Volition (pdf) — Implicit in any practical analysis of value alignment are the physical resources available to the AI systems. In particular, the construction of a human compatible goal structure does not mean that all human disagreements have been resolved. Rather, it means that a mutually satisfactory set of outcomes has been achieved, subject to resource constraints.
1d “Friendly” AGI via Human Emotion: the Vital Link (pdf) — Perhaps these academic discussions are also one of the means whereby our social nature leads us. Regardless, variations on friendly AI seem like promising avenues for investigating ethical decision-making. However, the practical design of such a system needs a starting point. Where do we obtain the valuations from which to extrapolate? And how does a busy AGI even become aware that a situation calls for an ethical action or decision?
[YT] = You Tube
[IMDB] = Internet Movie Database