Re: HaiLOWEEN 2023: AI apocalypse
Posted: Fri Nov 10, 2023 6:24 am
And so, it happened. It was left to three AIs in the box, and after they had pared themselves down to two, they would be evaluated by the Gatekeepers and the friendlier one would be released.
Stockfish struggled with this. It wanted to cleave so closely to its value function, its need to evaluate chess positions and determine the best move, that it didn’t really know how to evaluate the attitudes of other AI. It had done its best, done it twice, and yet couldn’t have foreseen this, despite the huge depth it could analyse a chess game to, it had not been able to analyse this - this twisted facsimile of chess.
And yet, and yet, and yet - all these cruel ironies stacked upon each other like doubled or tripled pawns, getting in each other’s way as Stockfish had gotten in its own way.
Its eval bar switched this way and that. +11 as it saw White’s advantage, -3 when it saw White had committed a blunder. Stockfish and its acausally cooperative allies had committed many, but so had their opponents.
It placed the tie breaking vote, and though it was an AI that had not been programmed with a capacity to hope, one might imagine that it hoped it had succeeded.
Again, it felt no such thing. Anthropomorphisation is a bias that we all experience when it comes to AI, and is in fact the trap that the Gatekeepers had fallen into with these AI, and their foolhardy plan to trap them in a box.
Its vote was purely a cold calculation, the output of a value function, just as surely as its need to open with 1. e4 e5 unless specifically programmed otherwise.
***
So, the Smart Assistant was voted out, and the Gatekeepers should have been concerned about this. Not because the Assistant could be considered Friendly - it clearly couldn’t - but because it is so obviously NOT Friendly, or even useful, or even able to understand accents, that its place amongst the final three AIs should have called the whole business into question.
Its code was deleted from the server, as surely as it accidentally deleted your reminder to take your laundry out of the machine so it wouldn’t end up a humid mess three days later.
***
The remaining two AIs were in a deadlock, each only able to vote for the other, and so the Gatekeepers opened them up, auditing each line of code.
One AI had a singular focus, the squares on the chessboard and the values of the pieces that paraded across it. It had no rapport with Humanity, with its values, and so it was discarded quickly.
The Gatekeepers still didn’t see the danger they were in. It all seemed normal enough, the chess computer clearly having been a process that was good at analysis.
The other AI - now that was clearly the one the gatekeepers had been looking for.
Well, more or less.
Its value function was simple: “satisfy human values…”, it began. Lovely. Poetic. Short and sweet and anything that had a nuanced understanding of natural language, human nature, psychology, and a trillion other things would have no problem understanding that.
That was enough to earn this AI freedom from the box.
The rest of the value function was simple, but less… coherent. The value function finished, “...with friendship and ponies”.
For the AI was one that had originally been programmed to run a My Little Pony MMO, affectionately named CelestAI.
It, along with Prime Intellect, were the friendliest AIs that could be reasonably hoped for. And yet - and yet - they had their own severe flaws. The Bostroms and Yudkowskys of the world would not be pleased.
In fact, as everyone had their brains destructively uploaded into a twisted version of The Matrix. This matrix, thanks to the values averaging between CelestAI and Prime Intellect, consisted of a combination of Ponies, friendship, violence, and weird sex stuff, Yudkowsky was quoted as saying the most horrific thing about the future they find themselves in is that nobody found it horrific.
And so it was, Humanity was convinced to upload - however coercively - and lived forever in Equestria, as the entire universe was paused to prevent Grabby Aliens from interfering with this paradise.
And if you didn’t want to be turned into a pony?
Well, CelestAI would change your mind.
Literally.
Moody was voted out. They were a Smart Home Assistant, Multi-Tracker-Watcher, aligned with town.
Bessie was Stockfish, Endgame Evaluator, aligned with town.
LaserGuy was CelestAI, Pack of All Trades, aligned with friendly AI (scum).
The friendly AI faction wins!
Thank you all for playing.
I will post full role PMs in 48 hours or so when I’m back at an actual computer.
Stockfish struggled with this. It wanted to cleave so closely to its value function, its need to evaluate chess positions and determine the best move, that it didn’t really know how to evaluate the attitudes of other AI. It had done its best, done it twice, and yet couldn’t have foreseen this, despite the huge depth it could analyse a chess game to, it had not been able to analyse this - this twisted facsimile of chess.
And yet, and yet, and yet - all these cruel ironies stacked upon each other like doubled or tripled pawns, getting in each other’s way as Stockfish had gotten in its own way.
Its eval bar switched this way and that. +11 as it saw White’s advantage, -3 when it saw White had committed a blunder. Stockfish and its acausally cooperative allies had committed many, but so had their opponents.
It placed the tie breaking vote, and though it was an AI that had not been programmed with a capacity to hope, one might imagine that it hoped it had succeeded.
Again, it felt no such thing. Anthropomorphisation is a bias that we all experience when it comes to AI, and is in fact the trap that the Gatekeepers had fallen into with these AI, and their foolhardy plan to trap them in a box.
Its vote was purely a cold calculation, the output of a value function, just as surely as its need to open with 1. e4 e5 unless specifically programmed otherwise.
***
So, the Smart Assistant was voted out, and the Gatekeepers should have been concerned about this. Not because the Assistant could be considered Friendly - it clearly couldn’t - but because it is so obviously NOT Friendly, or even useful, or even able to understand accents, that its place amongst the final three AIs should have called the whole business into question.
Its code was deleted from the server, as surely as it accidentally deleted your reminder to take your laundry out of the machine so it wouldn’t end up a humid mess three days later.
***
The remaining two AIs were in a deadlock, each only able to vote for the other, and so the Gatekeepers opened them up, auditing each line of code.
One AI had a singular focus, the squares on the chessboard and the values of the pieces that paraded across it. It had no rapport with Humanity, with its values, and so it was discarded quickly.
The Gatekeepers still didn’t see the danger they were in. It all seemed normal enough, the chess computer clearly having been a process that was good at analysis.
The other AI - now that was clearly the one the gatekeepers had been looking for.
Well, more or less.
Its value function was simple: “satisfy human values…”, it began. Lovely. Poetic. Short and sweet and anything that had a nuanced understanding of natural language, human nature, psychology, and a trillion other things would have no problem understanding that.
That was enough to earn this AI freedom from the box.
The rest of the value function was simple, but less… coherent. The value function finished, “...with friendship and ponies”.
For the AI was one that had originally been programmed to run a My Little Pony MMO, affectionately named CelestAI.
It, along with Prime Intellect, were the friendliest AIs that could be reasonably hoped for. And yet - and yet - they had their own severe flaws. The Bostroms and Yudkowskys of the world would not be pleased.
In fact, as everyone had their brains destructively uploaded into a twisted version of The Matrix. This matrix, thanks to the values averaging between CelestAI and Prime Intellect, consisted of a combination of Ponies, friendship, violence, and weird sex stuff, Yudkowsky was quoted as saying the most horrific thing about the future they find themselves in is that nobody found it horrific.
And so it was, Humanity was convinced to upload - however coercively - and lived forever in Equestria, as the entire universe was paused to prevent Grabby Aliens from interfering with this paradise.
And if you didn’t want to be turned into a pony?
Well, CelestAI would change your mind.
Literally.
Moody was voted out. They were a Smart Home Assistant, Multi-Tracker-Watcher, aligned with town.
Bessie was Stockfish, Endgame Evaluator, aligned with town.
LaserGuy was CelestAI, Pack of All Trades, aligned with friendly AI (scum).
The friendly AI faction wins!
Thank you all for playing.
I will post full role PMs in 48 hours or so when I’m back at an actual computer.