On page 142 of his book about existential risk, reference 1, Ord makes a passing reference to making up faces with computers. Through the notes and references at the end, this can be tracked down to reference 2. Which turns out to be just one of a whole lot of papers about generative adversarial networks, a very hot potato in the world of image generation. Maybe Ord had a research assistant to wade through enough of them for him to be able say something both neat and true. I associate to months of work by highly trained economists being distilled down to a very carefully fashioned – but single – sentence in something like the Autumn Statement of old from the Treasury.
The idea is that you have captured a whole lot of images of faces, perhaps at 1,024 by 1,024 pixels each, and you want to use those images to generate some more, different ones. Different ones which look as real as the real thing. So you want them similar enough to the faces you started with so that they look real, but different enough so that they are new and interesting. One way to pull this trick off is to use generative adversarial networks, or GANs for short. I associate to projecting all one’s thousands of faces into some high dimensional vector space as isolated points and then being able to generate a face from any point more or less inside the sub-space so defined. A kind of averaging between two points on a line. Or to use a longer word, interpolating.
But the actual idea here is that you have one neural network (the generator) generate images and the other network (the discriminator) guess whether they are real or not. They fight it out, evolving as they go, hopefully to a more or less ordinary game theoretic equilibrium. All this can be done with very little human intervention; in particular, there is no need to label anything, it is enough that the training images are all faces. Or animals, or churches or whatever it is that you fancy faking. A big plus, as the cost of labelling big image databases which might contain millions of images by hand can be considerable, even with the sort of casual labour force delivered by Amazon Mechanical Turk.
And once you have trained the generator, you let it generate – and see what it turns up. With the image above, taken from reference 3, being an example of the sort of thing that can be done. I believe someone with a good eye or expert knowledge can tell that it is not the real thing – but I have neither so it looks pretty good to me.
Another thing that you can do is feed the generator a seed – perhaps a random number between zero and one – and it gives you an image. You can also get the average image for a pair of numbers.
Hidden inside, there is also a feature space, sometimes called the latent structure, where a point might be defined by a vector with 512 elements – giving us a space with 500 odd dimensions – the high dimensional vector space already mentioned. I think, with trickery, you can get hold of these elements. And with more trickery you can define hyperplanes in that space which fence off things like long noses from short noses. A hyperplane which defines the direction to go if you want your image to grow a longer nose.
And given that you can approximate to the vector for any particular image in your training set, you can start to manipulate that image. To make someone fatter or older. Or thinner or younger, as the case may be.
Downsides
A big downside of this sort of thing is that when you see a face on the Internet, you can’t be sure anymore that it is a real face. In particular, if you follow someone on Facebook (or wherever), the whole thing might be a fake. Pictures, background, everything. And because the pictures are not of a real person there is no-one to complain. And Facebook won’t complain if the account in question is generating revenue through clicks.
Maybe models with their moods, their prima donna ways and their fancy wages are on the way out. No need to bother with the real thing any more when you can generate a fake one which is more or less as good, on the cheap.
Which all seems harmless enough – except perhaps for the models – but I worry about the blurring between fact and fiction, which strikes me as rather unhealthy. It is all very well getting very involved with your favourite soap character – or your favourite character from ‘Anna Karenina’ – but at least part of you knows that the character is fictional, is safely contained within a fictional world. I need to think about what it means for that containment to be breached. Or, to take a new-to-me word from reference 5, for the bunds to be missing.
Another downside is that, as things stand, in order to generate realistic high resolution images, you need lots of high resolution training images, maybe tens of thousands of them. Which can be a problem. So a computer is not yet like a three year old child, see a table or a giraffe once and be able to identify any other such for ever more. No need for tens of images here, never mind tens of thousands of them.
Another is that, the mathematics and systems underlying all this are reasonably challenging and I imagine that many users blast away with packaged software without too much regard with what is going on under the hood. Rather as users of sophisticated statistical packages are apt to blast away, sometimes with untoward results. But that said, if you get a picture which you like or which you can use, fine. If not, well, you can just try again.
Commercial interests
The NVIDIA people of reference 4, registered in the tax haven otherwise known as Delaware, appear to be big players in this field, with their people putting their names to a lot of the papers. With the papers, and a lot of the software behind them, being freely available to all.
Big in this field looks to include high end computers for games and games designers. Perhaps you need a NVIDIA® RTX™ A6000?
PS: in the course of one of the mathematical challenges, I learned that one can define the square root of a matrix, but that it is a lot less unique than the square root of a number. I don't think that I ever knew this before, despite having once known something about linear algebra. Perhaps the square roots of matrices are more curious than useful, so it is easy to pass them by.
References
Reference 1: The Precipice: Existential Risk and the Future of Humanity – Toby Ord – 2021.
Reference 2: Progressive growing of GANs for improved quality, stability and variation - Karras, Aila, Laine and Lehtinen – 2017.
Reference 3: https://thispersondoesnotexist.com/.
Reference 4: https://www.nvidia.com/en-gb/.
Reference 5: http://psmv4.blogspot.com/2021/04/concrete-frustration.html.
Reference 6: https://psmv4.blogspot.com/2021/02/fake-117.html.
No comments:
Post a Comment