What’s going to happen when computers start to imagine?
Would you believe if I say that all these portraits are computer-generated? In other words, these people do not exist in real life! This is the ‘imagination’ of a computer.
I know, it is so cool and haunting at the same time when you think about how much computers and machine learning in particular has evolved over the years. This ‘miracle’ is possible thanks to the new technology called GAN architecture.
What is machine learning?
Machine learning is the use of algorithms and neural network models, which improves the performance of a computer system. This is not a new technology because its roots date back to 1949 when Donald Hebb talked about a model of a brain cell interaction, in his book “The organization behavior”.
But recently machine learning has become more useful and relevant than ever due to three main reasons. Reachability to data more than ever causes computers to become more powerful and have better machine learning algorithms.
To talk about StyleGAN we have to know 2 types of learning mechanisms related to machine learning.
- Supervised learning
When you are watching a video on YouTube, it suggests some related videos for you, or when you watch several movies on Netflix, it makes you some suggestions on other movies that have been watched by others who had the highest mutuality. Giving this type of similarity items on categorized or labeled data can be known as supervised learning.
- Unsupervised learning
Here the computer looks for similarities in unlabeled data types and arranges them into groups. In 2015, Google created a model neural network that understood the concept of a cat just ‘looking’ through millions of images without any specific data/instructions provided.
GAN is a combination of these two. Incoming data for GAN in unsupervised, but GAN sets up a supervised learning problem to deal with unsupervised data. It produces fake data and tries to determine if the data are fake or real.
Generative Adversarial Network (GAN)
Simply, GANs function like a game. There are two neural networks and an unlabeled set of data. Two neural networks are Generator and Discriminator. The generator tries to produce data/object that looks like the real object and the job of the discriminator is to determine whether the incoming data is real or fake. In the beginning, if the generator produces fake data, discriminator can quickly dismiss them as fake. But when it processes continually, the generator starts to produce data more and more similar to real objects through its experience. Discriminator fails to distinguish fake data from real. Or in other words, the computer starts generating realistic human images.
StyleGAN and Nvidia
If you are a gamer you already know what Nvidia is. It’s a company that designs graphics processing units (GPUs) for the gaming and professional markets, as well as chips for smartphones and automobiles. In other words, they ‘own’ global GPU market. Nvidia has a whole unit devoted to research purposes. They do research related to algorithms and numerical methods, Applied research, Circuits, and VLSI design, Computational photography and imaging, Computer architecture, Computer graphics, Computer vision, Display technology, High-performance computing, and much more areas. One of these areas is obviously, Machine learning and artificial intelligence.
As I said before, conventional GAN tries to replicate existing, unlabeled data without any instructions given. But researchers from Nvidia Terro Karras, Samuli Laine, Timo Alia altered this mechanism and developed a new GAN that can extract specific data from different photos and produce a brand new photo which is a blend/fusion of all those considered characters.
The generator made by Nvidia considers an image as a collection of styles. And there are coarse styles, middle styles, and fine styles. Each of them contributes to shaping the output image on different levels.
- Coarse styles– Pose, Hair, Face shape
- Middle styles– Facial features, Eyes
- Fine styles – Color scheme
StyleGAN proves to be an excellent way for producing high-resolution images. But, it had some defects. Some images produced by StyleGAN showed some artifacts. Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, Timo Aila did further research on the subject and developed StyleGAN 2. Some of the problems with original StyleGAN were,
- Droplet artifacts
Some of the images showed a blob (water droplet) shaped artifact. They redesigned the normalization used in the generator. It removed this artifact.
- Phase artifacts
When creating details like teeth and eyes, the generator showed a strong preference for those areas. It kept fixating on those areas. StyleGAN2 proposed an alternate design that retains the benefits of progressive growth without drawbacks.
- In addition to these major changes, StyleGAN 2 produced images fast and the quality of the images was significantly better.
Now you might be wondering about the applications. You can think that producing human-like images is just for fun but there is no specific use. That is correct but there is a point I did not mention still. The GAN is not only related to photos but also can be used to generate any type of data in the same way. It just needs good training. The ability to create imagined data similar to the real one has a lot of potentials.
GANs’ ability to create photos can be used by police to create portraits of missing people. Scientists are trying to develop GAN to a level that can make photos just by text or voice descriptions. And this can be used to make buildings, new designs in the clothing industry and even can be used for the wiring of a house or a plumbing drawing (3D Modeling). Some researchers are trying to use this technology to create more detailed and realistic computer games.
Regarding healthcare, GANs can be used to identify anomalies in lab results that could result in a better and quicker diagnosis. GANs are used to analyze medication alterations and mixing, the order of mixing for previously incurable conditions.
There are ongoing several researches about the use of GAN to carry out complex organic chemistry conversions which can lead to novel drug discoveries and identify compounds that are required for further research.
Autonomous driving can be pointed out as another interesting application of GANs. This paper You can understand the potential of self-driving cars developed using GANs by visiting the following link.
And they test those algorithms in the GTA V computer game using it as the simulation environment.
Obstacles for developing GANs
This complex technology needs a lot of work, time, and brain for developing, on the other hand, an excessive budget is essential. So the introduction of business applications is crucial for the future of GAN.
Ian Goodfellow, the inventor of GAN.
Also known as the man who’s given machines the gift of imagination.
- Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks (https://arxiv.org/abs/1511.06390)
- StyleGAN paper: https://arxiv.org/pdf/1812.04948.pdf
- StyleGAN2 paper: https://arxiv.org/pdf/1912.04958.pdf