- Duration: 1 hr 26 mins
- Publication date: 17 Sep 2018
- Part of series IET Prestige Lecture Series, EngTalks - FKA The Kelvin Lecture Series
Abstract
Human digitisation – The how, the benefits and the pitfalls
Is the next generation inspired to push the boundaries in the future?
Are they given a chance to show their creativity and bring engineering and the arts together?
From the latest smartphones to advances in supercomputing, the visual effects technology behind today’s digital age is rapidly changing.
Ubiquitous information and communication are transforming our lives and revolutionising not only how we work, but how we see and understand the universe.
AI, voice recognition, deep learning and deepfakes – Dr Li will cover it all and more…
Taking human digitisation to the next level
When the first photorealistic computer-animated feature film 'Final Fantasy: The Spirits Within' was released in 2001 it was heralded as the movie which could prove to be the death knell of the actor.
The lead character, Aki Ross, was designed to be as realistic as possible; the now defunct studio, Square Pictures, intended for the CGI character to be the world's first artificial actress to appear in multiple films in multiple roles.
This great promise was not realised and audiences were unconvinced. Despite significant advances over the 17 years which have followed there is still a feeling that true human digitisation is some way off - surely we will always know when an image of a person is genuine or not?! How can better quality be achieved? What are the potential pitfalls of this technology and what might the benefits and applications be?
In his EngTalk Dr Hao Li will address his work on photorealistic human digitisation and rendering using deep learning. It is an original method for animating a digital avatar in real-time based on the facial expressions of an head-mounted display user.
His approach achieves higher fidelity animations than can be achieved using existing methods, and requires no user-specific calibration. His approach regresses images of the user directly to the animation controls for a digital avatar, and thus avoids the need to perform explicit 3D tracking of the subject’s face, as is done in many existing methods for realistic facial performance capture.
His system demonstrates that plausible real-time speech animation is possible through the use of a deep neural net regressor, trained with animation parameters that not only capture the appropriate emotional expressions of the training subjects, but that also make use of an appropriate psychoacoustic data set.
Hao Li will also address ethical dilemmas linked to his research. Artificial intelligence video tools make it relatively easy to put one person’s face on another person’s body with few traces of manipulation. So, called deepfakes are one of the newest forms of digital media manipulation, and one of the most obviously mischief-prone.
It’s not hard to imagine this technology’s being used to smear politicians, create counterfeit revenge porn or frame people for crimes. “I see this as the next form of communication,” he said in interview with New York Times. “I worry that people will use it to blackmail others, or do bad things. You have to educate people that this is possible.