Once again, artificial intelligence teams tease the realm of the impossible and deliver surprising results. This team in the news figured out what a person's face may look like just based on voice. Welcome to Speech2Face. The research team found a way to reconstruct some people's very rough likeness based on short audio clips.
The paper describing their work is up on arXiv, and is titled "Speech2Face: Learning the Face Behind a Voice." Authors are Tae-Hyun Oh, Tali Dekel, Changil Kim, Inbar Mosseri, William Freemany, Michael Rubinstein and Wojciech Matusiky. "Our goal in this work is to study to what extent we can infer how a person looks from the way they talk."
They evaluate and numerically quantify how, and in what way, their Speech2Face reconstructions from audio resemble the true face images of the speakers.
The authors apparently wanted to make sure their intent was clear, not as some attempt to link voices with images of the specific people who actually spoke, as "our goal is not to predict a recognizable image of the exact face, but rather to capture dominant facial traits of the person that are correlated with the input speech."