Twenty Years Beyond the Turing Test: Moving Beyond the Human Judges Too
- PDF / 1,426,188 Bytes
- 30 Pages / 439.37 x 666.142 pts Page_size
- 9 Downloads / 193 Views
Twenty Years Beyond the Turing Test: Moving Beyond the Human Judges Too José Hernández‑Orallo1,2 Received: 25 March 2020 / Accepted: 29 October 2020 © Springer Nature B.V. 2020
Abstract In the last 20 years the Turing test has been left further behind by new developments in artificial intelligence. At the same time, however, these developments have revived some key elements of the Turing test: imitation and adversarialness. On the one hand, many generative models, such as generative adversarial networks (GAN), build imitators under an adversarial setting that strongly resembles the Turing test (with the judge being a learnt discriminative model). The term “Turing learning” has been used for this kind of setting. On the other hand, AI benchmarks are suffering an adversarial situation too, with a ‘challenge-solve-and-replace’ evaluation dynamics whenever human performance is ‘imitated’. The particular AI community rushes to replace the old benchmark by a more challenging benchmark, one for which human performance would still be beyond AI. These two phenomena related to the Turing test are sufficiently distinctive, important and general for a detailed analysis. This is the main goal of this paper. After recognising the abyss that appears beyond superhuman performance, we build on Turing learning to identify two different evaluation schemas: Turing testing and adversarial testing. We revisit some of the key questions surrounding the Turing test, such as ‘understanding’, commonsense reasoning and extracting meaning from the world, and explore how the new testing paradigms should work to unmask the limitations of current and future AI. Finally, we discuss how behavioural similarity metrics could be used to create taxonomies for artificial and natural intelligence. Both testing schemas should complete a transition in which humans should give way to machines—not only as references to be imitated but also as judges—when pursuing and measuring machine intelligence. Keywords Turing test · Turing learning · Imitation · Adversarial models · Intelligence evaluation
* José Hernández‑Orallo [email protected] 1
Universitat Politècnica de València, Valencia, Spain
2
Leverhulme Centre for the Future of Intelligence, Cambridge, UK
13
Vol.:(0123456789)
J. Hernández‑Orallo
1 Introduction Twenty years ago, on the fiftieth anniversary of the introduction of the imitation game (Saygin et al. 2000), there seemed to be momentum and consensus to move beyond the Turing test (Hernández-Orallo 2000). It was high time, I argued, to look for intelligence tests that should be “non-Boolean, factorial, nonanthropomorphic, computational and meaningful”. In these two decades, AI has changed significantly, and the Turing test is not part of the everyday vocabulary of AI researchers any more, not even as a future landmark (Marcus 2020). Rather, the notions of artificial general intelligence (AGI) and superintelligence have replaced the old wild dreams of AI, and are used as arguments exposing the limitations of a great majority of AI applica
Data Loading...