In defense of the Turing test
- PDF / 702,816 Bytes
- 9 Pages / 595.276 x 790.866 pts Page_size
- 103 Downloads / 248 Views
ORIGINAL ARTICLE
In defense of the Turing test Eric Neufeld1 · Sonje Finnestad1 Received: 11 January 2020 / Accepted: 20 January 2020 © Springer-Verlag London Ltd., part of Springer Nature 2020
Abstract In 2014, widespread reports in the popular media that a chatbot named Eugene Goostman had passed the Turing test became further grist for those who argue that the diversionary tactics of chatbots like Goostman and others, such as those who participate in the Loebner competition, are enabled by the open-ended dialog of the Turing test. Some claim a new kind of test of machine intelligence is needed, and one community has advanced the Winograd schema competition to address this gap. We argue to the contrary that implicit in the Turing test is the cooperative challenge of using language to build a practical working understanding, necessitating a human interrogator to monitor and direct the conversation. We give examples which show that, because ambiguity in language is ubiquitous, open-ended conversation is not a flaw but rather the core challenge of the Turing test. We outline a statistical notion of practical working understanding that permits a reasonable amount of ambiguity, but nevertheless requires that ambiguity be resolved sufficiently for the agents to make progress. Keywords Turing test · Winograd schema · Practical certainty · Collaborative conversation
1 Preamble In 2013, Gary Marcus published an article in the New Yorker (Marcus 2014) presenting, for non-specialists, “a terrific paper” by Hector Levesque. The paper, On our best behaviour (Levesque 2014), posed some tough questions about the Turing test and proposed an alternative, the Winograd schema. Marcus summarizes the argument against Turing’s test as follows: “…the Turing test is almost meaningless, because it is far too easy to game.” Consider, he says, following Levesque, the chatbots that compete every year for the Loebner Prize: “the winners tend to use bluster and misdirection far more than anything approximating true intelligence”. Levesque’s alternative test is a set of binary choice anaphor resolution questions called Winograd schema challenges. The questions are designed “to be easy for an intelligent person but hard for a machine merely running Google searches”. They require common sense (in one example, “a fairly deep understanding of the subtleties of human language and the nature of social interaction”) and “get at things people don’t bother to mention on Web pages, and * Eric Neufeld [email protected] 1
that don’t end up on giant data sets”. This test, as compared to the Turing test, “is much harder to game”. Approximately a year later, a chatbot using the name Eugene Goostman won a Turing contest organized by the University of Reading (2014). There followed a flurry of articles reporting that a machine had passed the Turing test, followed, in turn, by articles pointing out that Goostman had not passed the Turing test. Nevertheless, for some, Goostman was further evidence of the deficiencies of Turing’s test. In fact, Goost
Data Loading...