Recently, it was claimed that a program has successfully beaten the Turing test, convincing 33% of the judges that it was human. Currently (at least at the competition) the bar is set at 30%, requiring the program to convince less than a third of a judges that it is human.

Was the result impressive? Not entirely. Rather than developing true general artificial intelligence, chatbots designed for the Turing test are more so masters of social engineering.

Take MGonz for example. Merely a general set of rules, no intelligence at all, but can still convince others of its humanity through its confrontational style and sheer vulgarity. Sure, it may not answer general content questions (such as "How many legs does a horse have?", a standard question that any human should be able to answer), but a stream of vulgarities and condescension works just as well.

Eugene, rather than playing the jerk, plays the callow child. With both language and age as obstacles, the program lowers the judge's expectation of the program's knowledge, and thus allows it to pass the test more easily. It isn't about making a better AI, it's about lowering the standards to what is considered human.

Although some may consider the 30% limit to be low, it is always more difficult to prove a positive than a negative. It is easy to unequivocally state that someone is actually a bot (one can easily distinguish spambots from actual people in online discussions), but to positively prove that someone online is actually human is more difficult. And with skeptics, nothing's possible to prove.

Skeptics are necessary in all fields, as it is always important to question what is commonly accepted. A devil's advocate is important in shedding light on alternate explanations. But, especially with judgements dependent on the proportion of people accepting the proposition, skeptics skew the percentages to an unfavourably difficult standard.

So what do we do with the Turing test, now that it has been beaten? Do we merely raise the bar? Do we change the rules to disqualify programs that manipulate social engineering?

Personally, I don't think it matters. The Turing test has been and still stands as a good metric for a conversational chatbot. However, even without the Turing test, we will continue to see advances in general artificial intelligence. And we will continue to see the results. We don't need no test to tell us if the program is smart or not. We be the judge of it ourselves.

And when we can't, well, then welcome to the future.

Tagged with programming
Posted on2014-07-26 03:50
Last modified on2014-11-02 22:57

Comments (1)

2015-07-17 01:20:56 by Mauve:

Personally, I was really interested in the Lovelace Test when I read about it recently: http://motherboard.vice.com/read/forget-turing-the-lovelace-test-has-a-better-shot-at-spotting-ai