![]() She compared that to judging food based on how the chef described the dish on the menu rather than tasting it yourself. It’s also problematic that Facebook showed people transcripts of chatbot conversations rather than having people actually chat with BlenderBot, Juji CEO and chatbot entrepreneur Michelle Zhou told VentureBeat. Pandorabots says it’s unfair for a company to crown itself the best open domain AI system based on a metric it made itself. However, Facebook didn’t have anybody actually use its chatbot - the company simply showed judges side-by-side transcripts of the chatbot versus other chatbots and asked them to pick the best one. Google’s metric, “Sensibleness and Specificity Average,” asks human evaluators two questions for each chatbot response: “Does it make sense?” and “Is it specific?” Conveniently for Google, its own chatbot scores 79% on the “Sensibleness and Specific Average” score, while other chatbots do not clear 56%.įacebook’s metric is called “ACUTE-Eval,” and it also asks two questions: “Who would you prefer to talk to for a long conversation?” and “Which speaker sounds more human?” Facebook found that 75% of human evaluators would rather have a long conversation with the Facebook chatbot than the Google chatbot and 67% described it as more human than the Google chatbot. While agreed-upon metrics for a variety of discrete NLP benchmarks exist - complete with a leaderboard and buy-in from major technology companies - Google and Facebook’s new competing metrics underscore the lack of agreed-upon measurements for open domain AI. ![]() In addition, Facebook and Google have introduced their own evaluation frameworks, with each beating the other using their own metric. Learn the critical role of AI & ML in cybersecurity and industry specific case studies. But Pandorabots says the real aim of the Bot Battle is to spark an industrywide conversation about the need to agree on a chatbot evaluation framework. The winner? Kuki, with 79% of the votes and 40,000 views. But she’s a politician, often taking the conversation back to her comfort zone and delivering the same quips again and again. He’s a terse figure who wears a “Make Facebook Great Again” hat and doesn’t shy away from intolerant opinions like “I don’t like feminists.” The Pandorabots chatbot Kuki is arguably more eloquent. ![]() The first contestant, “Mark Zuckerb0rg,” is based on Facebook’s Blenderbot. Viewers were invited to vote on the better chatbot. The Bot Battle consisted of two virtual beings chatting 24 hours a day, seven days a week for two weeks (unlike humans, AIs never tire). So a pun-loving chatbot startup called Pandorabots decided to put on a flashy Bot Battle. Check out all the on-demand sessions from the Intelligent Security Summit here.Įmerging technology fields need industrywide metrics to measure progress. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |