By now thousands of people around the world have taken a spin with the new Hound (beta) app from Google Play. Meanwhile, iOS-based folks like me have had an opportunity to see the inhouse video demo where the app performs solo:
or this one comparing the apps performance when pitted against Siri, GoogleNow and Cortana:
The speed at which Hound hears and reacts to spoken input is quite impressive. So is its ability to maintain context in the course of a “conversation” as well as its demonstrated ability to deal with complex, layered queries that span more than one knowledge domain. The bar has been raised for rapid, responsive, multi-topic, natural language, spoken input to mobile devices for search, communications digital commerce and Q&A. In addition, Katie McMahon, general manager of SoundHound, points out that Hound can be used for device or application control as well as spoken input for texts or other written communications. All this, as she points out, “using the air from your own lungs.”
Until this week, SoundHound (which was founded in 2005) offered mobile apps that are able to identify a song and its lyrics even if it is being hummed or sung by the device’s owner. There have been more than 215 million downloads of the app across multiple platforms, based largely on word-of-mouth, organic publicity. The Hound app, in McMahon’s words, is not the product of a “pivot.” Rather it is the “fruition of the company’s technology arc” which has been to directed toward enabling people to speak naturally in order to control their digital devices.
There are three aspects of Hound that distinguish it from competing mobile personal assistants. The first involves how it understands spoken input. The Hound platform performs automated speech recognition (ASR) and natural language processing (NLP) simultaneously in their proprietary engine. The conjoined ASR/NLP engine results in rapid arrival from “speech to meaning,” which in turn results in speedier responses to spoken queries when compared to the alternatives.
Hound has a second distinguishing feature that is readily apparent in both video demos. It enables speakers to “be themselves” as they talk through very specific, yet slightly scatter-shot descriptions of their search terms, travel plans or general questions. They can even revisit an old query and add more conditions or constraints and still receive an answer in short order.
A third characteristic of Hound that is not readily apparent from the videos revolves around its ability to scale and to add new knowledge domains. Those of you that followed Opus Research’s coverage of Siri prior to its acquisition by Apple will remember that it was a “do engine,” with tight relationships to Web sites like OpenTable for restaurant reservations, Fandango for buying movie tickets or TaxiMagic for calling a cab. There were fewer than 10 of these initial knowledge domains and there are probably around two dozen today.
The Hound (beta) was released with about fifty knowledge domains, according to McMahon. The videos reflect tight coupling with weather info, a mortgage calculator, calendar information and travel planning (based on a partnership with Expedia). Yet the real accelerant for adding knowledge domains is the Houndify™ Developer Platform which was unveiled in early June when the Hound app debuted on Google Play. The platform provides mobile app developers with a RESTful API that enables them to add natural language input, both spoken and text. Its aim is to bring all the attributes of the Hound app, rapid speech-to-meaning, support of complex queries, understanding context to apps running on “iOS, Android, Windows, Unix, Raspberry Pi, and others.”
Houndification represents the most open approach yet to promoting proliferation of Intelligent Assistance into multiple domains of knowledge. In many ways it resembles the IBM Watson Developer Cloud but there is a salient difference. Watson aims to put the power Cognitive Computing into the hands of hundreds or thousands of developers. The core capability is deep understanding, analytics, insights, classifications and the like. “Houndification,” by contrast, brings a small set of non-trivial capabilities (i.e. support of natural language input and rapid response to complex questions) into the hands of DevNation.