Geekwire.com

Amazon launches new artificial intelligence services for developers: Image recognition, text-to-speech, Alexa NLP

2016-11-30

Amazon today announced three new artificial intelligence-related toolkits for developers building apps on Amazon Web Services.

At the company’s AWS re:invent conference in Las Vegas, Amazon showed how developers can use three new services — Amazon Lex, Amazon Polly, Amazon Rekognition — to build artificial intelligence features into apps for platforms like Slack, Facebook Messenger, ZenDesk, and others.

The idea is to let developers utilize the machine learning algorithms and technology that Amazon has already created for its own processes and services like Alexa. Instead of developing their own AI software, AWS customers can simply use an API call or the AWS Management Console to incorporate AI features into their own apps.

AWS CEO Andy Jassy noted that Amazon has been building AI and machine learning technology for 20 years and said that there are now thousands of people “dedicated to AI in our business.” Now the company is opening up the back-end infrastructure to third-party developers. Three services were announced today and more are coming next year, Jassy said.

“Amazon AI services are fully managed services so there are no deep learning algorithms to build, no machine learning models to train, and no up-front commitments or infrastructure investments required,” Amazon said in its press release. “This frees developers to focus on defining and building an entirely new generation of apps that can see, hear, speak, understand, and interact with the world around them.”

Amazon Polly is available today in US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Dublin) Regions, and will expand to other regions in the coming months. Amazon Rekognition is available in US East (N. Virginia), US West (Oregon), and EU (Dublin) Regions, and will also expand. Developers can sign up for a preview of Amazon Lex today.

Amazon CTO Werner Vogels detailed the new services in a blog post published Wednesday. Here’s a quick rundown of Amazon Lex, Amazon Polly, Amazon Rekognition.

Amazon Lex

Amazon Lex — which refers to what’s in between “Alexa” — lets developers build conversational interfaces into apps by using voice and text.

“The same conversational engine that powers Alexa is now available to any developer, making it easy to bring sophisticated, natural language ‘chatbots’ to new and existing applications,” Vogels wrote. “The power of Alexa in the hands of every developer, without having to know deep learning technologies like speech recognition, has the potential of sparking innovation in entirely new categories of products and services.”

Jassy, speaking at the keynote, said that Lex will allow developers to build conversational applications. He used an example of an app that lets people order pizza, with AWS’ artificial intelligence technology, natural language processing, knowledge graphs, and more helping to understand context and user intent.

“You can build multi-step conversations,” Jassy noted.

Dr. Matt Wood, general manager of product strategy for AWS, also spoke during the keynote and showed how Lex can be used to build conversational travel planning apps. The process went like this:

Wood: Book a flight to London.

AI: When do you want to travel?

Wood: Friday afternoon.

AI: There’s a flight leaving at 5 p.m. for $500. Book it?

Wood: Book it.

AI: OK, it’s booked.

“This is a very simple example of the type of fluid conversations you can have with services running on Amazon Lex,” Wood said.

Lex can be integrated into Salesforce, Microsoft Dynamics, Hubspot, Twilio, Facebook Messenger, and other platforms.

You can see pricing info for Lex here.

Amazon Rekognition

Amazon Rekognition is an image recognition service that can quickly detect what’s in an image, like the number of people, the gender of those people, their emotions, different items, etc. It uses the same technology that Amazon has built to analyze images on Amazon Prime Photos.

“Amazon Rekognition democratizes the application of deep learning technique for detecting objects, scenes, concepts, and faces in your images, comparing faces between two images, and performing search functionality across millions of facial feature vectors that your business can store with Amazon Rekognition,” Vogels wrote. “Amazon Rekognition’s easy-to-use API, which is integrated with Amazon S3 and AWS Lambda, brings deep learning to your object store.”

During his travel planning demo, Wood showed how Rekognition can be used in conjunction with Lex. For example, using the Rekognition image technology, the AI can show a user photos of a lake or forest that help guide them to where they want to travel.

“Although each of these individual services — Polly, Lex, Rekognition — can be used independently, pulling them together allows you to build some really novel, sophisticated, category-defining applications,” Wood said.

You can see pricing info for Rekognition here.

Amazon Polly

Polly is a text-to-speech deep learning service.

“With Amazon Polly, we are making the same text-to-speech technology used to build Alexa’s voice to AWS customers,” Vogels wrote. “It is now available to any developer aiming to power their apps with high-quality spoken output.”

Jassy explained how developers can use Polly to input a text like “the temperature in WA is 75° F,” and then spin out an MP3 stream. There is intelligence built into Polly so that the audio comes out like “the temperature in Washington is 75 degrees Fahrenheit.”

Polly comes with 47 different voices and 27 languages. You can see pricing info for Polly here.