Contributor

What I learned from using Amazon Alexa for a month

opinion
Jul 19, 20165 mins

Ask Alexa if she can pass a Turing test and she will answer: “I don’t need to pass that, I am not pretending to be human.”

Amazon Echo
Credit: Michael Brown

When Amazon Echo with Alexa service came out in November 2014 I was skeptical. A speaker with voice recognition seemed like an unnecessary oddity. When a friend of mine purchased one in 2015, I had a chance to play with it but was unimpressed still.

Alexa’s SDK has been open to third party developers for a year now. As a software engineer it is important for me to keep up with emerging technologies and learn about them. I purchased an Amazon Echo about a month ago and had an opportunity to interact with the technology and try out the SDK.

More useful than Siri

Comparing Alexa to Siri is like comparing apples to oranges. Yes, both are speech bots. That’s probably the extent of what they have in common.

The primary Alexa service revolves around information lookup, home automation, and shopping on Amazon. Users can enable “skills,” which are essentially speech-based apps, and expand Alexa’s functionality.

From the speech recognition standpoint, Alexa is definitely more responsive than Siri. This is a family-friendly product and as such it needs to handle different speech patterns — children, adults, the elderly. In my experiments, I found Alexa to be more accurate than both Siri and Google, but of course your mileage may vary.

Don’t expect it to pass a Turing test

In a Turing test, a human operator uses a text-only terminal to interact with two test subjects separated from one another. The operator is aware that one subject is a machine and the other is a human, but they do not know which one. The machine subject is considered to have passed the test if the operator cannot tell which one is which.

Ask Alexa if she can pass a Turing test and she will answer: “I don’t need to pass that, I am not pretending to be human.” Expecting Alexa to pass this test is a sure recipe for disappointment. It is more advanced than interactive voice response systems and sure as hell more powerful than Siri, but it is not human.

The first analogy that occurred to me was that of Palm OS and Graffiti. Palm couldn’t pack the computing power needed to process handwriting while also keeping the cost of the device low. Palm makers instead asked users to learn a dumb-down script-like mechanism to input data into the PDAs.

Likewise, Alexa’s users are expected to adapt a bit to Alexa’s capabilities. It doesn’t respond to an infinite variety of sentence structures, nor does it maintain a conversation like a human would. In short, it is a “chat bot.”

The good news is that Alexa is continuously improving. All the software needed to handle voice recognition and A.I. lives in the cloud so Amazon is always updating the platform.

Amazon made it easy to contribute skills

The Alexa Skills Kit, the “collection of self-service APIs, tools, documentation and code samples that make it fast and easy for you to add skills to Alexa” is well documented and easy to learn, especially if you use AWS Lambda. The developer needs to provide sample phrases, or utterances. The utterances get mapped onto intents and can have slots for custom words. Alexa’s machine learning backend does all of the analysis and by the time the code is reached everything is broken down into intents and slot values.

To get a sense of what’s involved in building speech bots I built a few simple skills and submitted them to Amazon for certification. Amazon provides a checklist to set expectations for developers. My experience working through the process is that it is very subjective — much like the experience of using Alexa itself.

Alexa Skills Kit is still in its early stages. I wish Amazon put a little more effort into making it work more smoothly with build tools, such as Jenkins. I would also like to see a monetization scheme similar to Amazon Underground.

Some final thoughts

Using Alexa for a few weeks, I’ve become acutely aware of the contrast between dealing with a call center and dealing with A.I. Dealing with the latter is far more pleasant.

Shortly after getting Echo we needed to resolve an issue with our airline for an upcoming family trip. Unable to solve this problem using their website we had to call their customer service. As expected, I had to navigate the frustrating tree of menus. When I finally got to speak to someone they could barely speak English. They could only speak to a script and any diversion resulted in being transferred to someone in another department in what seemed like an endless vortex of incompetence.

Patrick Thibodeau has written a lot about tech outsourcing and the flow of U.S. white collar jobs to low-cost countries. However, there is a bigger more secular change happening — and it will happen faster than anything we’ve experienced before.

Any job that involves information lookup, scheduling, or following a script is bound to get replaced with an A.I.

Oleg Dulin is a Big Data software engineer and consultant in the New York City area.

In 1997 Oleg co-founded Clarkson University Linux Users Group. This group was influential in bringing awareness of open-source to Clarkson, and later morphed into what now is a dedicated lab and curriculum called Clarkson Open Source Institute. While at Clarkson, Oleg advocated on behalf of open-source and Linux and community and helped with construction of Clarkson’s first open-source high-performance computing cluster called “The North Country.”

While at IBM T. J. Watson Research Center in 1999-2000 Oleg co-authored a paper on federated information systems that was presented at Engineering of Federated Information Systems (EFIS) conference in 2000. This R&D project involved building a proof-of-concept federated IS that integrated structured (SQL) and unstructured (multi-media) data under a single set of API and user interfaces.

From 2001 to 2003 Oleg worked as a data integration consultant at a major investment bank in NYC on a web portal for private banking. This project involved aggregation of secure financial data from multiple legacy databases and presenting it in a customizable web portal.

In 2004, while working at a startup called ConfigureCode, Oleg contributed to two patent applications involving construction and semantic validation of mixed-schema XML documents. This technology was utilized in a Data Capture and Tracking System for Human Resources data integration.

From 2005 to 2011 Oleg worked at a Wall St. company (see Oleg’s LinkedIn Profile for more details) where he was instrumental in improving data quality, reducing trading errors, implementing analytics and reporting within the context of an equities order management system. The system was a 24/7 high performance computing platform that processed billions of dollars worth of trade executions daily.

From fall of 2011 to end of 2016, Oleg worked at Liquid Analytics as Cloud Platform Architect, where he was a thought leader in the implemention of a cloud-based PaaS for mobile Business Intelligence.

Presently, Oleg works at ADP Innovation Lab as Chief Architect.

The opinions expressed in this blog are those of Oleg Dulin and do not necessarily represent those of IDG Communications, Inc., its parent, subsidiary or affiliated companies.

More from this author