Finally audio based commands and tagging that doesn't suck

QR Codes

I have kept an eye on QR codes for a few years now; it is a simple technology. Simple technologies win because – well, they just plain work. So many new technologies do a better job of adding complexities to solve a problem than they do to decrease them. I mean, as an example, the majority of the world still uses headphone jacks and earbuds to listen to their iPod, even though blue tooth is a great technology that removes the need to use those easily tangled cords. BUT those annoying tangled cords are still far more reliable and simpler to use than bluetooth. So we wait for the “simpler” technology to become – simpler. QR codes have the same M.O. They are simple and work, but they are also annoyingly primitive. For example, that QR code image you see on the left of this article means absolulty nothing to you visually. Yet I use it to take up space on the page, because it can provide value if you are willing to pull out your phone and take a snapshot of it.

The QR code works becuase it is a unique image that contains data within all its black and wite specks, like a data finger print or a bar code.  A device with an app that can read those specks convert the “fingerprint” into a equally unique URL that the app can then direct you too.

A more complex, yet more direct solution, is to have an image that is human readable, like and ad, act as the data rich finger print. That way a person can either take advantage of the precious realestate of the ad by siimply reading it, or taking a picture of the ad and get directed to the related URL. It looks like technolgies such as Google Goggles are on their way to crakingthat nut. For now however, Google Goggles is not more reliable or simpler to use then our ugly, cryptic; yet simple, and relaibale QR code.

Audio Commands and Tagging


The use of audio commands has had a problem finding its place as a “simpler” solution to the everyday problems they always claim to have solved, but as often fall short on expectations. I don’t know how many friends of mine have had voice-command car systems that in the end just dont work as reliably and effectively as turning a dial or pressing a button on their dash.

John’s car: “bee-eep. Can I help you?”

John in his car: “Call Sean”

John’s car: “Looking for Jons and bathrooms in the area.”

John: “Ugghhh! No , Call Sean!”

Car: “Bee-eep. Thank you. Calling Don now…”

John: :-[

Siri seems to be making voice commands better, or at least marketing it that way, but the dream of talking to our computers, as the easier way to interact with them, still seems as far away as it did here (See min 3:00 in 1984)



Okay, Siri and Google voice commands are doing better, and getting used more use than I have ever seen in the past with similar technologies, so that is promising…but yelling into your phone to “search for near by bars” in a crowded room is – well – shitty.

Shazam made some great leaps forward in the audio tagging and command space by finding the unique characteristics in songs, and turning that into pertinent data. That uniqueness is used so that the app can determine the song name, and its singer, just by holding up your phone to a song you hear on the radio.

This year, as some you may have already seen, Shazam has gotten into the QR related space by bringing their technology to TV comercials. When you see the Shazaam logo on a comercial open your Shazam app and let Shazam listen to the commercials unique audio. Their ability to link the unique “fingerprint” of sound coming form the commercials audio, and turn it into useful data allows them to link commercials to open a website on your device; much like a QR code. It is neat because the audio is as easily interpreted by human ears as it is by the Shazam app; maximizing the use of the allotted ad space. Unfortunately it falls short in the fact that it is impractical to expect a viewer to chase down their phone, open the Shazam app, and tag the comecials audio, before the comercial is over.

Audio Sync


I think I just saw a technology that actually make sense. Practical in its use, efficient in it implementation, and it solves a problem by decreasing complexity more then it adds.

In this case the audio is used to sync your tablet to a show you are watching. With this strategy you’re truly decreasing the steps needed to get what the show, and the viewer wants. No extra steps, no rushing for a unrelated app to open a web page, no ugly QR code images taking up space, just simply a way to help the user link the app their are using to the show they are watching.

It works by listening to the show you are watching, and applying the Shazaam like technology to the audio of the show to recognize what part of what show you are watching. The data is processed, and instead of just opening a web page, it sync your application’s experience to meta data surrounding the show on you TV. You can then interact with others watching the same moment at the same time, or listen to back stories related to the segment. Cool beans.