I got a long email last week from one of my PR contacts at Google, and it was entirely in Spanish. My Spanish is not very strong, so I did what I do on most occasions like this: I read the email once, then ran it through Google Translate.
It’s always fun – and very instructive – to do this. Google Translate enables me to check how well I am relearning Spanish while at the same time checking how well Google is doing. For the past couple of years, I’ve been following Google’s progress on this front, and I have been impressed – not just with how well it is learning Spanish, but also with how it’s actually doing the learning. Among other approaches, Google Translate has been using crowdsourcing to scale its ability to translate across a broadening range of languages and dialects. You can say that crowdsourcing is the Google way. The more that people use it, the smarter it gets.
Which brings me to the subject of this week’s column, which happens to be the subject of the email I got from my PR contact. Google had just announced that its Voice Search product – which you can get via an app on Android phones, iPhones, and other devices – has now localized for Indonesian, Malaysian, and Latin American speakers. A service that began in 2008 for only English speakers, Google Voice Search is now available in 40 languages and accents, according to Product Manager Fernando Delgado, who walked me through the story, service, and roadmap in an interview last Friday.
This is important. Let me say why.
Adapting to the World of Mobile
For many people today, search is mostly a desktop experience. Yes, people do use search on their smartphones, but it’s a frustrating experience – for me, it’s even more frustrating than writing email. You want to get the search term right; if you don’t, you have to go through the process of typing the term again, which on an iPhone and other touch-based devices makes for a miserable user experience. Google Voice Search attempts to improve on that experience by adapting to the primary way many people still use the device – by speaking into it (though there’s growing evidence that voice as a modality is waning). I remember making a mental note of this when I first downloaded the Google search app. But I remember asking, “will people use it?” Delgado says that the momentum is gathering, especially in countries where Android is growing. Google users are making “millions of voice searches” each day – not anywhere near the level of text searches, but it’s a sizeable number and it’s growing.
The Crowdsourcing Project
That said, to continue growing, Google needs to look for ways to make the experience available in other markets. With Spanish now the third largest language on the web (after English and Chinese), the opportunity to localize in Latin America was irresistible. That, plus all the studies that show Latinos are among the most digitally savvy of all ethnics groups (which Google, of course, has been closely following). But as I noted earlier, Google has a method for building competencies across all its services, and the method is crowdsourcing. What began as an experiment for its Dutch-speaking users, the “word of mouth” project has been extended to localize Google Voice Search around the world.
It works like this: in each country or region, Google identifies small groups of people who are “avid fans” of Google services and who are well-networked in their communities. The fans then go out to the communities and record data samples that Google can study to refine the searches. The method helps Google meet a huge challenge.
“We require thousands of hours of raw data to capture regional accents and idiomatic speech in all sorts of recording environments to mimic daily life use cases,” wrote Linne Ha, Google’s international program manager, on the Google Mobile Blog. And Latin American Spanish is an exceptional challenge that crowdsourcing helps to overcome. With so many dialects and regional accents, the Google team needed a special strategy: start by crowdsourcing samples in Mexican and Argentinean Spanish – two of the most widely divergent flavors of Spanish – and fill in the gaps with samples from Peru, Chile, Costa Rica, Panama, and Colombia. The result: a crowdsourced approximation of “Latin American” Spanish.
The Role You Play
But the crowdsourcing doesn’t end there. Each time you, the end consumer, use the service, Google “anomizes” your search – protecting your privacy – and adds your voice file to the data set, which it continually analyses. In other words, you too can help Google learn your kind of Spanish.
Hearing about all this crowdsourced data gathering, I was inspired to run my own little experiment during my Friday lunch break, in my office. I decided to run the same set of queries against two services: 1) the generic Google Voice Search and 2) Google Voice Search optimized for Spanish. If you are on an iPhone in the U.S., open the Google search app, go to settings (the little starburst on the upper right-hand corner of the screen), tap on “Voice Search,” and select one of four versions of “Español” – I chose “America Latina.” (Note: if you’re on an Android phone, you’ll have a wider range of selections.)
Here are my results, starting from the easiest to the hardest voice searches. Note: wherever I could, I tried out my best Nuyorican.
Round 1: Google Voice Search (U.S. English Edition)
“Jennifer Lopez”: OK, perhaps too easy. But I thought it made sense to start with a little Spanglish and warm up for the experience. And she’s Nuyorican (“Jenny From the Bronx“).
Result: Yes, on the first try (came back with photos, too).
“Giovanni Rodriguez”: Still too easy, maybe. And after all, I am one of the world’s leading authorities on the pronunciation of Giovanni Rodriguez.
Result: Yes, on the first try.
Result: Google – yes, but only when I said it like a non-Latino (no rolling “r”).
“Yo quiero Taco Bell”
Result: Yes, on the first try (came back with pictures, this time with the chihuahua).
“Besame mucho”: (A beautiful old, but corny song)
Result: Yes, on the first try (but I didn’t try singing it).
“Bodega”: (Grocery store)
Result: Yes, on the first try.
“Mofongo”: (Legendary Puerto Rican plantain dish)
Result: Yes, on the first try.
“Chancleta”: (Flip-flop sandal)
Result: No. First three results: “trunk lip,” “chunklet,” “chocolate.”
“Chuleta”: (Pork chop)
Result: No, not even close. First three results: “AAA,” “2 letter,” “Chewbacca.”
“¡Wepa!”: (Latino exclamation of celebration; equivalent of “woo-hoo,” but way more cool)
Result: No; in fact, embarrassing. First three results: “what,” “weppa” (The Wake Emergency Physicians of Pennsylvania); “wet backs” (really).
Round 2: Google Voice Search (America Latina Edition)
Recap: Using the U.S. English edition, Google scored seven hits out of 10 searches. Not bad. But I ran the three missed searches in the America Latina version, and I got the following:
Result: Yes, on the first try.
Result: Yes…but not on the first try. I stopped to reflect on what Google’s Fernando Delgado told me about accents. Fact: I lost most of my Spanish before kindergarten, when my parents moved from one part of the Bronx to another. To compensate for this deficit, I put on my best impersonation of my mother – who sometimes, on the phone, sounds exactly like Rosie Perez (also Nuyorican) – and said (with more confidence) “chuleta.” Cha-ching – it came back perfectly.
Result: Nope…not even after a dozen tries. I tried saying it like my mother says it. I tried saying it like my grandfather used to say it. I even tried saying it like I really, really meant it, jumping up and down, yelling at my iPhone (it was a slow day at the office – not sure anyone noticed). Most common result? UAPA – Universidad Abierta Para Adultos, a school in the Dominican Republic.
Which reminded me…I have several Dominican friends, and one of them, you might say is the goddess of wepa. LATISM’s Elianne Ramos holds Twitter parties every Thursday night where hundreds if not thousands of wepas of mass distraction have been detonated. I asked her this weekend to try out the Latina America version of Google Voice Search. On the first try, she got chancleta and mangú (the “quintessential Dominican delicacy”). But what about wepa? Says Elianne:
“On the first try, it came back with APA Style at the site for the American Psychological Association. Either Google thinks I’m losing my marbles or I need to let out my inner Dominican. Time for take two. This time, I shouted wepa at the top of my lungs (hip swivel and everything) and bingo…well, almost. It found Uepa.com – a Dominican news and entertainment site. We’re getting closer. I tried a couple of more times but it kept giving me the same results, so I gave up. On to the next word.”
Granted, the best use cases for Google Voice Search are for when you are on the go and for any search term “that’s too long to type,” as Google’s Delgado put it. In fact, the top voice searches today are in sports and entertainment (like movie listings), generally, that require long strings of words and phrases. Still, knowing that Google Voice Search gets smarter the more we use it, I encourage you all to take the time and search for “wepa.” When Google gets that right, it will be cause for celebration indeed.
Giovanni is off today. This column was originally published on April 5, 2011 on ClickZ.