originally published on Convince & Convert
How would you hashtag the Mona Lisa?
You can probably come up with a lot of ideas. Perhaps your list would include: #art, #DaVinci, #smile, #masterpiece, and #Louvre, among others. Conversely, reading a few of those hashtags would make it easy for most literate, English-speaking adults to guess the subject. Those hashtags would never do justice to the picture though, and even poorly painted or photographed images couldn’t be replaced by descriptors.
A picture is worth far more than a thousand hashtags. There will always be far more information that’s translated visually than via text, and research cited by an executive at 3M noted that people process visual information 60,000 times faster than text.
This visual data gap is difficult to quantify. Do you measure bytes? Pixels? Neurons fired when processing the information? Yet look at any image shared online, and there will be far more information conveyed by the image itself than in any text and hashtags describing it.
The History of Image Analysis and Visual Search
Altimeter analyst Susan Etlinger noted in her 2016 report “Image Intelligence” that, after reviewing interviews with a number of technology providers, “Approximately 80 percent of images that they see that include brand logos do not explicitly mention the brand with any accompanying text.”
Analyzing images is a difficult challenge. The Institute of Electrical and Electronics Engineers (IEEE) reported that there are 15,000 object categories, so merely classifying objects within images has proven to be technically daunting. The book Computer Vision by Richard Szeliski describes how computer scientists started tackling the problems of image recognition and visual search in the early 1970s. More than 40 years later, those technologies are just starting to be refined into applications that are useful for marketers and consumers.
Google, a pioneer and leader in visual search technology, started focusing on the challenge in early 2001, thanks to inspiration from Jennifer Lopez. Eric Schmidt, Google’s Executive Chairman and then its CEO, wrote in Project Syndicate, “People wanted more than just text. This first became apparent after the 2000 Grammy Awards, where Jennifer Lopez wore a green dress that, well, caught the world’s attention. At the time, it was the most popular search query we had ever seen. But we had no surefire way of getting users exactly what they wanted: J-Lo wearing that dress. Google Image Search was born.”
In recent years, some marketing executives have been overly optimistic about how soon consumer applications would be ready. In eMarketer’s report “Visual Search and Recognition,” Razorfish’s Jason Goldberg said, “I’m strongly bullish on visual search. It solves a real problem consumers have . . . In the not-too-distant future, it’ll become a heavily used mainstream feature. I think the inflection point is at least a year away, but not two years.” That report was published in November 2014, and depending on how Mr. Goldberg defines “mainstream,” that inflection point may still be in the future. If so, it is very close at hand.
More recently, eMarketer cited a study by RichRelevance, “UK 2016: In-Store Personalization: Creepy or Cool?” published in July 2016. The survey of UK internet users reported that 62 percent would find it cool to “scan a product on your mobile device to see product reviews/recommendations” for other items you might like”—a task that could presumably be accomplished best by image recognition, as opposed to other means such as scanning a barcode on a tag. That option netted the highest percentage of internet users finding it “cool.”
Meanwhile, the option netting the most “creepy” votes—by 75 percent of internet users—was “facial recognition technology identifies you as a high-value shopper and relays this information to a salesperson.” As a warning to marketers, identifying products is “cool,” but identifying people is “creepy,” though such privacy standards are likely to change over time.
A New Framework for Visual Search
A 2015 report from Slyce, “Visual Search: The Technology & The Market,” reported that 74 percent of consumers say text-based keywords don’t efficiently help them find products online, while 67 percent of consumers claim that the quality of product images is “very important” when purchasing products. Its research showed that back in 2014, retailers that implemented visual search reaped multiple benefits. Citing the impact in one retail category, Slyce wrote, “Of 4.4 million visits to casual apparel sites, visitors who used visual search viewed 37 percent more products, initiated 68 percent more return visits, spent 36 percent more time on the website, and had an average order value of 11 percent greater than those visitors who did not use visual search.”
Visual search isn’t just about finding images that match a query. There are many aspects of images that are typically cataloged and analyzed. Below is a framework for visual search that addresses many of the components that will matter most to marketers tapping into it.
One critical element of this framework is that it distinguishes between identification and intelligence. Each step starts with identifying what something is, but the most value to marketers will derive from any added intelligence about it—whether that intelligence is provided by software or by human analysis (much will probably stem from software and analysts working together).
For this framework, consider an image of a woman drinking a can of Acme Cola on the beach. Using the framework, marketers can analyze the image in five steps.
- Identification: The brand shown here is a beverage.
- Intelligence: Beverages are visible in 42 percent of photos showing some kind of food or drink.
- Identification: This is a photo posted by user SallyBeachLover1981 to Instagram. The same photo, which has a seagull in the background looking like it is posing for the photo, was spotted in a Buzzfeed roundup, “Top 20 Most Epic Beach Photobombs.”
- Intelligence: This image had a total estimated reach of 850,000 unique people due to its exposure on Buzzfeed, and the page where it appeared was shared 1,380 times. The photo’s engagement rate on Instagram was 852 percent higher than Sally’s average.
- Identification: The can has a Acme Cola logo on it.
- Intelligence: Among consumer packaged goods brands, Acme Cola products have the largest share of voice, appearing in 5.4 percent of images.
- Identification: The image includes multiple objects that can be identified: a soda can (which looks like any beverage can, but is clearly soda due to the branding), a bird, a cloud, and the sunglasses that Sally is wearing.
- Intelligence: Of images with the Acme Cola logo, 43 percent feature cans and 29 percent feature bottles. Additionally, eight percent of images that include both Acme Cola cans and people include someone wearing sunglasses.
- Identification: This photo is taken at the beach during the day.
- Intelligence: Photos that include a beach are 2.4 times as likely to display the Acme Cola brand compared with photos showing a city. Additionally, 96 percent of beach photos that are shared publicly are shot in the daytime.
This kind of report is a taste of what marketers will expect from monitoring and analytics technologies. Some of it’s available today. Offerings for businesses and consumers are on the verge of hitting that tipping point into mainstream usage. Early adopters won’t feel like they’re early for much longer.