I Never Metadata I Didn’t Like

How I wish I had a dollar for every broadcaster or publisher who complains about not having more online traffic but who hasn’t placed metadata (define) on his sites. I’d be rich!

The way the vast majority of consumers find information online is via search engines, and metadata is a major factor in what content search engines display to consumers.

Including appropriate metadata in a Web page can markedly increase that page’s rankings in search engines. The concept of Web page metadata, which is structured data about the contents of the page, isn’t new.

There used to be a time when Internet search engines ranked a Web page only by indexing all the words on that page. That ended around 1998.

Internet search engines now rank a page according to complex and proprietary algorithms, but almost all those algorithms include metadata as one of the top three factors. (The other two top factors are the indexing of words on that page and the number of that page’s incoming/outgoing hyperlinks).

Nevertheless, few publishers and broadcasters use metadata well, or even use it at all. They and their editors, producers, and reporters misunderstand or ignore it. Too often, their eyes glaze over when it’s mentioned. Most seem to regard metadata as some esoteric, techno-geek topic like server-load balancing or secure socket layering. The result is their content then misses the boat for wider distribution.

Here’s a 21st century formula they need to memorize:

    Specific metadata = wider distribution

And here’s another one:

    Specific metadata = serendipity/2

No heavy mathematics or calculus required. Good metadata equals wider distribution. And good metadata provides half of serendipity online.

Let’s examine a few bad examples by major broadcasting and publishing sites. While teaching a Syracuse University graduate school class about search engine optimization this week, I asked students to name some publishing or broadcasting sites. Not surprisingly for their location, the students first named Syracuse.com, the Web site operated by local newspaper “The Post-Standard” of Syracuse, NY. Take a look at its home page and use your browser’s “view source” function to look at that page’s underlying HTML. There’s no metadata anywhere in the page. How well do you think this newspaper’s home page will rank in the search engine results, even in searches about Syracuse?

Click the site’s “News + Biz” button in the top navigation, and perform the same examination. There are metadata keywords this time, but only the most generic possible about a news site in a central New York city:

    meta name=”keywords” content=”breaking news, syracuse, CNY, city, police, fire, crime, safety”

Is the reason the keywords aren’t more specific? Is it because this page is dynamically generated and its publisher doesn’t believes the search engines will record it? The fact is many search engines do.

Moreover, click on any hyperlink to a specific story and examine the metadata on a specific story page. Respective pages containing stories about a murder, a poet, and a lacrosse game had no metadata about murder, poet, poetry, or lacrosse. How well do you think those long-lasting pages may rank whenever someone searches for stories about those topics on the site?

The next site my students named was “The New York Times.” Unlike Syracuse.com, the NYTimes.com home page had metadata. These were its keywords:

    meta name=”keywords” content=”New York Times, international news, daily newspaper, national, politics, science, business, your money, AP breaking news, business technology, technology, Cybertimes, circuits, new york times, navigator, sports, weather, editorial, Op-Ed, arts and leisure, film, movie reviews, theater, stock quotes, arts, classified ads, automobiles, books, crossword puzzle, job market, help wanted, careers, real estate listings, travel, web glossary, new york region, Navigator, cybertimes, op-ed, job listings, forums, business connections, theatre reviews, auto classifieds, newspaper archives, travel forecasts, NY Yankees, Mets, Giants, Jets, boxing, pro football scores, major league baseball, college basketball, Knicks, Rangers, Islanders, college football, sports commentary, fashion and style, Hockey, tennis, major league soccer, global issues, associated press, regional news coverage, quick news, women’s health, obituaries, stock quotes, charts, market indexes, sports update, politics, science, political news, science times”

Using “New York Times” and “Cybertimes” as keywords are certainly good ideas. I can also understand this newspaper including the keywords “international news,” “new york region,” and perhaps “Yankees,” “Mets,” “Giants,”” Jets,” “Knicks,” “Rangers,” and “Islanders” even in those teams’ off season. But keywords such as “daily newspaper,” “national,” “politics,” “arts,” and “careers” are so generic as to be useless.

At that moment, the home page featured stories about lethal injection, Pope Benedict, and Virginia Tech. Each of those specific and long-lasting story pages featured blank keyword metadata:

    meta name=”keywords” content=””

BBC News’s home page didn’t do much better:

    meta name=”keywords” content=”BBC, News, BBC News, news online, world, uk, international, foreign, british, online, service”

Those generic metadata keywords were also in every one of BBC News’s specific story pages. Or try Folio, the trade journal of the magazine industry. “CurtCo Relocates Art Magazine to New York” was the page’s featured story, but these were its keywords:

    meta name=”keywords” content=”magazine industry, magazine publishing, publishing, magazines, circulation, emedia, magazine careers, magazine jobs, publishing news, b2b media, consumer media, audience development, consumer publishing, b2b magazine”

I realize many publishers and broadcasters who read this column will protest their sites’ CMSes generate Web pages dynamically but don’t include appropriate metadata keyword that way. Whether they blame the problem on their CMSes or not, publishers and broadcasters must include metadata that specifically matches each page’s contents. Overly generic metadata is almost as useless as no metadata at all.

The more specific the metadata, the more likely that page will be found and the more distribution the page will have online.

Moreover, aggregation sites and search engine sites are increasingly using specific metadata when they suggest other stories, images, or video clips that are similar to searches. That’s why specific metadata is half of serendipity online.

Online retailers long ago discovered that including specific metadata greatly increases their reach, traffic, and results. It’s time that online information companies discovered that, too.

Related reading

Vector graphic of a megaphone spewing out business themed items, such as a laptop, tablet, pen, @ symbol and smartphone