The Importance of Data Lifecycles for Chinese Websites

China’s fast growing Internet has created opportunities not only in search engine marketing and search engine optimization for websites, but also in data lifecycle. Every decision for your website can be as data-driven as possible, if you know where and how to collect the data necessary for your online business.

Google’s Data Lifecycle

Everyone knows Google is a search engine. But what Google has really established is a data lifecycle between webmasters/advertisers and Google themselves. Consider two processes:

  • Google collects website data from all over the world
  • Google feedbacks bulk data back to all webmasters/advertisers

How Does Google Collect Data?

Process 1: Google collects massive website data through two major methods: Google Analytics and Googlebot.

Google Analytics:

  • First of all, Google Analytics can easily be set up on your website (by inserting a piece of JavaScript code on every page of your site, given by Google).
  • Once the codes are on your site, Google Analytics will start collecting data traffic to your site. Every piece of the data collected will be stored in some Google data centers.


  • Google has several crawlers working tirelessly and simultaneously, and Googlebot is one of the major crawlers.
  • Googlebot would access a website, move through the website’s internal links, and record all the web pages that it has visited.
  • Then Googlebot takes all the page information it collected back to Google’s many data centers.
  • Finally Google will decide what web pages should be placed in Google’s index and how the web pages should be ranked in organic search results, according to Google’s indexing factors and ranking factors.

After data collection, Google reorganized the data into more meaningful and useful forms, i.e., reports, before showing them to webmasters and online advertisers.

How Does Google Share Website Data?

Process 2: Google shares website data back to webmasters and advertisers through many tools/systems, but most notably through:

  • Google Analytics – Google provides you close to a full view of your website’s traffic sources, where those traffic may come from search engines (e.g., Google, Baidu, Bing, etc.), referral sites (e.g., BBC News, Clickz.Asia, etc.), or social network sites/microblogging sites (e.g., Facebook, Twitter, etc). Through Google Analytics, you can get a bit more including traffic sources’ locations (e.g., countries and cities), browser types (e.g., Firefox, Chrome, Internet Explorer), and visitors’ high-level behavior (e.g., new visitors vs. repeat visitors, loyalty of visitors, etc.).
  • Google Adwords – If you buy search ads/display ads and advertise through Google AdWords, you can always log in to your AdWords account, get the reports, review the performance of your ads, and make decisions/adjustment to your spending.
  • Google Webmaster Tools – After verifying to Google that you are in fact the owner of your site, Google will give you access to the technical issues of your site from Googlebot’s perspective (surely, this is more for those who work on search engine optimization of your website).

Google? No, It Should Be Baidu in China!

Above is the demonstration of the processes in collecting and sharing website data by Google, but in China, Google’s search share is way behind Chinese leading search engine Baidu. So the bulk data that really matters most should be with Baidu and should be shared by Baidu back to all Chinese webmasters and advertisers. However, is this happening and if not, then why not?

Baidu Analytics: Baidu has only released the beta version of Baidu Analytics (which is equivalent to Google Analytics in China) in late 2009 to selected Baidu SEM advertisers. In 2010, Baidu Analytics has been opened to the public.

Baidu’s Webmaster Tools: Baidu released a so-called webmaster tools to webmasters in 2010. It should be the equivalent to Google’s Webmaster Tools, but in reality, Baidu’s Webmaster Tools has never been up to par due to many shortcomings in functionality.

  • It’s not yet Baidu’s intention to share with webmasters a lot of data, other than simply making the webmaster tools a place only for website URL submission (via XML format). Baidu has never been not ready to share too much or any data with anyone. Look at Baidu’s building a Chinese Internet Empire. In Baidu’s philosophy (similar many large-sized Chinese Internet companies), they don’t prefer being collaborative, they prefer to compete.
  • Baidu’s intention may be to use the data collected for their own benefit: they found out through their search data that the online travel industry possesses an opportunity and acquired Qunar, an online travel search engine, to head right into the travel industry. In the short-term, this strategy proves to be a better option than sharing all these data with Chinese webmasters and advertisers. Baidu could easily do this again and again to many other industries.
  • By sharing less to Baidu SEM advertisers, Baidu continues to keep most of the less-data-driven advertisers aware that most of their spending on Baidu SEM could have been actually wasted cash.

Baidu Phoenix Nest: Though Baidu’s SEM platform has made more and more reports available to Baidu SEM advertisers over the years, it is Baidu’s keyword bidding algorithm and offline negotiation that hinders most of the advertisers.

  • Baidu’s SEM bidding algorithm is highly tilted toward bid prices that advertisers set to their keywords and places much less weight onto ad ranking factors for example, keyword quality score.
  • Baidu in their culture loves negotiation with advertisers. The more negotiation they get into, the better chance they are to get advertisers to pay more. Everything, every spot on any Baidu network site has a price, which is acceptable. And Baidu utilizes the most out of it by negotiating ad spot prices with advertisers (as they have a large sales team). But the problem is with the quality of these traffic advertisers get from these so-called Baidu networks.

Without adequate data being collected in the first place from all the Chinese websites and shared back to all the webmasters and advertisers, this data lifecycle still has a long way to complete in China’s Internet world.

Related reading