The technical complexity of Web sites has increased significantly since the early days of brochureware Web sites. Those sites were developed mainly by HTML, and collecting data about the traffic via log files was an easy task.
These days, however, many Web sites provide dynamic content, which is stored in databases; moreover, sites are often integrated with other corporate applications, and those of external service providers, for transaction processing and content publishing.
This brings about serious limitations to the solutions available in the marketplace. Two of these limitations, not mitigated well with commercially available solutions, are these:
- Capturing traffic data for dynamic content
- Clickstream analysis
Capturing Traffic Data for Dynamic Content
Log-file-based analysis fails to capture the exact content the user is looking at if the content is stored in a database or sourced externally. Log files record only the page the user is looking at; they do not record the unique identifier of the content displayed.
A similar situation arises when your Web site publishes dynamic content directly from an external source.
With log-file-based analysis, you would know that a user is looking at external content, but you wouldn’t be able to identify what exactly they are looking at.
This situation worsens if you break the page into multiple dynamic-content display areas.
If your Web site offers dynamic content and you wish to know the details of what interests your online audience, a solution using solely log files as a basis of your traffic analysis would be insufficient. You may overcome this by a simple scripting program included in your Web pages.
This program should identify the user and capture the details of the page and content identifiers and timestamp them as the user navigates through your site. You may use the record-unique identifiers from the database for internal content. Using this information together with the results of your log-file-based analysis would provide the details of what interests your online audience.
Personalization is gaining importance. Collecting this information would also help you to offer personalized content and customized navigation to your online audience as their navigation patterns and interests become known to the site.
Analyzing Clickstream Data
Clickstream data can be defined as the pattern of a visitor’s navigation through a site.
Log files capture this information using the Internet Protocol address of the visitor. However, most of the solutions that I have come across do not offer any clickstream analysis.
This makes your analysis one-dimensional. In other words, you know someone has looked at a particular page, but you are not able to see what else that same person has looked at.
What does this mean in real life?
A good example could be an affiliate program that you are using to generate traffic to your site. Without the clickstream data, you can tell how much traffic each affiliate is bringing to your site, but you won’t be able to delve into details, such as the number of happy-ending visits and the navigation patterns and interests of the traffic brought to you by a particular affiliate.
As noted above, the clickstream data is captured in the log file; the solution recommended for the dynamic content can also be extended to collect clickstream data. To analyze this data, you would need a Web site analysis tool that offers multidimensional analysis.
We are starting to see new solutions emerging in this field. However, if you choose to collect the data yourself as part of your Web application, any OLAP (online analytical processing) tool supporting multidimensional analysis would do the job.