Five ways to maintain data quality in your analytics
A data-driven strategy is an essential part of any marketing role, making data quality a top priority for senior marketers. But how can you ensure your data is clean and accurate?
A data-driven strategy is an essential part of any marketing role, making data quality a top priority for senior marketers. But how can you ensure your data is clean and accurate?
A data-driven strategy is an essential part of any marketing role, making data quality a top priority for senior marketers. But how can you ensure your data is clean and accurate?
A recent report by AT Internet explored the 5 key dimensions for data quality in digital analytics. Here are some key takeaways from the report, as well as some things marketers can do to keep their data quality high.
This content was produced in association with AT Internet
According to Incapsula’s 2016 Bot Traffic Report, more than 50% of the traffic on the web can be attributed to bots – as the chart below demonstrates.
Image courtesy of Incapsula
This traffic can be broken down into ‘good’ and ‘bad’ bots. ‘Good’ bots are either:
‘Bad’ bots are most likely to be ‘impersonators’ that assume a fake identity in order to bypass website security. The more nefarious can execute Distributed Denial of Service (DDoS attacks) against sites they hit. These types of bots accounted for 24% of total internet traffic in 2016, with another 1.7% contributed by web scrapers.
Bot traffic of this proportion has two effects that marketers should be aware of. One, it artificially inflates traffic volumes (so your site looks like it’s getting more traffic than it is), and two, it brings conversion rate metrics down (so your campaigns look less effective than they are).
Stripping out this traffic is essential for accurate benchmarking. Without ‘clean’ data, it’s significantly harder to make informed decisions about strategy.
During site updates and changes to mobile apps, ensuring analytics tags’ integrity is essential to collecting good data – particularly on sites with a high number of pages, such as publishers or online retailers who frequently add and modify pages.
Although errors can be difficult to detect, they’re critical to identify and correct in order to ensure the accuracy of reports.
Missing, duplicated or incorrect tags can impact campaign measurement – leading to erroneous conclusions about how effective certain campaigns are. Event-specific sites are often prone to missing tags, as teams are frequently under intense time pressure before launch, which can lead to technical oversights.
Unfortunately, these can also be the costliest mistakes to make, as the event – such as a TV ad or conference – often represents a significant investment by the company.
Using numeric strings (category IDs, SKUs) in URLs can seem like a win over of long, unwieldy strings of plain-text. But while this may be practical when capturing data, it can cause issues when analysing it. Intelligible text values are a big help in understanding where data has come from and which strings can be consolidated.
Keeping text values consistent is also important. A common inconsistency is in language parameters, where the same values are often written in different ways – such as using ‘EN’ and ‘English’ both to represent text in English.
In this example, each would appear in different rows in a report, and would require manual consolidation by an analyst.
Using a host of tools can be problematic for data collection and analysis. Different systems can use unique definitions and calculations for the same dimensions and metrics. For example, different analytics tools may attribute traffic sources differently depending on whether a campaign is running or not.
One common issue is cross-device measurement. A user who visits a site on their phone on the way into work and then again on desktop when they get to work might be counted as two different users.
Using a single tool that has the capacity to measure logged-in behavior across devices and platforms is an effective solution – saving you the hassle of manual reconciliations and deduplications.
Top-end digital intelligence providers can give users an insight into visitor behavior in real time. This enables teams to get instant feedback on time-specific campaigns and respond to occurrent issues, such as 404 errors and mobile app crashes, as they happen.
Another use-case is during a breaking news event, where a media site might track the performance of individual articles in real time, providing a data-driven insight into what kind of content users are most interested in.
[1] http://www.oxfordeconomics.com/thought-leadership/leaders-2020
[2] https://home.kpmg.com/xx/en/home/campaigns/2016/06/ceo-outlook.html
[3] https://www.edq.com/globalassets/white-papers/building-a-business-case-for-data-quality-report.pdf
To find about more about preserving your data quality, download AT Internet’s full report: Data Quality in Digital Analytics: The 5 Key Dimensions.
This article was produced in collaboration with AT Internet. Click here to read ClickZ’s collaborative content guidelines.