Extract, Transform, Load—or ETL—is a relatively new concept for marketers. It’s the way data becomes information, through a process of:
1. extracting data from outside sources,
2. transforming it to fit operational needs, and
3. loading transformed data into a data warehouse
or other system.
ETL often involves gathering disparate data sets from multiple applications so they can relate to each other in meaningful reporting contexts. IT departments have been doing it for as long as sophisticated data warehouses have been around. To create a cost accounting system for the CFO, for instance, they might integrate data from payroll, sales and purchasing, and they routinely apply an ETL process to get it done.
But now, the explosion of digital media has forced marketing departments to understand the ETL process as well. We pull activity and performance data from tools like Google Analytics, DART, AdWords, and Twitter and slap it together with spend data from invoices or budget spreadsheets. Then we toss in some outcome data like leads, conversions, sales or revenue and hope we get some insight into what’s working. When we finally see the results from all this, it becomes clear that our manual ETL process is much too time consuming at best, exceedingly painful at worst, and costs way too much.
For marketers, ETL remains a murky (perhaps even scary) topic, but it’s something we have to understand—and fast. Data management is now mission critical to the marketing function and we can only expect it to become even more so. Think about what social media channels looked like just five years ago compared to today. Mobile, as a viable marketing channel, barely existed until just recently. The ever-changing digital landscape is a veritable guarantee that marketers will always face more and more data to extract, transform, and load. Which means marketers must automate the ETL process as quickly as possible.
Because here’s what’s especially challenging about ETL for marketers:
1. Extraction is hard.
Each digital marketing tool we use provides us with data in its own unique format. Offline data, especially, can be difficult to gather as it’s rarely provided in any standardized way. More often, it comes to us in ad-hoc slides or spreadsheets from one or more agencies. Even if we currently have some sort of semi-comprehensive homegrown data warehouse or customer database, getting data out of it can still be a challenge. Rarely are these custom systems designed with data extraction in mind.
2. Extraction is expensive.
Because of the huge variety of data sources marketers deal with, the cost of extraction can be hugely prohibitive. Without proper time and resources allocated to to the task, it either doesn’t get done at all, or it gets done piecemeal.
3. Extraction is often incomplete.
Too often, marketing data that’s too difficult to pull simply gets left out. Our BI or Customer Intelligence unit might readily pull marketing data from an easy source—say, Salesforce—but will leave out the hard stuff like those ad-hoc agency-provided offline reports in PowerPoint. In the end, our picture of what’s working and what’s not is too often full of holes.
4. Transformation is cumbersome.
Transforming marketing data means enhancing, evolving or otherwise changing our data sets in some way so they’ll relate to one another and our reporting will make sense. Concepts that are important to the business for reporting purposes are often missing from the data entirely. One way to resolve this is to add common tags, or fields, into each of our data sets. For instance, if we want to track conversion data by region, we must add a field to our AdWords and DART data and populate it with the appropriate value (e.g. [UK], [US], [APAC]). But many systems are inflexible, and so we’re forced to resort to arcane naming conventions and other manual processes that are tedious and error-prone.
The challenge with transforming today’s marketing data is that the process must be done time and time again over many channels. Further, if the business needs to report results by product, line of business, or customer segment, each of those concepts must exist as tags, or fields, in each data set and populated accordingly. Here, the man hours can add up very quickly while someone (or two or three) are tasked with an enormous amount of cut, copy and paste—activities that keep marketers from doing the more interesting work we love, like gathering insights, testing our ideas and making decisions about what should change.
5. Transformation isn’t possible in native tools.
The majority of digital tools we use everyday—things like Marketo, Google AdWords or TweetDeck—were designed for execution. If they offer any kind of reporting or analytics at all, it’s usually very limited. Not designed from the ground up to interact or integrate with other reporting tools, these applications rarely make it easy to add the tags that make data transformation possible.
6. Loading is irregular.
Because we can’t easily extract data from its native source or readily transform it, the frequency with which we load our data into a central repository or data warehouse is often inconsistent. If some data sets are refreshed weekly while others are refreshed only quarterly, our data warehouse has one more variable to deal with, one more way the data can lack integrity.
Without going through the ETL process, there is no way to look at our all marketing data together or compare anything side-by-side. We must associate our disparate data sets or we’re flying blind. But performing the ETL process manually is just not scalable (not to mention, very un-fun).
For marketing, a manual ETL process comes at a high opportunity cost. If our talented team members are so busy cutting, copying, and pasting, they’re prevented from performing higher value functions—like discovering what’s working, what’s not, and making the course corrections that move the business forward. Automating our ETL process makes marketing data more reliable and accurate, gives it more integrity, and leaves a better audit trail. If we’re really in the business of “marketing”, it’s time we got out of cut-copy-paste wasteland and into an automated form of data transformation.