Big Data in Advertising: Cause and Effect

I have to admit that I’m excited that “big data” is all the rage in advertising. We’ve been analyzing big data for years on behalf of clients in an effort to improve the target segmentation of our marketing and communication efforts. We’ve also been using big data analytics to measure results as part of a cycle of continuous improvement. As anyone who works with data knows, one of the challenges in measuring results is determining cause and effect. It’s one of the fundamental problems in science: when two events take place, can we tell whether one directly cause the other, or if the two follow a similar pattern for a different reason?

Separating correlation and causation is notoriously difficult. When scientists look for cause-and-effect relationships, and can’t do controlled experiments, they often look for correlations. If lung cancer is more common in smokers than non-smokers, it suggests that smoking causes lung cancer. Once a correlation is found, the Granger causality test can sometimes be used to strengthen the case that there is a causal link at work. This can help differentiate it from cases where the apparent correlation is misleading. The test simply asks if one variable is useful for predicting another. It shows, for instance, that people who smoke are more likely to develop lung cancer, which suggest the two are causally linked.

A new technique called convergent cross mapping tests for causation in time series data.

George Sugihara of Scripps Institution of Oceanography at UC San Diego and colleagues from around the world have developed a new approach to help ecologists distinguish true causal interactions from misleading correlations. Published in the most recent online issue of the journal Science, the method described in the paper, “Detecting Causality in Complex Ecosystems,” extracts the “signature” left by causes embedded in ecological observations—-historical records known as time series. The new mathematical approach deduces causes from the affects.

For example, their technique, called “convergent cross mapping,” was applied to time series records of anchovy and sardine populations to determine whether the species are interacting and responsible for the ups and downs of the other. Using a combination of data from the California Department of Fish and Game and CalCOFI (the Scripps-based California Cooperative Oceanic Fisheries Investigations program), one of the world’s longest marine monitoring efforts, the researchers confirmed that even though the booms and busts of sardines and anchovies move lock step in opposite directions, they were not affecting each other (a spurious correlation much like the numbers of Republicans in the Senate and sunspots). However, the tool was able to uncover that sea surface temperatures, though not correlated with either fish population, are a hidden force driving changes for both stocks.

“The major novelty of this method is that it is based on a dynamic and interconnected view of nature,” said Sugihara, the McQuown Chair Distinguished Professor of Natural Science at Scripps. “Ice cream sales and rates of violent crime might rise and fall at the same time, but our method is able to determine whether this is due to cause-and-effect, or whether both are simply more common during hot summer months.”

The new tool is distinct from methods developed by UC San Diego economists Clive Granger and Robert Engle for financial and economic data, which earned them the Nobel Prize in Economic Sciences. Granger’s technique is aimed at purely random systems rather than those having rules governing how the parts move. Sugihara and his colleagues developed their tool specifically for complex ecosystem analysis, yet its applications could have far-ranging implications across multiple areas of science. For example, “one could imagine using it with epidemiological data to see if different diseases interact with each other or have environmental causes,” said Sugihara.

Sugihara and other Scripps scientists are now applying this tool to study specific ocean phenomena, such as the seemingly unpredictable harmful algal blooms that occur in various coastal regions from time to time. In addition to Sugihara, coauthors of the paper include Lord Robert May of the University of Oxford, Hao Ye and Ethan Deyle of Scripps Oceanography, Chih-hao Hsieh of National Taiwan University, Michael Fogarty of NOAA’s Northeast Fisheries Science Center and Stephan Munch of NOAA’s Southwest Fisheries Science Center. The research received financial support from the National Science Foundation (NSF), an NSF/NOAA CAMEO (Comparative Analysis of Marine Ecosystem Organization) Award, the McQuown Chair in Natural Science, the Sugihara Family Trust, Quantitative Advisors, National Taiwan University, National Science Council of Taiwan (CH), NSF Graduate Research Fellowships (HY and ED) and an EPA STAR (Science to Achieve Results) Graduate Fellowship.

Sugihara says his new test can deal with two-way causality. What’s more, the test can ferret out various causal linkages in systems with several variables. CCM asks whether one variable predicts another, much like Granger causality. To deal with the two-way causality problem, each data set is put through mathematical transformations, creating a three-dimensional shape called a manifold. Points on one manifold may be used to predict points on the other, but not necessarily the other way round. That means causal relationships, of one direction or another, can be measured separately.

Sugihara tested his method on real-world case where the causal relationships have already been established, but the jury remains out on CCM. According to New Scientist magazine, several statisticians said CCM looks promising. “It will be worth studying in more detail,” says Steffen Lauritzen of the University of Oxford. Others are less keen. According to Judea Pearl, who studies causality at UCLA, the paper may be theoretically flawed. The starting point of any study is a hypothesis about why a causal relationship exists, he says. Sugihara’s test instead involves scouring data for links and then retrofitting a hypothesis to suit.

If CCM proves out, then in could be of tremendous benefit to advertising agencies. Cause-and-effect is vital to almost every type of campaign – especially integrated campaigns. At bloomfield knoble, we use Granger causality to help segment a target audience and to measure success. A new technique that offers greater certainty can only help improve future efforts.

# # #

We build strategies and everything that goes with them.

Some of the largest organizations in the world, including many in the mortgage and finance industries, trust us with the most important aspects of their business. From defining clients’ brands and identities to developing ongoing campaigns in a variety of media, we provide the communications and measurement tools to move them forward. Applying our experience and dedication to the media and the message, bloomfield knoble handles every detail of our clients’ strategic marketing initiatives.