What Market Researchers Could Learn from Eavesdropping on R2D2

04 Nov 2011|Added Value

Scott Porter, our VP of Methods in the US, recently attended the first annual Southern California Machine Learning Workshop (SoCaML), hosted by UC Irvine. Scott asks:  in the context of research and insight, why should we care about what the Machine Learning community is doing?

For those not familiar with Machine Learning, it is a scientific discipline related to artificial intelligence.  But it is more concerned with the science of teaching machines to solve useful problems as opposed to trying to get machines to replicate human behavior.

If you were to put it in Star Wars terms, a Machine Learning expert would be more focused on building the short, bleeping useful R2D2 than the shiny, linguistically gifted but clumsy C3P0—a machine that is useful and efficient as opposed to a machine that replicates behaviors and mannerisms of humans. 

There are many techniques and approaches that marketing insights consultants could borrow from the Machine Learning community. The community is made up of a larger group of researchers and scientists as well as those concerned with market research, and their focus is improving algorithms that can be applied across a wide variety of scientific, technology, business, and engineering problems. And so it is a wonderful source of inspiration for approaches that can be adapted to our own industry.

What kinds of problems/applications is the Machine Learning community interested in?
The community is trying to teach machines to solve the kinds of problems that are deeply familiar to market researchers and marketing scientists, as we wrestle with these same problems in common research tasks. A few examples and how they feel relevant.

1- Prediction:

In market research, one of the most common statistics applications is drivers analysis – identifying the variable(s) that can predict specific attitudes or behaviors. A wide varietyof business applications were discussed, including many on-line recommendation systems, tools that allow social networking companies to predict behavior, and techniques for refining web search.

Improvements in predictive models will be critical to keep up with the needs of our clients going forward. Predictive models are usually of great interest to our clients, because they want to be able to know what would likely happen if they were to take a particular action. When building predictive models, we are now being asked to consider much more complex data.

For example, when trying to model online social network behavior, we may need to take into account the behaviors of various connections in the online social network, as well as the timing of these behaviors. We are also being asked to take data collected from one situation, and create our best prediction for another situation. Some common reasons for this are because the client would like to make decisions about markets or products that didn’t exist and would like to borrow information from similar situations on which they do have data or that the world has changed somewhat between the time we measured and the time to act.

2 – Classification:

Researchers are always interested in classification to help solve a variety of problems. Some applications discussed at the workshop included segmenting study participants into different risk tolerance groups, automatically classifying open-ended text into topics, and classifying instructions sent over a brain-computer interface where study participants wear a cap with special sensors and controlled the interface by thinking of movement (such as “left” or “right”).

One application of classification techniques we use frequently is segmentation. There are some exciting possibilities for segmentation that are opened up by considering new classification techniques. Some newer approaches to classification that I am particularly excited about would allow us to combine classification with models of decision process (in order to find segments that make decisions in similar ways) or incorporate models of transitions between various profiles when segmenting (in order to have a more dynamic segmentation that understands that consumers may move between various segments over time). We’re also finding more uses for classification in modeling human decision-making, because people, unlike robots, might not measure things in terms of “a utility of 33.7,” but in terms of “good enough” or “not good enough.”

3 – Dimensionality Reduction:

Market researchers use dimensionality reduction for creating perceptual maps, or when trying to make sensible summaries of groups of questions (we often use something like factor analysis). Several researchers at the workshop presented improvements to dimensionality reduction that make the results more sensible or interpretable. Some demonstrations were performed on photos of human faces making various expressions, scans of handwritten numbers, and audio files of music. In addition to discussing dimensionality reduction used on its own, there were also examples of using it as an initial step before attempting classification or prediction.

This is an exciting area of development. The Machine Learning community has taken these techniques well beyond the traditional methods for dimensionality reduction. New techniques are able to organize much more complex data in an interpretable way. One fairly recent development that leads to this type of improvement is inferring additional information about the relationships between the observed cases. This allows you to differentiate between cases that have similar profiles, but arrive at those profiles through very different processes.

As a oversimplified example, let’s take these four groups from a joke about the American educational system (which is in turn paraphrased from much older analogies about enlightenment*): Freshmen (1st year students) One who doesn’t know and doesn’t know that he/she doesn’t know; Sophomores (2nd year students) One who doesn’t know, but knows that he/she doesn’t know, Juniors (3rd year students) One who knows, but doesn’t know that he/she knows, and Seniors (4th year students) One who knows and knows that he/she knows.

Using a traditional technique, we could divide the world into a two by two grid:

diagrams - 1

This is somewhat useful, and can tell us that Freshman are the most different from Seniors. But we’re missing the sense of process. Using our grid it seems as possible to move from the Sophomore state (2nd year) to the Senior state (4th year) as the actual process (moving from 2nd year to 3rd year to 4th year).

Newer Dimensionality reduction techniques actually attempt to infer some aspects of the underlying process that generates the data by looking at similarities between individuals. So, continuing with our example, let’s assume that we’ve sampled each of these four groups at random throughout the year, and Sophomores assessed near the beginning of their Sophomore year look a lot like Freshman, and assessed near the end of their Sophomore year look a lot like Juniors. Using an algorithm that pieces this type of information together, even without the labels for the year in school we can infer a single dimension, representing the underlying process.

diagrams 2

 In the short-term, advances in dimensionality reduction are very useful for making improvements such as more interpretable perceptual maps. In the long-term, they may lead to much more fundamental improvements in terms of understanding and modeling both prediction and classification problems.

How can I best integrate Machine Learning with my project?
Donald Metzler from USC shared some interesting thinking about the best way to work with Machine Learning experts within the context of a larger project. He specifically discussed projects in Web Search, but his points are applicable to working with Machine Learning experts to solve market research problems.

Don posed the question: If you were to take a more applied expert and a Machine Learning expert, who would do a beter job improving an algorithm given a week, a month, or a year? He guessed that over the short run, it would probably be the more applied expert, since the most important improvements come from feeding the algorithm more relevant information. Over the long-term, the Machine Learning expert might have the advantage.

However, the real long-term solution is to have both working on the problem. The more applied expert helps identify the most relevant information for solving the problem. The Machine Learning expert has a number of analytical tools, strategies for applying those tools, and methods of measuring improvement that can greatly improve the end analysis, as long as focus is kept on solutions that are relevant to the particular domain (e.g. in our case brand and marketing).

*Although this is not the most thorough scholarship, the earliest reference I found to this in my two-minute search of Wikipedia was the poet Ibn Yamin, born 1286

Footnote: The thumbnail image at the start of this post was collaboratively created by the author and his daughter.

prev next