Don't let the math scare you – the social media code has been cracked!

bkadmin
May 15, 2012

Understanding and analyzing topics in social media has become immensely important for advertising agencies. We’re all searching for the two things that, when combined, become the holy grail. The first is that we are all looking for ways to create social media posts that expand influence (as determined by whatever the influence needs to be according to existing model) and ways to measure success (again, not necessarily the same as ROI). A recently posted paper may have given the answer for both.

Authors Himabindu Lakkaraju of IBM, Indrajit Bhattacharya and Chrianjib Bhattacharyya of the Indian Institute of Science, have posted a paper to the Cornell arXiv server for review entitled Dynamic Multi-Relational Chinese Restaurant Process for Analyzing Influences on Users in Social Media. The abstract of their paper outlines not only the focus of the paper, but (more importantly to agencies) valuable insights on topic trends and user personality trends, beyond the capability of existing approaches.

“We study the problem of analyzing influence of various factors affecting individual messages posted in social media. The problem is challenging because of various types of influences propagating through the social media network that act simultaneously on any user. Additionally, the topic composition of the influencing factors and the susceptibility of users to these influences evolve over time. This problem has not studied before, and off-the-shelf models are unsuitable for this purpose. To capture the complex interplay of these various factors, we propose a new non-parametric model called the Dynamic Multi-Relational Chinese Restaurant Process.

This accounts for the user network for data generation and also allows the parameters to evolve over time. Designing inference algorithms for this model suited for large scale social-media data is another challenge. To this end, we propose a scalable and multi-threaded inference algorithm based on online Gibbs Sampling.

Extensive evaluations on large-scale Twitter and Facebook data show that the extracted topics when applied to authorship and commenting prediction outperform state-of-the-art baselines. More importantly, our model produces valuable insights on topic trends and user personality trends, beyond the capability of existing approaches.”

The authors outline two major distinguishing features of social media. First, users are influenced by a variety of factors when posting messages, and secondly, social media data is inherently dynamic. Owing to these multitude of factors, and the intrinsic interplay between them, analysis of social media data has been a major challenge. In their paper, the authors propose a non-parametric probabilistic approach for analyzing social media data. From an agency perspective, a major impact of their paper is the qualitative and quantitative ability to outperform state-of-the-art baselines.

“For example, our analysis shows that users posts are mostly influenced by personal preferences, rather than global, regional or social-network factors, except in times of major world events, when users become swayed by global influences at the cost of personal preferences.”

A large portion of the paper is dedicated to discussing their proposed model. I’ll spare you the math, but if you’ve had statistics and probability in college, it won’t be too out of reach. What you’ll notice is the excellent leaps they’ve taken to build their model in steps – first describing the static Relational Chinese Restaurant Process, then incorporating multiple relations, and finally adding temporal evolution to it.

The authors present the following:

and then introduce elements to measure the dynamic nature of the topics and user influence patterns or personalities. For topic trends:

For user personality trends:

and for evolving topic distributions:

I am not doing justice to their formulas, so I would suggest reviewing their work for their excellent step-by-step analysis. In their paper, the authors have made a first attempt at studying the important problem of analyzing user influences in generation of social media data. They have proposed a new non-parametric model called Dynamic Multi-Relational CRP that incorporates the aggregated influence of multiple relationships into the data generation process as well as dynamic evolution of model parameters to capture the essence of social network data.

Through extensive evaluations, they demonstrate that the topic trends discovered by their model are superior to those from state-0f-the-art baselines. More importantly, they found insightful patterns of influence on social network users, beyond the capability of existing models.

My good friend Jeff, who probably quit reading when he saw the formulas, would ask “so what does this all mean?” It means that there is a new model for measuring influence and identifying patterns within social media. The impact is, quite simply, better wins. bloomfield knoble is going to adopt the model proposed by the authors. We intend to deploy this model across social media on behalf of clients. Doing so, we believe that we will be able to better influence social media and provide better analytics to our clients. Better wins.

# # #

We build strategies and everything that goes with them.

Some of the largest organizations in the world, including many in the mortgage and finance industries, trust us with the most important aspects of their business. From defining clients’ brands and identities to developing ongoing campaigns in a variety of media, we provide the communications and measurement tools to move them forward. Applying our experience and dedication to the media and the message, bloomfield knoble handles every detail of our clients’ strategic marketing initiatives.