3 Things I’ve Done in R in 2017 So Far

Last Updated on December 17, 2019 by Alex Birkett

As I mentioned in my little post on New Year goals, I’m learning R. It’s part of a broader goal to become more technical, and I’ve certainly gravitated towards the data arts in that regard.

I’ve explored front-end languages like JQuery, and while that added a lot of value in very predictable ways, R has made more sense to me so far.

I feel more comfortable learning it.

Even better, I continue to come up with awesome ways to use it (the limiting factor in most of my ideas is, of course, my skill level).

R is cool and I have been geeking out on it for a few months. In the past 3 months, I’ve had a few wins. Here are some pretty basic things I’ve sort of accomplished this year that have helped me with marketing.

1. Getting Starting with Google Analytics in R

I won’t say I’ve explored Google Analytics in more detail or in any way that benefits me much in R, but I’ve got it up and running, and eventually, it’s going to help me build a dashboard in Shiny to share across my team.

I started with this excellent post from Tim Wilson: https://www.linkedin.com/pulse/tutorial-from-0-r-google-analytics-tim-wilson

I first learned about R in the context of digital analytics, so this was really the natural way to get started in the language. After taking a very quick run through Swirl’s interactive education and a Lynda course on R and Statistics, I wanted to pull data from our analytics account and see what I could do.

What I could do wasn’t impressive, but it’s a start. I’m pretty much just playing around with different ways to chop data now, all things I can do in GA. Though I did build a cool heatmap based on traffic by hour of day and day of week:

I suspect this is where the majority of my R education and application will come into play in the future.

2. Building Robust User Personas Using Advanced Statistical Techniques

This was definitely the most useful project I completed. Only a subsection of it was in R, but still, that part was really important.

Basically, we wanted to create data-driven personas for CXL Institute, but we wanted them to be fast (when you’re agile, you don’t want to be bogged down by analysis paralysis.

I wrote about the whole process in this post, but it essentially boils down to 8 steps:

  • Scope out our goals and plan the approach
  • Write the survey and send it to our target audience/customers
  • Exploratory data analysis in R
  • More data analysis in Excel (lots of pivot tables)
  • Qualitative data analysis
  • Outline rough but distinct personas
  • Find prototypical customers from each persona and conduct 1-on-1 interviews
  • Add all the data together and create beautiful and useful personas

The exploratory data primarily consisted of PCA, Factor Analysis, and Clustering (hierarchical and k-means), and a whole lot of visualizations. I also drew out histograms for various variables and analyzed the data as a function of specific variables, though that was much easier to do in Excel.

The end result: success! We built useful personas.

3. Creating a Markov Chain based Attribution Model with GA Data

This was a really quick one because the code was basically copy and pasted. But with my knowledge of pulling GA data with R and a general ability to read, write, and edit R code, I was able to build an attribution model for our digital analytics data using a Markov model.

Image Source

Here’s where I got the idea and the majority of the code from: https://stuifbergen.com/2016/11/conversion-attribution-markov-model-r/

This alone didn’t give us much value beyond our GA dashboard, but I did learn that we weren’t tagging particular email campaigns properly, so we were able to spot that and change it. It was also easier to see the marginal value of referral traffic from specific sources we didn’t think were valuable based on last-click attribution, but ended up appearing value algorithmically.

So, I guess this was valuable after all, and I’m sure in the future when the newly tagged campaigns come in it will be incredibly useful to see what’s valuable and what’s not.

I also want to explore a bit more of the attribution modeling. I took it at face value and didn’t question the underlying assumptions, but I think it’s worth exploring in the future, especially as we ramp up ad spend and add new marketing channels.

This post is what I’ll work through next: https://www.lunametrics.com/blog/2016/06/30/marketing-channel-attribution-markov-models-r/


R is dope. Can’t wait to do more with it.

Next on the list: mapping out funnel based keyword research in R using clustering and classification, building a Shiny dashboard with digital analytics and financial data, using regression analysis to predict behavioral patterns that correlate with greater retention.

In addition to learning R, I’ve also been going back and learning a lot of the nooks and crannies of Google Analytics that I overlooked in my haste to learn it. It’s going to be a fun and nerdy year.