How to Learn R (for Marketers and Business Folks)

I’m a marketer, but I spend a lot of time in R. I use it to analyze A/B tests and explore data sets. I’ve also built fully functional web applications using R and Shiny to enable new processes for my team at HubSpot using the language.

There are so many free resources for learning technical skills nowadays. It’s a darn good time to be a technical marketer (and also, the “non-technical marketer” is a myth and everyone can learn this stuff).

My first project was a few years ago. While working at CXL, I undertook a project inspired by Stefania Mereu, using R to analyze survey data and create data-driven user personas.

Basically, learning R (and subsequently Python) has been a super power in many ways.

I paid my sister to learn R and take notes

I’ve wanted to write down some notes on how to get started, in case other marketers wanted to give it a crack. However, it’s tough to write about what it was like to learn when you’re standing from your current perspective.

So I paid my sister a few hundred dollars to go through a couple R courses (specifically, DataCamp’s Intro to R) and take notes on what she learned (side note: she’s a college student and would be a great intern if you want to hire her). Her background is having used a bit of SPSS and SQL for data analytics classes in school, so this was her first exposure to R.

My sister, the future data scientist. Hire her for an internship.

I took her notes and combined them with my memory and experience of learning the language, as well as some examples of projects you can do as a marketer (click here to skip to that section). If I can learn R and use it daily/weekly, and if my sister can learn the basics of R in a few weeks as a college student, you can too! So here’s my beginner’s guide to learning R. First, why…

Why use R? What can marketers/growth/product managers do with R?

If you work with data you probably know how to use tools like Google Analytics and Microsoft Excel. You might even know some SQL or perhaps how to set up reports in a BI tool like Looker.

These are great starting point, but if you want to unlock interesting new analysis tools, production capabilities, or things like working with APIs and scraping/cleaning data, then you should learn a scripting language like R or Python. You can also automate a bunch of stuff if you learn R or Python.

For all of its benefits (particularly with machine learning), I still prefer using R to Python, and I still think R is easier to get up and running for new users (especially marketers). Because there are so many digital analysts working with R, there are tons of ready made packages to do things like access Google Analytics, visualize data, print to Google Sheets, etc. Additionally, R Studio is an epically easy place to learn R and work. I’m biased because I learned R first, but it felt so actionable right away. For example, in like an hour, you can make a heat map of your website traffic like this:

Some benefits of R:

  • Easy to set up, write, and QA
  • Centralized language with only one version being actively supported
  • A big community with many packages to help cut down on your learning time and coding time (it’s particularly popular among analysts and marketers)

The biggest reason I can think is that, given your current use of tools like Google Analytics, Adwords, or even a tool like SEMRush, we can get wayyy more out of our data by learning a little R (will show some examples in the article below – click here to skip the setups and basic syntax lessons and go right to the code/examples).

Sections of this article will be:

Installing R on Your Computer and Getting Setup Properly

This will be a relatively quick section, Texas-style:

  1. Install R
  2. Install R Studio

R Studio is an integrated development environment (IDE). It makes programming much easier, as you can edit and test code and see the output and your variables/data in real-time.

While you’re messing around with R, check out Swirl. It’s an R package that teaches you to code R, in R. It’s how I picked up the syntax initially. All you have to do is install it and call it, like so:

install.packages(“swirl”)
library(swirl)
swirl()

Now you’re up and running and already have a (free) R course! Maybe you don’t even need the rest of this article.

The Super Easy Basic Stuff

First off, R is case sensitive. Second, remember that you define variables in R by using an arrow, like this:

variable_name <- “Here’s a string”

My sister was thrilled about the easiness of defining variables. Her notes:

“To assign variables is simple. Assigning a variable allows you to store a value into R – it is done by typing something such as “x <- 50”, which would assign “50” to “x”. To print out the value of a variable, all you have to do is write the name of the variable on a line – easy!”

I share her enthusiasm.

You can comment away code (write code that is not executed, mainly for documentation and communication) by using the # sign at the beginning of a line, e.g.:

Here’s what prints when I run that code:

You can do basic arithmetic easily

At its most basic level, you can use R as a calculator. Give it a try. Type in something like this:

5 + 5

Highlight that bit of code and hit command + enter.

As my sister Reilly wrote, “you can do all basic arithmetic, as well as modulo, which looks like this: %%; for example, writing “28 %% 6” gives you 4 (it returns the remainder of the division number on the left by the number on the right).”

This can be helpful if, say, you’re trying to isolate only even numbers in a dataset:

Of course, you can use variables for your mathematics as well. E.g. you can assign an integer (say 50) to a variable (say ‘x’) like such:

x <- 50

Then you can use your variable to do math! Check it out:

Functions and Loops in R

Let’s not dive too heavily into the stuff that is common to all programming languages, but I did already briefly show you how to build a function. Here’s the syntax:

function_name <- function(arg) {
## your function here
}

A For Loop is pretty easy, too:

for (val in sequence) {
statement
}

I’ll a for loop and a data set to show you what we can do. Here we just loop through a list of integers 1 through 10 and add 1 to each of them:

And here’s a totally useless function, just to illustrate syntax:

 

The different types of data in R

Okay, here’s the R specific stuff when it comes to how data is codified, stored, and utilized in R. This stuff matters when it comes to querying, cleaning, and analyzing data you’ll work with.

Basic values;

  • Decimal values and integers are both numeric
  • Boolean values are logical (true or false)
  • Text/string values are characters.

Those are the building blocks, the root values of data, in R. Then you have different data structures, such as:

  • Vectors
  • Matrices
  • Factors
  • Data Frames
  • Lists

You can check what data type you have by using the function ‘class()’ like the example here:

The course Reilly took, as far as I can tell, was a compendium of data types in R and mainly covered this stuff. We’ll walk through her notes and each type to explain how it is used. After that, I’ll show you a few examples of how you can easily get started with some R projects, as well as some resources for further learning (click here to skip to examples).

Vectors in R

A vector is the simplest data type in R. it’s a one-dimensional array that can hold just one data type at a time. Examples:

You can select a single unit of a vector by using this syntax: vector_name[vector_position]

Unlike other languages, it starts counting at 1 (not 0), so numeric_vector[2] gives us “5”:

If you want to select more than one, you can use a comma, e.g. numeric_vector[1,3], or you can select a series of units using a colon, e.g. numeric_vector[1:3]

My sister took a lot of notes on vectors, and probably for good reason (they’re the core data structures in R, really). I, however, find it boring to write about all the different rules and stipulations to vectors (and data types generally), so I’ll do two things:

Matrices in R

Awesome, now we move to matrices: a two-dimensional collection of values of the same data type (numeric, character, or logical) arranged into a fixed number of rows and columns. It’s like a spreadsheet!

You can create matrices using the function matrix(), and there are three arguments: collection of values, byrow = TRUE/FALSE, nrow = )

  • Argument one: a collection of values that will be arranged into rows and columns – [1:5] is equivalent to (1, 2, 3, 4, 5)
  • Argument two: byrow = TRUE/FALSE – if you want the matrix to be filled by rows use TRUE (like the example), if you want it to be filled by columns use FALSE
  • Argument three: nrow = ___ – fill in the blank with however many numbers of rows you want the matrix to have. The example has 4.

You can, of course, combine multiple vectors together like this:

Woo! More points: select them the same way as you did vectors (brackets). You can use rowSum() and colSum() to calculate the totals for each row and column in the matrix (these functions create a new vector).

Another cool function, cbind() lets you add columns to a matrix without having to redo it. It looks like this: new_matrix_name <- cbind(old_matrix, new_matrix/new_vector…..) where new_matrix_name is the new matrix with the added columns, old_matrix is the completed matrix, and new_matrix or new_vector is the part you want to add on.

  • rbind() is the same thing but for rows!

Alright, again, I’m getting bored just writing this stuff, so just try things out and maybe save this reference document for later.

Factors in R

Factors are a statistical data type used to store categorical variables (limited to a set of categories). Read more on categorical variables here.

In the vast majority of cases, I find that when I’m uploading data (from an API or a spreadsheet or something), R tries to coerce strings as factors and it’s annoying – that’s why when we upload a CSV or TXT, there’s an argument called stringsAsFactors and I usually set it to FALSE).

Enough about factors, here’s what my sister wrote verbatim about what she struggled with:

“I initially didn’t understand the reasoning behind levels and the summary function. I understood what they did, but I was hesitant with understanding what to do with the information given. After a lot of trial and error of applying levels() and summary() to the data sets in the course, I began to understand their purpose. Best advice: practice, research, and continue doing so until it clicks, and never hesitate to ask for help if needed.”

Bolding mine: don’t be afraid to just tinker and learn as you go.

Data Frames in R

We are so close to being finished with data types in R! Push through. Two more and they’re the things I use most in data analysis and tool building.

What’s a data frame in R?

A data frame is a two-dimensional structure in where columns contain values of a variable and rows contain a set of values (observations) from each column – the data can be numeric, logical, character, etc. Now this is like a spreadsheet!

This is more practical because analyzing data sets usually involves data frames that contain more than one type of data – for example, when working with survey answers, as the course uses: yes/no questions = logical, “how old are you” = numeric, open ended questions = character.

Data sets can clearly become massive, but cutting them down into sections can be useful: use head() and tasil() to see the first observations and final observations of a data set respectively. The first function you’ll often use on a data frame is str(). Here’s what it shows you:

You can create your own data frame using the data.frame() function

I’m going to burn through this section on data frames, despite them being my most used data type in R, simply because there’s a massive amount of information about them. You’re best off playing around and looking up some documentation (which you can do from R, by the way, with a question mark before a value, like ?data.frame):

For what it’s worth, my sister said the data frames lesson in the Data Camp course was super intuitive. I’m pretty sure if you work in Excel a lot, you’ll understand data frames right away and get a lot of use from them.

Lists in R

Last one!

What’s a list in R?

Here’s how my sister put it: ‘A list is literally a list like you would use in your daily life – it has items that differ in characteristics, activity, time frame, etc.”

In R, a list lets you put an array of objects under a list name in an orderly fashion. Objects under the list can be matrices, vectors, data frames, etc. that haven’t been introduced yet, and they can be as random as possible – they do not have to be related to one-another.

You can make a list like this: “list_name <- list()”, where list_name is the name of the list you are making, and inside the parentheses are the contents of the list (remember, they can be vectors, matrices, etc.)

You can assign names to the components of your list by adding another part to the previous example: list_name <- list() <- names(list_name) <- c(“____”….), where c() holds the names of the contents of the list respectively.

To add elements to an existing list, just use c(): new_list_name <- c(old_list, value = ____), where new_list_name is the new list you would like to make, old_list is the list you made that you want to add the information to, and value = ____ is the part you want to add to the list.

Four Use Cases for Marketers Learning R (with Code)

I’ve written in the past about how I’ve learned R (in 2017 I wrote about the user personas projects as well as attribution modeling and Google Analytics heatmaps).

Here are four news sample projects I’ve worked on and you can, too.

1. Automate Data Collection and Analysis in Google Analytics

You can do a lot of things with Google Analytics and R (read this post here to get started, and check out this site for other great ideas). Here’s a super basic script to pull some quick GA data for blog landing pages and organic traffic:

Since GA data and what you want to learn from it is highly contextual, I recommend not even copying my script here, and instead, just playing around with queries/questions you want to answer. It’s all do-able from R (and like I note in my script, you can get away from sampled data!)

2. Pull the “Head Keyword” for a Given URL with SEMRush

Recently, I was working on a “Product Led Content” audit of the HubSpot blog – essentially, I was looking for previously published articles that have a strong product focus, where we could potentially inject more mentions of our freemium tools or more conversion points.

I searched things like “site:blog.hubspot.com intitle:”how to”” and then scraped all the URLs using SEOquake.

I then pulled the title in via =ImportXML(A2, “//title”), and I ran the list of URLs through Ahrefs bulk upload tool to get the estimated keyword and traffic volume. This let me see quickly which the most impactful and popular posts were.

I also wanted the head keyword, though, because we can then use that keyword to prioritize link building outreach. The logic is that, if we rank and bring in conversions for a keyword (say “email marketing”), then we should also seek to get links from other sites who rank for that keyword.

It would take forever to get each head keyword manually through Ahrefs, so I whipped up a script in R that does so using the SEMRush API. Here’s the code:

3. Analyze A/B Test Results in R

I always use R to analyze my A/B test results, and normally to slice and dice segments for post-hoc analysis.

It’s easy to run a simple t.test() in R, but there is a lot more you can do as well. Instead of listing it all here or trying to rewrite the code to share, just read this post. It’s awesome.

4. Check a list of URLs to see if it links to your website

People comparison shop. When someone searches “best live chat software,” they’re very likely about to buy some live chat software. The more times they see your product mentioned in a search like that, the more likely they are to consider it as something serious to check out.

That’s why I like “SERP Real Estate” (the percentage of search results for a given query that mention your brand) as an important measure of brand awareness. Unfortunately, no SEO or PR tool gives you this data. So let’s use R to pull it from SEMRush, scrape the page, and check against each page for our link!

I’ve built a whole interactive application for this, but I’ll write the generic script that does the function here:

Further Resources for Learning R

If you like courses, DataCamp is the best for pure, practical programming. Coursera has some more in-depth and formal courses (though it is my least favorite platform of those mentioned here). Udacity also has a great exploratory data analysis course using R.

I don’t find courses all that useful beyond the beginning of your learning journey, and instead, I like to work on projects and figure out as I go. Some good inspiration for projects and blogs/people to follow in the R world:

Conclusion

Here’s my sister’s summary of her R learning journey:

“Learning R has been rather easy so far – in terms of learning the functions and how to use them. When actually applying what is learned to data sets and analyzing them independently, I can only guess will be much more difficult. From what I’ve gathered in the four hour course, plus some practice, is that learning R isn’t difficult in itself, but it’s more about the problems you’re working on.”

Couldn’t have said it better myself! Learning R isn’t that difficult, it’s only a tool to apply to problems you’re trying to solve. So get learning!

Alex Birkett
Alex Birkett is a Growth Marketer and Content Strategist based in Austin, Texas. He's a proud UW-Madison graduate and enjoys craft beer, lifting weights, and sailing.

One Response to “How to Learn R (for Marketers and Business Folks)

  • Jacob Jarnvall
    3 weeks ago

    Great article Alex! I also found the CXL course “Advanced experimentation analysis” by Chad Sanderson to be a great source for learning R specifically for digital experimentation.