When I was in charge of email at my last startup, the MailChimp blog was a must read. Their approach to email marketing is brilliant so when my colleague suggested I interview MailChimp’s chief data scientist, John Foreman, for a Data Story, I was definitely onboard. In addition to being a data scientist at MailChimp, John is also the author of Data Smart: Using Data Science to Transform Information into Insight. You can follow him on Twitter at @john4man.
1. People have a love/hate relationship with email. How can data science help people love email more and get more out of it?
Think about a true double-opted email subscription versus, say, a Facebook “like” of a product. When I like a product page on Facebook, do I really want to hear from them in my feed? In part, isn’t that “like” just an expression that’s meant for public display and not for 1-to-1 ongoing communication from the business?
Contrast that with email. If I opt into a newsletter, I’m not doing that for anyone but myself. Email is a private communication channel (I like the term “safe place”). And I want your business to speak to me there. That’s powerful. Now think of a company like MailChimp. We have billions of these subscriptions from billions of people all across the world. MailChimp’s interest data is unparalleled online.
OK, so that means that as a data scientist, I have some pretty meaty subscription data to work with. But I’ve also got individual engagement data. Email is set up perfectly to track individual engagement, both in the email, and as people leave the email to interact with a sender’s website.
So I use this engagement and interest data to build products — both weapons to fight bad actors as well as power tools to help companies effectively segment and target individuals with content that’s more relevant to the recipient. My goal is to make the email ecosystem a strong one, where unwanted marketing email goes away and the content that hits your mailbox is ideal for you.
For instance, MailChimp recently released a product called Discovered Segments that uses unsupervised learning to help users find hidden segments in their list. Using these segments, the sender can craft better content for their different communities of recipients. MailChimp uses the product ourselves; for example, rather than tell all our customers about our new transactional API, Mandrill, we used data mining to only send an announcement to a discovered segment of software developers who were likely to use it, resulting in a doubling of engagement on that campaign.
2. How is data science structured at MailChimp? How big is your team, and what departments do you work with?
MailChimp has three data scientists, and our job as a little cell is to deliver insights and products to our customers. That sounds like business-speak, so let me break it down.
By insights, I mean one-off research and analysis of large data sets that’s actionable for the customer. And by products, I mean tools that the customer can use to perform data analysis themselves. If the tool or product isn’t useful or required by the customer, we don’t build it. A data science team is not a research group at a university, nor is it a place to just to show off technologies to investors. We’re not here to publish, and we’re not here to build “look at our data…ooooo” products for the media. Whenever a data science team is involved in those activities, I assume the business doesn’t actually know what to do with the technical resources they’ve hired.
Now, who is the “customer” in this mission? We serve other teams internally as well as MailChimp’s user base. So an example of a data product built for an internal customer would be Omnivore — our compliance AI model, while an example of a data product built for the general user population would be our Discovered Segments collaborative filtering tool.
We work very closely with the user experience team at MailChimp — the UX team is constantly interviewing and interacting with our users, so they generate a lot of hypotheses which we investigate using our data. The UX team, because their insight is built quickly from human interactions, can flit from thought to thought and project to project; when they think they’re onto something good, they kick the research idea to the lumbering beast that is the data science team. We can comb through our billions of records of sends, clicks, opens, http requests, user and campaign metadata, purchase data, etc. to quantitatively back or dismiss their new thinking.
3. Your book, Data Smart, is about helping to teach anyone to get value out of data. Why did you see a need for this book?
I used to work as a consultant for lots of large organizations, such as the IRS, DoD, Coca-Cola, and Intercontinental Hotels. And when I thought about the semi-quantitative folks in the middle and upper rungs of those organizations (people more likely to still be using the phrase “business intelligence” as opposed to “data science”), I realized there was no way for those folks to dip their toe into data science. Most of the intro books made a lot of assumptions about the reader’s math education background, and they depended on R and Python, so the reader needed to learn to code at the same time they learned data science. Furthermore, most data science books were “script kiddy” books, the reader just loaded stuff like the SVM package, built an AI model, and didn’t really know how the AI algorithms worked.
I wanted to teach the algorithms in a code free environment using tools the average “left behind” BI profession would be familiar with. So I chose to write all my tutorials in Data Smart using spreadsheets. At the same time though, I pride myself on writing a more mathematically deep intro text than what you find in many of the other intro data science texts. The book is guided learning — it’s not just a book about data science.
Now, I don’t leave the reader in Excel. I guide them into using R at the end of the book, but I only take them there after they understand the algorithms. Anything else would be sloppy.
Another reason I wrote the book is because the market didn’t have a broad data science book. Most books focus on one topic — such as supervised AI. Data Smart covers data mining, supervised AI, basic probability and statistics, optimization modeling, time series forecasting, simulation, and outlier detection. So by the time the reader finishes the book, they’ve got a swiss army knife of techniques in their pocket and they’re able to distinguish when you use one technique and when you use another. I think we need more well-rounded data scientists, rather than the specialists that PhD programs are geared to produce.
4. You’ve written a book, maintain a personal blog and write for MailChimp. How important has communication and writing skills become to data scientists?
I believe that communication skills, both writing and speaking, are vital to being an effective data scientist. Data science is a business practice, not an academic pursuit, which means that collaboration with the other business units in a company is essential. And how is that collaboration possible if the data scientist cannot translate problems from the high-level vague definition a marketing team or an executive might provide into actual math?
Others in an organization don’t know what’s mathematically possible or impossible when they identify problems, so the data science team cannot rely on them to fully articulate problems and “throw them over the fence” to a data science team ready-to-go. No, an effective data science team works as an internal, technical consultancy. The data science team knows what’s possible and they must communicate with colleagues and customers to understand processes and problems deeply, translate what they learn into something data can address, and then craft solutions that assist the customer.
5. Time for the Miss America question. If you had access to any data in the world, what is the question or problem you’d like to most solve?
I am a huge fan of Taco Bell. And I recognize that the restaurant actually has very few ingredients to work with — their menu is essentially an exercise in combinatorial math where ingredients are recombined in new formats to produce new menu items which are then tested in the marketplace. I’d love to get data on the success of each Taco Bell menu item. Combined with possible delivery format information, nutrition information, flavor data, and price elasticity data, I’d love to take a swing at algorithmically generating new menu items for testing in the market. If sales and elasticity data were timestamped, perhaps we could even generate menu items optimized for and only available during the stoner-friendly “fourthmeal.”
Thanks to John for taking the time to speak with Gnip! If you’re interested in more Data Stories, please check out our collection of 25 Data Stories featuring interviews with data scientists from Kaggle, Foursquare, Pinterest, bitly and more!