I have long wanted to interview someone from YouTube as I think their social data is fascinating and incredibly vast. Every minute, 100 hours of video are uploaded to YouTube. Christyn Perras, a quantitative analyst at YouTube, is talking with Gnip about the career path to being a data scientist, the tools in her arsenal, YouTube’s data-driven culture and Coursera.
1. What was your path to becoming a quantitative analyst at YouTube? What would you recommend for others?
As an undergraduate, I studied psychology and was particularly drawn to the experimental side of the discipline. When I was considering an advanced degree, I concentrated on the aspects of psychology that I loved during my search for a graduate program. I eventually found a program that focused on applied statistics and experimental design at the University of Pennsylvania, where I received an MS and PhD. However, even after graduation, my career path remained unclear and the tech industry wasn’t even on the radar. It was when I started looking for jobs using search terms referring to my skill set rather than job titles that I saw a world of opportunity unfold in front of me.
My first job on the west coast was at Slide, a social gaming company. It was an amazing experience. At Slide, I used my psychology background to understand our users and the way they interacted with our products. In addition, my background in statistics and experimental design gave me the skills to study, test, quantify and interpret user behavior and to measure the impact of our influence. We sought answers to questions such as: Why were these people using our our products? What made them come back? And what could we do to change their behavior and/or enhance their experience? I am now doing this at YouTube and concentrate my efforts on understanding our creators and continuing to improve their YouTube experience via foundational research and experiment analysis.
2. I’ve noticed that Google doesn’t tend to use the title of data scientist. Is there a reason for this?
Not that I’m aware of. Data scientist, quantitative analyst, statistician and decision support analyst are all fairly interchangeable terms in the tech industry. As I mentioned before, my job search was most successful when I used keywords related to my skills and interests (statistics, psychology, experiments) rather than searching job titles (statistician). However, I imagine with the rising popularity and awareness of the field, naming conventions for job titles will likely become more standardized.
3. What is one of the most surprising aspects you’ve learned about YouTube data?
Honestly, I was surprised by the sheer amount of data! It is staggering. I had to learn a number of new programming languages and techniques just to be able to get the data I needed for an analysis into a manageable format. During my time at Penn, SAS, SPSS and SQL were the preferred tools and were incorporated into the curriculum. Without a more extensive computer science background, areas such as MapReduce and Python were quite new to me. I’m also continually expanding my knowledge and experience with techniques used to manipulate, reshape and connect data on this scale. When working with billions of data points, you often need to think creatively.
There is a strong data-driven culture at YouTube and, as a result, product managers and analysts work very closely. In the case of a product change or redesign, analysts are involved in the process from the start. Early involvement ensures, for example, the data necessary for analysis are collected, experimental arms are set-up correctly, logging is accessible and bug-free. We discuss the goals and expectations of product changes in depth to make sure analyses are designed to answer the right question and will produce valid, actionable results. Analysts and product managers typically have a steady dialogue throughout the course of analysis. Once the analysis is complete, we discuss the results, interpret the meaning, consider the implications, and make decisions about the next steps.
I love Coursera! My favorite courses include Data Analysis with Jeff Leek and Computing for Data Analysis with Roger Peng. Coursera is doing something truly great and I look forward to seeing how they grow and progress. Data science is a bit nebulous in terms of education (at least it was when I was in school). There wasn’t a “data science” major or anything like that, so it was necessary to piece it together yourself. I have an amazing team with wildly different backgrounds from physics to psychology to economics. I love bouncing ideas off my colleagues and am guaranteed wonderfully unique and clever perspectives. Companies like Coursera make dynamic teams like this possible by giving people from a wide variety of disciplines access to the additional education they need to shape their career path and be successful in their job.
Another amazing resource for future data scientists is OpenIntro (openintro.org), co-founded by my colleague David Diez. With OpenIntro, you’ll find a top-notch, open-source statistics textbook and a wealth of supporting material.
Thanks to Christyn for the interview! If you’re interested in reading more interviews, check out Gnip’s collection of 25 Data Stories for interviews with data scientists from Foursquare, Pinterest, Kaggle and more.