Data scientists Mohammad Shahangian of Pinterest; Kostas Tsioutsiouliklis of Twitter, Adam Laiacano of Tumblr discuss the challenges and opportunities in social data.
As Gnip’s own data scientist Dr. Skippy was joined on stage by three data scientists representing three prolific social networks, Big Boulder Master of Ceremonies Lindsay Campbell couldn’t help herself gushing to the crowd, “This is by far the sexiest panel this year”. (Which was a reference to the Harvard Business Review naming data science the sexiest profession of the 21st century.)
Physical appearance aside, there could hardly be a truer statement to Big Boulder attendees: a legion of self-proclaimed data nerds.
Scott Hendrickson, better known as Dr. Skippy, Data Scientist at Gnip was joined on stage by Mohammad Shahangian of Pinterest, Kostas Tsioutsiouliklis of Twitter, and Adam Laiacano of Tumblr.
A Look at the Data Science Departments
The conversation began with each guest sharing the size of data science teams and roles at their respective organizations.
The data science team at Twitter is currently comprised of 7-8 people, looking to build to team of 20 in the near future (see open positions here). Data scientists at Twitter fall into two departments: a business intelligence and insights team of data scientists and individual data scientists who are embedded into teams. Data scientists embedded into teams become key stakeholders in improving and evolving the product.
The business intelligence team works collaboratively to explore ideas and create reports, even if it is not always favorable to the company. As Kostas explains, data scientists are trusted at Twitter. It’s ok to report the truth.
At Pinterest, there are 8 full-time data scientists on the team. The primary goal for data scientists is to understand what users are doing, to put pinners first- a strong company value. Much like Twitter, Pinterest data scientists are integrated into other engineering teams. This blend of engineers and data scientists on the same team enables nimble product iterations. Since adding data scientists to the mix at Pinterest teams are now requesting deeper and deeper metrics to measure success and plan product.
Tumblr’s team of data scientists is also eight strong in two roles, first a search and discovery team six strong and second, a two person, very self reflective business intelligence team. The search and discovery team is tasked to maintain the quality of the data and build products that can make the data usable, and ensure the end product is something users enjoy. The business intelligence team of two people is highly self-reflective investigating actions users take to determine which actions are indicatory of long term success.The outcome of which is most frequently is reporting.
Data Science Impact on Product
At Tumblr, there is a significant amount of testing around registration and onboarding, what users see when they land at Tumblr.com. However, Adam is quck to add that Tumblr has a unique view on their research, stating, “You don’t have to do as much research on your product when you use it yourself”.
Data scientists at Twitter report metrics all the way to the top. The CEO and the executives are asking questions about the data around launch of a new product and value the input of data scientists.
By sharing data with product teams, Pinterest engineers are being driven by the data. Mohammad shares, “After exposing metrics to people, the first instinct is to want to make the metrics better. This brings a culture of people who come to the data science team and seek their input. They take the ideas of product and run some queries to see if the data validates it. We’ve made it very easy for product teams to set up experiments, we don’t even call them experiments anymore.” Expounding on this fact, he shares an anecdote from a recent rewrite of the entire website. When launched, scientists noticed a dip in follows. Investigation from the team lead to understanding that the enhanced speed of the rewritten website had eliminated a small lag which followed a users like. A lag of time in which users had been following pinners on the site. By correcting the lag, follows went back up.
Who You Callin’ Sexy?
As Dr. Skippy joked about the popularity, ahem sexiness, of the data science title, conversation turned to the lack of an industry standard definition for the role, noting there is often confusion and a lack of differentiation from business analysts and business intelligence roles.
Kostas began noting that data science is not about analyzing but about prediction. Twiter data scientists are also engineers. Backgrounds of Twitter data scientists include statistics, data mining, machine learning, and engineering.
Further delineating from data analysts, Mohammad points out that role isn’t pulling their own data. Continuing on he added, “If you can’t pull your own data, how can you figure out what you want? A data scientist is skeptical. If results seem too good to be true, they will investigate. Question the data. Analysts will take the data as the data.”
Adam relates a good scientist as individual who can get data in any format and clean it up, can take weird, fuzzy forms and see the layout of the information is available. To connect the puzzle and build the data set that is useful.
The Future For Analysis of Social Data
Much of data science to date has been ad hoc, but the panelists agree that as you look closely at what data scientists do, it’s templates and patterns. Over time this work will become progressively more standardized. With new, faster tools it will move away from ad hoc processes. Teams will build models and tools to solve recurring problems.
Adam of Twitter added optimistically that the future is the work data scientists will do as they collect data across platforms and across multiple streams. It’s up to those developing third-party tools and resources to innovate using all the data.
Lastly, Mohammad chimed in that machine learning and prediction modeling is the sexy amongst the sexy. Adding, “That’s what we’re all waiting for”.