Bad Data, the Right Data and Le Data

Gnip believes social data can change the world and our leadership team has been writing about data in O’Reilly, speaking at the Sentiment Symposium and at LeWeb. We wanted to share what they were talking about.

Bad Data by O'Reilly

Gnip CEO Jud Valeski wrote a chapter in the recently released O’Reilly handbook “Bad Data” by Ethan McCullum. Jud wrote the chapter called “Social Data: Erasable Ink?” about how the evolving social media landscape is challenging expectations about how people interact with social data and who owns it. Gnip is committed to providing terms-of-service compliant social data and this chapter talks about the expectations around social data and how the various players are managing them.




Our COO Chris Moody speaking at the Sentiment Symposium on “Building Sentiment Analysis on the Right Social Data”

Building Sentiment Analysis on the Right Social Data (Chris Moody, Gnip) from Seth Grimes on Vimeo.

Jud being interviewed by Robert Scoble at LeWeb

Data Story: Interview with Seth Grimes of the Sentiment Symposium

Seth Grimes is an industry analyst and consultant who specializes in data analysis technologies. He also organizes the Sentiment Analysis Symposium, which Gnip is pleased to sponsor. You can meet Gnip, including President & COO Chris Moody, at the next symposium, October 30 in San Francisco. If you register, you can use the code GNIP for a $200 discount!

Seth Grimes of Sentiment Symposium

1. How do sentiment, and sentiment analysis, differ from industry to industry? Sentiment does differ from industry to industry! Thin sheets and walls in a hotel are bad, but a thin mobile phone or tablet is good. Humans understand the difference; after all, it’s people who generate the content we’re mining, the flood of reviews, tweets, status updates, and messages. Our task as business people and technologists is to automate understanding. We need to make social, online, and survey analyses systematic and reliable and to align analyses with business goals. You do need sentiment capabilties that account for differences from industry to industry. Solutions should understand differences between hospitality and consumer electronics, and also healthcare and politics. But we also need to get at cultural and demographic factors behind the data, in order to completely understand what people are saying, feeling, reposting, and planning.

2. You’ve talked about how “good, clean, comprehensive social data is central for any meaningful social-media analytics initiative.” What are other considerations people should have when employing social data for sentiment? “Good, clean, and comprehensive” is a start; continue the list with relevant and useful. It’s easy to get overwhelmed by the social-data flood, but much, even most, of what’s being said on-line and on-social has little business value. You have to filter out the noise, by which I mean, in this context, data you can’t use. For instance, one of my friends just tweeted, “Let’s Stop Demonizing [Do It Yourself]. If Home Depot didn’t put contractors out of business,… why are we so worried about DIY [market] research?” I’d bet that Home Depot picked up that tweet, but I sure hope they ignored it. But if you’re following market research thought leadership, you want to pay attention. Of course, good, relevant data isn’t enough. You need analytical tools that will help you make sense of it and communicate usable business insights. Sentiment analysis plays that role, as a key element in larger social and enterprise analytics programs, applied for customer service and support, market research, media analysis, clinical medicine, and a host of other applications.

3. What do you think the biggest changes in sentiment have been in the last few years, and how do you see it evolving in the future? There has been immense growth in awareness of the power and possibilities offered by automated sentiment technologies, accompanied by market emergence of a slew of new tools. Unfortunately, many of them are simplistic and over-promise, which has led some to question the accuracy and usefulness of automated methods. Fortunately the challenge — and the business opportunities — have spurred the development of new, better methods. One is “active learning,” which is essentially human-curated machine learning. Another is application of crowdsourcing to sentiment-rating tasks, both for business analyses and to create training data for machine learning processes. Again fortunately, the major part of the technical complexity is hidden behind the scenes. There are some great analysis tools out there, accessible for business users and designed to produce business-usable insights.

4. What are the most innovative uses of sentiment analysis you’ve seen? It’s really interesting seeing applications of sentiment analysis for politics and policy, for 2012 election analysis, by media organizations, the campaigns, and researchers. We’re devoting a segment of the up-coming Sentiment Analysis Symposium, October 30 in San Francisco, to this topic. I expect there to be significant lessons learned, from political analyses, that are applicable to work in business market research, competitive intelligence, customer experience, marketing, and public relations. I’m also impressed by efforts that have linked social- and survey-mined sentiment to behavior models, psychological profiles, demographic data, and transactional records. Multi-source, multi-method “triangulation” is a real advance in creating business insight.

5. At the Sentiment Symposium, the American Cancer Society is talking about how to assess and react to market situations using social data. What else do you think social data can tell us? There are really very few limits at this point. Anything a human might grasp by reading social and online postings, a machine can also grasp, not perfectly but with increasing precision. Machines — computer software — has the speed and power to exceed human abilities. Automated analyses can find subtle patterns over time, geographically, related to certain language usage, that a human would never detect. They can link these patterns to real-life profiles and behaviors in order to predict people’s preferences and plans. We’re not quite there yet, but were closing in on the point where social data can tell use I’m looking forward to the symposium talk by Liz Keck of the ACS, also especially to social psychologist Kate Niederhoffer’s keynote, Sentiment Driven Behaviors, Sentiment Driven Decisions, and the Dow Jones, Luminoso, and eBay talks. I shouldn’t play favorites, however; we have some great speakers, and I hope folks in the Gnip community will join us.

Continue reading