Blog and Comment Data Answers the Why

Shoppers Tweet about what they bought, but they turn to blogs and comments to share why they bought.

This is only one example of what makes the long-form data from blog and commenting platforms valuable to any company looking to better understand why their customers and prospects make the decisions they do. Simply put, blogs and comments are opinion rich. And when it comes to product development, sales, brand management, and more, these opinions provide a unique and critical lens into the nuanced thinking behind customer decisions.

In the past, some social media monitoring providers have used scraping solutions to include blog and comment data in their offerings. While this can get you the data, scraping has several fundamental challenges. The data can be days or weeks old. Scraping solutions often ignore terms of service and user intent, meaning the data can disappear at a moment’s notice when the scraper gets blocked. The data can come in a range of formats that make it very difficult to parse and analyze. And with scraped data, you only get results from the blogs and comments that you know you should be looking at, missing important discussions that surface in new and unexpected places.

It’s because of these challenges that we’re introducing Gnip for Blogs, combining content from four of the most popular long-form blog and comment sources. This first-of-its-kind package of data from Disqus, Tumblr, WordPress and IntenseDebate gives realtime, normalized, terms of service-compliant access to the rich conversations happening across a huge swath of the Internet. With Gnip for Blogs, customers are able to easily and confidently build their business applications on multiple sources of long-form data knowing it won’t suddenly disappear tomorrow.

Each of these sources has a story to tell on its own, but by looking at them all together brands are able to draw insights from an enormous range of discussion. This includes the mass market reach provided by WordPress who powers 19% of the web, the high volume of brand mentions on Tumblr, the highly-engaged audience on IntenseDebate and the enormous reach and quality of the conversations on Disqus.

One of our customers, Networked Insights, recently used realtime WordPress data to identify early technology trends based on influencer blog conversations. They then used this content to refine and focus a targeted online promotion for one of their customers. The end result? A 30% lift in ROI for their online ad spend. And this is only the beginning.

For more information, check out the Gnip for Blogs on our website or contact us at

Mining Consumer Opinion in Comments

An interview with Daniel Ha and Steve Roy from Disqus on mining opinion in comments. 

Commonly known as a comment system, Disqus facilitates comments from over 2.5 million sites. The team at Disqus, Daniel Ha and Steve Roy, like to think of themselves as a community of other communities. But how do they distinguish themselves?

 Communities and Identity

Any discussion that happens on Disqus, by its nature is its own community. Disqus found that the majority of users’ time was spent below the fold, in the comments. Part of what fuels this is the ability to act under a pseudonym. Disqus maintains that by embracing a pseudonym, people can act as their “real” self. They find that people who embrace a pseudonym reveal a more passionate interest than they normally would. It gives people a voice they wouldn’t typically be able to use, enabling a user to pursue things that mainstream media may not be covering, or to be part of a community they couldn’t otherwise.


Brands can tap into Disqus in a couple ways:

  1. On their properties utilizing Disqus: Brands like HP have launched destination websites with Disqus to participate in the conversation naturally happening.
  2. Disqus’ ad product: Brands can pay to have a presence in other websites (like a Tumblr blog) and place their content above the comment feed. The response to this placement of content is higher as well because it’s located where the audience is more engaged.
  3. Learned Insights: Brands can use pattern detection to learn stories about their brands. A great example of this is when there needs to be a product recall, because a lot of this type of discussion takes place in these stories.

Data Learnings

Disqus recently achieved a major milestone, reaching 1 billion monthly unique visitors. Often considered US focused, the majority of their growth in recent months is international. Disqus supports 40+ languages worldwide. Through its many users, Disqus has been able to understand the behavior patterns on their networks and noted 3 things in particular:

  •  Comment Length: The amount of characters can tell a lot about the level of interest in users. Steve says 57% of all comments are essentially the lengths of Tweets (under 140 characters) and not using links.
  • Time of Day: The worldwide pattern for commenting shows a peak in volume at 10 am in every time zone. Not only does this mean more people comment at this time of day, they also engage with other comments and read comments then too.
  • Categories: Disqus buckets their sites into about 45 different types. Each category has various statistics associated with their category as well. For instance, gamer sites average about 10 characters per comment. Religious sites, on the other hand, average closer to 600 characters per comment. As a brand, this is valuable data that can help shape how they engage with users.

Disqus is proud of the use cases of their data too. Several examples were mentioned, like Gooqus, a search engine utilizing both Google custom search and Disqus.This allows a user to not only see the top Google results, but also add a layer of richness, allowing for more sentiment to be derived from the data.

Big Boulder is the world’s first social data conference. Follow along at #BigBoulder, on the blog under Big BoulderBig Boulder on Storify and on Gnip’s Facebook page.

Observations On Disqus: The Spread of Words

Marketers and communicators all share a similar goal: to become part of the conversation. Comments in reaction to blogs and news stories are a fantastic place to discover the topics that are driving conversation. To dig deeper, we recently looked at public comments from Disqus, the world’s largest discussion platform, to see what was getting online chatter at the end 2012. With 70,000 comments published on Disqus every hour, you can find insights and conversations that can’t be found elsewhere.

What we found is that communicators often use a language set that the audience does not share.  In discussion, most common denominator language dominates.

Let’s look at a couple of Disqus data science examples.

The Fiscal Cliff

Social Media Discussion of Fiscal Cliff

At the end of 2012, one topic that dominated mainstream publications and political blogs was the Fiscal Cliff, when a series of tax cuts for the United States were expected to expire at the end of the year. Since this was a topic of contention between the Democrats and Republicans, you would have expected this to be a passionate point of conversation during the Elections. As it turns out, this wasn’t exactly the case. When did the Fiscal Cliff talk start? The day after the Election. And the discussion was couched in broader terms than just the acute “Fiscal Cliff” crisis. So while Washington operates and speaks in continual crisis mode, the public thinks of these challenges in broader, more systemic terms.

Disqus Conversations on Taxes and Medicare

 While the Fiscal Cliff wasn’t a hot topic until after the election, taxes and medicare saw consistent conversations before and after the election.

Timing is everything when it comes to starting conversations. While the Election focused on what happened in the past four years and what would happen in the next four years, the day after the Election honed in on what was immediately down the road — the Fiscal Cliff.

Skyfall vs Breaking Dawn vs Twilight

Skyfall vs Breaking Dawn vs Twilight on Disqus

Moving from politics to pop culture, we were curious what would generate more conversation — a bunch of sparkly vampires driving Volvos (the movie Breaking Dawn, the fourth installment in the Twilight series) or the eponymous spy from England (Skyfall). We were initially surprised to see that Skyfall generated more chatter around its premiere on Nov. 9 than Breaking Dawn saw for its premiere on Nov. 18. However, when we took a closer look by adding the term Twilight into the mix, we found that Twilight created more chatter than Skyfall.

­Comments are an excellent barometer of buzz around upcoming events and launches. Even more than that, comments can help companies understand what terms people use about an event. In this example, if you were the studio marketer using content marketing to promote the release of Breaking Dawn, your odds would improve by using Twilight in your headline.

Movie Vs. Libya vs. Benghazi

While searching for popular movies on Disqus, we found an interesting spike for the term “movie” in mid-September, but couldn’t attribute it to a popular movie. After some digging, we realized that this was related to the movie “Innocence of Muslims,” the controversial spoof movie on the religion. While the movie was originally uploaded to YouTube in July, it aired on an Egyptian network on Sept. 9, which immediately created protests that quickly spread to Libya. On Sept. 11, four Americans including the Ambassador were killed in Benghazi, Libya. While the terms Libya and movie spiked immediately, Benghazi built up momentum more slowly over time spiking right before the election as it became part of the political debate between the two parties.

Buzz around current events doesn’t immediately spike right after the event. As new facts and information are disseminated, the current of conversation can change. In this scenario, a new and more specific term “Benghazi” did dominate the conversation, as it slowly became shorthand for the overall issue. What carries conversation is language that accelerates understanding and lowers the barrier for participation.

Ultimately, comments are windows into not only what people are talking about but also when topics tip over into public conscious and what the driving forces are behind when conversations peak. In the same way that communicators deploy search engine optimization to target searchers, they need to also incorporate conversation optimization strategies to become part of the conversation.

Get the Disqus Firehose With New Filtering Options

In February, we announced that the full Disqus firehose of public comments is available through Gnip. Our customers love the conversations in Disqus, but have asked for tools to filter the stream so they receive only the conversations they want. Today, we’re announcing our new Disqus PowerTrack offering. Similar to our Twitter PowerTrack product, Disqus PowerTrack offers powerful filtering so customers can filter the full Disqus firehose of public comments to extract the specific conversations they’re looking for. With over 500,000 comments created each day on Disqus, there are a huge range of conversations taking place and you don’t want to miss the ones about your brand or products.

With Disqus PowerTrack, you have a wide array of filtering options. You can filter for specific keywords. You can constrain that filter to specific websites. Or you can look for just the mentions that have links. So, if you’re looking for brand mentions of Apple, you can track conversations about the iPhone or brand mentions in general. You can also monitor for comments mentioning the iPhone that have links in them so you can understand what online stores are being promoted along with your products. See the full list of Disqus PowerTrack Operators in our documentation.

To see the power of the full Disqus firehose, check out this graph showing all mentions of Apple on Disqus. On a normal weekday, there are almost 10,000 comments about Apple. For big events, like WWDC, you see a spike to almost 40,000 comments per day. That’s a lot of conversations.

Get Disqus Firehose with New Filtering Options

We’re big proponents of the conversations that happen in comments, and we’re committed to making it easier for companies to understand and be able to participate. Our new Disqus PowerTrack makes it easier than ever to understand the types of conversations happening in comments.

If you have any questions about the new Disqus capabilities, please contact your sales rep or our sales team at

Big Boulder: From Monologue to Dialogue with Disqus

An interview with Daniel Ha and Ro Gupta with Disqus about how to engage using comments.

Big Boulder Panel at Disqus

Today Disqus is one of the most widely used discussion platforms on the web. Small blogs to large media brands use Disqus. Daniel Ha says Disqus likes to talk about how people don’t know their brand, but they are familiar with Disqus’s core discussion engine. When Disqus launched four years ago, they didn’t know anything about blogs, comments or publishers. Instead, Disqus wanted to tackle online communities to build more loyal audiences. Today audience development is equally as important as content.

Launch of Disqus 2012

The Disqus team wanted to analyze how they would launch Disqus if it were a new product in 2012: How would they build it?  Disqus knew their value was with their users; they knew 98% of people would never comment online, so they build a product for people who get value from lightweight engagement. “Comments” is very broadly defined. Over time, Disqus wants to move away from comments and move to how discussions power communities. Disqus knew the user experience and were able to produce Disqus 2012.

But they’re also providing hard metrics for publishers. With Disqus 2012, publishers saw a 41% increase in engagement across sites. They also have an incredible new feature in their real-time view of users on Disqus. You can view it at

Social and Disqus

Daniel says, “Disqus has been described as a social commenting system, don’t necessarily agree with it.” Social adds an extra dimension that wasn’t available 10 years ago. Disqus fosters relationships and more topic-centric conversation. It’s not necessarily between friends, but rather connecting people on a common topic.  So yes, it’s social commenting, but it’s much deeper than that.

“Discussions have always part of the promise of the internet,” explained Daniel. He then gave the analogy of communities being like your favorite local bar.  Sure, you can anywhere to get cheap drinks and get hang out, but you have your favorite bar because you know that’s where you’re comfortable and you know the people there. Disqus’s communities attract experts and novices who want to come together and connect on a common theme.


As with any social platform, there’s a concern with identity and the intersection of level of engagement. Disqus has found there’s a middle ground of users who have an identity, though it’s not specific to their real identity. They provide high quality comments and many. Some level of identity choice is important in communities. It’s not about hiding something, but it allows a multi-faceted approach to expression. When there’s more freedom in the expression, Ro says, “Real insights can be drawn from the data.”

Fun fact about Ro Gupta; he coined the “Big Boulder” name. Cheers to that!

To end the session, Chris Moody also announced an easier way to filter comments from Disqus. More information will be available in the near future.

Big Boulder is the world’s first social data conference. Follow along at #BigBoulder, on the blog under Big BoulderBig Boulder on Storify and on Gnip’s Facebook page.

Big Boulder: Blogs, Comments, Forums and Rich Social Data Gestures

A panel discussion on forums, comments and blogs and other rich social data gestures with Ro Gupta from Disqus, Mark O’Sullivan from Vanilla, Mike Preuss from FormSpring and Martin Remy from Automattic, and moderated by Nicole Glaros from TechStars.

Social Gestures Panel at Big Boulder

The definition of community can vary widely across platforms, but is there a real definition of community? At Disqus, they like to think of communities as a continuum. First come comments then conversations on twitter, blogs, and other platforms. Once the conversation is developed, it gives way to a community. Communities are about recognition and repetition, and forums allow for these communities to develop. Commenting systems are a jumping point for communities. Mark O’Sullivan of Vanilla coined them “community training wheels” because they are a good starting point for community forums. They enforce familiarity and often lead to offline communities as well. Mike Preuss explained what draws people into communities: FOMO. As social beings, the “fear of missing out” or FOMO drives communities. When 45% of daily users on FormSpring are creating content and engaging others, users feel the need to contribute to the conversation. About 74% of visitors to Disqus will return everyday or every other day. “When you think you’re missing out on something,” Mike says, that defines a community.

Developing a community is a huge task, but the bigger task is engaging users. At Vanilla, they have a full range of social gestures because not everyone will be able to contribute to every topic. But by using “light weight” gestures such as “likes” or “smiles” in the case of FormSpring, content creators can receive feedback and give readers some way to signal back. It also helps to identify good and bad content, influencers and contributors in the community and drive moderation from these. There’s a tremendous push toward allowing anyone the chance to become a content creator. A recent and fascinating case of this is Pinterest; “pinning” photos is creating content and allows users to express who they are.

At Disqus, they focus on reaction tools, according to Ro Gupta. Ro says they want to be able to reengage after the fact, and this includes cross-pollinating on other platforms like Facebook and Twitter. Engagement can be measured by “daily active users.”  The 90:9:1 rule is something that Disqus deems true for their platform. 90% of users are passive clickers, 9% help curate content, and 1% create the most content and drive discussions. However, there is a middle ground because of lightweight gestures that encourage users to engage on a smaller scale. According to Ro, about 35% of users contribute solid participation in the form of  lightweight gestures. In the case of Vanilla, Mark said some users were hesitant to allow lightweight contributions, but over time, users found it encouraged new content and lowered barriers to engagement.

Lightweight interactions are relatively new, but do they really affect product roadmaps? The answer is always yes. Martin of Automattic says WordPress isn’t adding social for the sake of “adding social,” but rather because the feedback from lightweight interactions is motivation for content creators. WordPress is adding more tools to enable this as well. As Mike of FormSpring explains, “we want to reward good user behavior,” by releasing new features for users. Lightweight actions help them sort what’s actually relevant to communities, so FormSpring came out with a feature to sort by most popular and by language.

When it comes to platforms, each company agreed that it is extremely important to carry content across platforms. As Martin said, it’s important for people to publicize their content outside of their blogs. Users want to share on tumble, twitter, and elsewhere. Tumblr actually doubled engagement within WordPress. “Viralizing the content,” Ro of Disqus says, draws in more users. 50% of Disqus’ users connect with another social platform and 10-12% of comments are shared on Twitter. And while you’re always competing for eyeballs online, no single platform can own a conversation about something. When a user is particularly interested in a topic, it will naturally cross platforms. Facebook has even helped discussions grow through “Facebook comments”. They tend to increase the pie for everyone and open the eyes of new users.

One of the biggest concerns of content creators is engagement versus reach: which is more important? Both matter to different creators, and but it’s important to consider who is asking. For example, a blog like TechCrunch has more influence and reach than a personal blog, but both reach and engagement are valuable within different communities.

Big Boulder is the world’s first social data conference. Follow along at #BigBoulder, on the blog under Big BoulderBig Boulder on Storify and on Gnip’s Facebook page.

Rich Comment Data from Disqus Now Available Through Gnip

Imagine going to a dinner party and listening to the first thing each person said. You’d learn a few things, but you’d miss out on the meat of the conversation that happens in the give and take of the dialogue.

In the world of online public social conversation, blog posts are the monologue and comments provide the dialogue. Each is valuable on their own, but to see the complete picture, you need both. Conversations happen in comments, and it has been a huge struggle for brands to be able to keep up with comments to fill in their understanding of this key piece of the conversation.

I’m excited to announce that we’re making it easier to access these public conversations with the addition of the full Disqus firehose to our publisher portfolio. As the largest third-party commenting platform in the world with 70 million commenter profiles, the Disqus firehose provides coverage of more than 500,000 comments every day, spanning almost every topic imaginable and reaching over 700 million readers each month.

Comments last forever. They appear in search results and remain part of the discussion long after the day they were written.  With their staying power and depth of discussion, the commenting ecosystem provides an important — and different — social signal. Disqus further embodies this by allowing users to react to others’ comments with up or down “votes” creating significantly more engagement. The 2 million “votes” on Disqus each day provide insight into what comments are generating the most reaction.

Our Disqus API partnership provides authorized access for the first time ever to full firehoses of discussion content and interaction across the Disqus network. To the extent that any of this data has been available before, it’s been provided by technologies like content scraping/crawling that pulled pieces of the discussion, but did not guarantee full coverage in real time on a publisher-safe, consistent and reliable basis. Because this new service is being provided via a direct partnership with Disqus, with Gnip’s full firehose, you get low-latency streams that provide full coverage with the support of the publisher to ensure the availability of the data over the long term.

We’re thrilled to have data from Disqus available on our platform and can’t wait to see the amazing ways that our customers are able to apply it to their businesses. Email us at to learn more about Disqus and set up a trial so you can see the data for yourself.