Blog and Comment Data Answers the Why

Shoppers Tweet about what they bought, but they turn to blogs and comments to share why they bought.

This is only one example of what makes the long-form data from blog and commenting platforms valuable to any company looking to better understand why their customers and prospects make the decisions they do. Simply put, blogs and comments are opinion rich. And when it comes to product development, sales, brand management, and more, these opinions provide a unique and critical lens into the nuanced thinking behind customer decisions.

In the past, some social media monitoring providers have used scraping solutions to include blog and comment data in their offerings. While this can get you the data, scraping has several fundamental challenges. The data can be days or weeks old. Scraping solutions often ignore terms of service and user intent, meaning the data can disappear at a moment’s notice when the scraper gets blocked. The data can come in a range of formats that make it very difficult to parse and analyze. And with scraped data, you only get results from the blogs and comments that you know you should be looking at, missing important discussions that surface in new and unexpected places.

It’s because of these challenges that we’re introducing Gnip for Blogs, combining content from four of the most popular long-form blog and comment sources. This first-of-its-kind package of data from Disqus, Tumblr, WordPress and IntenseDebate gives realtime, normalized, terms of service-compliant access to the rich conversations happening across a huge swath of the Internet. With Gnip for Blogs, customers are able to easily and confidently build their business applications on multiple sources of long-form data knowing it won’t suddenly disappear tomorrow.

Each of these sources has a story to tell on its own, but by looking at them all together brands are able to draw insights from an enormous range of discussion. This includes the mass market reach provided by WordPress who powers 19% of the web, the high volume of brand mentions on Tumblr, the highly-engaged audience on IntenseDebate and the enormous reach and quality of the conversations on Disqus.

One of our customers, Networked Insights, recently used realtime WordPress data to identify early technology trends based on influencer blog conversations. They then used this content to refine and focus a targeted online promotion for one of their customers. The end result? A 30% lift in ROI for their online ad spend. And this is only the beginning.

For more information, check out the Gnip for Blogs on our website or contact us at

Expion Joins Plugged In to Gnip and Adds Tumblr & GetGlue as Data Sources

This week I’m on a panel at Expion’s third annual “Mission Possible” Conference in Raleigh, which makes it even more fun to announce that Expion is now a Plugged In to Gnip partner and will be adding Tumblr, Disqus and GetGlue to their social data sources. It is especially significant to be making this announcement among all of Expion’s incredible customers because, ultimately, this partnership is all about providing them with the best social data out there.

Expion’s leadership position in the social media marketing and engagement industry makes them an ideal Plugged In partner as they’re committed to providing complete, reliable and sustainable social data into their analytics products. The world’s largest brands and agencies use Expion to effectively monitor and engage with their customers in real time across multiple geographic locations and myriad digital channels. The marketplace is changing rapidly and we are seeing industry leaders like Expion marshall together the best mix of social data sources to serve their customers. By adding sources such as GetGlue, Tumblr and Disqus, Expion is creating a competitive advantage for their customers–giving them a much more complete picture for their brand.

I’ve had the pleasure of working with Expion for the past year and have been particularly impressed by their commitment to innovation. The social media landscape is constantly evolving and they have always been eager to dive in to the latest products and data sources such as Tumblr and GetGlue to ensure that their products, and their customers stay ahead of the curve.

Stay tuned for a case study demonstrating what Expion’s customers are doing with access to these data sources!

Gnip at Expion Conference


Simply Measured Introduces Tumblr to Its Offerings

It is always fun to see your customers innovate, which is why it’s so great to see Simply Measured add Tumblr to its many social media sources. Seeing Tumblr data alongside all of the other social media sources Simply Measured offers paints an incredible picture of the social media landscape for brands. Users can now measure Tumblr as well as other channels, including Facebook, Twitter, Instagram, YouTube, Vine, LinkedIn and Google+ to show the relationship between content marketing, traffic and audience interactions.

Simply Measured’s Tumblr analytics feature reporting allows you to understand the long-tail of content to being able to compare yourself to competitors. The four reports include:

●      Tumblr Blog Report: Analyze the engagement of your posts, understand the amplification and longevity of reblogs, track your follower growth and engagement, and identify your brand’s influencers.

●      Tumblr Blog Report with Google Analytics: Measure all aspects available in the Tumblr Blog report, and include data around traffic sources and user behavior.

●      Tumblr Competitive Analysis: Benchmark your brand against any 10 Tumblr blogs. See how you compare in terms of posts, reblogs, amplification and even engagement outside of Tumblr.

●      Complete Social Media Snapshot: Measure your efforts on Tumblr in-context with Facebook, Twitter, Instagram, YouTube, Vine, LinkedIn and Google+. Compare audience size and growth, and post engagement across all major networks.

Simply Measured, a longtime Gnip customer, is also now a certified partner of Tumblr. Congratulations to the team at Simply Measured on your continued success!

Sample Simply Measured Tumblr Analytics Report

Data Science: The Sexiest Profession Going

Data scientists Mohammad Shahangian of Pinterest; Kostas Tsioutsiouliklis of Twitter, Adam Laiacano of Tumblr discuss the challenges and opportunities in social data.

Data Scientists at Big Boulder

As Gnip’s own data scientist Dr. Skippy was joined on stage by three data scientists representing three prolific social networks, Big Boulder Master of Ceremonies Lindsay Campbell couldn’t help herself gushing to the crowd, “This is by far the sexiest panel this year”. (Which was a reference to the Harvard Business Review naming data science the sexiest profession of the 21st century.)

Physical appearance aside, there could hardly be a truer statement to Big Boulder attendees: a legion of self-proclaimed data nerds.

Scott Hendrickson, better known as Dr. Skippy, Data Scientist at Gnip was joined on stage by Mohammad Shahangian of Pinterest, Kostas Tsioutsiouliklis of Twitter, and Adam Laiacano of Tumblr.

A Look at the Data Science Departments

The conversation began with each guest sharing the size of data science teams and roles at their respective organizations.

The data science team at Twitter is currently comprised of 7-8 people, looking to build to team of 20 in the near future (see open positions here). Data scientists at Twitter fall into two departments: a business intelligence and insights team of data scientists and individual data scientists who are embedded into teams. Data scientists embedded into teams become key stakeholders in improving and evolving the product.

The business intelligence team works collaboratively to explore ideas and create reports, even if it is not always favorable to the company. As Kostas explains, data scientists are trusted at Twitter. It’s ok to report the truth.

At Pinterest, there are 8 full-time data scientists on the team. The primary goal for data scientists is to understand what users are doing, to put pinners first- a strong company value. Much like Twitter, Pinterest data scientists are integrated into other engineering teams. This blend of engineers and data scientists on the same team enables nimble product iterations. Since adding data scientists to the mix at Pinterest teams are now requesting deeper and deeper metrics to measure success and plan product.

Tumblr’s team of data scientists is also eight strong in two roles, first a search and discovery team six strong and second, a two person, very self reflective business intelligence team. The search and discovery team is tasked to maintain the quality of the data and build products that can make the data usable, and ensure the end product is something users enjoy. The business intelligence team of two people is highly self-reflective investigating actions users take to determine which actions are indicatory of long term success.The outcome of which is most frequently is reporting.

Data Science Impact on Product

At Tumblr, there is a significant amount of testing around registration and onboarding, what users see when they land at However, Adam is quck to add that Tumblr has a unique view on their research, stating, “You don’t have to do as much research on your product when you use it yourself”.

Data scientists at Twitter report metrics all the way to the top. The CEO and the executives are asking questions about the data around launch of a new product and value the input of data scientists.

By sharing data with product teams, Pinterest engineers are being driven by the data. Mohammad shares, “After exposing metrics to people, the first instinct is to want to make the metrics better. This brings a culture of people who come to the data science team and seek their input. They take the ideas of product and run some queries to see if the data validates it. We’ve made it very easy for product teams to set up experiments, we don’t even call them experiments anymore.” Expounding on this fact, he shares an anecdote from a recent rewrite of the entire website. When launched, scientists noticed a dip in follows. Investigation from the team lead to understanding that the enhanced speed of the rewritten website had eliminated a small lag which followed a users like. A lag of time in which users had been following pinners on the site. By correcting the lag, follows went back up.

Who You Callin’ Sexy?

As Dr. Skippy joked about the popularity, ahem sexiness, of the data science title, conversation turned to the lack of an industry standard definition for the role, noting there is often confusion and a lack of differentiation from business analysts and business intelligence roles.

Kostas began noting that data science is not about analyzing but about prediction. Twiter data scientists are also engineers. Backgrounds of Twitter data scientists include statistics, data mining, machine learning, and engineering.

Further delineating from data analysts, Mohammad points out that role isn’t pulling their own data. Continuing on he added, “If you can’t pull your own data, how can you figure out what you want? A data scientist is skeptical. If results seem too good to be true, they will investigate. Question the data. Analysts will take the data as the data.”

Adam relates a good scientist as individual who can get data in any format and clean it up, can take weird, fuzzy forms and see the layout of the information is available. To connect the puzzle and build the data set that is useful.

The Future For Analysis of Social Data

Much of data science to date has been ad hoc, but the panelists agree that as you look closely at what data scientists do, it’s templates and patterns. Over time this work will become progressively more standardized. With new, faster tools it will move away from ad hoc processes. Teams will build models and tools to solve recurring problems.

Adam of Twitter added optimistically that the future is the work data scientists will do as they collect data across platforms and across multiple streams. It’s up to those developing third-party tools and resources to innovate using all the data.

Lastly, Mohammad chimed in that machine learning and prediction modeling is the sexy amongst the sexy. Adding, “That’s what we’re all waiting for”.

Big Boulder is the world’s first social data conference. Follow along at #BigBoulder, on the blog under Big BoulderBig Boulder on Storify and on Gnip’s Facebook pag

Unleashing the Creative Expression of 100 Million People

An interview with Derek Gofffrid and Danielle Strle of Tumblr about the unique experience behind Tumblr and its 100 Million users. 

Derek Gottfried and Danielle Strle at Big Boulder

What is Tumblr?

Tumblr is not a social network.

Derek Gottfrid explains that Tumblr is about the content, not relationships and relationship-building. Hence Tumblr is a media network — with a focus on content propelled by user passion for that media.

When broken down the Tumblr platform fills two roles around sharing media: for users to consume and to share. A Tumblr is a channel for a user to post, create and share to the world in an unlimited way. Second, the dashboard is an incredible media consumption tool.

What Makes Tumblr Special?

It is the diverse formats to share the different types of content (photo, video, text, links, quotes, chats, audio) available in one place. With 7 post types, it’s easier to take part in the community because sharing doesn’t mean having to fill a big white text box. Users can select to share photos, videos, quotes or even reblog content. It removes the intimidation many users find in long-form text blogging.

When it comes to the incredibly viral nature of posts on Tumblr, it’s no question their most unique and valuable asset is the reblog.

Derek explains reblogs as a unique, nuanced feature of the platform that encourage users to adopt content as their own. He adds quickly that a reblog is much like clothing. You could make your own clothes, or you could go buy and wear it. Either way, you make it your own when you put it on.

To fully understand the impact of the reblog feature, consider this — 85 to 90% of posts a day on Tumblr are reblogs. The Tumblr team has seen single posts be reblogged 10,000, even 100,000 times in one day.

How Are Brands Utilizing Tumblr?

As advertising and brands on Tumblr celebrate their one-year anniversary, Derek is quick to note that the development of Tumblr was not motivated by creating a space for brands. Instead, the Tumblr team, has taken time and care to figure out the best way brands can contribute to user’s content stream. Successful brands are utilizing the tools available on Tumblr to tell robust stories. The result is brand content that is a thoughtful, mindful addition to the stream of users, which can be adopted by users as their own content.

What Are The Untapped Opportunities Available With Tumblr Data?

There is a huge volume of data provided by Tumblr, yet analytics and understanding of much of it remains unexplored. As Derek explains, the next level deep dive (analytics) on Tumblr is a huge opportunity. To explore and understand the power of reblogs, specifically how they travel through the userbase. Another untapped opportunity is tools to understand the massive volume of data going through the system.

Big Boulder is the world’s first social data conference. Follow along at #BigBoulder, on the blog under Big BoulderBig Boulder on Storify and on Gnip’s Facebook page.


Building the Location Layer of the Internet With Mike Harkey of Foursquare

Mike Harkey, the Head of Platform Business Development at Foursquare, talks about how Foursquare is building the location layer of the Internet. 

Mike Harkey of Foursquare

To kick things off at Big Boulder, Gnip’s VP of Product, Rob Johnson interviewed Mike Harkey. As the Head of Platform Business Development at Foursquare, Mike talked about the evolution of Foursquare during the past four years. First introduced as the “check-in app,” Foursquare is now becoming known for its location recommendation services.

 Merchant Applications

As Mike stated, “the company is growing dramatically.” Foursquare recently received $41 million in funding in April 2013, and that is certainly shaping their growth. From a consumer application, check-ins and active uniques have grown 10% every month. However, Foursquare is really focused on providing real world applications for merchants, whose use has quadrupled in the past 6 months.

Foursquare has always offered a free solution for merchants to claim their business and run offers and specials within the app. Users can also follow merchants to keep an eye on these offers. However, at the end of the day this won’t matter if a merchant can’t see what needle Foursquare is moving for them. Enter merchant dashboards: Through the merchant API, merchants can track the value and success of their media campaigns and how Foursquare is influencing them.

 The Location Layer

Just as Facebook is the social layer of the internet, Foursquare has built the location layer. With 4 billion check-ins and 50 million places worldwide, it’s not hard to see why this data is so valuable and practical. And there’s something that’s fundamentally unique about Foursquare, in their ability to see real-time actions.

Foursquare is the first to find out when a venue opens and closes. This signal is not only beneficial for the application, but also for 3rd party platforms that rely on them. Maintaining the quality of data when it’s user-based is challenging but Foursquare has learned which levers to pull. A community of super users have the rights to edit and update data to help to “vet and validate” its quality. This further fuels the consumer application of Foursquare.

Using the Data

Foursquare check-ins show the pulse of New York City and Tokyo from Foursquare on Vimeo.

Foursquare holds itself to a higher standard with its data. They believe this data is not just theoretical, but has practical, real-world applications. For merchants, this means validating their presence on the app – according to Mike, 20% of users check-in to a place discovered by the recommendation service within 36 hours of discovery.

Since the founding of the company, people have wanted to access the data Foursquare provides. The API has always been open, but Foursquare has wanted to be careful about allowing access to the data. Gnip’s partnership with Foursquare to allow access to its firehose has tremendous possibilities for businesses. Examples include how individual users act during specific events. During Hurrican Sandy, Foursquare released visualizations around how people operated during and after a crisis.

Globally, using this data for good has been a priority for Foursquare. In Turkey, there was activity they didn’t expect during the recent riots. They had representatives on the ground of the riots and could see users posting photos and information as this was the only viable mechanism to expose this information.

The Future of Foursquare

Foursquare believes the applications for this data are virtually limitless, whether it’s making the data available for research or business applications. Foursquare is excited to see what people will build with their anonymized data from its partnership with Gnip. Foursquare has a number of products will be introduced this year. Soon, small businesses will be able to advertise through Foursquare and make the most out of this service. They will have the ability to turn on and off offers and reach long-term consumers.

Big Boulder is the world’s first social data conference. Follow along at #BigBoulder, on the blog under Big BoulderBig Boulder on Storify and on Gnip’s Facebook page.

Aspirational Brands on Tumblr: Lexus vs. Toyota

Gnip conducted a brief analysis of the Toyota family of brands (Toyota, 4Runner, Camry, Highlander, Lexus, Prius, Rav4, Scion, Sequoia, Tacoma, Tundra) on multiple social media platforms. We looked at brand mentions on Tumblr, Twitter, WordPress and WordPress comments during the period of Oct. 15 to Nov. 15, 2012.

As you would expect, Toyota was the most frequently mentioned brand on each social platform, with one enormous exception – Tumblr. Lexus had 5 times as many mentions on Tumblr as Toyota. This highlights how aspirational brands do exceptionally well on Tumblr where niche communities of fans often form around brands. (Attention brand managers, this happens whether the company is involved or not). A central component of Tumblr is visual content, which also plays well with aspirational brands. Furthermore, Tumblr content is both extremely viral and has a long shelf life meaning that content shared on Tumblr can be shared for longer periods of time and jump to more diverse sub-groups within the network than other social networks. During the month Gnip tracked mentions, Lexus received more than 200,000 mentions while Toyota received 40,000.

In social media, it is easy to rely on Twitter as a kind of alert system of when content is being shared, but at Gnip we’ve seen time and time again where content that pops up elsewhere doesn’t always pop up on Twitter. Each social media network has its own attributes and audience and modes of interaction. Because of likes, reblogging, and the way timelines are read by Tumblr users, Tumblr has active communities that aren’t found elsewhere.

Lexus on Tumblr

Tumblr Analytics: It’s a Whole New World

Union Metrics has been with Gnip since the early days, using our social data in their flagship product, TweetReach. Earlier this year, when we announced the availability of social data from Tumblr, we were excited that Union Metrics moved quickly to start building a new product based on that data. Last week, Union Metrics launched Union Metrics for Tumblr and was named Tumblr’s preferred analytics provider.

We’re big believers in Tumblr and the value of the conversations taking place there. As we’ve talked about in the social cocktail, Tumblr content has unique properties. Our data science shows that Tumblr content is inherently viral – able to amplify conversations about any topic – and even more than that, the content on Tumblr has incredible staying power.

And we’re not the only believers in Tumblr. Brands like Adidas and Coca-Cola have been actively engaging and advertising on Tumblr since the launch of Tumblr’s advertising platform earlier this year.

Congrats to the team at Union Metrics! This is exciting news and we’re only at the beginning.

You can read more in AdWeek, The Next Web and GigaOm.

The Staying Power of Tumblr

It took two days for the poll to pop.

Three days after the pride cookie, Houston radio station KTRH dropped a question for its listeners.

“The cookie your grandfather loved has ‘gone gay!’” the station wrote on its website, “What Do You Think? Does This Rainbow Flag Cookie Bother You?”

It bothered becausegretchensaidso (now using Tumblr username Gretchenisincognito). Well, at least, the question did. That day, the user left a tumble for followers:

“This poll is from a conservative news radio station,” the user wrote, “Let’s surprise them with overwhelming results in favor of equality.”

The post trickled out, gathering almost a hundred reblogs in a 24-hour period. Then it flatlined, holding without major gains through the morning of the episode’s fifth day.

And that’s when it burst. On the evening of June 30th, with Tumblr Oreo chatter sloping back to normal, becausegretchensaidso’s message went vertical — a full 48 hours after publication. Close to 300 users shared the post in a matter of hours. A day later, that number had doubled. Over at KTRH, the poll was tilting for the pride cookie.

Content lingers on Tumblr. becausegretchensaidso and another user waited for days before posts went viral. Others watched as posts drifted forward, adding one or two reblogs each day.
Figure 1 presents the accumulation of reblogs by content posted by different Tumblr users. Excluded from the picture is palahniukandchocolate. During the Oreo episode, fewer than 10 Tumblr users originated content that drove the explosion of the story.

It’s different on Twitter. Not only where the total volumes slow (as we saw in earlier posts here and here), but share rates of the story’s top drivers fell precipitously and sequentially as each piece of content yielded to the freshest meme. Traffic mapped a Social Media Pulse, the picture of social decay for unanticipated events. Across the nine users who drove most conversation on Twitter, user retweets — a analog for reblogs on Tumblr — did not display the endurance of a Tumblr conversation.

Figure 2 presents the rate of retweets by hour for content posted by top drivers of the Oreo conversation on Twitter.

For brands, the implications are clear: Conversations — promoted or unprovoked — endure on Tumblr through reblogging. That can heighten the returns to network engagement — and the risk of allowing negative perceptions to form.

Tumblr also has movement quality that can dominate a moment: During the height of the Oreo episode, reblogs made up more than 90 percent of tumbles related to the pride cookie. On Twitter, the number of retweets rarely rose above 50 percent of the tweet volume.

Figure 3 presents the shares of Tumblr and Twitter conversations related to Oreo, at the episode’s peak, driven by shared content.

In a sense, then, on Tumblr, the creator is king: The network offers those who would speak an unprecedented platform, engineered for replication and amplification. It falls to brands to take advantage of the behavior on this platform by creating content users want to associate themselves with and pass along.

Continue reading

Oreo, Tumblr and a Network's Power to Amplify

Really, it was bigger than Oreo.

When Nabisco posted an image supporting gay pride, Tumblr blew it up. Users took the statement of a single snack manufacturer and made a cause that touched many companies.

In this, the second part of a trilogy, major brands find themselves roped to a conversation about love in America. Part one talked about how Oreo cannonballed into the social web by posting an image of a rainbow Oreo in support of gay pride. Part three will use the episode to highlight conversation dynamics unique to the Tumblr network.

It began with maskedman.

“Gay oreo? Oreo suppoert Gays/??” the user wrote, “Never evating cookie again. … Disgustedng. THis is AMERICA, not HOMERICA.”

The post, which would ultimately accumulate some 1,500 notes, landed a day after Oreo’s image and touched off a wave of support for the company.

One user, palahniukandchocolate, made a list.

“Dear people boycotting Oreos for supporting gay rights: The following companies also support gay rights,” she wrote, adding the names of 37 companies, among them Allstate, Gap, Nike and Starbucks.

A day later, monkaroo retooled the tactic:

“Yes, please boycott Oreo for their support of gay rights,” monkaroo wrote before invoking two dozen companies aligned with Oreo, “We’ll all appreciate you going on a diet … [D]o us all a favor, don’t take it all out on a festive cookie… Just stay home and boycott everything.”

The note from palahniukandchocolate ran close to 900 characters. monkaroo’s topped out over 1,800. Together, they used the freedom of Tumblr’s platform to find a community in an ideology. They grabbed allies — and by doing so, they blew up the question.

The notes caught.

By the evening of the 26th, palahniukandchocolate’s message was pulling down hundreds of reblogs per hour. Indeed, that night, the note would lay claim to 75 percent of Tumblr’s Oreo conversation.

Graph Showing Oreo Mentions Spike on Tumblr

Figure 1 presents hourly Tumblr activity about Oreos (blue) and hourly reblogs of user palahniukandchocolate (orange).

The action spread elsewhere. Starbucks had seen a median 11 tumbles per hour in the two weeks leading up to the 24th. Pepsi had seen 14. On the night of the 26th, palahniukandchocolate lifted both brands, driving each to a network peak of more than 400 posts per hour.

Microsoft also bounced, rising to the 400 peak from 15 posts per hour and holding triple digits as late as the afternoon of the 29th. Costco, with barely a pulse on the network the week before, found itself in 7,100 tumbles the day after the cookie.

Figure 2 presents hourly Tumblr activity around Costco, McDonald’s, Microsoft, Pepsi, Sears and Starbucks. Association with Oreo’s pride cookie drove heightened activity for each brand.

palahniukandchocolate named 37 brands in her defense of Oreo. For most, including Coca-Cola, Levi’s,  Nike and Walgreen’s, that single association dominated the brand’s Tumblr presence in the second half of June.

Tumblr’s platform made that possible. Figure 3 shows four brands that bounced on Tumblr thanks to the Oreo affair. None saw pickup on Twitter in the wake of the image — the platform has no room for periphery.

Graph Showing Cookie Brand Mentions on Tumblr
Figure 3 presents hourly Twitter volumes for four brands that popped on Tumblr in the wake of Oreo’s image. Microsoft’s acquisition of Yammer drove the brand’s heightened activity pictured here.

In part, it’s not surprising that the Oreo story could cast so long a shadow over so many brands. Tumblr’s largely an extraprofessional platform; presence on the network requires personal connections between users and brands. Figure 3 presents average daily Tumblr volumes for corporate titans. The flows are thin, technology superbrands notwithstanding.
Graph of Brand Activity on Tumblr

Figure 4 presents average daily Tumblr activity around a subset of the 50 largest corporations by market capitalization (ranked Aug. 18, 2012).

Brands with little network presence risk leaving definition in the hands of others. And Tumblr encourages association: The platform provides flexibility in media and speeds the replication of conversation.

The series’ last installment dives into conversation dynamics on the network. If you like trace diagrams, this next one’s for you.