Full Coverage of Anonymized Foursquare Check-In Data Now Available Exclusively from Gnip

We’re thrilled to announce our exclusive partnership with Foursquare to provide their full firehose of anonymized check-in data. Location is one of the most interesting ways to view data and no one understands the power of location like Foursquare. With more than 35 million registered users, nearly 4 billion total check-ins, and over 75 million API calls a day, Foursquare is the location layer for the Internet, helping to connect people with places around the world.

Foursquare has always believed in having a robust set of APIs so people can build great solutions on their data.  With today’s announcement we are offering commercial-grade access solutions that will bring a level of reliability, sustainability and completeness that has never been available before.

You may be wondering what it means to provide a full firehose of anonymized check-in data.  It means that we’ll be able to provide access to the realtime stream of every check-in that is taking place on Foursquare.  By providing only anonymized data with no form of user identification, Foursquare preserves users’ privacy, while unleashing countless geo-based use cases for businesses and researchers.  For example, this means we’ll be able to deliver all of the realtime activity happening at Starbucks in Portland but not who is checking in to each location.

The possibilities for what can be created with complete access to the anonymized Foursquare firehose seem endless. For example, The Wall Street Journal was able to do side by side comparisons of New York and San Francisco to find what makes each city tick. Retailers will be able to study the results of local advertising campaigns. Financial analysts will have another valuable data point to forecast Black Friday sales.  Real estate development groups will be able to better understand where they should develop new locations. As Blake Shaw, Foursquare’s data scientist, told us when we interviewed him:

“We are capturing this amazing signal about what millions of people are doing in the real world at every moment of the day in cities all around the globe. We have seen that when we aggregate check-in patterns across many individuals, we can measure features of cities at a higher resolution than was ever possible before. I think this data can act almost like a “microscope for cities.”

Foursquare will be joining our other premium publishers – Twitter, Tumblr, WordPress, Disqus, IntenseDebate, StockTwits and Estimize. We are offering both the full firehose and filtered access through our robust PowerTrack product.

We can’t wait to see what the world will build once they have access to the full Foursquare firehose!

To learn more, check out gnip.com/foursquare or email info@gnip.com. You can also head over to Foursquare to read their post, “Giving data nerds access to the realtime pulse of check-ins around the world.”

Gnip at Boulder Startup Week

Each year, Boulder Startup Week celebrates the startup community, and Gnip is proud to be a sponsor again. In addition to bringing new talent to Boulder, it’s a great way for our startup community to come together, collaborate and learn from each other.

If you’re visiting town, be sure to check out Gnip’s open job listings. Gnip will also be speaking at the following events, and would love to say hi so seek us out!

Boulder *Hearts* Women in Tech
Gnip’s Vice President of Engineering Greg Greenstreet is speaking
Wednesday, May 15, 2013 from 2:00 PM to 3:30 PM (MDT)
Boulder, CO

Research tells us that women can make startups more efficient, more resilient, and more successful; yet there’s a dearth of women at startups. So, what to do? Boulder has a welcoming culture and startups can build welcoming cultures, too. Learn about the importance of “cultural fit,” avoiding unconscious bias, recruiting with a wide net, and writing job descriptions that don’t include the word “ninja.” Hear from local startups about their challenges and perspectives on building diversity into their cultures.

Panelists:
Greg Greenstreet, Gnip
Leslie Osborne, Standing Cloud
Jim Franklin, SendGrid
Ingrid Alongi, QuickLeft
Jenny Slade, NCWIT

Give Back, Get Back: A Guide to Boulder and OSS
A talk by Gnip’s Engineer Eoin Coffey at the Gnip office
1050 Walnut Street, #115
Boulder, CO 80302
Thursday, May 16, 2013 from 5:30 PM to 7:00 PM

Boulder and Open Source Software (OSS) both thrive on having open communities that contribute. Participating in both helps you grow as both a professional and as a programmer. However, what might surprise you is the ways that your contributions help others in ways that you might not expect. This talk by Eoin Coffey of Gnip will take a look at the unique aspects of contributing to OSS and the Boulder tech community.

Food and drink provided.

How to Hire an Intern (and Be Hired)
Gnip intern Brian Lehmann and Data Scientist Dr. Scott Hendrickson will be on the panel
Friday, May 17, 2013 from 10:00 AM to 11:30 AM (MDT)
RaffleCopter
1021 Pearl St.
Boulder, CO 80302

Want to hire a student developer? Or looking to get a technical internship at a startup? Come hear from a panelist of startups inlcuding Jim Franklin, CEO of SendGrid and Dr. Scott Hendrickson, Head of Data Science at Gnip who have succesfully hired students. Also hear the prospective of students who have worked at startups including Gnip, Rally Software and TeamSnap about how to do it right. The event will be followed up by a last minute networking session for students looking for positions at startups.

Data Story: Phil Harris of Geofeedia

Data Stories is Gnip’s ongoing series telling the stories of the people and companies that are doing groundbreaking work in social data. This week we’re interviewing Phil Harris, CEO of Geofeedia, a company that allows you to search and monitor social media by location. Geofeedia is a recent Gnip customer, and I love what they’re doing. The inherent value of Geofeedia was made clear to me when we received a media request looking for all social media that was geotagged close to the finish line of the Boston Marathon. Content + location creates powerful stories and Geofeedia is making it easier to find the right ones. 

1. What social data sources do you wish had geotagged data?
Our business is built on the fundamental premise of open source social data aggregation.  Or, I should say, every source. That said, there are currently major social data sources that provide public location data based on location identifier versus geotag. We will accommodate location id to integrate these data sources, but I strongly believe that over time, the benefits of more precise geo-location tagging on social media content will encourage these services to move towards geotagging. When they do, we’re exceptionally well positioned to translate that evolution into benefit for our clients.

2. If you’re a user, what do you think is the advantage of sharing your geodata?
We’ve barely scratched the surface of how geodata will deliver value to consumers. I believe the rapidly growing penetration of smartphones and adoption of geo-centric applications such as navigation will create a rich ecosystem of geo-data driven benefits. I am speaking with major consumer brands who believe that they will be able to create and maintain consumer relationships via location based social media in ways that will deliver significant value back to the individual user.

3. What can you find with Geofeedia that you can’t find on other platforms?
I know from analyzing our data with active customers that a significant amount of user generated content is missed by traditional keyword or hashtag centric monitoring tools. We complement these platforms to ensure relevant location based content is delivered to our customers in real-time.

4. Only a small portion of social media is geotagged, do you think this will change in the future?
I do. We’re seeing an increase every quarter, but as brands start rolling out compelling reasons for consumers to geotag their content, I believe geotagged social media will become the default.

5. How do you think Geofeedia will be used for good?
The leading businesses I’m speaking with consider Geofeedia as a tool to improve their overall customer experience. Understanding an individual social media conversation at a moment in time at a given location drastically improves the ways brands can serve their customers. Also, numerous public safety agencies are using Geofeedia to improve their ability to respond to natural disasters and other scenarios where real-time, location based social media awareness delivers great value.

6. How will real-time geo monitoring affect a brand’s ability to connect with their customers?
Like I said, the major brands with whom I’m speaking are evaluating how to improve their overall customer experience across all touch points – sales, customer service, loyalty – through real-time location based monitoring, analysis and engagement. I do believe that real-time, location based social media engagement will drastically improve a brand’s ability to have a meaningful, new type of relationship with their customers and become a de facto element of their communication mix.

Big Boulder 2013

Big Boulder’s back for 2013 and better than ever.

The leaders in social data: Facebook, Twitter, Tumblr, Foursquare, Automattic, Disqus and many more are descending on Boulder again this summer to talk about the future of their platforms. Last year was a huge success and the expectations this year are even higher. We have a line-up that will deliver!

Headshots for Big Boulder

We’ll go deep into Asia and Latin America with speakers from China, Brazil and Japan, including the CEO of LINE, one of the fastest growing social networks on the planet. We’ll hear about non-traditional applications of Social Data with discussions on Finance, Government, Academic Research and Data Science. And to help us make sense of it all, we’ll have industry analysts discussing their views of the future. See the agenda and speakers pages for all the details.

In addition to all the great topics covered in the sessions, we’ve left plenty of time for networking with others in Social Data, including sunset cocktails with views of the Flatirons, a bicycle pub crawl, and since this is Boulder after all, morning yoga and hiking.

Big Boulder is an invite-only event for the leaders in the social data ecosystem. Space is filling up quickly so if you’re still thinking about it, sign up now before we hit capacity. Interested in coming but haven’t been invited? First check out our blog post about social data vs. social media. If you’re all about social data, email bre@gnip.com for information.

Social Data vs Social Media

One area I see a lot of confusion about is the difference between social media vs. social data. I come from a social media background and use social media in marketing, so I see where the confusion can come from.

The easiest way to think about it in plain English:

  • Social Media: User-generated content where one user communicates and expresses themselves and that content is delivered to other users. Examples of this are platforms such as Twitter, Facebook, YouTube, Tumblr and Disqus. Social media is delivered in a great user experience, and is focused on sharing and content discovery. Social media also offers both public and private experiences with the ability to share messages privately.

  • Social Data: Expresses social media in a computer-readable format (e.g. JSON) and shares metadata about the content to help provide not only content, but context. Metadata often includes information about location, engagement and links shared. Unlike social media, social data is focused strictly on publicly shared experiences.

Or otherwise boiled down, social media is readable by humans and made for human interaction while social data is social media that is readable by computers.

Let’s look at a Tweet in form of social media and social data to show exactly what I’m talking about.

From this Tweet from Gnip, we can visually see that it uses the #BigBoulder hashtag, a Bit.ly link to our Storify page, that it has 73 retweets and 3 favorites, the time and date of the Tweet.  

 

Now let’s take a look at what the architecture of a Tweet looks like when received from an API.


  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
{
   "body": "RT @gnip: Thrilled to welcome all #BigBoulder attendees! Watch the social
story unfold on our Storify page. http://t.co/ZzqUMfJz",
   "retweetCount": 71, 
   "generator": {
      "link": "http://twitter.com", 
      "displayName": "web"
   }, 
   "gnip": {
      "klout_score": 53, 
      "matching_rules": [
         {
            "tag": "old krusty tweet", 
            "value": "thrilled to welcome all attendees"
         }
      ], 
      "language": {
         "value": "en"
      }, 
      "urls": [
         {
            "url": "http://t.co/ZzqUMfJz", 
            "expanded_url": "http://storify.com/Gnip/big-boulder"
         }
      ]
   }, 
   "object": {
      "body": "Thrilled to welcome all #BigBoulder attendees! Watch the social
story unfold on our Storify page. http://t.co/ZzqUMfJz",
       "generator": {
         "link": "http://www.tweetdeck.com", 
         "displayName": "TweetDeck"
      }, 
      "object": {
         "postedTime": "2012-06-20T18:07:13.000Z", 
         "summary": "Thrilled to welcome all #BigBoulder attendees! Watch the social
story unfold on our Storify page. http://t.co/ZzqUMfJz", 
      "link": "http://twitter.com/gnip/statuses/215506104082366465", 
         "id": "object:search.twitter.com,2005:215506104082366465", 
         "objectType": "note"
      }, 
      "actor": {
         "preferredUsername": "gnip", 
         "displayName": "Gnip, Inc.", 
         "links": [
            {
               "href": "http://gnip.com", 
               "rel": "me"
            }
         ], 
         "twitterTimeZone": "Mountain Time (US & Canada)", 
         "image": "http://a0.twimg.com/profile_images/1347133706/
Gnip_logo-73x73_normal.png", 
         "verified": true, 
         "location": {
            "displayName": "Boulder, CO", 
            "objectType": "place"
         }, 
         "statusesCount": 971, 
         "summary": "Gnip is the leading provider of social media data for enterprise
applications, facilitating access to dozens of social media sources through a single
API",
         "languages": [
            "en"
         ], 
         "utcOffset": "-25200", 
         "link": "http://www.twitter.com/gnip", 
         "followersCount": 3335, 
         "favoritesCount": 108, 
         "friendsCount": 384, 
         "listedCount": 212, 
         "postedTime": "2008-10-24T23:22:09.000Z", 
         "id": "id:twitter.com:16958875", 
         "objectType": "person"
      }, 
      "twitter_entities": {
         "user_mentions": [], 
         "hashtags": [
            {
               "indices": [
                  24, 
                  35
               ], 
               "text": "BigBoulder"
            }
         ], 
         "urls": [
            {
               "indices": [
                  98, 
                  118
               ], 
               "url": "http://t.co/ZzqUMfJz", 
               "expanded_url": "http://bit.ly/MumrVJ", 
               "display_url": "bit.ly/MumrVJ"
            }
         ]
      }, 
      "verb": "post", 
      "link": "http://twitter.com/gnip/statuses/215506104082366465", 
      "provider": {
         "link": "http://www.twitter.com", 
         "displayName": "Twitter", 
         "objectType": "service"
      }, 
      "postedTime": "2012-06-20T18:07:13.000Z", 
      "id": "tag:search.twitter.com,2005:215506104082366465", 
      "objectType": "activity"
   }, 
   "actor": {
      "preferredUsername": "daveheal", 
      "displayName": "Dave Heal", 
      "links": [
         {
            "href": "http://daveheal.com", 
            "rel": "me"
         }
      ], 
      "twitterTimeZone": "Mountain Time (US & Canada)", 
      "image": "http://a0.twimg.com/profile_images/1755125722/photo_2_normal.JPG", 
      "verified": false, 
      "location": {
         "displayName": "Boulder, CO", 
         "objectType": "place"
      }, 
      "statusesCount": 5657, 
      "summary": "Boulder resident. Rochester NY native. Michigan Law graduate.
Copyright enthusiast. Liker of sports. DFW fanboy. CrossFitter. Work @Gnip. ",
      "languages": [
         "en"
      ], 
      "utcOffset": "-25200", 
      "link": "http://www.twitter.com/daveheal", 
      "followersCount": 671, 
      "favoritesCount": 28, 
      "friendsCount": 292, 
      "listedCount": 26, 
      "postedTime": "2009-03-02T01:18:39.000Z", 
      "id": "id:twitter.com:22432819", 
      "objectType": "person"
   }, 
   "twitter_entities": {
      "user_mentions": [
         {
            "indices": [
               3, 
               8
            ], 
            "id": 16958875, 
            "screen_name": "gnip", 
            "id_str": "16958875", 
            "name": "Gnip, Inc."
         }
      ], 
      "hashtags": [
         {
            "indices": [
               34, 
               45
            ], 
            "text": "BigBoulder"
         }
      ], 
      "urls": [
         {
            "indices": [
               108, 
               128
            ], 
            "url": "http://t.co/ZzqUMfJz", 
            "expanded_url": "http://bit.ly/MumrVJ", 
            "display_url": "bit.ly/MumrVJ"
         }
      ]
   }, 
   "verb": "share", 
   "link": "http://twitter.com/daveheal/statuses/215509188481253376", 
   "provider": {
      "link": "http://www.twitter.com", 
      "displayName": "Twitter", 
      "objectType": "service"
   }, 
   "postedTime": "2012-06-20T18:19:29.000Z", 
   "id": "tag:search.twitter.com,2005:215509188481253376", 
   "objectType": "activity"
}

This is social data. Same content, very different format, very different context and very different end user.

So what exactly does goes into the social data of a Tweet? To start, here is some of the metadata that you’re seeing.

  • Language identification — It is detected that the language of this Tweet is in English. Language identification is important for social media monitoring so companies can correctly monitor for the content they want.

  • URL expansion — Essentially this resolves or traces a shortened url to the end url that a consumer would see in their browser window. In this case, http://storify.com/Gnip/big-boulder is the link we shared using bitly.

  • Content — Gnip shows the full content of the Tweeted message, as well as metadata about the Tweet; like hashtags and URLs used, users that were mentioned, and when it was posted.

  • User — Gnip provides the display name, username, user’s stated location and additional bio information of the Tweeter. This is the information that users decide to share when signing up for an account.

  • Klout scores — An additional piece of metadata Gnip can provide is Klout score, so if one of our clients only wanted to see tweets with a Klout score of 30 or higher, they could do that.

Beyond Twitter data, Gnip offers social data from Tumblr, Disqus, Automattic (WordPress) and other publishers that all have their own unique metadata and enrichments. In addition to enrichments, Gnip offers format normalization. This means if you’re looking at a WordPress blog or a Tweet, the data is normalized no matter what the platform. E.g. date and location are formated and located in the same place within the JSON payload; making it easy to consume and parse data from multiple different sources.

Finally, a big difference is in how people use social data vs social media. Social data is what powers social media monitoring and analytics companies, it’s used in business intelligence to combine with other data sets, it’s used by hedge funds as part of their algorithms when looking at financial trades, or even to take a top-level look during a natural disaster.

Welcoming Estimize, Gnip’s Latest Premium Publisher

At Gnip, we’ve always had a theory that financial firms would be hungry for social data. What has happened has surpassed our expectations, though; we’ve seen an incredible hunger from firms wishing to use social data as a news source, a sentiment signal and a research set.

One of the ways we’ve measured the success of how this sector uses social data is by how often our customers ask for additional social data sources. One of the most consistent asks we’ve heard has been for for Estimize, a crowdsourced earnings estimates platform that provides open sourced financial estimates with incredibly transparency, making it a valuable and unique set of social data.

We’re excited to now be the exclusive provider of Estimize’s streaming data, delivering our trading customers yet another competitive edge driven by social interaction. Estimize has a community of 2,50 vetted analysts that create estimates that beat comparable Wall Street reports more than 67% of the time. In the short few years since Estimize has been founded they’ve become a force, and we believe this dataset- and the power of this dataset- will continue to increase substantially over time.

Watching how the financial industry has incorporated social data from StockTwits, Twitter and now Estimize is proving the utility of social data and we’re excited to be on the vanguard of that.

Access to Public APIs from Instagram, bitly, Reddit, Stack Overflow, Panaramio and Plurk

Our customers care about every public conversation that happens online. Every month we deliver more than 100 billion social data activities to our clients. While much of our social data is from our premium publishers (Twitter, Tumblr, WordPress, Disqus and StockTwits), we also make a wide range of social data from public APIs readily available through our Enterprise Data Collector product. A significant part of what Gnip does is make social data easier to digest by optimizing the polling of these APIs and by enriching and normalizing the data. We also normalize the data, so if you’re digesting social data from Gnip from the public API of Instagram, it will appear in the same normalized format as social data from Twitter.

To that end, we’re announcing the addition of the public APIs for Instagram, bitly, Reddit, Stack Overflow, Panaramio and Plurk to the Gnip Enterprise Data Collector. While some of those might make perfect sense to you, others might make you turn your head and say, “huh.” Below we have more background on each publisher and why they’re important to the social data ecosystem.

Instagram on Enterprise Data Collector

This photo sharing app, recently acquired by Facebook, continues to be one of the fastest growing social networks out there with 90 monthly million active users. Every day there are 40 million photos uploaded, and every second users like 8,500 photos and make 1,000 comments about them. Our customers have traditionally been very interested in geotagged social data, and between 15 to 25 percent of Instagram users geotag their photographs.

Instagram has become a popular marketing tool for brands from Anthropologie, Intel, Virgin America, Taco Bell and American Express to name a few with Instagram accounts. Furthermore, we’ve really started to see Instagram as a popular tool around current events and for citizen reporting. During Hurricane Sandy, many people used Instagram as a way to document what was happening around them and showing destruction in real time. With the recent inauguration, CNN asked users to tag their Inaugural Instagram photos with #CNN and they saw users submitting an average of 25 photos every few seconds.

Customers accessing the Enterprise Data Collector will be able to access popular posts, conduct tag searches and geosearches.

Potential Uses for the Instagram API:

  • Tracking photos around natural disasters
  • Geo use cases for a given location
  • Brand monitoring

bitly on Enterprise Data Collector

bitly is the easiest and most fun way to save, share and discover links from around the web. While commonly associated as a link shortener for Twitter, bitly is used across the web and provides great information about what social sites are driving traffic. People use bitly to share 80 million new links a day.

Gnip customers will be able to search keywords some of destination page title and URL and some of the content and header tags.

Potential Uses for the bitly API:

  • Monitoring for brand mentions
  • Understanding trending content

Reddit on Enterprise Data Collector 

Reddit is a social news site with user-generated content covering nearly every topic in the world. One of the world’s fastest growing sites in the world, Reddit has 50 million active users contributing links, stories, pictures and topics of discussion.

Customers will be able to search by keyword and hot topics. Brands are often unaware of stories percolating about them on the popular site. One recent interesting example is where a Redditor posted an Applebee’s receipt where a pastor refused to tip her waitress based on how much she was tithing, which ultimately ended up being a national news story.

Potential Uses for the Reddit API:

  • Monitoring for brand mentions
  • Crisis communications warning

Stack Overflow on Enterprise Data Collector

Stack Overflow is a community edited Q&A site about computer programming, making it easy for programmers to find answers to questions they have about code. The site has more than 1.5 million registered users and 4 million questions.

Customers will have access to the entire firehose of Stack Overflow Answers and be able to search tags, reputation and comments by keyword. Programmers tag their questions and making it easy to find the content you’re looking for. Currently, the six most popular tags are C#, Java, PHP, JavaScript, jQuery, and Android.

Potential Uses for the Stack Overflow API:

  • Monitoring questions and discussion about software and technical brands
  • Monitoring bugs and outages
  • Often requested in conjunction with review sites

Panoramio on Enterprise Data Collector

Panoramio is a photo-sharing website with geotagged content that is layered upon Google Earth and Google Maps. Panoramio allows viewers to see an enhanced view of Google Earth because they can see other photos taken in the area.

Customers will be able to use a bounding box to view photos within a certain location. We have consistently found that our customers are eager for more social data with geotagged content.

Potential Uses for the Panoramio API:

  • Monitor social activity within a certain geographic area

Plurk on Enterprise Data Collector

Plurk is a microblogging site that allows users to communicate in posts with 210 characters and emoticons. Plurk has more than 1 million active users that post 3 million “Plurks” each day. Plurk is one of the more popular social networks in Taiwan and also has a strong presence in Hong Kong, Singapore, Philippines and India. Gnip customers will be able to search for keywords within posts.

Potential Uses for the Plurk API:

  • Monitoring for brand mentions, with a particular focus on certain Asian countries
  • Understanding trending content

If you’re interested in learning more about these additional sources on Enterprise Data Collector, please contact info@gnip.com for more information.

SEC “Likes” Social Media

If there’s one person in the world who gets the intersection of the social web with investing, it’s Howard Lindzon. So when the SEC ruled Tuesday that company postings on sites like StockTwits/Facebook/Twitter were as good as news releases and company websites (as long as investors were aware of the use of those sites), I immediately turned to Howard for his thoughts. And sure enough, he had a great post today.

One of the most powerful points he made spoke to the fine line that StockTwits walks as a finance social site. They carefully split the difference between 1.) allowing managed /curated content, tools and control necessary for compliance and governance and 2.) enabling the spontaneous, multi-source, lighting-quick conversation paradigm that makes social media so incredible. As he put it:

Rules matter and if they are clearly stated and thoughtfully enforced, communities can thrive (learn, mentor, make a little coin). We [ed. StockTwits] added some basic financial features like the ability to create disclosures/tracking and the removal of the delete function to ensure trust is at the forefront.  No matter what others call us or think, Stocktwits is a NEWSWIRE. Information is flowing from one to many, all day, every day and it is full of context.

The social web will continue to grow and the power of the content being created on that web will continue to impact even the most regulated industries. How other platforms can adapt and fuel that change, like StockTwits has done, will be fascinating to watch.

Check out the full blog here.

Big Boulder: Bourbon & Boots at SXSW

Derek Gottfrid of Tumblr

This past 2013 SXSW, Gnip brought out the big guns so to speak. We held our first SXSW event, Big Boulder: Bourbon & Boots, on Monday at Malverde for more than 200 of our awesome customers and publishers. We were lucky enough to have Derek Gottfrid, VP of Product, interviewed by our Chris Moody, Gnip’s COO.

You can check out our photos from Big Boulder: Bourbon & Boots on Facebook or check out our Storify with Tweets, Instagrams and Vines of the event!

We were also able to attend several great sessions related to social media and social data, and our notes from the sessions are below!

Building Tools for Creativity 
David Karp of Tumblr

David talked about creating Tumblr as a “hacky tool” because he had tried all of the tools and wanted a better way to express himself on the web. After creating the tool, he was surprised that within a week there were a few thousand people using it.

One aspect of Tumblr that has always amazed David Karp is how its users have defined its use. With the reblogging button, it created a whole community that wasn’t about creation but rather curation, which is a huge part of Tumblr’s identity. When they made panoramas from the iPhone easier to share and more presentable on Tumblr, they found a bunch of new immediate use cases. Jamie Beck created the cinemagraph, a gorgeous and more dramatic GIF. One person shared whole boards from a video game he was designing.  You can’t predict how people will use new Tumblr features but how they will use it will surprise you.

David also perfectly captured a trend on the web, “Images are first class citizens and everything else is a distance second on the web.”

Real-Time Marketing

David Teicher, Ad Age; Bonin Bough, Mondelez International (makers of Oreo); Steve Doan, Oreo; Gary Vaynerchuk, VaynerMedia; David Berkowitz, 360i; Albert Chou, Expion

Gnip was lucky enough to be invited to this private, invite-only panel held by Expion, and it was definitely a favorite by those who attended. This panel focused on real-time marketing and was from the group that brought us the much talked about Oreo Super Bowl real-time marketing.

One area that really stuck out to me was when Bonin talked about how social media made the ROI of each marketing mix more powerful. That social media made their TV spend twice as effective, and that any marketer interested in ROI would want to make their spend 2X as effective. Marketers shouldn’t look to understand how much value they received for each Tweet but rather take a look at the ROI of the overall ecosystem.

Another aspect that really resonated was talking about how hard it is to measure traditional marketing and how much easier it is to measure social media. They talked about how advertisers are faced with a “pass-around rate” for circulation in case someone leaves a copy of Vogue on the bus. Or how unrealistic billboard advertising views are because as Gary mentioned everyone is also likely texting and driving and not paying attention to advertising. Social media ROI is easier and more realistic to prove.

To make real-time marketing work, you should have a willingness to prepare. As David Berkowitz pointed out to do real-time marketing you should ask, “What are you doing every single day of the year?” Gary also talked about from an agency perspective how frustrating it was for him as his clients didn’t move fast enough, which is a familiar problem for agencies.

This panel also brought up the point that using content for marketing purposes is nothing new. Michelin created the Michelin guide for restaurants and hotels once they realized people liked taking road trips. Or when Guinness had problems selling pints, they created the Guinness World Records. Now today it is companies such as Red Bull creating the content as part of their brand.

The State of Blogging 
Matt Mullenweg of Automattic and Kara Swisher of All Things D

This was a hilarious session about blogging, and was helped tremendously by Matt and Kara’s back-and-forth banter.

Matt talked about how he created WordPress because he wanted better software for blogging and was frustrated by what was on the market at the time. Creating WordPress was a “happy accident.”

When asked about whether social networks were hurting blogging, Matt told the audience that social networks had breathed a second wind into blogging. Social networks drive significant traffic to WordPress sites. Matt also mentioned that different social networks create different ego boosts for different reasons and Kara told him that he needed a dog.

Matt believes that WordPress might not have the most users, but that they have the best users. It offers a lot of flexibility and power for serious bloggers. WordPress continues to grow year after year and much of that growth is organic. Matt thinks they’ve beat out other competitors by understanding personal publishing better than anyone else.

One area that Matt sees that WordPress needs to improve upon is their WYSIWYG editor and that the experience there and on mobile could be so much better.

Matt is always trying to think about how people will be digesting content 18 months from now, so right now he is thinking about how Google Glass will change the content experience.

Matt also talked about what makes a good blog post and emphasized that pictures can really create a better reading experience.

 

Data Story: Mohammad Shahangian on Pinterest Data Science

At Gnip, we believe the value of social data is unlimited. Data Stories is how we bring this belief to life by showcasing how social data is used. This week we’re interviewing data scientist Mohammad Shahangian of Pinterest about how the data science team works at Pinterest, surprising uses of Pinterest and data science as a career path. You can follow him on Pinterest at pinterest.com/mshahang

Data Scientist at Pinterest

1. What do you see is your role as the data scientist for Pinterest?

The company’s focus is on helping millions of people discover things they love and get inspiration to go do those things in their life. For me, that means analyzing the rich data that is created by the millions of people interacting with billions of pins from across the web each day. I evaluate this data and provide insights that make data actionable. My team also prototypes and validates ideas, performs deep analysis and builds tools that allow us to answer our most frequent questions in seconds. We work with every team to answer Pinterest’s biggest questions and ensure that each decision positively impacts Pinners over the long term.

For example, we take a business question like “How should our web, tablet and phone experiences differ?” and present the results as insights like, “Many users use the mobile apps in the morning and again at night, but prefer the website during the day” and “Users prefer to use mobile apps to casually discover new content, whereas they use the web to curate and organize content.” We then work with the design and product teams to build features around these insights and measure their impact.

2. What are some of your favorite ways that people use Pinterest that people wouldn’t expect?

What makes Pinterest unique is that it’s a tool and the users really define its use cases. For me, Pinterest was really helpful when I was planning my wedding and it made perfect sense to use as collaborative office shopping list. I would have never thought to use it as a tool for:

A collection of Stop signs from around the world
Daily Grommet gets their community to collaborate on a board to see things they want to sell
Vintage Driving - a collaborative board where users pin their favorite vintage cars:
GE Badass machines featuring GE tech
Madewell’s Rainbow board
Michelle Obama’s MyPlate Recipes encourages health eating
Stunning virtual collections of minerals and shipwrecks
The “365 Days of Pinterest” challenge. She made a Pinterest project every day for a year!
Sammy Sosa awesomeness
Sony shows off their technology with food pictures shot with a Sony Camera
Pantone announces the color of the year
The National Pork Board

3. What category do you see as the most viral on Pinterest?

DIY and recipes pins generally go viral year round. Around the holidays, holiday-themed content across all categories tends to get the most traction.

4. How has data science added value to Pinterest?

We have this internal value we refer to as “knit.” It means that we have an open, curious culture where everyone in different disciplines—from engineering and design to marketing to community—works together. Data science is at the core of that. The search, recommendations and spam teams apply data science to improve the quality of content we put in front of Pinners. This is only a subset of how we apply data though; most of the decisions we make at Pinterest are actually backed by data.

Data is a universal language that teams across the company use to collaborate and make decisions. Each team has a set of performance metrics, and we hold a weekly meeting to understand the impact that each area is having on company-wide metrics. As data scientists we do more than just analyze data, we create rich data sources that we make available to other teams so they can do their own analysis. More than half of Pinterest employees run MapReduce jobs via Hive.  Our metrics dashboards are accessible to everyone and our core metrics are emailed daily to the entire team.  We also share our data studies and insights with the whole team.

We also use data just for fun. During our weekly happy hour, we share a weekly Data Fun Fact with the team. We present the fact in the form of a multiple choice question and have the team vote on the answer. For example, we asked, “How many days before Valentine’s day does the query ‘Valentine’s day ideas’ increase the most: 1, 3, 5 or 7 days?” (Hint for the curious reader: two*three/two).

5. What do you think someone should know before becoming a data scientist at a major web company like Pinterest?

I would say go for it! If you are hungry to extract value from real world data, you’re really going to enjoy it. I know that for a lot of really talented people in academia the only thing standing between them and the opportunity to solve a really interesting problem is the lack of rich data. My experience at Pinterest has been the exact opposite. Our team can’t grow fast enough to tap into a world of valuable insights that are sitting dormant within billions of records somewhere in the cloud.

Continue reading