Author: Elaine Ellis, Marketing

Elaine Ellis is a Marketing Manager at Gnip where she runs social media and public relations. Previously, she was a marketing manager at Trada. Elaine received her BA in Marketing from Notre Dame (Go Irish!) and finds it incredibly awkward that her boss went to USC.

Gnip at Boulder Startup Week

Each year, Boulder Startup Week celebrates the startup community, and Gnip is proud to be a sponsor again. In addition to bringing new talent to Boulder, it’s a great way for our startup community to come together, collaborate and learn from each other.

If you’re visiting town, be sure to check out Gnip’s open job listings. Gnip will also be speaking at the following events, and would love to say hi so seek us out!

Boulder *Hearts* Women in Tech
Gnip’s Vice President of Engineering Greg Greenstreet is speaking
Wednesday, May 15, 2013 from 2:00 PM to 3:30 PM (MDT)
Boulder, CO

Research tells us that women can make startups more efficient, more resilient, and more successful; yet there’s a dearth of women at startups. So, what to do? Boulder has a welcoming culture and startups can build welcoming cultures, too. Learn about the importance of “cultural fit,” avoiding unconscious bias, recruiting with a wide net, and writing job descriptions that don’t include the word “ninja.” Hear from local startups about their challenges and perspectives on building diversity into their cultures.

Panelists:
Greg Greenstreet, Gnip
Leslie Osborne, Standing Cloud
Jim Franklin, SendGrid
Ingrid Alongi, QuickLeft
Jenny Slade, NCWIT

Give Back, Get Back: A Guide to Boulder and OSS
A talk by Gnip’s Engineer Eoin Coffey at the Gnip office
1050 Walnut Street, #115
Boulder, CO 80302
Thursday, May 16, 2013 from 5:30 PM to 7:00 PM

Boulder and Open Source Software (OSS) both thrive on having open communities that contribute. Participating in both helps you grow as both a professional and as a programmer. However, what might surprise you is the ways that your contributions help others in ways that you might not expect. This talk by Eoin Coffey of Gnip will take a look at the unique aspects of contributing to OSS and the Boulder tech community.

Food and drink provided.

How to Hire an Intern (and Be Hired)
Gnip intern Brian Lehmann and Data Scientist Dr. Scott Hendrickson will be on the panel
Friday, May 17, 2013 from 10:00 AM to 11:30 AM (MDT)
RaffleCopter
1021 Pearl St.
Boulder, CO 80302

Want to hire a student developer? Or looking to get a technical internship at a startup? Come hear from a panelist of startups inlcuding Jim Franklin, CEO of SendGrid and Dr. Scott Hendrickson, Head of Data Science at Gnip who have succesfully hired students. Also hear the prospective of students who have worked at startups including Gnip, Rally Software and TeamSnap about how to do it right. The event will be followed up by a last minute networking session for students looking for positions at startups.

Data Story: Phil Harris of Geofeedia

Data Stories is Gnip’s ongoing series telling the stories of the people and companies that are doing groundbreaking work in social data. This week we’re interviewing Phil Harris, CEO of Geofeedia, a company that allows you to search and monitor social media by location. Geofeedia is a recent Gnip customer, and I love what they’re doing. The inherent value of Geofeedia was made clear to me when we received a media request looking for all social media that was geotagged close to the finish line of the Boston Marathon. Content + location creates powerful stories and Geofeedia is making it easier to find the right ones. 

1. What social data sources do you wish had geotagged data?
Our business is built on the fundamental premise of open source social data aggregation.  Or, I should say, every source. That said, there are currently major social data sources that provide public location data based on location identifier versus geotag. We will accommodate location id to integrate these data sources, but I strongly believe that over time, the benefits of more precise geo-location tagging on social media content will encourage these services to move towards geotagging. When they do, we’re exceptionally well positioned to translate that evolution into benefit for our clients.

2. If you’re a user, what do you think is the advantage of sharing your geodata?
We’ve barely scratched the surface of how geodata will deliver value to consumers. I believe the rapidly growing penetration of smartphones and adoption of geo-centric applications such as navigation will create a rich ecosystem of geo-data driven benefits. I am speaking with major consumer brands who believe that they will be able to create and maintain consumer relationships via location based social media in ways that will deliver significant value back to the individual user.

3. What can you find with Geofeedia that you can’t find on other platforms?
I know from analyzing our data with active customers that a significant amount of user generated content is missed by traditional keyword or hashtag centric monitoring tools. We complement these platforms to ensure relevant location based content is delivered to our customers in real-time.

4. Only a small portion of social media is geotagged, do you think this will change in the future?
I do. We’re seeing an increase every quarter, but as brands start rolling out compelling reasons for consumers to geotag their content, I believe geotagged social media will become the default.

5. How do you think Geofeedia will be used for good?
The leading businesses I’m speaking with consider Geofeedia as a tool to improve their overall customer experience. Understanding an individual social media conversation at a moment in time at a given location drastically improves the ways brands can serve their customers. Also, numerous public safety agencies are using Geofeedia to improve their ability to respond to natural disasters and other scenarios where real-time, location based social media awareness delivers great value.

6. How will real-time geo monitoring affect a brand’s ability to connect with their customers?
Like I said, the major brands with whom I’m speaking are evaluating how to improve their overall customer experience across all touch points – sales, customer service, loyalty – through real-time location based monitoring, analysis and engagement. I do believe that real-time, location based social media engagement will drastically improve a brand’s ability to have a meaningful, new type of relationship with their customers and become a de facto element of their communication mix.

Social Data vs Social Media

One area I see a lot of confusion about is the difference between social media vs. social data. I come from a social media background and use social media in marketing, so I see where the confusion can come from.

The easiest way to think about it in plain English:

  • Social Media: User-generated content where one user communicates and expresses themselves and that content is delivered to other users. Examples of this are platforms such as Twitter, Facebook, YouTube, Tumblr and Disqus. Social media is delivered in a great user experience, and is focused on sharing and content discovery. Social media also offers both public and private experiences with the ability to share messages privately.

  • Social Data: Expresses social media in a computer-readable format (e.g. JSON) and shares metadata about the content to help provide not only content, but context. Metadata often includes information about location, engagement and links shared. Unlike social media, social data is focused strictly on publicly shared experiences.

Or otherwise boiled down, social media is readable by humans and made for human interaction while social data is social media that is readable by computers.

Let’s look at a Tweet in form of social media and social data to show exactly what I’m talking about.

From this Tweet from Gnip, we can visually see that it uses the #BigBoulder hashtag, a Bit.ly link to our Storify page, that it has 73 retweets and 3 favorites, the time and date of the Tweet.  

 

Now let’s take a look at what the architecture of a Tweet looks like when received from an API.


  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
{
   "body": "RT @gnip: Thrilled to welcome all #BigBoulder attendees! Watch the social
story unfold on our Storify page. http://t.co/ZzqUMfJz",
   "retweetCount": 71, 
   "generator": {
      "link": "http://twitter.com", 
      "displayName": "web"
   }, 
   "gnip": {
      "klout_score": 53, 
      "matching_rules": [
         {
            "tag": "old krusty tweet", 
            "value": "thrilled to welcome all attendees"
         }
      ], 
      "language": {
         "value": "en"
      }, 
      "urls": [
         {
            "url": "http://t.co/ZzqUMfJz", 
            "expanded_url": "http://storify.com/Gnip/big-boulder"
         }
      ]
   }, 
   "object": {
      "body": "Thrilled to welcome all #BigBoulder attendees! Watch the social
story unfold on our Storify page. http://t.co/ZzqUMfJz",
       "generator": {
         "link": "http://www.tweetdeck.com", 
         "displayName": "TweetDeck"
      }, 
      "object": {
         "postedTime": "2012-06-20T18:07:13.000Z", 
         "summary": "Thrilled to welcome all #BigBoulder attendees! Watch the social
story unfold on our Storify page. http://t.co/ZzqUMfJz", 
      "link": "http://twitter.com/gnip/statuses/215506104082366465", 
         "id": "object:search.twitter.com,2005:215506104082366465", 
         "objectType": "note"
      }, 
      "actor": {
         "preferredUsername": "gnip", 
         "displayName": "Gnip, Inc.", 
         "links": [
            {
               "href": "http://gnip.com", 
               "rel": "me"
            }
         ], 
         "twitterTimeZone": "Mountain Time (US & Canada)", 
         "image": "http://a0.twimg.com/profile_images/1347133706/
Gnip_logo-73x73_normal.png", 
         "verified": true, 
         "location": {
            "displayName": "Boulder, CO", 
            "objectType": "place"
         }, 
         "statusesCount": 971, 
         "summary": "Gnip is the leading provider of social media data for enterprise
applications, facilitating access to dozens of social media sources through a single
API",
         "languages": [
            "en"
         ], 
         "utcOffset": "-25200", 
         "link": "http://www.twitter.com/gnip", 
         "followersCount": 3335, 
         "favoritesCount": 108, 
         "friendsCount": 384, 
         "listedCount": 212, 
         "postedTime": "2008-10-24T23:22:09.000Z", 
         "id": "id:twitter.com:16958875", 
         "objectType": "person"
      }, 
      "twitter_entities": {
         "user_mentions": [], 
         "hashtags": [
            {
               "indices": [
                  24, 
                  35
               ], 
               "text": "BigBoulder"
            }
         ], 
         "urls": [
            {
               "indices": [
                  98, 
                  118
               ], 
               "url": "http://t.co/ZzqUMfJz", 
               "expanded_url": "http://bit.ly/MumrVJ", 
               "display_url": "bit.ly/MumrVJ"
            }
         ]
      }, 
      "verb": "post", 
      "link": "http://twitter.com/gnip/statuses/215506104082366465", 
      "provider": {
         "link": "http://www.twitter.com", 
         "displayName": "Twitter", 
         "objectType": "service"
      }, 
      "postedTime": "2012-06-20T18:07:13.000Z", 
      "id": "tag:search.twitter.com,2005:215506104082366465", 
      "objectType": "activity"
   }, 
   "actor": {
      "preferredUsername": "daveheal", 
      "displayName": "Dave Heal", 
      "links": [
         {
            "href": "http://daveheal.com", 
            "rel": "me"
         }
      ], 
      "twitterTimeZone": "Mountain Time (US & Canada)", 
      "image": "http://a0.twimg.com/profile_images/1755125722/photo_2_normal.JPG", 
      "verified": false, 
      "location": {
         "displayName": "Boulder, CO", 
         "objectType": "place"
      }, 
      "statusesCount": 5657, 
      "summary": "Boulder resident. Rochester NY native. Michigan Law graduate.
Copyright enthusiast. Liker of sports. DFW fanboy. CrossFitter. Work @Gnip. ",
      "languages": [
         "en"
      ], 
      "utcOffset": "-25200", 
      "link": "http://www.twitter.com/daveheal", 
      "followersCount": 671, 
      "favoritesCount": 28, 
      "friendsCount": 292, 
      "listedCount": 26, 
      "postedTime": "2009-03-02T01:18:39.000Z", 
      "id": "id:twitter.com:22432819", 
      "objectType": "person"
   }, 
   "twitter_entities": {
      "user_mentions": [
         {
            "indices": [
               3, 
               8
            ], 
            "id": 16958875, 
            "screen_name": "gnip", 
            "id_str": "16958875", 
            "name": "Gnip, Inc."
         }
      ], 
      "hashtags": [
         {
            "indices": [
               34, 
               45
            ], 
            "text": "BigBoulder"
         }
      ], 
      "urls": [
         {
            "indices": [
               108, 
               128
            ], 
            "url": "http://t.co/ZzqUMfJz", 
            "expanded_url": "http://bit.ly/MumrVJ", 
            "display_url": "bit.ly/MumrVJ"
         }
      ]
   }, 
   "verb": "share", 
   "link": "http://twitter.com/daveheal/statuses/215509188481253376", 
   "provider": {
      "link": "http://www.twitter.com", 
      "displayName": "Twitter", 
      "objectType": "service"
   }, 
   "postedTime": "2012-06-20T18:19:29.000Z", 
   "id": "tag:search.twitter.com,2005:215509188481253376", 
   "objectType": "activity"
}

This is social data. Same content, very different format, very different context and very different end user.

So what exactly does goes into the social data of a Tweet? To start, here is some of the metadata that you’re seeing.

  • Language identification — It is detected that the language of this Tweet is in English. Language identification is important for social media monitoring so companies can correctly monitor for the content they want.

  • URL expansion — Essentially this resolves or traces a shortened url to the end url that a consumer would see in their browser window. In this case, http://storify.com/Gnip/big-boulder is the link we shared using bitly.

  • Content — Gnip shows the full content of the Tweeted message, as well as metadata about the Tweet; like hashtags and URLs used, users that were mentioned, and when it was posted.

  • User — Gnip provides the display name, username, user’s stated location and additional bio information of the Tweeter. This is the information that users decide to share when signing up for an account.

  • Klout scores — An additional piece of metadata Gnip can provide is Klout score, so if one of our clients only wanted to see tweets with a Klout score of 30 or higher, they could do that.

Beyond Twitter data, Gnip offers social data from Tumblr, Disqus, Automattic (WordPress) and other publishers that all have their own unique metadata and enrichments. In addition to enrichments, Gnip offers format normalization. This means if you’re looking at a WordPress blog or a Tweet, the data is normalized no matter what the platform. E.g. date and location are formated and located in the same place within the JSON payload; making it easy to consume and parse data from multiple different sources.

Finally, a big difference is in how people use social data vs social media. Social data is what powers social media monitoring and analytics companies, it’s used in business intelligence to combine with other data sets, it’s used by hedge funds as part of their algorithms when looking at financial trades, or even to take a top-level look during a natural disaster.

Big Boulder: Bourbon & Boots at SXSW

Derek Gottfrid of Tumblr

This past 2013 SXSW, Gnip brought out the big guns so to speak. We held our first SXSW event, Big Boulder: Bourbon & Boots, on Monday at Malverde for more than 200 of our awesome customers and publishers. We were lucky enough to have Derek Gottfrid, VP of Product, interviewed by our Chris Moody, Gnip’s COO.

You can check out our photos from Big Boulder: Bourbon & Boots on Facebook or check out our Storify with Tweets, Instagrams and Vines of the event!

We were also able to attend several great sessions related to social media and social data, and our notes from the sessions are below!

Building Tools for Creativity 
David Karp of Tumblr

David talked about creating Tumblr as a “hacky tool” because he had tried all of the tools and wanted a better way to express himself on the web. After creating the tool, he was surprised that within a week there were a few thousand people using it.

One aspect of Tumblr that has always amazed David Karp is how its users have defined its use. With the reblogging button, it created a whole community that wasn’t about creation but rather curation, which is a huge part of Tumblr’s identity. When they made panoramas from the iPhone easier to share and more presentable on Tumblr, they found a bunch of new immediate use cases. Jamie Beck created the cinemagraph, a gorgeous and more dramatic GIF. One person shared whole boards from a video game he was designing.  You can’t predict how people will use new Tumblr features but how they will use it will surprise you.

David also perfectly captured a trend on the web, “Images are first class citizens and everything else is a distance second on the web.”

Real-Time Marketing

David Teicher, Ad Age; Bonin Bough, Mondelez International (makers of Oreo); Steve Doan, Oreo; Gary Vaynerchuk, VaynerMedia; David Berkowitz, 360i; Albert Chou, Expion

Gnip was lucky enough to be invited to this private, invite-only panel held by Expion, and it was definitely a favorite by those who attended. This panel focused on real-time marketing and was from the group that brought us the much talked about Oreo Super Bowl real-time marketing.

One area that really stuck out to me was when Bonin talked about how social media made the ROI of each marketing mix more powerful. That social media made their TV spend twice as effective, and that any marketer interested in ROI would want to make their spend 2X as effective. Marketers shouldn’t look to understand how much value they received for each Tweet but rather take a look at the ROI of the overall ecosystem.

Another aspect that really resonated was talking about how hard it is to measure traditional marketing and how much easier it is to measure social media. They talked about how advertisers are faced with a “pass-around rate” for circulation in case someone leaves a copy of Vogue on the bus. Or how unrealistic billboard advertising views are because as Gary mentioned everyone is also likely texting and driving and not paying attention to advertising. Social media ROI is easier and more realistic to prove.

To make real-time marketing work, you should have a willingness to prepare. As David Berkowitz pointed out to do real-time marketing you should ask, “What are you doing every single day of the year?” Gary also talked about from an agency perspective how frustrating it was for him as his clients didn’t move fast enough, which is a familiar problem for agencies.

This panel also brought up the point that using content for marketing purposes is nothing new. Michelin created the Michelin guide for restaurants and hotels once they realized people liked taking road trips. Or when Guinness had problems selling pints, they created the Guinness World Records. Now today it is companies such as Red Bull creating the content as part of their brand.

The State of Blogging 
Matt Mullenweg of Automattic and Kara Swisher of All Things D

This was a hilarious session about blogging, and was helped tremendously by Matt and Kara’s back-and-forth banter.

Matt talked about how he created WordPress because he wanted better software for blogging and was frustrated by what was on the market at the time. Creating WordPress was a “happy accident.”

When asked about whether social networks were hurting blogging, Matt told the audience that social networks had breathed a second wind into blogging. Social networks drive significant traffic to WordPress sites. Matt also mentioned that different social networks create different ego boosts for different reasons and Kara told him that he needed a dog.

Matt believes that WordPress might not have the most users, but that they have the best users. It offers a lot of flexibility and power for serious bloggers. WordPress continues to grow year after year and much of that growth is organic. Matt thinks they’ve beat out other competitors by understanding personal publishing better than anyone else.

One area that Matt sees that WordPress needs to improve upon is their WYSIWYG editor and that the experience there and on mobile could be so much better.

Matt is always trying to think about how people will be digesting content 18 months from now, so right now he is thinking about how Google Glass will change the content experience.

Matt also talked about what makes a good blog post and emphasized that pictures can really create a better reading experience.

 

Data Story: Mohammad Shahangian on Pinterest Data Science

At Gnip, we believe the value of social data is unlimited. Data Stories is how we bring this belief to life by showcasing how social data is used. This week we’re interviewing data scientist Mohammad Shahangian of Pinterest about how the data science team works at Pinterest, surprising uses of Pinterest and data science as a career path. You can follow him on Pinterest at pinterest.com/mshahang

Data Scientist at Pinterest

1. What do you see is your role as the data scientist for Pinterest?

The company’s focus is on helping millions of people discover things they love and get inspiration to go do those things in their life. For me, that means analyzing the rich data that is created by the millions of people interacting with billions of pins from across the web each day. I evaluate this data and provide insights that make data actionable. My team also prototypes and validates ideas, performs deep analysis and builds tools that allow us to answer our most frequent questions in seconds. We work with every team to answer Pinterest’s biggest questions and ensure that each decision positively impacts Pinners over the long term.

For example, we take a business question like “How should our web, tablet and phone experiences differ?” and present the results as insights like, “Many users use the mobile apps in the morning and again at night, but prefer the website during the day” and “Users prefer to use mobile apps to casually discover new content, whereas they use the web to curate and organize content.” We then work with the design and product teams to build features around these insights and measure their impact.

2. What are some of your favorite ways that people use Pinterest that people wouldn’t expect?

What makes Pinterest unique is that it’s a tool and the users really define its use cases. For me, Pinterest was really helpful when I was planning my wedding and it made perfect sense to use as collaborative office shopping list. I would have never thought to use it as a tool for:

A collection of Stop signs from around the world
Daily Grommet gets their community to collaborate on a board to see things they want to sell
Vintage Driving - a collaborative board where users pin their favorite vintage cars:
GE Badass machines featuring GE tech
Madewell’s Rainbow board
Michelle Obama’s MyPlate Recipes encourages health eating
Stunning virtual collections of minerals and shipwrecks
The “365 Days of Pinterest” challenge. She made a Pinterest project every day for a year!
Sammy Sosa awesomeness
Sony shows off their technology with food pictures shot with a Sony Camera
Pantone announces the color of the year
The National Pork Board

3. What category do you see as the most viral on Pinterest?

DIY and recipes pins generally go viral year round. Around the holidays, holiday-themed content across all categories tends to get the most traction.

4. How has data science added value to Pinterest?

We have this internal value we refer to as “knit.” It means that we have an open, curious culture where everyone in different disciplines—from engineering and design to marketing to community—works together. Data science is at the core of that. The search, recommendations and spam teams apply data science to improve the quality of content we put in front of Pinners. This is only a subset of how we apply data though; most of the decisions we make at Pinterest are actually backed by data.

Data is a universal language that teams across the company use to collaborate and make decisions. Each team has a set of performance metrics, and we hold a weekly meeting to understand the impact that each area is having on company-wide metrics. As data scientists we do more than just analyze data, we create rich data sources that we make available to other teams so they can do their own analysis. More than half of Pinterest employees run MapReduce jobs via Hive.  Our metrics dashboards are accessible to everyone and our core metrics are emailed daily to the entire team.  We also share our data studies and insights with the whole team.

We also use data just for fun. During our weekly happy hour, we share a weekly Data Fun Fact with the team. We present the fact in the form of a multiple choice question and have the team vote on the answer. For example, we asked, “How many days before Valentine’s day does the query ‘Valentine’s day ideas’ increase the most: 1, 3, 5 or 7 days?” (Hint for the curious reader: two*three/two).

5. What do you think someone should know before becoming a data scientist at a major web company like Pinterest?

I would say go for it! If you are hungry to extract value from real world data, you’re really going to enjoy it. I know that for a lot of really talented people in academia the only thing standing between them and the opportunity to solve a really interesting problem is the lack of rich data. My experience at Pinterest has been the exact opposite. Our team can’t grow fast enough to tap into a world of valuable insights that are sitting dormant within billions of records somewhere in the cloud.

Continue reading

Can’t Miss Social Data Sessions at SXSW

Below we’ve compiled our list of can’t miss sessions for those working in social data! Drop us a note in the comments if we’re missing a great panel that social data folks would be interested in.

Also, Gnip is hosting an invite-only event for those companies driving the social data ecosystem on Monday called Big Boulder: Bourbon & Boots, which will feature great networking, a quick interview with the amazing Derek Gottfrid of Tumblr, custom bourbon drinks developed by Gnip’s Rob Johnson, and the best patio in Austin. If you work in social data and want an invite, please email bre@gnip.com. 

Friday

#pathonly: Social’s Shift Towards Real Privacy
5:00PM – 6:00PM
Hilton Austin
Dave Morin of Path

Saturday

How Twitter Has Changed How We Watch TV
9:30AM -10:30AM
Austin Convention Center
Jenn Deering Davis of Union Metrics

A Home on the Web: The State of Blogging in 2013
12:30PM – 1:30PM
Hyatt Regency
Kara Swisher of AllThingsD, Matthew Mullenweg of Automattic

The Rise of Contextual Social Networks
3:30PM – 4:30PM
Sheraton Auston
Colleen Taylor of TechCrunch, Francesa Levy of LinkedIn Today, Nate Johnson of Path, Sarah Leary of Nextdoor

Data Science Through the Lens of Journalism
3:45PM – 4:00PM
Hilton Austin
Zanab Hussain of SimpleReach

Big Data: Is It Killing Creativity?
4:00PM–4:15PM
Hilton Austin
Matt Langie of Adobe

Sunday

Social Media Was Fun. Has Measurement Killed It?
5:00PM – 6:00PM
Sheraton Austin
Matt Thomson of Klout, Adam Schoenfeld of Simply Measured

Monday

The Future of Location: From Social to Utility
12:30PM – 1:30PM
Austin Convention Center
Dennis Crowley of Foursquare

The Rise of the Planet of the Creatives
5:00PM – 6:00PM
Omni Downtown
Danielle Strle of Tumblr, Claire Mazur of Of a Kind, Jamie Beck, Jen Beckman 20X200

Tuesday

Location! The Importance of Geo-Data
11:00AM – 12:00PM
Sheraton Austin
Catherine D’ignazio of MIT, Devin Gaffney and Mark Graham of Oxford, Monica Stephens of Humboldt

Wednesday

From 140 to 0: The Rise in Image-Based Marketing
11:00AM – 12:00PM
Austin Convention Center
Nate Auerbach of Tumblr, Scott Sperry of Sperry Media, Shannon Schlappi of Locker Partner, Vince Bannon of Getty Images

Data Story: Dan Lynn of Full Contact

Data stories is Gnip’s way to talk about the many amazing ways that data is used. Today on the blog we’re speaking with Dan Lynn, a cofounder and CTO of FullContact. FullContact is trying to solve the world’s contact information problem, which is no small feat. We thought the dilemmas faced by this team with dealing with disparate and decaying data makes for a great story. You can follow Dan on Twitter at @DanKLynn

Dan Lynn of FullContact

1. What problem is Full Contact trying to solve with data?

At FullContact, we’re solving the world’s contact information problem, which is that your contact information is a mess. In address books like GMail, Outlook, SalesForce and customer lists, you’ve got missing details, duplicate entries, and the same person fractured across multiple cloud systems. We’re using data to help you clean all that up and keep those address books in sync, up to date, and duplicate-free.

2. What do you see as the advantages of combining social data with contact information? Do people make deeper connections if they have social data?

When I was growing up, I had 3 or fewer ways I could contact my friends: street address, phone (usually their parents’!) and, later, email. As the Internet took off, they added instant messenger accounts, eBay usernames, Twitter handles, Facebook accounts, LinkedIn profiles and dozens more. These are all valid means of contacting someone, but most people prefer some over others, and it’s great to have that choice.

While it’s awesome for me to find out who among my contacts have Twitter accounts that I’m not yet following, using social data is very helpful for me (or a computer) to tell two similar-but-different contacts apart. Social profiles are starting to act more and more as a person’s public identifier, much like a Social Security number that you would actually *want* people to have. Filling-out my contacts with social data makes it that much easier to merge duplicates, tell the difference between John Smith Jr. and John Smith Sr., and contact people in ways other than email, phone, or snail mail.

3. What do you wish you knew a year ago about how people archive and share contact information?

Honestly, a year ago, the problem was staring us straight in the face: people *don’t* really archive and share contact information. Sharing has been too error prone for people to trust an automated system not to screw up their contacts. I’ve lost count of the number of times people share contact information by reading phone numbers from each other’s phone, yelling email addresses across the room, or emailing contact info back and forth with subject lines like “Bart Lorang’s phone number”. The problem is hard, and everyone has different expectations around the idea of sharing contact information. Many people want their contacts automatically kept up to date with changes in their co-workers’ address books. Others only want updates if the contact publicly changes his/her information. What should an automated system do if two of your colleagues share conflicting changes to one of your contacts? Ultimately we all just want the best way to get in touch with someone at a given time.

4. Contact information is considered decaying data. What are the challenges of working with decaying information?

The idea of decaying data is that the data you have *right* now is only a snapshot of the world at a given time. You could say that your data “decayed” if the real world has moved on and your database hasn’t caught up. This is a real problem with contact information. It changes constantly. People change jobs, change names, move, change phone carriers, and more. The challenge is keeping your address book up to date with all these changes. Many companies that work with contact information in bulk simply “punt” and apply a simple rule to their data by reducing their confidence in it some percentage every year. I think that’s too heavy-handed and doesn’t work for the end-user. At FullContact, we fundamentally believe that a person’s contact information is current until we find some other, newer, piece of contact information that suggests otherwise. That means that we’re constantly searching the internet for up-to-date information about your contacts.

5. How do you think Full Contact fits into the world of social media and how people are already obtaining contact information? 

For the last couple years, we’ve been seeing the social networks clamp down on their users’ contact information (often for good reason). We remember the spat between Google and Facebook over the ability to export your friends’ information. It’s easy to agree philosophically with elements of both arguments. To Facebook’s point, a person should be in control of her own contact information. To Google’s point, a person should be in control of her contacts, and has a reasonable expectation to get the same data back from a service that she put in. We think FullContact helps bridge this gap. We believe that you own your address book, but we also believe that you have a right to control what information about you is floating around out there on the Internet. We want to you to have the most up-to-date picture of your contacts, but we want to give your contacts control over their own information.

Continue reading

Data Stories: Brooke Fisher Liu on Using Social Media in Natural Disasters

Data Stories is Gnip’s project to tell the stories of how social data is being used. This week we’re interviewing Brooke Fisher Liu from the University of Maryland about her research on how people use social media in natural disasters (PDF). You can follow Brooke on Twitter at @Bfliu. (Also, you can see our data scientists post on Twitter’s reaction to an earthquake in Mexico.)

Brooke Fisher Liu

Brooke Fisher Liu (photo courtesy of Anne McDonough)

1. When the wildfires broke out in Boulder, I found Twitter to be the best source of information hands down. What kind of information do you see people communicating about natural disasters?

During natural disasters people tend to use social media for four interrelated reasons: checking in with family and friends, obtaining emotional support and healing, determining disaster magnitude, and providing first-hand disaster accounts. A consistent research finding is that people are less likely to follow official, government sources on social media than their friends and family during disasters. I think that may change over time as government sources become more savvy about effectively using social media during disasters.

2. How is curated content such as Storify changing how people communicate during disasters?

This is one area where the research hasn’t caught up with practice yet. However, I think that social media sites that curate content such as Storify, Pinterest, or even Instagram are going to be major players in disaster communication in the future. One of the reasons people don’t turn to social media for disaster information is that the quantity of information is difficult to sift through and verify. Sites that curate content help cut through the sea of online information, and also provide a familiar, reliable source of information through online connections established before disasters.

3. You talked about people mobilizing on social media after natural disasters in your report. Do you ever see people respond in real time?

Absolutely. Real-time communication is one of the primary draws of social media during disasters. There are multiple examples of social media being the first source of disaster information such as for the 2011 Tuscaloosa tornadoes and the 2008 Mumbai terrorist attacks.

4. What surprised you the most about how people were using social media during natural disasters?

By far the biggest surprise is that people still turn to traditional media sources, especially broadcast journalism, as the most accurate source of disaster information. So, while they may first turn to social media, they still prefer traditional media during disasters. I think this may change over time, but it certainly was a surprise for me. Of course, journalists often rely on social media for disaster information, and I think over time we’ll see the distinction between traditional media and so-called new media blur even more.

5. How do you think the use of social media in natural disasters will evolve?

I think over time people will view social media as more trustworthy and thus turn to it as their primary source of information. I also think social media will continue to play a large role in facilitating disaster recovery by helping people connect with each other and rebuild communities. “Official sources” such as governments and the media will increasingly enhance their social media presence before disasters, which likely will position them to be not only the first, but also most trustworthy social media sources down the road. Perhaps most importantly I think social media will continue to surprise us by providing new communication capabilities during disasters that we can’t currently predict.

Continue reading

Observations On Disqus: The Spread of Words

Marketers and communicators all share a similar goal: to become part of the conversation. Comments in reaction to blogs and news stories are a fantastic place to discover the topics that are driving conversation. To dig deeper, we recently looked at public comments from Disqus, the world’s largest discussion platform, to see what was getting online chatter at the end 2012. With 70,000 comments published on Disqus every hour, you can find insights and conversations that can’t be found elsewhere.

What we found is that communicators often use a language set that the audience does not share.  In discussion, most common denominator language dominates. Let’s look at a couple of examples.

The Fiscal Cliff

Social Media Discussion of Fiscal Cliff

At the end of 2012, one topic that dominated mainstream publications and political blogs was the Fiscal Cliff, when a series of tax cuts for the United States were expected to expire at the end of the year. Since this was a topic of contention between the Democrats and Republicans, you would have expected this to be a passionate point of conversation during the Elections. As it turns out, this wasn’t exactly the case. When did the Fiscal Cliff talk start? The day after the Election. And the discussion was couched in broader terms than just the acute “Fiscal Cliff” crisis. So while Washington operates and speaks in continual crisis mode, the public thinks of these challenges in broader, more systemic terms.

Disqus Conversations on Taxes and Medicare

 While the Fiscal Cliff wasn’t a hot topic until after the election, taxes and medicare saw consistent conversations before and after the election.

Timing is everything when it comes to starting conversations. While the Election focused on what happened in the past four years and what would happen in the next four years, the day after the Election honed in on what was immediately down the road — the Fiscal Cliff.

Skyfall vs Breaking Dawn vs Twilight

Skyfall vs Breaking Dawn vs Twilight on Disqus

Moving from politics to pop culture, we were curious what would generate more conversation — a bunch of sparkly vampires driving Volvos (the movie Breaking Dawn, the fourth installment in the Twilight series) or the eponymous spy from England (Skyfall). We were initially surprised to see that Skyfall generated more chatter around its premiere on Nov. 9 than Breaking Dawn saw for its premiere on Nov. 18. However, when we took a closer look by adding the term Twilight into the mix, we found that Twilight created more chatter than Skyfall.

­Comments are an excellent barometer of buzz around upcoming events and launches. Even more than that, comments can help companies understand what terms people use about an event. In this example, if you were the studio marketer using content marketing to promote the release of Breaking Dawn, your odds would improve by using Twilight in your headline.

Movie Vs. Libya vs. Benghazi

While searching for popular movies on Disqus, we found an interesting spike for the term “movie” in mid-September, but couldn’t attribute it to a popular movie. After some digging, we realized that this was related to the movie “Innocence of Muslims,” the controversial spoof movie on the religion. While the movie was originally uploaded to YouTube in July, it aired on an Egyptian network on Sept. 9, which immediately created protests that quickly spread to Libya. On Sept. 11, four Americans including the Ambassador were killed in Benghazi, Libya. While the terms Libya and movie spiked immediately, Benghazi built up momentum more slowly over time spiking right before the election as it became part of the political debate between the two parties.

Buzz around current events doesn’t immediately spike right after the event. As new facts and information are disseminated, the current of conversation can change. In this scenario, a new and more specific term “Benghazi” did dominate the conversation, as it slowly became shorthand for the overall issue. What carries conversation is language that accelerates understanding and lowers the barrier for participation.

Ultimately, comments are windows into not only what people are talking about but also when topics tip over into public conscious and what the driving forces are behind when conversations peak. In the same way that communicators deploy search engine optimization to target searchers, they need to also incorporate conversation optimization strategies to become part of the conversation.