Social Data vs Social Media

One area I see a lot of confusion about is the difference between social media vs. social data. I come from a social media background and use social media in marketing, so I see where the confusion can come from.

The easiest way to think about it in plain English:

  • Social Media: User-generated content where one user communicates and expresses themselves and that content is delivered to other users. Examples of this are platforms such as Twitter, Facebook, YouTube, Tumblr and Disqus. Social media is delivered in a great user experience, and is focused on sharing and content discovery. Social media also offers both public and private experiences with the ability to share messages privately.

  • Social Data: Expresses social media in a computer-readable format (e.g. JSON) and shares metadata about the content to help provide not only content, but context. Metadata often includes information about location, engagement and links shared. Unlike social media, social data is focused strictly on publicly shared experiences.

Or otherwise boiled down, social media is readable by humans and made for human interaction while social data is social media that is readable by computers.

Let’s look at a Tweet in form of social media and social data to show exactly what I’m talking about.

From this Tweet from Gnip, we can visually see that it uses the #BigBoulder hashtag, a Bit.ly link to our Storify page, that it has 73 retweets and 3 favorites, the time and date of the Tweet.  

 

Now let’s take a look at what the architecture of a Tweet looks like when received from an API.

 

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
{
   "body": "RT @gnip: Thrilled to welcome all #BigBoulder attendees! Watch the social
story unfold on our Storify page. http://t.co/ZzqUMfJz",
   "retweetCount": 71,
   "generator": {
      "link": "http://twitter.com",
      "displayName": "web"
   },
   "gnip": {
      "klout_score": 53,
      "matching_rules": [
         {
            "tag": "old krusty tweet",
            "value": "thrilled to welcome all attendees"
         }
      ],
      "language": {
         "value": "en"
      },
      "urls": [
         {
            "url": "http://t.co/ZzqUMfJz",
            "expanded_url": "http://storify.com/Gnip/big-boulder"
         }
      ]
   },
   "object": {
      "body": "Thrilled to welcome all #BigBoulder attendees! Watch the social
story unfold on our Storify page. http://t.co/ZzqUMfJz",
       "generator": {
         "link": "http://www.tweetdeck.com",
         "displayName": "TweetDeck"
      },
      "object": {
         "postedTime": "2012-06-20T18:07:13.000Z",
         "summary": "Thrilled to welcome all #BigBoulder attendees! Watch the social
story unfold on our Storify page. http://t.co/ZzqUMfJz",
      "link": "http://twitter.com/gnip/statuses/215506104082366465",
         "id": "object:search.twitter.com,2005:215506104082366465",
         "objectType": "note"
      },
      "actor": {
         "preferredUsername": "gnip",
         "displayName": "Gnip, Inc.",
         "links": [
            {
               "href": "http://gnip.com",
               "rel": "me"
            }
         ],
         "twitterTimeZone": "Mountain Time (US & Canada)",
         "image": "http://a0.twimg.com/profile_images/1347133706/
Gnip_logo-73x73_normal.png",
         "verified": true,
         "location": {
            "displayName": "Boulder, CO",
            "objectType": "place"
         },
         "statusesCount": 971,
         "summary": "Gnip is the leading provider of social media data for enterprise
applications, facilitating access to dozens of social media sources through a single
API",
         "languages": [
            "en"
         ],
         "utcOffset": "-25200",
         "link": "http://www.twitter.com/gnip",
         "followersCount": 3335,
         "favoritesCount": 108,
         "friendsCount": 384,
         "listedCount": 212,
         "postedTime": "2008-10-24T23:22:09.000Z",
         "id": "id:twitter.com:16958875",
         "objectType": "person"
      },
      "twitter_entities": {
         "user_mentions": [],
         "hashtags": [
            {
               "indices": [
                  24,
                  35
               ],
               "text": "BigBoulder"
            }
         ],
         "urls": [
            {
               "indices": [
                  98,
                  118
               ],
               "url": "http://t.co/ZzqUMfJz",
               "expanded_url": "http://bit.ly/MumrVJ",
               "display_url": "bit.ly/MumrVJ"
            }
         ]
      },
      "verb": "post",
      "link": "http://twitter.com/gnip/statuses/215506104082366465",
      "provider": {
         "link": "http://www.twitter.com",
         "displayName": "Twitter",
         "objectType": "service"
      },
      "postedTime": "2012-06-20T18:07:13.000Z",
      "id": "tag:search.twitter.com,2005:215506104082366465",
      "objectType": "activity"
   },
   "actor": {
      "preferredUsername": "daveheal",
      "displayName": "Dave Heal",
      "links": [
         {
            "href": "http://daveheal.com",
            "rel": "me"
         }
      ],
      "twitterTimeZone": "Mountain Time (US & Canada)",
      "image": "http://a0.twimg.com/profile_images/1755125722/photo_2_normal.JPG",
      "verified": false,
      "location": {
         "displayName": "Boulder, CO",
         "objectType": "place"
      },
      "statusesCount": 5657,
      "summary": "Boulder resident. Rochester NY native. Michigan Law graduate.
Copyright enthusiast. Liker of sports. DFW fanboy. CrossFitter. Work @Gnip. ",
      "languages": [
         "en"
      ],
      "utcOffset": "-25200",
      "link": "http://www.twitter.com/daveheal",
      "followersCount": 671,
      "favoritesCount": 28,
      "friendsCount": 292,
      "listedCount": 26,
      "postedTime": "2009-03-02T01:18:39.000Z",
      "id": "id:twitter.com:22432819",
      "objectType": "person"
   },
   "twitter_entities": {
      "user_mentions": [
         {
            "indices": [
               3,
               8
            ],
            "id": 16958875,
            "screen_name": "gnip",
            "id_str": "16958875",
            "name": "Gnip, Inc."
         }
      ],
      "hashtags": [
         {
            "indices": [
               34,
               45
            ],
            "text": "BigBoulder"
         }
      ],
      "urls": [
         {
            "indices": [
               108,
               128
            ],
            "url": "http://t.co/ZzqUMfJz",
            "expanded_url": "http://bit.ly/MumrVJ",
            "display_url": "bit.ly/MumrVJ"
         }
      ]
   },
   "verb": "share",
   "link": "http://twitter.com/daveheal/statuses/215509188481253376",
   "provider": {
      "link": "http://www.twitter.com",
      "displayName": "Twitter",
      "objectType": "service"
   },
   "postedTime": "2012-06-20T18:19:29.000Z",
   "id": "tag:search.twitter.com,2005:215509188481253376",
   "objectType": "activity"
}

This is social data. Same content, very different format, very different context and very different end user.

So what exactly does goes into the social data of a Tweet? To start, here is some of the metadata that you’re seeing.

  • Language identification — It is detected that the language of this Tweet is in English. Language identification is important for social media monitoring so companies can correctly monitor for the content they want.

  • URL expansion — Essentially this resolves or traces a shortened url to the end url that a consumer would see in their browser window. In this case, http://storify.com/Gnip/big-boulder is the link we shared using bitly.

  • Content — Gnip shows the full content of the Tweeted message, as well as metadata about the Tweet; like hashtags and URLs used, users that were mentioned, and when it was posted.

  • User — Gnip provides the display name, username, user’s stated location and additional bio information of the Tweeter. This is the information that users decide to share when signing up for an account.

  • Klout scores — An additional piece of metadata Gnip can provide is Klout score, so if one of our clients only wanted to see tweets with a Klout score of 30 or higher, they could do that.

Beyond Twitter data, Gnip offers social data from Tumblr, Disqus, Automattic (WordPress) and other publishers that all have their own unique metadata and enrichments. In addition to enrichments, Gnip offers format normalization. This means if you’re looking at a WordPress blog or a Tweet, the data is normalized no matter what the platform. E.g. date and location are formated and located in the same place within the JSON payload; making it easy to consume and parse data from multiple different sources.

Finally, a big difference is in how people use social data vs social media. Social data is what powers social media monitoring and analytics companies, it’s used in business intelligence to combine with other data sets, it’s used by hedge funds as part of their algorithms when looking at financial trades, or even to take a top-level look during a natural disaster.