SEC “Likes” Social Media

If there’s one person in the world who gets the intersection of the social web with investing, it’s Howard Lindzon. So when the SEC ruled Tuesday that company postings on sites like StockTwits/Facebook/Twitter were as good as news releases and company websites (as long as investors were aware of the use of those sites), I immediately turned to Howard for his thoughts. And sure enough, he had a great post today.

One of the most powerful points he made spoke to the fine line that StockTwits walks as a finance social site. They carefully split the difference between 1.) allowing managed /curated content, tools and control necessary for compliance and governance and 2.) enabling the spontaneous, multi-source, lighting-quick conversation paradigm that makes social media so incredible. As he put it:

Rules matter and if they are clearly stated and thoughtfully enforced, communities can thrive (learn, mentor, make a little coin). We [ed. StockTwits] added some basic financial features like the ability to create disclosures/tracking and the removal of the delete function to ensure trust is at the forefront.  No matter what others call us or think, Stocktwits is a NEWSWIRE. Information is flowing from one to many, all day, every day and it is full of context.

The social web will continue to grow and the power of the content being created on that web will continue to impact even the most regulated industries. How other platforms can adapt and fuel that change, like StockTwits has done, will be fascinating to watch.

Check out the full blog here.

Commercial Evolution of Social Networks

Over the past four years Gnip has seen many social services come and go. Not surprisingly, a pattern has emerged in how they evolve, and the degree to which our customers need their public data. There are generally three distinct phases a social service goes through, and how the service does in each phase impacts how it ultimately participates in the broader public social data ecosystem which can complete a full commercial cycle. This cycle being one combining consumer use (often buying intent, or expression) with commercial engagement (identifying need in time of natural disaster, or ad buying).

Phase 1: Consumer Engagement
​A social service must engage us; the end-users/consumers. Whether via a homegrown social graph, or leveraging someone else’s (e.g. Facebook Connect), in order for a social service to become useful, it needs users. From there, those users need to participate in self-expression (from posting a comment, to retweeting a tweet) and generate activity on the service. There are a variety of ways to compel us users to engage in a social service, but the social service itself is solely responsible for the first experience. The vision of the services’ founders yields a web-app or mobile interface that allows us to take action, leveraging the expressions laid out by the app itself (e.g. sharing a photo). If users like the expressions, discovery methods, and sense of “connectedness,” you’ve got a relevant social service on your hands.

Phase 2: APIs; Outsourcing Engagement
At some point a successful social service realizes the potential for outsourcing the expression metaphors that make the service successful & useful, and they construct an API that allows others to RESTfully engage with the service. In some instances the API is read-only. In some instances the API is write-only; sometimes both. What is key is that nine times out of ten, the API is meant to drive core service engagement via other user-facing applications. A classic example of this would the zillions of non-Twitter Inc clients that “Tweet” on our behalves everyday. One look at the endless number of Tweet “sources” that flow through the Firehose and you’ll realize this engagement potential.

The exceptional API is one that has broader social data engagement ecosystem consumption in its DNA. Typical social services consider themselves the center of the universe, and that not only will they capture all consumer engagement, they will be the root of all broader ecosystem engagement as well. However, success with Consumer Engagement does not guarantee commercial engagement; not by a long-shot.

Some services execute phase 1 and 2 simultaneously these days.

Phase 3: Activity Transparency; Commercial Engagement
Allowing other applications & developers to inject activities into the core service is obviously valuable, however it is only part of the picture. Social services with broad social and commercial impact have achieved this by addressing commercial needs for complete, raw, activity availability. For example, in order for someone to deploy resources in a disaster relief scenario effectively, they need to make their own determination as to what victims need, where they are located, and general conditions surrounding the event. The social service limiting access to the activities taking place on the service, by definition, yields an incomplete picture to downstream commercial consumers of the content. The result is a fragmented & hobbled experience for commerce engagement.

Another key component to commercial engagement is realizing that the ecosystem of data analytics and insights is well established, complex, and interwoven. Massive investments have been made in the market over the years, and brands want to leverage that fact. It is illogical for a social service to address the endless needs of the enterprise by building their own tools. Attempts to supplement this market comes at the potential expense of losing focus on building a great consumer experience.

The most impactful, useful, and valuable social services that Gnip customers leverage for their needs (ad buying, campaign running, stock trading, disaster relief), are those that acknowledge that they are not an island in the ecosystem. They complete the cycle by providing unfettered access to one of their most significant assets. In trade, the relevance of the social service itself is maximized because commerce can engage with it.

A good example of how impactful this transparency can be is Twitter. Consider how Twitter is used across new, as well as traditional, media. They’ve completed the cycle with a strong offering of Phase 3.

All three phases are not required for success, but all three are indeed required for success in the broader public commercial social data ecosystem.

Power Up Those Analytics: Alteryx Gets Plugged In

Earlier this morning, George Mathew – President & COO of Alteryx – announced during his speech at the Inspire 2013 conference that Alteryx is now a Plugged In To Gnip partner.

The journey of this partnership is one that we’ve enjoyed, and it has been a natural fit from the beginning. Alteryx is a desktop-to-cloud analytics solution widely recognized for both ease of use (even across a wide range of user skills) – as well as powerful capabilities that can handle complex problems. With those characteristics, an integration with social media data just made sense!

To top it off, Alteryx has an office and a vocal presence here in Boulder, so we’ve gotten to know many of the members of their team, and have been consistently impressed with their enthusiasm for big, social data and – most importantly – the creative and valuable insights that the right analytics tool can derive from that data.

The release of Alteryx Strategic Analytics 8.5 is a great example of what this partnership means for enterprises that need to incorporate social analysis quickly and simply. Within 8.5, users benefit from a pre-built integration with Gnip. As an Alteryx user, you enter your Gnip credentials within the Alteryx tool to access your Twitter stream. You can then plug that social analysis tool into the multi-step algorithm or joined data analytics you are creating. This type of integration enables business users from across an entire enterprise to leverage social data in their everyday decision making by simply adding this tool to their analytic workflow.

We’re very excited to have Alteryx in our Plugged In program, and look forward to working together.

Gnip Announces Partnership with Leader in Japanese Social Media Analytics

With more than 10% of the Twitter firehose in Japanese, the Japanese market for social data is a huge opportunity. This is why Gnip is excited to announce that we’re partnering with Hottolink as part of a strategic alliance to better serve Twitter data in Japan.

Through the alliance, Hottolink will have access to Gnip’s suite of products that serve data from Twitter’s full firehose. This data will power Hottolink’s social media listening platform with ongoing and historical access to Tweets in Japanese and every other language. By partnering with Hottolink, Gnip will have access to Hottolink’s technology and expertise, enabling Gnip to better meet the needs of the Japanese market.

Japan is the third-largest Tweeting population in the world with more than 30 million accounts and has some of the most active users in the world.  In fact, the world record for tweets-per-second was set in December 2011 during the television broadcast of the Japanese anime movie “Castle in the Sky,” with 25,088 tweets.

In Japan, they call a Tweet a “mumble” but the signal from Japanese language Tweets is loud and clear.  If you’re interested in learning more, please check out the press release (also in Japanese!) or email info@gnip.com.

Gnip and Automattic Make Whole New Universe of Data Available

“This new data from Automattic is a big addition and a testament to Gnip’s commitment to drive the social data economy forward. This is an important source to add to the social data mix, one that we know our customers will take full advantage of.”

- Rob Begg, VP Marketing of Radian6

As social media data becomes more and more important across a range of businesses, our customers are asking for access to more data sources to give them a more complete picture of the social media conversations that are relevant to their businesses.

Today, we’re excited to announce a major addition to our coverage of the conversations taking place on blogs around the world. We’re expanding our relationship with Automattic to make a whole new universe of blog and comment data available to the market for the first time anywhere.

For those who don’t know, Automattic is a network of web services including WordPress.com, VIP hosting and support, Polldaddy, IntenseDebate, and Jetpack. We’ve been delivering data from WordPress.com and IntenseDebate for about a year and a half and found that while our customers loved their data, they always wanted more.

As of today, we are now offering the full firehose of blog posts and comments from Jetpack-powered WordPress.org sites, as well as engagement streams of “likes” from WordPress.com and IntenseDebate. The new data from WordPress.org greatly increases the coverage available to those who are looking to do deep analysis of blog posts and comments. The new engagement streams enable companies to pull in reaction data to quickly understand sentiment, relevance and resonance. With this they can gauge the intensity of opinion around fast moving blog and comment conversations, helping prioritize critical response.

Being full firehoses, all of the streams from Automattic ensure 100% coverage in realtime giving customers the peace of mind that they can keep up the entire discussion on fast moving threads.

The scope of coverage offered by Automattic is pretty incredible.  Check out some of these stats:

We’re thrilled to be able to offer these new data streams to our customers and can’t wait to see the amazing things they’ll be able to do with them.

Updated: Coverage in GigaOM – Gnip and WordPress deepen ties, expand data partnership

Launching Gnip MarketStream & Partnership with StockTwits

While the market has been on its roller coaster ride across the past month, Gnip has kept its collective head down and stayed busy on behalf of our Investment Management clients (hedge funds, HFTs, asset managers, etc.). That hard work has paid off and we have two exciting announcements to make today.

  • Launch of Gnip MarketStream: Our hedge fund clients have been quite vocal in their desire for a package incorporating the most relevant social data streams into a single low-latency, high-volume solution. We’re proud to answer their needs with the launch of Gnip MarketStream, a realtime data solution that packages the incredibly rich and broad “voice of the market” Twitter stream with the uniquely deep and targeted “voice of the trader” StockTwits stream.
  • Premium Partnership with StockTwits: An integral component of the Gnip MarketStream is StockTwits social media data. We’re thrilled to announce this partnership with StockTwits, the leading realtime financial platform for the investment community and creator of the $(TICKER) tag. The StockTwits stream is a curated, defined-demographic, realtime social data stream focused on investment decisions and analysis. Gnip now provides streaming access to the full StockTwits firehose of social data, and offers access to historical content as far back as 2009.

While the use of social media data by the investment community has included use of this data in news analysis and equity research, the primary adoption of this data across the last six months has been as a trading indicator. By combining the strengths of both the Twitter stream and the StockTwits stream, Gnip MarketStream provides investment professionals unparalleled access to relevant social data at time when social media has become an increasingly vital channel for news and market sentiment.

For more information about Gnip MarketStream or StockTwits data, contact trading@gnip.com.

Guide to the Twitter API – Part 3 of 3: An Overview of Twitter’s Streaming API

The Twitter Streaming API is designed to deliver limited volumes of data via two main types of realtime data streams: sampled streams and filtered streams. Many users like to use the Streaming API because the streaming nature of the data delivery means that the data is delivered closer to realtime than it is from the Search API (which I wrote about last week). But the Streaming API wasn’t designed to deliver full coverage results and so has some key limitations for enterprise customers. Let’s review the two types of data streams accessible from the Streaming API.The first type of stream is “sampled streams.” Sampled streams deliver a random sampling of Tweets at a statistically valid percentage of the full 100% Firehose. The free access level to the sampled stream is called the “Spritzer” and Twitter has it currently set to approximately 1% of the full 100% Firehose. (You may have also heard of the “Gardenhose,” or a randomly sampled 10% stream. Twitter used to provide some increased access levels to businesses, but announced last November that they’re not granting increased access to any new companies and gradually transitioning their current Gardenhose-level customers to Spritzer or to commercial agreements with resyndication partners like Gnip.)

The second type of data stream is “filtered streams.” Filtered streams deliver all the Tweets that match a filter you select (eg. keywords, usernames, or geographical boundaries). This can be very useful for developers or businesses that need limited access to specific Tweets.

Because the Streaming API is not designed for enterprise access, however, Twitter imposes some restrictions on its filtered streams that are important to understand. First, the volume of Tweets accessible through these streams is limited so that it will never exceed a certain percentage of the full Firehose. (This percentage is not publicly shared by Twitter.) As a result, only low-volume queries can reliably be accommodated. Second, Twitter imposes a query limit: currently, users can query for a maximum of 400 keywords and only a limited number of usernames. This is a significant challenge for many businesses. Third, Boolean operators are not supported by the Streaming API like they are by the Search API (and by Gnip’s API). And finally, there is no guarantee that Twitter’s access levels will remain unchanged in the future. Enterprises that need guaranteed access to data over time should understand that building a business on any free, public APIs can be risky.

The Search API and Streaming API are great ways to gather a sampling of social media data from Twitter. We’re clearly fans over here at Gnip; we actually offer Search API access through our Enterprise Data Collector. And here’s one more cool benefit of using Twitter’s free public APIs: those APIs don’t prohibit display of the Tweets you receive to the general public like premium Twitter feeds from Gnip and other resyndication partners do.

But whether you’re using the Search API or the Streaming API, keep in mind that those feeds simply aren’t designed for enterprise access. And as a result, you’re using the same data sets available to anyone with a computer, your coverage is unlikely to be complete, and Twitter reserves the right change the data accessibility or Terms of Use for those APIs at any time.

If your business dictates a need for full coverage data, more complex queries, an agreement that ensures continued access to data over time, or enterprise-level customer support, then we recommend getting in touch with a premium social media data provider like Gnip. Our complementary premium Twitter products include Power Track for data filtered by keyword or other parameters, and Decahose and Halfhose for randomly sampled data streams (10% and 50%, respectively). If you’d like to learn more, we’d love to hear from you at sales@gnip.com or 888.777.7405.

Guide to the Twitter API – Part 2 of 3: An Overview of Twitter’s Search API

The Twitter Search API can theoretically provide full coverage of ongoing streams of Tweets. That means it can, in theory, deliver 100% of Tweets that match the search terms you specify almost in realtime. But in reality, the Search API is not intended and does not fully support the repeated constant searches that would be required to deliver 100% coverage.Twitter has indicated that the Search API is primarily intended to help end users surface interesting and relevant Tweets that are happening now. Since the Search API is a polling-based API, the rate limits that Twitter has in place impact the ability to get full coverage streams for monitoring and analytics use cases.  To get data from the Search API, your system may repeatedly ask Twitter’s servers for the most recent results that match one of your search queries. On each request, Twitter returns a limited number of results to the request (for example “latest 100 Tweets”). If there have been more than 100 Tweets created about a search query since the last time you sent the request, some of the matching Tweets will be lost.

So . . . can you just make requests for results more frequently? Well, yes, you can, but the total number or requests you’re allowed to make per unit time is constrained by Twitter’s rate limits. Some queries are so popular (hello “Justin Bieber”) that it can be impossible to make enough requests to Twitter for that query alone to keep up with this stream.  And this is only the beginning of the problem as no monitoring or analytics vendor is interested in just one term; many have hundreds to thousands of brands or products to monitor.

Let’s consider a couple examples to clarify.  First, say you want all Tweets mentioning “Coca Cola” and only that one term. There might be fewer than 100 matching Tweets per second usually — but if there’s a spike (say that term becomes a trending topic after a Super Bowl commercial), then there will likely be more than 100 per second. If because of Twitter’s rate limits, you’re only allowed to send one request per second, you will have missed some of the Tweets generated at the most critical moment of all.

Now, let’s be realistic: you’re probably not tracking just one term. Most of our customers are interested in tracking somewhere between dozens and hundreds of thousands of terms. If you add 999 more terms to your list, then you’ll only be checking for Tweets matching “Coca Cola” once every 1,000 seconds. And in 1,000 seconds, there could easily be more than 100 Tweets mentioning your keyword, even on an average day. (Keep in mind that there are over a billion Tweets per week nowadays.) So, in this scenario, you could easily miss Tweets if you’re using the Twitter Search API. It’s also worth bearing in mind that the Tweets you do receive won’t arrive in realtime because you’re only querying for the Tweets every 1,000 seconds.

Because of these issues related to the monitoring use cases, data collection strategies relying exclusively on the Search API will frequently deliver poor coverage of Twitter data. Also, be forewarned, if you are working with a monitoring or analytics vendor who claims full Twitter coverage but is using the Search API exclusively, you’re being misled.

Although coverage is not complete, one great thing about the Twitter Search API is the complex operator capabilities it supports, such as Boolean queries and geo filtering. Although the coverage is limited, some people opt to use the Search API to collect a sampling of Tweets that match their search terms because it supports Boolean operators and geo parameters. Because these filtering features have been so well liked, Gnip has replicated many of them in our own premium Twitter API (made even more powerful by the full coverage and unique data enrichments we offer).

So, to recap, the Twitter Search API offers great operator support but you should know that you’ll generally only see a portion of the total Tweets that match your keywords and your data might arrive with some delay. To simplify access to the Twitter Search API, consider trying out Gnip’s Enterprise Data Collector; our “Keyword Notices” feed retrieves, normalizes, and deduplicates data delivered through the Search API. We can also stream it to you so you don’t have to poll for your results. (“Gnip” reverses the “ping,” get it?)

But the only way to ensure you receive full coverage of Tweets that match your filtering criteria is to work with a premium data provider (like us! blush…) for full coverage Twitter firehose filtering. (See our Power Track feed if you’d like for more info on that.)

Stay tuned for Part 3, our overview of Twitter’s Streaming API coming next week…

Guide to the Twitter API – Part 1 of 3: An Introduction to Twitter’s APIs

You may find yourself wondering . . . “What’s the best way to access the Twitter data I need?” Well the answer depends on the type and amount of data you are trying to access.  Given that there are multiple options, we have designed a three part series of blog posts that explain the differences between the coverage the general public can access and the coverage available through Twitter’s resyndication agreement with Gnip. Let’s dive in . .. 

Understanding Twitter’s Public APIs . . . You Mean There is More than One?

In fact, there are three Twitter APIs: the REST API, the Streaming API, and the Search API. Within the world of social media monitoring and social media analytics, we need to focus primarily on the latter two.

  1. Search API - The Twitter Search API is a dedicated API for running searches against the index of recent Tweets
  2. Streaming API – The Twitter Streaming API allows high-throughput, near-realtime access to various subsets of Twitter data (eg. 1% random sampling of Tweets, filtering for up to 400 keywords, etc.)

Whether you get your Twitter data from the Search API, the Streaming API, or through Gnip, only public statuses are available (and NOT protected Tweets). Additionally, before Tweets are made available to both of these APIs and Gnip, Twitter applies a quality filter to weed out spam.

So now that you have a general understanding of Twitter’s APIs . . . stay tuned for Part 2, where we will take a deeper dive into understanding Twitter’s Search API, coming next week…