Ruby on Rails BugMash

On Saturday, Gnip hosted a Rails BugMash. Ten people showed up. We mashed some bugs. We learned about Rails internals and about contributing to open source. It was organized by Prakash Murthy(@_prakash), Mike Gehard(@mikegehard) and me (@baroquebobcat).

What’s a BugMash, you might ask? Rails BugMashes were something that came out of RailsBridge‘s efforts to make the Ruby on Rails community more open and inclusive. We wanted to use the format to help get more people locally involved in OSS culture and to show how contributing to a big project like Rails is approachable by mere mortals.

One of the themes of the event was to help with migrating tickets from the old ticket system. Rails moved to using GitHub Issues as the official place to file bug requests in April and there are still a lot of active tickets in the old ticket tracker, lighthouse.

For greater impact, we focused on tickets with patches already attached. For these tickets, all that we needed to do was verify the patch and make a pull request on GitHub. Migrating these tickets was straightforward and we got a few of these merged into Rails that afternoon.

When the event started only two of us had contributed to Rails. By the end of the afternoon, everyone who had participated had either submitted a patch or helped to do so. Some of the patches we submitted were already merged in before the event was over.

Thanks to @benatkin, @anveo, @mikehoward, @ecoffey, @danielstutzman, @jasonnoble and @jsnrth for coming out on a Saturday to contribute to Rails. Also, thanks to the Rails core team members who were watching pull requests Saturday who helped us with our commits and offered suggestions and comments on our work. In particular, José Valim(@josevalim), Santiago Pastorino(@spastorino) and Aaron Patterson(@tenderlove) were a great help.

Bugs Smashed

Updated: added participant’s twitter handles.

Guest Post, Rick Boykin: Gnip C# .NET Convenience Library

Microsoft .NETNow that the new Gnip convenience libraries have been published for a few weeks on GitHub, I’m going to tell you a bit about the libraries that I’m currently responsible for, the .NET libraries.  So, let’s dive in, shall we… The latest versions of the .NET libraries are heavily based on the previous version of the Java libraries, with a bit of .NET style thrown in. What that means is that I used Microsoft’s Java Language Conversion Assistant as a starting point, mixed in some shell scripting like Bash, Sed and Perl to fix the comments, and some of the messy parts that did not translate very well. I then made it more C# like by removing Java Annotations, adding .NET attributes, taking advantage of .NET native XML Serializer, utilizing System.Net.HttpWebRequest for communications, etc. It actually went fairly quick.  The next task was to start the Unit testing deep dive.

I have to say, I really didn’t know anything about the Gnip model, how it worked, or what it really was, at first. It just looked like an interesting project and some good folks. Unit testing, however, is one place where you learn about the details of how each little piece of a system really works. And since hardly any of my tests passed out of the gate (and I was not really even convinced that I even had enough tests in place,) I decided it was best to go at it till I was convinced. The library components are easy enough. The code is really separated into two parts. The first component is the Data Model, or Resources, which directly map to the Gnip XML model and live in the Gnip.Client.Resource namespace. The second component is the Data Access Layer or GnipConnection. The GnipConnection, when configured, is responsible for passing data to, and receiving data from, the Gnip servers.  So there are really only two main pieces to this code. Pretty simple: Resources and GnipConnection. The other code is just convenience and utility code to help make things a little more orderly and to reduce the amount of code.

So yeah, the testing… I used NUnit so folks could utilize the tests with the free version of VisualStudio, or even the command line if you want. I included a Nant file so that you can compile, run the tests, and create a zipped distribution of the code. I’ve also included an nunit project file in the Gnip.ClientTest root (gnip.nunit) that you can open with the NUnit UI to get things going. To help configure the tests, there is an App.config file in the root of the test project that is used to set all the configuration parameters.

The tests, like the code, are divided onto the Resource objects tests and the GnipConnection tests (and a few utility tests). The premise of the Resource object tests is to first ensure that the Resource objects are cool. These are simple data objects with very little logic built in (which is not to say that testing them thoroughly is not the utmost important.) There is a unit test for each one of the data objects and they proceed by ensuring that the properties work properly, the DeepEquals methods work properly, and that the marshalling to and from XML works properly. The DeepEquals methods are used extensively by the tests, so it is essential that we can trust them. As such, they are fairly comprehensive. The marshalling and un-marshalling tests are less so. They do a decent job; they just do not exercise every permutation of the XML elements and attributes. I do feel that they are sufficient enough to convince me that things are okay.

The GnipConnection is responsible for creating, retrieving, updating and deleting Publishers and Filters, and retrieving and publishing Activities and Notifications. There is also a mechanism built into the GnipConnection to get the Time from the Gnip server and to use that Time value to calculate the time offset between the calling client machine and the Gnip server. Since the Gnip server publishes activities and notifications in 1 minute wide addressable ‘buckets’, it is nice to know what the time is on the Gnip server with some degree of accuracy. No attempt is made to adjust for network latency, but we get pretty close to predicting the real Gnip time. That’s it. That little bit is realized in 25 or so methods on the GnipConnection class. Some of those methods are just different signatures of methods that do the same thing only with a more convenient set of parameters. The GnipConnection tests try to exercise every API call with several permutations of data. They are not completely comprehensive. There are a lot of permutations. But, I believe they hit every major corner case.

In testing all this, one thing I wanted to do was to run my tests and have the de-serialization of the XML validate against the XML Schema file I got from the good folks at Gnip. If I could de-serialize and then serialize a sufficiently diverse set of XML streams, while validating that those streams adhere to the XML Schema, then that was another bit of ammo for trusting that this thing works in situations beyond the test harness. In the Gnip.Client.Uti namespace there is a helper class icalled XmlHelper that contains a singleton of itself. There is a property called ValidateXml that can be reached like this XmlHelper.Instance.ValidateXml. Setting that to true will cause the XML to be validated anytime it is de-serialized, either in the tests or from the server. It is set to true in the tests. But, it doesn’t work with the stock Xsd distributed by Gnip.That Xsd does not include an element definition for each element at the top level which is required when validating against a schema. I had to create one that did. It is semantically identical to the Gnip version; it just pulls things out to the top level. You can find the custom version in the Gnip.Client/Xsd folder. By default it is compiled into the Gnip.Client.dll.

One of the last things I did, which had nothing really to do with testing, is to create the IGnipConnection interface. Use it if you want. If you use some kind of Inversion of Control container like Unity, or like to code to interfaces, it should come in handy.
That’s all for now. Enjoy!

Rick is a Software Engineer and Technical Director at Mondo Robot in Boulder, Colorado. He has been designing and writing software professionally since 1989, and working with .NET for the last 4 years. He is a regular fixture at the Boulder .NET user’s group meetings and the is a member of Boulder Digital Arts.

Three (Six?) Week Software Retrospective

I had to go back into older blog posts to remind myself when we launched; July 1st. It feels like we’ve been live since June 1st.

Looking Back

Things have gone incredibly well from an infrastructure standpoint. We’ve had to add/adjust some system monitoring parameters to accommodate the variety of Data Producers publishing into the system; different frequencies/volumes call for for specialized treatment. We weren’t expecting the rate, or volume, of Collection creation we wound up with. Within three hours of going live, we had enough Collections in the system to adversely impact node startup/sync times. We patiently tuned our data model, and tuned TerraCotta locks to get things back to normal. It’s looking like we’ll be in bed with TerraCotta for the long haul.


I’m not sure I could be any more pleased with AWS. Our core service is heavily dependent on EC2, and that’s been running sans issues. We’re working on non-Amazon failover solutions that assure un-interrupted service even if all of EC2 dies. Our backups are S3 dependent so we had some behind the scenes issues last weekend when S3 was flaky; see my previous post on this issue. We haven’t had our day in the sun with outages, and I obviously hope we never do, but so far I’m walking around with a big “I <3 AWS” t-shirt on.


On the convenience library front, we (Gnip + community) have made all of our code available on github. We’ve had tremendous community support and contribution on this front; so cool to see; thanks everyone!

Collections are by far the primary data access pattern (as opposed to raw public activity stream polling); not really a surprise.

Summize/Twitter has been a totally cool way to track ether discussion around Gnip. When we notice folks talking about Gnip, positive or negative, we can reach out in “real-time” and strike up a conversation.

That’s all for now.

Thanks to all the Data Producers and Consumers that have integrated with Gnip thus far!