Refactoring in the Cloud – Notifications

One of the things I have been thinking about a lot recently is the architecture of Glassboard. Now that I have everything mapped out for where I want to take Glassboard’s direction going forward both financially and functionally, it’s the time to look at how to actually build that functionality into the existing code base.

The Current Glassboard Architecture

Glassboard’s backend is powered by Microsoft Azure. In terms of computing, there are three main components: two worker roles and one web role.

Azure defines a Web Role as a server that has a running copy of the IIS web server on it, so that users can connect to it. A worker role doesn’t run IIS, but is still a Windows Server instance that you can use to do things like background processing on it and offload some of the heavy activities from your main server role.

This is a rudimentary diagram of the Glassboard architecture as of April 2014.

Glassboard's Architecture

Everything in the current Glassboard architecture is housed in a single Visual Studio solution with about 36 C# projects in it. It’s a lot of code to manage. Luckily it’s also well tested, which makes it easier to manage.

Current Notification Role Issues

There are a few technical and ideological issues I have with the current notification architecture.

To send messages between the web role and the attachment processor and notifications roles, I use Azure’s Storage Queue mechanism to pass messages. When you post a message to one of your boards, the web role generates a new XML payload in one of our storage queues that the Notification Role listens for to handle the processing.

Sometimes this queue and the processing gets extremely backed up. Usually the CPU usage spikes to around 90% or higher for long periods of times which can lead to message postings being delayed for several minutes (in some cases I’ve seen hours) at a time. This is no good, especially when building a messaging service.

I’m not sure what the cause is for the spiking CPU other than a lot of delicate C# threading issues which I’m guessing are running amok.

The other issue I have with the notification role is how much of the code for things push notification services is in-house built. At the time Glassboard was originally built a few years ago, there weren’t too many open source solutions for handling push in C# I’d imagine.

I’m fairly conservative when it comes to third-party dependencies, but for things like push notification services there’s not much sense in maintaining your own code library to connect to the service when there are a few good solutions out there used by hundreds (thousands?) of startups and businesses around the world. They’re well implemented, battle tested, and code that is maintained by the community.

Being on C#, that limits the amount of options I have for dealing with push providers. PushSharp seems to be the main provider out there.

The New Notifications Role

Given that I’m not a fan of the C# threading model currently implemented, the amount of code used to power something that’s relatively simple, and the homegrown nature of many of the push wrappers we’re using, I made the decision to rewrite this portion of Glassboard’s architecture using Node. Hang on. Put those Buzzword Bingo cards away. I’m not a fan of rewriting for the sake of rewriting either.

Ultimately, my decision to switch to Node is all about code management. I’m confident I can accomplish the goals of my notification roles with a lot less code using Node than C# thanks to it being a bit less verbose of a language and a variety of NPM packages I can take advantage of for handling APNS, SendGrid, and GCM. The less code I have to manage and maintain, the easier my job is as the sole developer of Glassboard.

I don’t believe I will have any performance degradation in switching from C# to Node for this portion of the code either, though it’s hard to tell before actually writing the code. If anything, I’m optimistic that performance will improve.

A Fork In The Road

There’s two different ways I can approach handling push notifications being distributed through my new worker role. The first is to use these npm packages like node-gcm and its similar ilk to distribute the messages myself like I have always done. The difference being mostly in the language being used (Javascript versus C#).

The other option is to take advantage of Azure’s Notification Hubs which will handle distributing the messages to devices for me. Notification Hub has some advantages that stand out to me:

This all sounds great. Why not just do this? Cost.

Notification Hubs are another cost layer on top of the existing charges I’m paying Microsoft for my cloud instances and storage. I’m still in ‘losing money’ mode with Glassboard, so I’m trying to be fairly conservative with my spending.

Glassboard currently sends around 150,000 emails a day according to Sendgrid, but I have no idea how many push messages it sends because it has never had a way to track that sort of information. If I opt into Notifications Hub, I’m likely going to be out at least $200 extra a month, but that could also sky rocket to an even higher amount.

Now you understand why I am giving so much important to data and analytics these days.

My gut says use Notification Hubs. My wallet says roll my own.

Service Bus vs Storage Queues

One last thing I’m still debating is whether to keep using Azure’s Storage Queues for managing the messaging between the different components in my cloud stack or to use the new-to-me Service Bus functionality, which looks to offer a few more features. The Notification Hub functionality Microsoft offers is powered by the Service Bus, which has me considering the switch.

There are plenty of decent articles out there comparing Service Bus vs Storage Queues, but from a more abstracted level that leaves me still questioning which way to go. The conservative developer in me says keep using what I’ve already got because no major Service Bus feature stands out to me as a must-have right now. The liberal developer in me sees Service Bus as the more modern architecture that will likely be able to grow with me as Glassboard (hopefully) expands in years to come.

Feedback certainly welcomed.

Writing The Code

Even though the core architecture of two worker roles and a single web role won’t change with this refactoring of Glassboard’s notification architecture, hopefully this has been at least a little eye opening in terms of how Glassboard is built, and how I envision it being built going forward.

The obvious next step is to write the code, which I started earlier this week by writing the unit and integration tests for the new Node code base.

Since I spend so much time in Visual Studio these days, I thought I’d give the Node tools that Microsoft released a try. Then I remembered that while I love what Microsoft is doing with the cloud and Azure, Windows is still Windows.