Build a Web Socket Server
It is a little-known feature of Adobe AIR that you can build and publish applications using a web standards-based workflow. That’s right, take an “index.html” add some “mycode.js” and turn it into an application to be deployed as a desktop application. This feature has actually been there since the beginning, and comes with all fashion of fun and interesting API. One of those APIs is the ability to create a raw socket server. With this in mind, I couldn’t pass up on the opportunity to implement a Web Socket server with JavaScript.
About Web Sockets
Web Sockets have actually been under development as a specification for a few years. The initial draft of the “Hixie” specification dates back to 2009, and made it through 76 iterations before recently giving way to the “Hybi” specification. That specification for Web Sockets dates back to early 2010, and is now RFC-6455 – or essentially the path forward for implementing Web Socket functionality on the server.
In the early days there was a fair amount of contention around how exactly to make the connection to the server securely. Both the Hixie and Hybi approaches use HTTP to make the opening request. The Hixie approach focused on using various hashes in the headers to confirm a secure connection. This created a lot of complexities in the client/server handshake. The Hybi approach simplifies the handshake considerably, and puts security emphasis on the actual messages themselves.
Creating a Socket Server
As you might expect, to create a socket server with AIR, you use the class ServerSocket. You can instantiate the class, set up your listeners for incoming connections, and bind to a specific IP address and port. When you are ready to start handling connections, you call ServerSocket.listen().
The main event listener to be aware of at this point is the connection from a client. This takes the form of ServerSocketConnectEvent.CONNECT. The event object that comes along with that has a “socket” property that is a reference to the client. Since you may want to send messages from the server to the client (the whole point of Web Sockets really), you probably want to keep track of those connections. I push the reference into an Array called “connections”.
Getting a connection is one thing, but then you will want to handle incoming data. In the case of Web Sockets, this incoming data might be messages from the client, or at the very beginning, the handshake that says the server accepts Web Socket connections in the first place. Incoming data happens on an event called ProgressEvent.SOCKET_DATA. The event that goes along with it carries the associated data being sent from the client.
Handshake
The Web Socket handshake starts off with an HTTP GET request. This is different from how the subsequent data messages are handled, so it provides a good point to determine what it is that the client actually wants from the server. I decided to handle this point of differentiation by reading the first byte and determining if it was the letter “H” or “76” as an integer. If it is the letter “H” then the client is trying to establish the connection, and we can further the handshake process along.
While Hybi simplifies the headers needed to make the handshake, the headers themselves are still important. At this point you can read in all the incoming data, and parse out the headers. The specific header you are most interested in accessing is the “Sec-WebSocket-Key” header. This header contains a randomly generated string. Per the specification, you are required to append a specific GUID to that string, then perform a Base-64 SHA-1 hash on the resulting string.
Once you have this hashed value, you are required to return it as part of the HTTP response. The first header in the HTTP response indicates that you want to switch protocols from standard HTTP. The second header tells the client that you are switching to the Web Socket protocol. The third header states that this is an upgrade action. And finally, the “Sec-WebSocket-Accept” header sends the resulting hash from the previous operation.
Once you’ve sent the HTTP response, assuming you have hashed the incoming value correctly, a connection is established and the handshake process ends. From here on out, incoming data takes the form of messages. And of course you can send outgoing messages from the server as well. We will dig into those details next.
Incoming Messages
Since we’ve established that only the handshake starts with “H” and that we’d not have incoming messages if the handshake had failed, we can assume new incoming data is messages. Data messages can be a variety of data types, to permit clients that aren’t browser-based. For example, there is an option to send binary messages. Early work on binary support in JavaScript is happening, but for now I’m going to focus on text messages.
Messages are passed using a concept called “framing” which can commonly be found in other protocols. In this case, the first four bits (yes, bits) tell the server if this is a single message, or the continuation of a longer message. For this example, I’ll be sending single messages at a shot from the client to the server, so these bits will always be “1000” and I can move on to getting more information about the message itself.
The next bits of the incoming message data represent the data type for this message. Since I’m looking for text messages, I’m looking for the bits “0001”. You might be wondering how you get at bits if binary support isn’t part of JavaScript just yet. While there’s more to this story that I’ll cover in a moment, the quick way to get at the bits is through string manipulation. You can read the incoming data, treat it as a string with the conversion radix of 2 (binary) and then look at the second chunk of four characters.
Oddly enough the data type bits can also denote if the client wants to close the connection. If that is the case, the bits of this second chunk of data will be “1000”. In that case, we will use Array.splice() to pull that particular client from our array of monitored clients. From there we will close that connection as well, before letting it be garbage collected down the road.
Data Masking
Continuing down the path of this being a text message, we will next want to know if the data was masked. In the case of messages coming from the client, in the case of Web Sockets, that bit will always be “1” (true). All messages from the client to the server get masked. We will come back to this in a moment.
The next seven bits indicate the size of the data that is being sent along from the client. Note that a message may be longer than what seven bits can handle (125 characters), and the specification accounts for this by using the value of “126” to indicate that these seven bits, and the next 16 bits are the payload size. If you need longer still, you will get a value of “127” which means these seven bits and the next 64 bits. For this example, all my messages will be less than 125 characters, so we take those seven bits to determine the size of the payload.
You might have noticed at this point that there’s some JavaScript syntax that you don’t normally see around handing bytes. It turns out that AIR provides binary manipulation APIs for you to use in JavaScript. This API may or may not reflect the ongoing JavaScript work around binary support, so take it with a grain of salt.
The next four bytes are the actual mask for the payload data itself. And then the next bytes are the actual payload data itself, where the number of bytes to read comes from the previous operation. If you didn’t already feel a little confused, this is where things can get a bit sticky. Masking bytes are a way of shifting the payload data bytes around such that the message itself retains some security integrity.
The next operation then is to unmask the payload data. To do this you essentially walk through the bytes of the payload data and shift their values using the bytes of the masking key. For each byte of the payload data, you will increment the masking byte that you use. When you’ve reached the fourth masking key byte, you start back at the first masking key byte, and so on until the payload data has been unmasked. Now you have the actual message value and can figure out what the client wants from the server.
Outbound Messages
Sending messages also relies on data framing, but luckily messages from the server to the client do not have to be masked, which makes the process of sending a message much easier to handle. From our previous walkthrough, we know that the first four bits of the framing are “1000” for a single message. We also know that we will be sending a text message, so the next four bits are “0001”. A byte consisting of “1000 0001” is the equivalent of “129”, so I send that byte right across.
Again, without the requirement of masking messages coming from the server, the next bits will be “0000”. That is to say that there is no masking (zero on the first bit), and therefore no masking key is needed so zero from the remaining seven bits. From there we can write a byte indicating how long our response message will be. And then finally, the actual payload data bytes going back to the client.
Of course being able to construct a message is one thing, but knowing where to send that data is another. That is why we kept track of the clients as they were connecting. For this example, I am sending random numbers back to the client at a high rate (every 50 milliseconds). Every time that interval hits, I iterate through the connected clients, and send along the randomly generated number.
Extras
Because I want to ensure that I only send messages along to the clients that are currently interested in getting them, I also keep an array of the various states of the connected clients. Clients can stay connected and choose not to receive messages for the time being. This also sets up the foundation to potentially send entirely different message to different clients.
Fellow Adobe Developer Evangelist, Andrew Trice, covers this situation very nicely in his write-up on the topic. If this is a scenario you are interested in for your application, I highly suggest that you take a look at his example code. Andrew also takes the next step on abstracting all the connection code and client management. Definitely a great read for those interested in the topic. Andrew has also posted his version on GitHub if you want to take a shot at the code yourself.
Conclusion
There you have it, a Web Socket server implemented with JavaScript on Adobe AIR. While it is a fun exercise to truly understand the Web Socket specification, and while it could serve as the basis for your own server, there are many other great open source options out there. I’ve posted this code on GitHub and will likely iterate over time, but I encourage you to check out the other options as well.
I am a father, husband, photography enthusiast and pilot most of the time. I work for Adobe managing the world's best evangelism team the rest of the time. I also enjoy hacking hardware, cigars, travel, and movies.
Hi I know this is not the right place to talk about this but I really need help with Xbee arduino connections.
I was watching your video http://www.youtube.com/watch?v=9VfiyDBjegs
trying to find help on connecting muti-Xbee –> Xbee –> flash.
I know that you used to have a few tutorials teaching people how to do it before but I believe you have already taken them out of your blog.
Could you email it to me please?
Thank you very much