Websocket Fun
Taking a little break from LPC this week to play around with the web client. I actually already built a prototype with Iffy for a different incarnation of gabbo a few years ago that is mostly reusable. The UI was pretty basic, there’s a big console window on the bottom of the screen for typing commands looking at output. Then there were a few other panes for displaying current room description, exits, what’s in your inventory…I think that’s it. I also had it working so the exits were links and you could click em to move around. Another cool piece was the rendering engine I wrote, where the server would send output with a buncha XML markup and it’d run it through an xslt transform to render/style the xhtml. The xhtml was then nested inside a larger declarative language sorta thing for drawing everything out in the DOM that the client code would execute. I didn’t get to take it very far so I’m not sure how well the model would hold up for more complicated tasks down the road, but for the few things I did have it doing I thought it worked pretty well.
The biggest change since the first prototype is that support for websockets is now common. Iffy got it working so you could telnet to the server over flash, and we never got to handling things like telnet negotiations. There was also this token you had to host somewhere if I remember correctly…it was just weird. From what I’d read, websockets were great. I remember the first time I saw it in use was this pretty impressive multiuser game, so I know it can do what I want it to do.
You can’t just open up TCP sockets all willy nilly though, there’s a ‘websocket’ protocol the server you’re connecting to has to speak. Fortunately other people have already figured this stuff out and there’s a pretty handy proxy server out there called ‘websockify’ that translates the websocket traffic to a TCP socket on the other side. It even supports ssl websockets so I can throw the proxy on the MUD server, connect over loopback and I get secure login for the web users for free. In order to talk to the proxy server you have to use a client library that is also provided. It actually comes with a whole vt100 telnet application, but I don’t need all that because the gabbo web client is its own “terminal type”. Still it’s good to know there’s code to reference if I wanna support things down the line like moving the cursor around. I support I should actually, since it would suck for content devs to be able to position the cursor for telnet users but not web users. That doesn’t mean I want to use escape sequences, though.
In fact, I think the biggest challenge is going to be figuring out how the server side support for the web client should work. Eventually, I’d like the web client to support rich content like images, audio, or even just fancier styling options than are afforded by something like a vt100 term. That means ditching the whole escape sequence thing in favor of some custom markup. The xslt thing I mentioned above supported tags like
To start, I don’t know if I want to force people to use close tags when they’re composing their messages, at least for formatting stuff. The MUD is, after all, a stream. I mean that in the technical sense, data is being transmitted over TCP sockets which are streams – but it also describes how people consume the game’s content. It’s not something like email where you’re dealing with discretely composed messages that are meant to be consumed in isolation from each other (though more and more email seems to be (mis)used in this way). If you sit in on a conversation in a noisy room on MUD, you’ll often see one thought, or even one sentence, expressed over several different invocations of the ‘say’ command. It’s also not like IM though, where the messaging is broken up into different containers based on who you’re talking to. With the MUD, you’ve got one single console and every message is consumed in the same logical space.
That said, the ramifications of these streaming characteristics aren’t actually super crucial. In practice, these kinds of nebulous messages between users I described represent a sliver of what is actually output to the screen. Most messages, whether it’s feedback from playing the game or you’re just looking around, are broken up neatly. And in fact, this quality is crucial for doing more complicated things with the web UI; e.g. if you need to update the inventory list when someone enters the room, you need to be able to parse out some singular instruction indicating that movement has occurred. However, to me, the question of close tags is more philosophical than practical. If you want to UI state to fall through from message to the next, there are ways to do that even with close tags in place. What we’ll end up with is a stream-like experience made up of discrete blocks (basically what all the other social networks do), with more extreme solutions for the more extreme situations.
There’s also a technical challenge with close tags, and messaging in general. While the telnet connection to the MUD driver is a generic TCP stream, the websockets connection between the client and the proxy is not. Websockets messages are exchanged in as separate events, and the proxy generates these events without any understanding of where breaks should occur in the stream of text it’s receiving from the MUD. I’m not exactly sure how it decides when to flush the incoming data, probably some combination of buffer size, scheduling, and events being received in the other direction. In any case, there has be some sort of mechanism in place that ensures complete tags (closing tag or not) are preserved. Also, each incoming message ends up rendered as a separate <div> in the console pane of the UI, and that could lead to some formatting problems without something to tell the client ‘start here’ and ‘end there’.
I’m not gonna worry too much about any of that right now, though. The real purpose of this experiment is to familiarize myself with some new technologies that I might want to employ. Specifically I’m looking at stuff Google Closure and AngularJS, and how they might fit into what I’m trying to build. Google Closure has some cool stuff for maybe building the networking layer, backend services, what have you, but the web client is also going to have these rich editors for building new game content. Unlike the game stream UI, these editors are a much more “webby” sort of experience, where nothing is really happening outside of what you see on the page until you click a ‘commit’ button to send your changes down to the server. The data binding stuff in AngularJS seems like it could come in handy here. I’m not sure how well the two libraries play together, though.
I’d also like to find a UI toolkit maybe for drawing out the more structural stuff, like tab boxes, list boxes, trees, menus, overlays, tooltips, etc. I’m used to using ZK where that’s all part of the framework and I don’t need to worry about writing my own HTML content. Again, this is probably more for the content editors. I’m also looking into doing some fancy WebGL stuff for drawing maps of room and stuff. You won’t really be able to build anything without some sort of editor anymore, so the user experience there needs to be good. There will be CLI editors in addition to the web ones, but I suspect most people will prefer the web.
Speaking of editors, one challenge concerning message I didn’t mention is how to deal people writing message strings that work for people using ANSI-enabled telnet, vanilla telnet, and the web client, without a bunch of expensive parsing and replacement every time a message is output. The obvious solution to this is to just maintain three different versions of the string internally and you use whichever one is needed based on user settings. Forcing everyone to write every message three times is terrible, though, so ideally you want the editors to handle all that for you. The web editor can be more-or-less WYSIWYG so specifying something like “make this text red” is a lot more intuitive than the CLI.
Not sure if I’m going to tackle the Closure and AngularJS stuff now or if I’m gonna switch back to the mudlib for a while. Porting my old client over was a little more effort than I expected and I didn’t even bring over any of the point-and-click stuff. There are also some CLI concepts I need to finish flushing out that may have ramifications for the GUI. I’ll write about it more in another post, but one idea I’ve built into the object parsing library is this idea of a player’s “current context” that changes as they interact with different objects inside the game. The GUI also has something like that, though it has more to do with how point-and-click navigation works than parsing object references out of strings. Much more to do.