TELNET had some good things going for it — a local-echo mode and a well-defined network virtual terminal. Then SSH came along and added minor enhancements like confidentiality and authentication, at the cost of losing the local-echo mode and the well-defined terminal semantics.
Terminal emulators are a hard sell. I mean, yeah, we want that functionality, but at the cost of selling our souls? OK, if not our souls, how about our sanity?
Ever had a misbehave program freak out and corrupt your display? Or maybe you’d cat the wrong binary, then suddenly f?Kfjd!ܾ!j?sRc?CH?U^, it looks like your mom picked up the phone during your BBS session, and you have to type ‘reset’ to fix it. It’s not world-ending, but it is irritating.
As amusing as the telnet quote is, it’s right. Local echo is nothing if not immediate. We get feedback. Granted, in the case of telnet, that feedback doesn’t mean a whole lot, but still, it’s nice to be able to type and not see the letters show up several seconds later just because you’re typing into a terminal from 35,000ft using an internet connection that cost you the price of a month of dial-up back in the days when mom could interrupt your session of Legend of the Red Dragon.
Enter Mosh, a new take on terminal emulators.
Mosh is a client/server model. You run it almost exactly like you would ssh:
It actually connects over ssh, then runs mosh-server on the other end. That’s when things get a little strange.
Mosh actually performs its transmissions via encrypted UDP datagrams. In the spirit of the heathen connectionless-UDP-protocols, it also doesn’t try to deliver each and every individual packet that the server spits out – so when you screw up and run find /, you don’t have to wait 10 minutes for your futile ctrl-c keystrokes to get to the server. Plus, and maybe most damning, it allows you to close your laptop, get off the plane, go into the terminal, pay ANOTHER $20, and connect back in immediately, all without running a screen session. Clearly, this is the work of Satan.
In all seriousness, though, this is pretty handy. What they actually do is, instead of treating the terminal like a stream (which is what every command line terminal emulator ever has done), they treat it like a state machine, and the network transmissions are all about bringing the client state into sync with the server state. That’s a cool idea.
So, after I wrote this, I spun up an Ubuntu server in the HP Cloud, and started playing with it. Very interesting.
The first thing I noticed was the confidence degree that the server received what I typed. I noticed this because the mosh shows how confident it is by underlining the words as I type them. For instance, when I type
echo “Four score and seven years ago our fathers brought forth on this continent”
as I type that, the underline “chases” my cursor. When the underline disappears, the client is 100% confident that the server and the client are in the same state.
Interestingly, though, when I paste in that same string, the underline is nonexistent. At least, I don’t see it at all. I’m on a Mac, and I’m using Terminal.app (yeah yeah, I know). I’m not sure if there’s an intrinsic difference in how Terminal.app handles pastes vs keyboard inputs, but the server becomes immediately aware when I paste, but not when I type. Odd.
Also interesting is the output of ps:
10512 ? S 0:00 mosh-server new -s -c 256
10513 pts/1 Ss+ 0:00 \_ -/bin/bash
10614 pts/1 S+ 0:00 \_ screen -T screen-256color ...
10633 ? Ss 0:00 \_ SCREEN -T screen-256color ...
10708 pts/2 Ss 0:00 \_ /bin/bash
10870 pts/2 R+ 0:00 \_ ps axf
All of the screen stuff is normal Ubuntu 10.10 cruft, but you can see that even though it does use the original ssh control channel to start up the connection, it drops it immediately.
I’ll run packet captures if anyone is curious, but if you’re that curious, you should just play with it yourself! Just remember to open UDP ports 60000-61000 to the machine on your firewall (which I almost forgot, natch). Let me know what you think of it in the comments below!