Wednesday, September 12, 2007

Semi-modal offline support

My life on the go

When I'm not working from home I commute to the office on public transit. On the way, I read lots of books (ok, and other fun stuff too. I also use my laptop to read blogs (too many to list), check email and write code.

Aside: Installing Visual Studio on my laptop was probably the best thing I have ever done for my family life. If I can take my code with me then I can keep hammering away at bugs while I'm communiting instead of hanging around at the office endlessly. I'd rather be sitting on my patio anyways with the WiFi.

The problem is that despite my GPRS connection (yeah, I wish I'd waited and gotten the EVDO model) I do go underground four times during my commute. So connectivity is a bit spotty. Consequently I'm in the habit of loading up FireFox with tons of tabs so that I have plenty of reading material for the whole trip.

Google Reader

Recently I switched to using Google Reader for reading RSS feeds. (Previous I had been aggregating the feeds using LiveJournal.) Google Reader now has a nifty offline support feature. When you switch it to offline mode it will download "up to 2000 items" and save them to your local disk. But there are problems...

The items have to be downloaded all at once. If it takes too long and you lose your connection or cancel the download, tough bananas. (Ironically, I also found the installer for the offline support plugin to be very flaky. If you're not online when you run it, it hangs on startup and eventually errors out.)

In search of graceful degradation

So what I'm going to pick on today is the notion that offline support requires a discrete mode.

Internet connectivity is always at least a little unreliable. Sure, most of the time your web application can issue an XML/HTTP request to some service provider and get back a response at most a few tenths of a second. But sometimes it will take much longer or it will fail. What should the application do?

So first off, forget about the statelessness of the web: that was only true back when we were serving up plain HTML documents. Today's web applications truly exhibit distributed state. The client has some of it and the server has the rest. Often client development is simplified by minimizing the amount of state it retains locally at the cost of responsiveness and robustness.

If you think this is contrived, think about all of the forms-based web applications you've used that are constantly doing round-trip calls to the server to submit changes made to the client's state to the server and figure out what to do next. Pretty horrible stuff! AJAX smooths the experience somewhat but the same kinds of things are going on underneath. It's all client-server.

So what happens when I pull the plug on my DSL connection while I'm writing a blog post. Well, Blogger periodically saves drafts to the server for backup purposes. But I can still make edits to the local state. Yet if my browser crashes for some reason, I'll lose them all because none of that state is persistent. (I'm speaking from experience here.)

Some years ago I read A Fire Upon the Deep. Vinge's universe places fundamental limits on the level of technological sophistication of different regions of the galaxy. At one point in the story there's a description of ships from the Beyond gracefully degrading as they enter the Slow Zone. Communication systems didn't fail outright but they compensated for narrowing bandwidth by reducing the quality. IIRC they adapted even to the point of transcoding conversations with talking heads and text-to-speech. That's the ideal...

The many shades of connectedness

Being offline means that there exists distributed state in a system that cannot be accessed or reconciled for an indefinite period. In between online and offline are many degrees of communication (un)reliability. Each degree of unreliability requires a corresponding measure of aggressiveness to tolerate. Here's a scale:

  • Perfect: The application's, reliability, bandwidth and latency requirements are fully met. Packets are never dropped. The plug is never pulled. You could run a Telnet session over UDP with complete confidence. This world does not exist!
  • Typical: There are occasional problems but they are easily compensated for by routine error correction protocols with at most minor glitches.
  • Unreliable: It becomes harder to mask errors. An application can still generally assume the problems are transient but it may become practically unusable unless it does at least some local caching.
  • Sporadic: User state will be lost or left dangling in mid-air unless the application compensates with local state management.
  • Offline: I pulled the plug! The application is on its own with whatever client-side wits it has left. In many cases, this is the kiss of death for an application and the user has to resort to typing up notes in Notepad until connectivity is restored.

A web application is ultimately at the mercy of the network. It can be enhanced to support disconnected operation, as with Google Reader, but there are significant complications. Just the same, why treat disconnected access as a special mode? Why not think of it as a very aggressive caching strategy with robust client-side storage? And if that's the case, why not enable it all of the time?

In other words, why doesn't Google Reader always cache feed items to my local disk whenever it can. I could certainly tell it to prefetch more data because I'm going offline for an extended time. Moreover if the synchronization process fails part way through, I should still be able to access whatever content I do have available. Why does offline support need to be modal? Why does Google Reader need to behave any differently from a desktop RSS client when disconnected from the network?

Thinking about web-enabled applications

I used the word enabled in web-enabled to effect a fundamental change of perspective. The web today is an excellent content delivery platform. Just the same, it is suboptimized for application delivery. On the web today, application and content are tightly enmeshed with one another but in reality they're very different. When I go offline, I don't mind if some of my content becomes inaccessible but I would still like to be able to use my web-enabled application to manipulate local content. That can't work so long as the application actually lives in a faraway server farm and I'm just interacting with served up views of its state when I'm online.

Ideally I want the web to deliver rich client applications whose code gets been cached on my local machine. I want applications that blur the line between the desktop and the web (including conforming to some sensible UI conventions). These applications would be enabled by the web and tightly bound to network services but they're not shackled to it. They can stand on their own albeit with reduced functionality. Ultra-portable applications: they run anywhere.

That's what I want!

Doom, death and destruction!

So who will be first to provide a compelling solution for lightweight rich-client application delivery over the web? Google Gears, Apollo, Silverlight or someone else?

1 comment:

Andy Stopford said...

Apollo and WPF (Silverlight is not really a good fit here) are adding concepts for occasionaly connected devices. Both sit around a local micro database (SQLite in Apollos case) that can background load data and allow you to switch between online and offline. The EBay app for Apollo showed this off, allowing a user to create items offline then goline to add them.

WPF would I expect likely use SQL Server CE and the new Sync framework -