Saturday, November 3, 2007

Cheap event handling

Websites, whether anyone realizes it or not, are basically event-driven. Request comes in, page comes out. Blogs go in, RSS goes out. Nothing changes until someone clicks the link, uploads the blog post, submits the form.

Because pages are the only thing a user consciously requests, websites were traditionally organized on a page-by-page basis. The site was nothing without the pages. But pages make all kinds of decisions about what is going to occur on them. What forms are available, what information is displayed, the groupings of necessary data.

In a large-scale system like Treemo, there is much more than pages to worry about. In fact, presenting the actual web pages represents maybe 30% of the logic in the system. Deciding what to do is the real problem. Is this a request for an HTML page? An XHTML page? An SMS message? An RSS feed? A picture? An account upgrade? Once we've decided on what's being asked for, objects get inflated from the database and something happens. Then what?

Well, all kinds of things. Things happening in the system have side-effects. A user buying a paid account has their quota lifted. A photo being uploaded causes notifications to be sent. A site message being sent causes another user's inbox to have a new message.

The list is very long. Any request- or event-driven system is basically built on these foundations. Something Has Happened, and Important Classes Need To Know. OO GUI frameworks took this to its logical conclusion. Dozens of event classes from AncestorEvent to UndoableEditEvent. Then classes can attach themselves to controls as listeners of these events, and be guaranteed that anytime the event occurs, the class will be notified.

On the web, for some reason, I don't see this paradigm much. One trouble spot for us as a PHP shop is that we're guaranteed that class state is NOT maintained across requests. This is different from a native GUI application, in which all the classes are sitting in memory, so when the event fires, the listening object is already there. Also, the list of listeners is easily maintained as an in-memory data structure. Not so much in PHP. In PHP you have to back your objects with databases, and listener structures are not so simple to generically store.

Of course, if you were really clever, you would use an ORM tool like DB_DataObject, but we aren't quite so fortunate (also DB_DataObject is really slow, and DBDO isn't ready yet). With or without ORM, you still have to have tables, how would that look?

The first time we encountered this problem we had, as usual, far too little time to think about it, and the primary concern we had was that it be asynchronous. At the time, this need was driven by the idea that we needed to perform regular billing events for paid users. Subscription billing is generated by time elapsing, so there needed to be a way to schedule future events.

Unfortunately, we designed a system that was Just Good Enough, and it has lasted all this time. It's fully asynchronous, which creates big problems in a dynamic system. State changes are very rapid, and being sure you placed EVERY piece of data EXACTLY in right the queue is too much detail to think about, and fails silently. Here's how it looks now:

$extra = array(
'message' => $_POST['message'],
'from' => $request->viewer->user_id
);
$evt = new event('share', $extra);
$evt->fire();

The fire() method inserts a row in the database containing the type, the extra array, and the timing data (when to process, etc.) A daemon then consumes the rows, and initiates handlers, which look something like this:

class shareHandler extends eventHandler
{
/**
* Handle a share event.
*
* extraData array:
* 'to' => Array of Email address or phone numbers. Required.
* 'from' => The user_id of the sender. Required.
* 'what' => String, url of the page that is being shared.
* 'message' => Text of the message. Defaults to _____.tpl
*/
public function handle() {
...
}

So this works fairly well, but the linchpin is that twiddly extraData bit. What happens when the caller fails to place the right data in there? Well, nobody will know it until the event daemon actually picks it off the queue, but that happens minutes or even hours after the request which generated the event occurred.

What we'd like to happen is for the caller to know a lot more about what's going on. If the user sharing content types in an invalid phone number, for example, it would be best if s/he could be notified of it immediately. So one simple way to do that for us is to move the event processing right into the request processing and do away with the daemon.

Much like MDB2, the event processor could be represented as a singleton object that all the classes could send events to - which it technically is now anyway. So, you want to share something? Try it this way:

$evt = new ShareEvent( $_POST['message'], $request->viewer->user_id );
$evtProc = EventProcessor::singleton();
$result = $evtProc->fire( $evt );

This fixes two of the major problems with the existing events. 1) Event parameters can be type-hinted and required in the constructor, because in this scheme events are typed, rather than handlers. 2) Notice how you get a $result back from the event firing? The request processor can actually deal with the results of the event's handling! The event processing itself can even throw exceptions if it needs to! This is much better.

Of course, you might not even need the singleton object, depending on your model. We're going to need it for tracking purposes, but at its core this pattern is no different than a good object model for system events. To do away with the singleton, just have your Event interface expose the fire() method. Or even if you need central points of logic for event processing, have the abstract event() class implement fire() and call parent::fire() in your extenders. This has lots of room for flexibility.

For more on events, and the observer pattern specifically, see the page on The Observer Pattern at ZDZ.

No comments: