Pages

Monday, November 30, 2009

Closure Templates and Node.JS: Server-Side Soy

Another NodeJS experiment: using Google's Closure Templates with NodeJS.

Closure Templates (aka Soy) can be used either on the server or client-side. The recommended way to use them server-side is with SoyTofu, but there is a way to use them in pure javascript on the server with our new pal, NodeJS.

Suppose we have this blog.soy template to render a simple blog post with some comments:

{namespace blog}

/**
 * Renders a post with comments.
 * @param post
 * @param comments
 */
{template .postPage}
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">

<html lang="en">
<head>
 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
 <title>blog</title>
 <meta name="generator" content="TextMate http://macromates.com/">
 <meta name="author" content="Sean McCullough">
 <!-- Date: 2009-11-28 -->
</head>
<body>
{call .post}
  {param post: $post /}
{/call}
{call .comments}
  {param comments: $comments /}
{/call}
</body>
</html>
{/template}

/**
 * Renders a Post.
 * @param post
 */
{template .post}
<h2>{$post.title}</h2>
{$post.body}
{/template}

/**
 * Renders a list of comments.
 * @param comments
 */
{template .comments}
<h3>Comments:</h3>
<ul>
  {foreach $comment in $comments}
 <li>{$comment}</li>
  {/foreach}
</ul>
{/template}

The Closure template compiler will take this input .soy file and create an output .js file that contains functions corresponding to the {template .functionName} sections above.

To compile:
java -jar SoyToJsSrcCompiler.jar --outputPathFormat templates-compiled/blog.js templates/blog.soy


The generated functions in templates-compiled/blog.js look like this:

blog.post = function(opt_data, opt_sb) {
  var output = opt_sb || new soy.StringBuilder();
  output.append('<h2>', soy.$$escapeHtml(opt_data.post.title), '</h2>', soy.$$escapeHtml(opt_data.post.body));
  if (!opt_sb) return output.toString();
};

Now, if you just try to require() this generated .js file, Node will complain because it doesn't know what soy.StringBuilder() is. We can fix that by shoehorning soyutils.js into node, of course.

First we need to make soyutils.js work with Node's require mechanism. require works in conjunction with process.mixin(), so you make soy require()-able by adding this to the bottom of soyutils.js (copied into your application code directory from the closure templates distribution):

process.mixin(exports, soy); 

Then we need to require soyutils in the blog.js file (you can just paste these into the bottom of the file but it's probably better to implement this as a post-soy-compile step in a build script so you don't have to keep pasting every time you recompile the template)

var soy = require('../soyutils');

process.mixin(exports, blog);

That last process.mixin call will make the blog template functions available to other source files via require.

Now we're ready to use the soy template with our nodejs server code. You'd just require templates-compiled/blog.js and call the functions that it provides from within your event handlers (again, building on the blogging example from a previous post):

var sys = require("sys"), http = require("http"),
  blogTemplates = require("./templates-compiled/blog");

var handlers = {
  '/posts/{postId}' : {
      GET : function(request, response, args) {
        response.sendHeader(200, {"Content-Type": "text/html"});
        var commentsPromise = getCommentsPromise(args.postId);
        var postPromise = getPostPromise(args.postId);
        var templateVars = {};
        commentsPromise.addCallback(function(comments) {
          templateVars.comments = comments;
        });
        postPromise.addCallback(function(post) {
          templateVars.post = post;
        })

        var joinedPromise = join([commentsPromise, postPromise]);
        
        joinedPromise.addCallback(function() {
          var pageHtml = blogTemplates.postPage(templateVars);
          response.sendBody(pageHtml);
          response.finish();
        });
      }
    }
  }
};

This is awfully clunky. I'd like to write a directory watcher that automatically compiles recently updated .soy files, appends the require and process.mixin calls, and reloads the result into Node.

Thoughts on Closure and NodeJS


I spent a little time trying to get the closure compiler to work with Node so that you could for instance, statically verify that the template function invocation parameters match up with the declared parameter types in the .soy file. Haven't gotten enough working there to blog about yet though.

I don't know if the closure compiler optimizations would help NodeJS much, but the static analysis would probably help catch a lot of easy-to-introduce but too-tedious-to-unit-test problems that crop up when you have lots of people working on the same code base.

Also, the Closure Library contains a lot of useful packages that could be applied server-side as well.

Wednesday, November 25, 2009

Joining Promises for Parallel RPCs in Node.JS

I've been playing around a bit more with Node.JS since my last post and I decided to experiment with the asynchronous process.Promise object this time around. In other languages I believe this concept is sometimes referred to as a future.

Diving in, suppose the following:

  • You're writing a blogging engine.
  • Blog Posts are kept in one data store, and Comments are in another.
  • A request for a Post object takes 1 second to return.
  • A request for a list of Comments on a Post takes 2 seconds to return (the comments data store is run by a bunch of slackers who don't care about latency)
  • You want to have /posts/{postId} return an html page that renders both a Post and all all the Comment objects on it.

When you make an RPC (or any I/O call) in Node.JS you should wrap it in a Promise so the process doesn't block on your HTTP request.

So our getPost and getComments RPCs (faked out) look like this:

var getCommentsPromise = function(postId) {
  var promise = new process.Promise();
  var comments = ["Comment 1 on " + postId, "Comment 2 on " + postId];
  setTimeout(function() { promise.emitSuccess(comments); }, 2000);
  return promise;
}

var getPostPromise = function(postId) {
  var promise = new process.Promise();
  setTimeout(function() { promise.emitSuccess({title: "Post Title " + postId, body: "Post Body " + postId}); }, 1000);
  return promise;
}

Now, if all you had to render on /posts/{postId} was the Post object and not the comments, you could just put the rendering code inside the handler for the Post RPC and be done with it, like so (building on the URI template router from my last post):
var handlers = {
  '/posts/{postId}' : {
      GET : function(request, response, args) {
        var postPromise = getPostPromise(postId);

        postPromise.addCallback(function(post) {
          templateVars.post = post;
          var pageHtml = postTemplate(templateVars);
          response.sendBody(pageHtml);
          response.finish();
        });
      }
    }
  }
}


But life is never that simple, and /posts/{postId} has to make two RPCs to get the data required to render a page. This is complicated because you can't render the page until both RPCs are complete.

There are at least two ways to deal with this situation. One sucks and the other doesn't suck as much.

Teh Suck: Serialize the RPCs, then render.

You can serialize the RPCs by nesting the call to the second one inside the handler for the first:

'/slowposts/{postId}' : {
      GET : function(request, response, args) {
        response.sendHeader(200, {"Content-Type": "text/html"});
        var postPromise = getPostPromise(postId);
        postPromise.addCallback(function(post) {
          var commentsPromise = getCommentsPromise(postId);
          commentsPromise.addCallback(function(comments) {
            var postTemplate = tmpl['post-template.html'];            
            var pageHtml = postTemplate({'post': post, 'comments': comments});
            response.sendBody(pageHtml);
            response.finish();
          });
        });
      }
    }
  }

This takes 3 seconds to complete: 2 for fetching comments, then 1 more for fetching the post.

Teh Not So Suck: Parallelize the RPCs, join them and render when the join is complete.

'/fasterposts/{postId}' : {
      GET : function(request, response, args) {
        response.sendHeader(200, {"Content-Type": "text/html"});
        var commentsPromise = getCommentsPromise(args.postId);
        var postPromise = getPostPromise(args.postId);
        var templateVars = {};
        commentsPromise.addCallback(function(comments) {
          templateVars.comments = comments;
        });
        postPromise.addCallback(function(post) {
          templateVars.post = post;
        })

        var joinedPromise = join([commentsPromise, postPromise]);
        
        joinedPromise.addCallback(function() {
          var postTemplate = tmpl['post-template.html'];            
          var pageHtml = postTemplate(templateVars);
          response.sendBody(pageHtml);
          response.finish();
        });
      }
    }
  }

This takes 2 seconds to complete since the RPCs are made in parallel, and the total time is just the slowest RPC (2 for fetching comments).

This method takes a special function, join(), to make it work.   join takes a bunch of promise objects and returns another promise that fires once all the other promises are complete:

function join(promises) {
  var count = promises.length;
  var p = new process.Promise();
  for (var i=0; i<promises.length; i++) {
    promises[i].addCallback(function() {
      if (--count == 0) { p.emitSuccess(); }
    });
  }
  
  return p;
}

Note that this example ignores stuff like errors, which make things even more complicated.  What to do with join when one of the promise objects fires an error instead of success?  Probably a good topic for another post in the future.

Also, I've been using Jed Schmidt's tmpl-node engine to render html in this example. Templating in Node.JS appears to be an active area of debate, but this one works fine for my purposes.

Note that one could also parallelize the rendering of the template as well, so the postPromise handler renders the html for the Post while commentsPromise is fetching/rendering comments. Then the join handler would stitch together the final html.

Sunday, November 22, 2009

Request Routing With URI Templates in Node.JS

I've been playing around with node.js, an asynchronous JavaScript server built on V8.

Node.js itself is pretty bare bones.  It's not a framework like Rails, but rather plain request-response handling.  It's sort of like Python's Twisted framework, from what I gather.

There are more full-featured frameworks for node.js if you look around on github, but I'm bored and feel like committing the sin of writing yet more framework code.

This morning I started with a request router for Node.JS that leverages URI templates*.  You specify the application as a series of request templates, paired with the functions that handle them.

For instance if you take the typical blogging application example, you might have a path like /posts/1234 - and the URI template would look like /posts/{postId}.

The magic is in turning {param}s in the URI template into parameters in the handler call.

Here's an example app that routes blog-like requests:

var handlers = {
  '/posts/{postId}' : {
      GET : function(postId) {
        this.response.sendHeader(200, {"Content-Type": "text/plain"});
        this.response.sendBody("GET Post ID: " + postId);
        this.response.finish();
      },
      POST : function(postId) {
        this.response.sendHeader(200, {"Content-Type": "text/plain"});
        this.response.sendBody("POST Post ID: " + postId);
        this.response.finish();        
      }
  },
  '/comments/{postId}/{commentId}' : {
      GET : function(postId, commentId) {
        this.response.sendHeader(200, {"Content-Type": "text/plain"});
        this.response.sendBody("GET Post ID: " + postId + " Comment ID: " + commentId);    
        this.response.finish();
      }
  }
};

As you can see the individual handler functions are further distinguished by HTTP method.

I'm not sure how to pass POST bodies to the handlers. They could just be attached to the handler's this I suppose.

Here's the full source:

var sys = require("sys"), http = require("http");

var handlers = {
  '/posts/{postId}' : {
      GET : function(postId) {
        this.response.sendHeader(200, {"Content-Type": "text/plain"});
        this.response.sendBody("GET Post ID: " + postId);
        this.response.finish();
      },
      POST : function(postId) {
        this.response.sendHeader(200, {"Content-Type": "text/plain"});
        this.response.sendBody("POST Post ID: " + postId);
        this.response.finish();        
      }
  },
  '/comments/{postId}/{commentId}' : {
      GET : function(postId, commentId) {
        this.response.sendHeader(200, {"Content-Type": "text/plain"});
        this.response.sendBody("GET Post ID: " + postId + " Comment ID: " + commentId);    
        this.response.finish();
      }
  }
};

var Route = function(uriTemplate) {
 this.uriTemplate = uriTemplate;
 var nameMatcher = new RegExp('{([^}]+)}', 'g');
 
 this.paramNames = this.uriTemplate.match(nameMatcher);
 // the regex keeps the {} on the param names for some reason. TODO: fix this.
 for (var i = 0; i < this.paramNames.length; i++) {
  this.paramNames[i] = this.paramNames[i].replace('{', '').replace('}', '');
 }

 this.matcherRegex = this.uriTemplate.replace('?', "\\?").replace(/{([^}]+)}/g, '([^/?&]+)');
 this.matcher = new RegExp(this.matcherRegex);
};

Route.prototype.parse = function(path) {
 if (this.matcher.test(path)) {
  var result = {};
  var paramValues = this.matcher.exec(path);
  // assert: paramValues.length == paramNames.length
  for (var i = 1; i < paramValues.length; i++) {
      result[this.paramNames[i-1]] = paramValues[i];
    }
  return result;
 }
 return null; //throw exception?
};

http.createServer(function (request, response) {
   var handled = false;

   for (pathTemplate in handlers) {
     var route = new Route(pathTemplate);
     var params = route.parse(request.uri.full);
     if (params) {
       // Convert the results to an array so we can pass them in via apply().
       var values = [];
       for (name in params) {
         values[values.length] = params[name];
       }

       var handler = handlers[pathTemplate][request.method];
       // So you can call this.request and this.response in the handlers.
       handler.apply({'request' : request, 'response' : response}, values);
       handled = true;
     }
   }

   if (!handled) {
     response.sendHeader(404, {"Content-Type": "text/plain"});
     var output = "Couldn't route: " + request.uri.full + "\n";
     for (name in request) {
       output += name + ": " + request[name] + "\n";
     }
     response.sendBody(output);
     response.finish();
   }
}).listen(8000);

sys.puts("Server running at http://127.0.0.1:8000/");
The route lookup in the http.createServer could be a lot more efficient, like memoizing Route objects for instance.

Anyways, NodeJS looks pretty exciting. Combined with CouchDB you could have a full JavaScript application stack: from storage to app server to client.

*Yes, I realize this is not a full implementation of the URI template spec.  It's just a proof of concept.