sambroblog. A Blog. With words. And maybe pictures.

21Jan/120

Inotify + Node + FTP = Easy-mode remote dev work

Pre-ramble

Haven't updated my blog in a while, but I had a fun little win tonight that I thought I'd share with the lovely readers of this blog (you know I love all 3 of you).

I am currently working fulltime at Wotif.com in Brisbane, and I'm extremely fortunate to be immersed in the weird and wondrous world of Groovy/Grails for the majority of application development I'm involved in. To make things even more awesome-er we're currently working a on a project that uses CouchDB/ElasticSearch as our primary data provider. Life is bliss. Well, mostly.

The Problem

I still have work come in occasionally from a long-time client I respect enough to give some time to on the weekend. Unfortunately this work is in the form of PHP4 code most of the time. And no, the pain doesn't just end there, usually the work is on a fairly large production Joomla 1.0 site. Needless to say, the technical debt flows freely.

Anyway, this particular site is a real pain to work on, as it's not really in a state where it can be run up locally without a fair amount of pain and misery. I haven't done much work for this particular client in the last few months, and thus I didn't really have a proper LAMP stack running on my machine at home (nor do I really want to). A more recent project I'd done for this client last year was a Joomla 1.5 site, where I had the luxury of "doing it right" - I had set the project up to easily be run up locally using some bash scripts, UnionFS, and a little bit of voodoo. But no such win was to be had for this Joomla 1.0 site.

The work I needed to perform on the Joomla 1.0 site was considerable enough that the prospect of firing up Filezilla and manually FTPing changes was unbearable. In the past I've used Eclipse with an obscure plugin called ESFTP to push my changes to the server as I develop. However this still has an obnoxious required manual step of clicking a button every time I want to push a file to the server.

I figured there had to be an easier way. Then I remembered seeing some cool inotify stuff in Node.js a while ago.

The Solution

I thought to myself: "Wouldn't it be cool if I had a little Node app running that monitored my project on the filesystem and FTP'd changes to the codebase as I made them?".

So I decided to cook something up. A couple of hours later I came up with this:

https://gist.github.com/1652663

I didn't end up using the libinotify bindings for Node.js, as it was a little too low level for a quick prototype. The main pain point was the fact that inotify isn't actually recursive, so you actually have to put together your own code that recursively creates watch descriptors for the directory structure, glue in new watch descriptors as new directories get created, and delete old descriptors as directories disappear. I instead opted to use the awesome inotifywait tool (which comes from the inotify-tools package in Ubuntu) which handles all the un-fun parts of inotify and instead sends nice little status updates on stdout.

Oh, and I was getting bizzare issues with the node-ftp library from NPM, so I just grabbed the latest from the git repo and threw it in with the script.

So now I just fire up this script with the FTP details and paths. It just sits there patiently and creates/deletes directories as needed, and pushes file changes/deletions as they occur.

Now this could definitely have easily been done using pretty much any language, but I think it's a pretty neat and elegant little CoffeeScript/Node solution :)

Speaking of Node/CoffeeScript, I've been working on a little project in all the spare time I can get. I'm excited to blog about some of the cool stuff I've found/done in that regard in the coming weeks!

That's all for now, internets.

27Jun/110

Listening for end of response with Node/Express.JS

I'm currently working with CoffeeScript, Node, Express, and Redis to deliver on a quick'n'easy contract I've been put in charge of. This is the first time I've used any of these technologies in a proper commercial deliverables type project, and I have to say, it's been an absolute delight.

An issue I ran into was I wanted to reduce the boilerplate on handling requests, so I wrote a quick route middleware in Express to create a client connection to Redis, assigning the connection to the request object for easy use . I also wanted to be clever and have this same middleware clean up after itself when the request ended. That is, I wanted the middleware to QUIT the Redis connection when the response had been sent.

Consulting the Express/Connect/Node docs yielded no clues as to how to do this, the closest hint I found was from the Node docs indicating that a http.ServerResponse is a WritableStream. I noticed WritableStreams had a "close" event that is supposed to be called when the Stream is no longer writable. I assumed that if you call .end() on a response then it should trigger this event, so my initial middleware looked like this:

setupRedisClient = (req, res, next) =>
	req.redisClient = require("redis").createClient()

	cleanup = =>
		console.log "it worked!"
		req.redisClient.quit()

	res.on "close", cleanup
	res.on "error", cleanup

	next()

I tried using this middleware in a route, and was sad to see that the event was not being triggered.

As a last resort I started digging through the Node source, and lo and behold! I found what I was looking for in lib/http.js. Turns out when you .end() your http response, it will emit a "finish" event.

Now my Redis middleware looks like so:

setupRedisClient = (req, res, next) =>
	req.redisClient = require("redis").createClient()

	cleanup = =>
		req.redisClient.quit()

	res.on "finish", cleanup
	res.on "error", cleanup

	next()

Incoming routes that need a Redis connection simply add this middleware, and hey presto! They can use req.redisClient to their hearts content. Once a response is sent back to the client, or an error occurs with the request, the Redis connection will be cleaned up automagically! Hurrah!

5Mar/110

No.de coupon

Huzzah!

Got my no.de coupon in the mail a couple of days ago.

Now I just gotta figure out what I wanna host at http://sammeh.no.de/....

3Mar/114

Creating a proper Buffer in a Node C++ Addon

Despite the wordy title, it's actually a fairly simple problem, with a fairly simple solution.

Let's say you have some binary data you want to provide to Node Javascript. No problem, Node has Buffers for that. Digging through the Node.js source code, you find node_buffer.h, which promises a utopia of an ObjectWrap goodness; you can even memcpy your binary data directly to it using Buffer::Data(bufferObject).

"Fantastic! I'll rock one of those buffers and simply return `bufferObject->handle_`!", I hear you exclaim. Not so fast stud.

If the client were to use this Buffer, they'd get a nasty surprise. It's not a Buffer. You see, Node.js has re-implemented Buffers since 0.2. The Buffer you're playing with from node_buffer.h is actually a SlowBuffer. As the name implies, it's working directly on the heap-allocated memory chunk, so alot of operations on it are quite inefficient. Worse still, the interface provided on SlowBuffer is actually different to the Node.js documentation. Allow me to explain.

The Buffer you're used to dealing with from Node.js user code actually originates from buffer.js. These Buffers are actually just "views" on a proper SlowBuffer, so operations like slicing are literally as quick as allocating a new Buffer object that views the SlowBuffer at a different offset and max length.

So how do you create one of these badboys from C++ to pass directly back to JS calling code? Glad you asked. Like so:

	// Some data we want to provide to Node.js userland code.
	// This can be binary of course.
	const char *data = "Hello world!";
	int length = strlen(data);

	// This is Buffer that actually makes heap-allocated raw binary available
	// to userland code.
	node::Buffer *slowBuffer = node::Buffer::New(length);

	// Buffer:Data gives us a yummy void* pointer to play with to our hearts
	// content.
	memcpy(node::Buffer::Data(slowBuffer), data, length);

	// Now we need to create the JS version of the Buffer I was telling you about.
	// To do that we need to actually pull it from the execution context.
	// First step is to get a handle to the global object.
	v8::Local<v8::Object> globalObj = v8::Context::GetCurrent()->Global();

	// Now we need to grab the Buffer constructor function.
	v8::Local<v8::Function> bufferConstructor = v8::Local<v8::Function>::Cast(globalObj->Get(v8::String::New("Buffer")));

	// Great. We can use this constructor function to allocate new Buffers.
	// Let's do that now. First we need to provide the correct arguments.
	// First argument is the JS object Handle for the SlowBuffer.
	// Second arg is the length of the SlowBuffer.
	// Third arg is the offset in the SlowBuffer we want the .. "Fast"Buffer to start at.
	v8::Handle<v8::Value> constructorArgs[3] = { slowBuffer->handle_, v8::Integer::New(length), v8::Integer::New(0) };

	// Now we have our constructor, and our constructor args. Let's create the
	// damn Buffer already!
	v8::Local<v8::Object> actualBuffer = bufferConstructor->NewInstance(3, constructorArgs);

	// This Buffer can now be provided to the calling JS code as easy as this:
	return scope.Close(actualBuffer);

And that's all folks!

3Mar/110

Node.js

Since I haven't updated my blog for a few months I figure now would be a good time to do a bit of a brain dump on what is interesting to me nowadays.

Currently I'm completely immersed in the weird and wondrous world of Node.js. If you are even remotely interested in anything related to web development/engineering, you should already know about Node. Briefly, it's a server side JavaScript (SSJS) implementation built on top of Google's V8 JavaScript engine. It's blisteringly fast, and has already been employed in some big projects to solve some pretty insane scaling problems engineers are facing in large websites.

Currently I'm just getting myself acquainted with Node.js, I've written some random libs that are on my Github. Currently I'm writing a native Node extension called node-gitteh, which provides bindings to the excellent C library libgit2. I'll be using these bindings to manipulate Git repositories from Node as part of a little project I'm going to undertake (more on that later).

Writing these bindings has been interesting, given that I'm writing C++ code for the first time in years, and having more trouble remembering how to use an STL map<> than I am wrangling the bizarro V8 API. I think this definitely warrants a tip of the hat to Google, the internals of V8 are pretty accessible; my only gripe with V8 is a pretty painful lack of hand-holding documentation, however there were plenty of examples of stuff on Github from other kindred Node spirits who've written bindings for things like GD, Mysql, libxml2 and the like.

The thing that has impressed me about Node the most is the amount of talent the community is comprised of. Node is only just over a year old and there is already a mass of quality libraries and frameworks. One I particularly love is Vows, which has made TDD an absolute breeze in Node. If you're just starting out with Node, I thoroughly recommend you get into the habit of using Vows to test your code, ideally writing the tests before you even launch into your next Javascripty wet dream. Seriously, it's worth it.

Of course, Node is not without it's faults. The biggest one currently is lack of threaded-ness, or any kind of concurrency control from JavaScript. Many will argue this is a Good Thing, as it abstracts away the misery that is semaphores, locks, re-entrancy and other goodies that come with thread-safety. However I still think there should be first-class support for concurrent operations in Node user land (read: JavaScript code).

There are some solutions out there that utilize nodes ability to spawn child processes and communicate/control them effectively, to the effect of running a pool of Node processes. However I view these as a kludge, as Node processes do have a pretty decent memory footprint on initialization. Given that Chrome has a way of running completely sandboxes JavaScript execution contexts in parallel (that is, a blocking script in one frame wouldn't block other frames), I'm sure there's an elegant solution to be found.

That's all for now!