Greater than the sum of its components

Lately I’ve been working on a cool project written exclusively in JavaScript, with a Node.js & MongoDB back end, and a CommonJS Backbone front end. What I have found most fun so far is the synergy I get between certain components.

Templates

First off, I admit I’m a reinvent-the-wheel kind of engineer. I readily find some minor fault in existing solutions and decide I have to write my own. EJS is really great, especially for someone coming from a PHP background, who doesn’t think logic-less templates are better than sliced bread. However, I really needed templates that can run asynchronously, doing file or network io for includes and other such magic.

So, I made Stencil. I was able to make templates that compile without mucking up the line numbers, so debugging is very straight-forward. No exception rethrowing necessary. The very-important async use case was satisfied without making all templates forced to use the async pattern.

sync_result = sync_tpl(data); // works if no async code in template
async_tpl(data, function(err, async_result) { }); // always works

Where the whole becomes more than the sum of parts: A small snippet makes it so I can directly `require` my templates, and get back the function instead of the string:

require.extensions['.html'] = function(module, filename) {
	var fs = require('fs'), stencil = require('stencil-js'),
		opts = { id:filename, src:fs.readFileSync(filename, 'utf8') };
	module._compile(
		'module.exports=' + stencil.compile(opts, true) + ';',
		filename
	);
};

Now the rest of my code that uses templates doesn’t have to care that I use Stencil. You just `tpl = require(‘path/to/template.html’)`. This is possible because Node.js has an extensible require, and Stencil allows you to compile to a JavaScript string instead of just to a function. If I were to go back and change the templating system to EJS, Jade, or Mustache, I would only need to update this one little snippet.

Client CommonJS

I liked Node.js’s module system, and I didn’t want to have to replace it or use a separate system on the front end. Don’t get me started on mess of UMD. So, I created my own Modules library. You’ve heard about this before.

I got CommonJS modules to load (asynchronously) and run in the browser, so it was trivial to share code used on both ends. Again, line numbers weren’t munged in the server-side translation, so debugging works just like you always expect it to.

The library runs as a middleware for Express, enabling the reload functionality AMD lovers rave about, as well as standalone for concatenating and minifying bundles in the production build process. All with a client-side weight one-third that of AlmondJS, although that or RequireJS would also work on the front-end, since Modules still uses AMD as its transport format.

The real magic though, is that the Modules library has an option for translating certain types of files, giving us the same `require` functionality for our templates that we had on the server, and because the translation happens server side (or at build time), the client code can keep a Content Security Policy that disallows eval and unsafe inline code, as Stencil never has to be loaded in client code. (Lighter & more secure. Woohoo!)

app.use(require('modules').middleware({
	translate:{
		html:function tpl(name, file, src) {
			var opts = { id:name, src:src };
			return 'module.exports=' + stencil.compile(opts, true) + ';';
		}
	},
	root: './components/', // file root path
	path: '/module/', // url root path
	// ... other options
}));

Backbone

One magic thing that I got for free, is that Backbone and Underscore are already CommonJS compatible, so passing them through the same middleware just worked. Async, and countless other Node.js modules also just work.

Adding it all together

While I chose to write my own templating and module components, many other libraries include the little hooks that make these synergies possible. Each component individually is really nothing spectacular, but when you put them all together you get a product that is cohesive from front to back, and really fun to work on.

A Case Against Vendor Prefixes In CSS

I am a web developer, and a rather impatient one too. When a new feature is available in a few browsers, I want to use it. Most of the time, these features are either experimental or not finished with the standardization process when they are generally available. So, they are prefixed by the vendor. This is how the process was designed, so that is what vendors do.

Why do we prefix?

Prefixing is a kind of disclaimer for the feature. “Hey this is likely going to change, so don’t rely on it.” This seems in theory like a good way to go about it. If I am a browser maker, and I think of a cool new feature, like say, a gradient defined in css instead of an image, I really won’t know how good my design is until lots of people have used it and given feedback. Of course, if I am conscientious of the community and betterment of the web for everybody, I share my idea with other browser vendors and get their take on it too. Often we have competing ideas of how to implement it, so we want a way to distinguish between them. This competition is wonderful, and will lead us to a better solution. So, I make my background-image:-andy-linear-gradient(…), and my competitor makes their background:-steve-gradient(linear, color-stop(), …). Some people can try it out and it will work through the standards process and eventually everyone will have a background:linear-gradient(…) feature.

In theory it works out great, but what about in practice?

Here’s what actually happens, from a dev’s perspective.

My favorite browser, Shiny, implements a cool new feature: -shiny-gradient(). I play with it and think it’s really cool, but to be safe, I don’t actually use it in any production site. A year later, the other browser I support, Ice Monster has long since added their own -ice-gradient(). Two years later, pretty much every browser has their own prefixed version, even the Laggard browser.

Nice. It only took two years for the feature to be generally available, so I start to use it, even though my code looks like this:

  background-image: -shiny-gradient(linear, left top, left bottom, from(hsl(0, 80%, 70%)), to(#BADA55));
  background-image: -shiny-linear-gradient(top, hsl(0, 80%, 70%), #BADA55);
  background-image:    -ice-linear-gradient(top, hsl(0, 80%, 70%), #BADA55);
  background-image:    -lag-linear-gradient(top, hsl(0, 80%, 70%), #BADA55);
  background-image:     -my-linear-gradient(top, hsl(0, 80%, 70%), #BADA55);
  background-image:         linear-gradient(top bottom, hsl(0, 80%, 70%), #BADA55);

I don’t mind too much, because I use SASS or some other css pre-processor that actually takes care of all the prefixes and nuances. But this still bothers me in two important ways.

  1. My stylesheets are getting much heavier than they used to be, which is a concern because I want people to be able to view my site quickly even on mobile devices. Most of the syntax is exactly the same, but I still have to write it over and over again for each vendor.
  2. I have to opt in for each browser I want to get the feature. If a new browser becomes popular, it won’t get the gradients unless I go back and add yet another version.

I complain, but I keep doing it anyway. Two years later (Now four years since it was first introduced), The standard is still a draft, and because I want to support not just the bleeding edge browser, even when the standard finalizes I must leave all of the vendor prefixes indefinitely.

The problem gets more real with mobile.

Management asks for a ShinyPhone version of our application. They don’t care about Robot, even though it uses -shiny prefixes. I am given enough time to make the ShinyPhone version, but no time to even test in Robot. Eventually though, I manage to get it working because I own a Robot phone.

A few months later, Catchup Phone 7, Ice Monster Mobile, and Concert Mini are showing up on more phones. They have their prefixed version of all the great Shiny features I used, but because I didn’t know about them, the mobile application looks awful, and would take me several weeks to fix for each new phone. Management is not willing to spend that kind of time, so even though they have all the features my site is broken for them. Who will our customers blame? It works on ShinyPhone, so it must be that Ice Monster Mobile just isn’t as good. Ice Monster and other browsers get blamed for my site not working well there.

The Solution?

It is clear that if other browsers want to make themselves look good, they have to do more than just implement the feature. If they just use -shiny prefixes, that would make my application work far better, and therefore make their browser look good. But that completely undermines what we’ve learned in the browser wars, and goes against the reason for prefixing in the first place!

We don’t really have a good solution yet.

However, I have an idea I think worth talking about. What if the feature hadn’t been prefixed at all? I would have been less nervous to put it into production, because CSS simply doesn’t apply rules that aren’t implemented, and though it will likely change syntax, I can add new versions, and the one that is implemented will work. My stylesheet ends up more like this:

  background-image: gradient(linear, left top, left bottom, from(hsl(0, 80%, 70%)), to(#BADA55));
  background-image: linear-gradient(top bottom, hsl(0, 80%, 70%), #BADA55);

My application just works for every browser that supports the feature with little thought or effort from my part, and if the spec doesn’t change, which it actually doesn’t change very often, I am already done more than four years before it is standardized.

Benefits of prefixing:

  1. Sense of security for browser vendors, so they can change the implementation and make it better.
  2. Web developers should be aware that the feature isn’t really ready yet.
  3. Credit goes to the vendor who pioneered the feature.

Benefits of NOT prefixing:

  1. Less effort and maintenance for web developers trying to make their application (and browsers) look good. They don’t need to spend a lot of time researching which browsers support which features.
  2. Lighter weight stylesheets for everyone, especially mobile browsers.
  3. Browser vendors can focus on the features, not on evangelizing their prefix.
  4. No -webkit- prefixes being supported by Mozilla. Dang, I said it after trying so hard not to.

Honestly I do see the value in prefixing experimental and non-standardized features, but vendors have to break them often, and the standard needs to move faster if developers are realistically going to experiment with experimental features, and wait for the standard for production use.

Please feel free to disagree in the comments, check out the discussion going on in the w3c, or read up on other opinions. Better yet, get involved.

On Pattern Hating

I have long considered myself a Java hater. I now think it really has nothing to do with the language itself. Sure it was easy to point at slow performance (hasn’t been true for a long time now), or mourn for missing syntactic sugar (Pattern.compile(‘abc’, Pattern.CASE_INSENSITIVE) vs /abc/i), but really I think my problem with Java is really just a problem with the mindset I have observed in novice programmers (with Java usually being their first language).

The problem is with patterns.

Patterns are great. They provide a toolbox that can lead developers on the road to “best practice”. But…

Patterns are a poor substitute for problem solving.

It doesn’t matter if you know how to make a Singleton, even if you know when a Singleton is useful, if the problem at hand is improving report speed. You need to know math, you need to know computation, and you need to find the unnecessary work being done. It’s possible we’ll use a Singleton, but it won’t be the solution to the problem.

In an interview, if I ask for code to find the most common words in a bunch of text files, “public class WordRanker {” is unimportant. I’ve seen a few programmers struggle for the first few minutes to figure out if it should be a class, a function, or what language to use. But once, I was impressed by someone who quickly figured out what they wanted to do, and then said, “I’d google how to do that.”

The pattern is accidental complexity. Problem solving is essential complexity.

CommonJS in the Browser

I’ve been thinking a lot lately about how to use CommonJS modules in my web applications. I even started a repository on github for my implementation. As is apparent from searching, the task is non-trivial, and there are lots of people trying to do the same thing, and every one of them has a different idea about how it should work.

But WHY would you want to use CommonJS (formerly known as ServerJS) modules in a client environment?

Ideally you can share modules between client and server, but that requires you to use a server environment like node.js, which might make management really nervous. Even without sharing the CommonJS module system helps us avoid some annoyances in JavaScript development.

  • Each module has it’s own scope. I don’t have to manually wrap each file in a function to get a new variable scope. (Of course, to achieve this, the boilerplate is going to have to wrap each module’s code in a function anyway.)
  • Namespaces are only used in the require function, not everywhere in my code. Almost inevitably every web application I’ve worked in ends up using code like the following:
        var whatIWanted = new FormerCompanyName.Common.CoolLibrary.ConstructorName( More.namespace.chains.than.you.can.follow );
        // the rest of this file continues to use these ridiculously long namespaces
    

    Although I’m sure many will disagree with me, I much prefer the CommonJS way:

        var CoolModule = require('common/cool-library'),
            thingINeed = require('more/namespace/chains/than/you/can/follow'),
            whatIWanted = new CoolModule.ConstructorName(thingINeed);
        // the rest of the file is void of long namespaces
    

    And much more importantly, when I define a new module (or class as some insist on calling them):

        FormerCompanyName.Common.CoolLibrary.ConstructorName = function() {/* ... */};
        // versus
        exports.ConstructorName = function() {/* ... */};
        // or even
        module.exports = function() {/* ... */} // this case isn't in the spec, but I really like it, so I made sure my library can handle it.
    
  • Because you can also use relative module identifiers (“./sibling-module”, “../uncle-module”), when the company changes it’s name, it can be as simple as renaming a folder to update all the top-level module ids.
  • Additionally, modules can be included in the page in any order, and are only executed when first required, instead of all modules executing immediately upon inclusion, requiring the script order to be specific and fragile. If I add a new module using CommonJS, I can just append it to the end of the list, otherwise I have to make sure it is earlier in the page than whatever uses it, and after whatever it uses.

Okay, but how much work is it going to be?

Let’s walk through first what I wanted my server-side code to look like, then what it has to do to make it work on the other side.

As most of my server-side experience thus far has been in php, that’s the first language I’ve used in my implementation.

<!DOCTYPE html>
<html>
<head>
    <title>My Awesome Application</title>
    <link rel="stylesheet" href="awesome-styles.css" />
</head>
<body>
    <!-- blah blah blah -->
    <?= Modules::script() /* include all necessary script tags */ ?>
    <script>require('awesome').go()</script>
</body>
</html>

The Modules class will look for all js files in the folder you put it in, and any subfolders, and will id them by their path.

Yes, I am including every module, not actually checking dependencies. I refer you back to my previous post and say this is the simplest way, and if the caching headers are working, the experience won’t suffer. You are welcome to use one of the fantastic libraries that loads modules on-demand, if you disagree.

Hopefully that is all the server-side API you need to worry about, but there is more if you need it.

So what is that library doing to my poor scripts to make the CommonJS module environment?

I will explain in detail what goes into it in another post, but if you are daring, you can check out the source on github.

Some thoughts on Web 4.0

The web has undergone some significant changes since its inception. 1.0 consisted mostly of HTML documents, with simple CSS style, and little or no JavaScript interaction. 2.0 was the AJAX revolution, making dynamic sites with complex JavaScript. Some have suggested we are already in 3.0, with HTML5 and SVG well supported in the latest version of every major browser. What I’d like to talk about, is what I wish would come next.

As many who are immersed in front-end web development have noticed, HTML and SVG have different DOMs, different styles, and competing animation tools. They have been getting better, with HTML5’s inline SVG support, and browsers beginning to bring each markup’s features to the other, but the inconsistencies are still painful, and and they make implementation both for web and browser developers sub-optimal.

What I would love to see is something akin to the following document:

<!DOCTYPE html>
<html lang="en">
<head>
  <title>Fancy HTML+SVG</title>
  <link rel="stylesheet" href="styles.css" />
  <defs>
    <path id="logo" desc="My Fancy SVG Logo" d="M59,0 l69,69 h-15 l-44,44 v15 l-69-69 h15 l45-45 5,5 -45,45 44,44 44-44 -49-49 z  M59,44 c0-8,10-8,10,0 v40 c0,8-10,8-10,0 z" />
    <filter id="soft_blur"><feGaussianBlur in="SourceGraphic" stdDeviation=".5"/></filter>
  </defs>
  <link rel="shortcut icon" sizes="16x16 24x24 32x32 48x48" href="#logo" />
</head>
<body>
  <header>
    <a id="home" href="."><use href="#logo" /></a>
    <h1>The TaleCrafter's Scribbles<h1>
    <h2>notes about science, fiction, and faith… but mostly web development</h2>
  </header>
  <article>My Article text and images and stuff go here</article>
  <footer>Boring Legal and maybe locale selection in here</footer>
  <script src="script.js" async defer></script>
</body>
</html>

styles.css

  #logo { background:#111; } /* applies to everywhere <use>d, including favicon */
  #home { width:64px; height:64px; float:left; }
  #home path { transform:scale(.5); transition:background .5s ease; }
  #home path:hover { background:#0d0dc5; }
  h1 { filter:url(#soft_blur); transition:filter .5s linear; }
  h1:hover { filter:none; }
  /* ... lots more styles ... */

script.js

  document.querySelector('#home path').addEventListener('click', /* open menu or something useful */);

Summary of things that would be cool:

  • no need for foreignObject or anything like that, simply mix and match tags
  • put all the useful attributes in the same namespace (make use is useful without xlink: namespace)
  • css transitions & animations on svg styles (properties would also be nice)
  • defs and use in html documents
  • filters on html elements (Firefox is already working on this)
  • unify styles like background and fill
  • JavaScript DOM API identical

In short SVG and HTML would be one and the same. You would style both with the same css.

Some nitpicks:

  • I’m not sold on defining filters in markup, then using in style. It feels… odd. Why not define in style? (Oh no, that might be too much like IE’s filters! Gasp!)
  • Animating is still a crapshoot. It feels like it should be in JavaScript, but declarative syntax is so much simpler, and easier to optimize for browsers. Some SMIL animations work in some browsers. CSS animations are still nacent but promising. (Even IE looks like it might implement it in ‘native HTML5′. Sorry, couldn’t help myself.) Still, JavaScript is the only reliable way right now.

Let me hear an Amen, or let me know what I’m missing. Leave a comment and let’s talk about it.

Load only when needed, or Preload everything?

As JavaScript and web application best practices have formed over the last several years, there have appeared two contesting patterns in loading the scripts needed for an application:

Don’t load any JavaScript until you know you need it.

I usually feel like this is the way to go, because a lot of my code is specific to a particular widget or workflow. Why make the page take longer to load initially for something the user won’t do every visit? Just put in minimal stubs to load the full functionality once the user begins down that workflow, or interacts with the widget.

Pros:

  • Lighter initial page weight
  • Encourages functionally modular code
  • Memory performance boost (important if you have to support old browsers)
  • Speed performance boost (if done right)

Cons:

  • Adds additional complexity to code
  • Laggy performance (if done wrong)
  • Lots of HTTP requests

Combine and minify all JavaScript into one file loaded at the end of the html file.

You know beforehand what is going to be needed on each page, and YSlow warned you about too many HTTP requests. Bundle up all the scripts into one download which will be cached after the first page view.

Pros:

  • Easy to implement (lots of code will do it for you)
  • Initial page load (once cached) is really fast

Cons:

  • Load a lot more than usually necessary
  • Initial load can be much slower

So how do you know which pattern to follow? It depends! If your application is very complex, and large portions of the functionality are used infrequently, it makes a lot of sense to use an on-demand pattern. If your application is fairly simple, or if all of the code is likely to be used every time, then combining all of the scripts and including it from the start will be much easier.

I recently worked on a smaller application where I divided all the script into two files. The first was loaded initially, and provided enough functionality for the login dialog only. Upon successful login, the second script was loaded, which combined all of the remaining pieces of application.

The point I most want to make is this: Don’t just follow a pattern because it is a “best practice”. Take the time to figure out the best solution for your project.

I thought I new you(JavaScript);

This is the first of hopefully many posts aiming to demystify javascript.

The first thing to get over is the name. JavaScript is not Java. The name came from trying to ride on Java’s hype. JavaScript is to Java as Hamster is to Ham. Understand? Moving on…

Hopefully, most programmers now know that JavaScript is object-oriented. I’m afraid though that most believe object-oriented is synonymous with classical inheritance, which you will not find in JavaScript. JavaScript instead uses prototypal inheritance.

Classical Inheritance in Java:

class Fruit {
  private String name;
  public Fruit(String n) { name = n; }
  public toString() { return name; }
}

class Banana extends Fruit {
  public Banana() { super("banana"); }
}

// (new Banana()) instanceof Banana and Fruit

With classical inheritance, as in Java, you define classes. Classes are templates or blueprints for what an object of that type will be like. Objects, which are instances of a class, get all the methods and fields associated with the class and the classes it inherits.

When you call a method, first the runtime looks in the class, then if it can’t find the definition, it traverses up the class hierarchy until it finds the method definition.

Prototypal Inheritance in JavaScript:

function Fruit(name) { this.name = name; }
Fruit.prototype = { name:null, toString:function() { return this.name; } };

function Banana() { Fruit.call(this, 'banana'); }
Banana.prototype = new Fruit(null);

// (new Banana()) instanceOf Banana and Fruit 

As you can see in JavaScript, with prototypal inheritance, there are no classes. The ‘class’ keyword is not used. Objects inherit from other objects. (The Banana prototype is an ‘instance’ of Fruit.) Constructors are just normal functions that you may call with the ‘new’ keyword.

When you access any property, the runtime checks the object, then if it cannot find the property, it traverses up the prototype object hierarchy until it finds the property. If it doesn’t find the property, it returns undefined.

The new keyword is a little deceptive, because it looks the same as Java. This is closer to what really happens:

// var banana = new Banana(a, b);
var banana = {}; // new Object()
// assume __proto__ is a hidden field, used internally for the prototype hierarchy
banana.__proto__ = Banana.prototype;
var temp = Banana.call(banana, a, b); // call the Banana function with 'this' set to the banana object
banana = (temp && typeof temp === 'object') ? temp : banana;

// banana.name;
var temp = banana;
while (!temp.hasOwnProperty('name') && temp.__proto__) { temp = temp.__proto__; }
return temp.hasOwnProperty('name') ? temp.name : undefined;

Please note that this code is an oversimplification, but hopefully helps you to understand what is happening behind the scenes. One of the interesting things you may have noticed from the above code is that when you call ‘new Banana()’, you might not get back what you expect. See one way you can implement the Factory pattern in JavaScript:

function Fruit(name, color) {
  if (typeof Fruit[name] === 'function')) return new Fruit[name]();
  this.name = name;
  this.color = color;
  return this;
}
Fruit.prototype = { name:null, color:null };

Fruit.Banana = function Banana() { return this; };
Fruit.Banana.prototype = new Fruit('Banana', 'yellow');

Fruit.Apple = function Apple() { return this; };
Fruit.Apple.prototype = new Fruit('Apple', 'red');

var banana = new Fruit('Banana'); // instanceOf Fruit and Banana
var apple = new Fruit('Apple'); // instanceOf Fruit and Apple
var kiwi = new Fruit('Kiwi'); // instanceOf Fruit

As most of JavaScript’s powerful dynamic features, it could easily be used for evil as well as for good.

function Droid() { return new IPhone(); }
var phone = new Droid(); // this is not the droid you're looking for

wtfjs.com is full of examples where JavaScript does weird things, but almost invariably because you tried to do something weird in the code. With a small amount of restraint on the developer’s part, JavaScript can be powerful and need not be a mystery.

JavaScript Require Update

I’ve updated the code I use to require scripts and styles on my web pages.

Check it out, or fork it at github: http://github.com/thetalecrafter/require

Usage:

main file:

	require.setObjUrl('jQuery', function(name) {
		return name === 'jQuery' ? 'http://code.jquery.com/jquery-1.5.2.min.js' :
			'http://cdn-' + (name.length % 4) + '.example/plugins/' + name + '.js'; });
	require('jQuery.myplugin', function(myplugin) { /* both have loaded when this executes */ });

plugin file:

	require('jQuery', function(jQuery) {
		jQuery.myplugin = ...
	});

require css: Any requirement matching /\.css$/i will be treated as a css requirement.

	require('myplugin.css', function() { /* You can count on styles being available here */ });

require image: Any requirement matching /\.(?:gif|jpe?g|png)$/i will be treated as an image requirement.

	require('myplugin_bg.png', function() { /* You can count on the image being available here */ });

JavaScript require in 100 lines of code

UPDATE: I’ve changed up my code a bit in the follow up post: JavaScript Require Update
UPDATE: Although my initial intent was to write require with minimal code, my latest version in github is much longer, but preforms better and is much more feature rich. Check it out, or fork it at github: http://github.com/thetalecrafter/require

Lately I’ve been toying with dependency management in JavaScript. Most implementations of require (at least that I’ve seen) use polling, a function in the loaded script, synchronous XMLHttpRequest (dojo.require), or some combination of those.

Polling is less than ideal, since more code runs than is necessary. It can slow down the responsiveness of the page if the interval is too short, and the user waits longer than necessary if the interval is too long.

Putting a function in the loaded file means that everything you load has to understand the system. You cannot load arbitrary files. This makes it harder to do mash-ups involving other peoples’ code.

Synchronous requests lock up the browser. If the server is latent, the user may feel the browser has crashed, and if the server goes down, it can actually crash the browser. In addition, XMLHttpRequest responses are not cached like script tags, meaning that the dynamic packages may need to be reloaded with every page load.

So… when looking at writing my own require function I knew I wanted:

  • Event-driven code. (No polling. No more code execution than necessary.)
  • No requirements on the contents of required files.
  • Asynchronous loads (No chance of freezing or crashing the browser.)
  • Take advantage of the browser’s cache.
  • Nested requires. (A file isn’t loaded until everything it requires is loaded.)
  • Decent browser compatibility (IE6+, FF2+, Chrome, Safari 3+, Opera).
  • No external library requirements.

One thing I ended up giving up to get the aforementioned wants: Loading scripts in parallel. Nested requires were unreliable since not all browsers guarantee execution order of dynamically inserted script tags, therefore too hard to determine the parent requirement. I’m looking at you Safari. Any pointers to improve that would be appreciated.

My testing has been less than thorough, and there are many situations I didn’t try to handle. (Like checking to see if the script was already included statically.)

Without further ado, here’s my code: (the most up-to-date is available on github)

/**
 * _.require v0.3 by Andy VanWagoner, distributed under the ISC licence.
 * Provides require function for javascript.
 *
 * Copyright (c) 2010, Andy VanWagoner
 *
 * Permission to use, copy, modify, and/or distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */
(function() {
	var map = {}, root = [], reqs = {}, q = [], CREATED = 0, QUEUED = 1, REQUESTED = 2, LOADED = 3, COMPLETE = 4, FAILED = 5;

	function Requirement(url) {
		this.url = url;
		this.listeners = [];
		this.status = CREATED;
		this.children = [];
		reqs[url] = this;
	}

	Requirement.prototype = {
		push: function push(child) { this.children.push(child); },
		check: function check() {
			var list = this.children, i = list.length, l;
			while (i) { if (list[--i].status !== COMPLETE) return; }

			this.status = COMPLETE;
			for (list = this.listeners, l = list.length; i < l; ++i) { list[i](); }
		},
		loaded: function loaded() {
			this.status = LOADED;
			this.check();
			if (q.shift() === this && q.length) q[0].load();
		},
		failed: function failed() {
			this.status = FAILED;
			if (q.shift() === this && q.length) q[0].load();
		},
		load: function load() { // Make request.
			var r = this, d = document, s = d.createElement('script');
			s.type = 'text/javascript';
			s.src = r.url;
			s.requirement = r;
			function cleanup() { // make sure event & cleanup happens only once.
				if (!s.onload) return true;
				s.onload = s.onerror = s.onreadystatechange = null;
				d.body.removeChild(s);
			}
			s.onload = function onload() { if (!cleanup()) r.loaded(); };
			s.onerror = function onerror() { if (!cleanup()) r.failed(); };
			if (s.readyState) { // for IE; note there is no way to detect failure to load.
				s.onreadystatechange = function () { if (s.readyState === 'complete' || s.readyState === 'loaded') s.onload(); };
			}
			r.status = REQUESTED;
			d.body.appendChild(s);
		},
		request: function request(onready) {
			this.listeners.push(onready);
			if (this.status === COMPLETE) { onready(); return; }

			var tags = document.getElementsByTagName('script'), i = tags.length, parent = 0;
			while (i && !parent) { parent = tags[--i].requirement; }
			(parent || root).push(this);
			if (parent) this.listeners.push(function() { parent.check(); });

			if (this.status === CREATED) {
				this.status = QUEUED;
				if (q.push(this) === 1) { this.load(); }
			}
		}
	};

	function resolve(name) {
		if (/\/|\\|\.js$/.test(name)) return name;
		if (map[name]) return map[name];
		var parts = name.split('.'), used = [], ns;
		while (parts.length) {
			if (map[ns = parts.join('.')]) return map[ns] + used.reverse().join('/') + '.js';
			used.push(parts.pop());
		}
		return used.reverse().join('/') + '.js';
	}

	function absolutize(url) {
		if (/^(https?|ftp|file):/.test(url)) return url;
		return (/^\//.test(url) ? absolutize.base : absolutize.path) + url;
	}
	(function () {
		var tags = document.getElementsByTagName('base'), href = (tags.length ? tags.get(tags.length - 1) : location).href;
		absolutize.path = href.substr(0, href.lastIndexOf('/') + 1) || href;
		absolutize.base = href.split(/\\|\//).slice(0, 3).join('/');
	})();
	
	function require(arr, onready) {
		if (typeof arr === 'string') arr = [ arr ]; // make sure we have an array.
		if (typeof onready !== 'function') onready = false;
		var left = arr.length, i = arr.length;
		if (!left && onready) onready();
		while (i) { // Update or create the requirement node.
			var url = absolutize(resolve(arr[--i])), req = reqs[url] || new Requirement(url);
			req.request(function check() { if (!--left && onready) onready(); });
		}
	}

	require.map = function mapto(name, loc) { map[name] = loc; };
	require.unmap = function unmap(name) { delete map[name]; };
	require.tree = root;
	jQuery.require = require;
})();

Accessing AWS SimpleDB from PHP

This week, as I built part of my App Server for Distributed Systems Design, I hit another stumbling block. The library that Amazon provides in PHP for accessing SimpleDB requires PHP 5.2. I should have known that I need to use the latest version.

Not only did Amazon’s library not work for me, but it was huge and complicated. I found another library at: Google Code, but as fate would have it, that library didn’t work either. The code was pretty ugly imho, but at least it was straighforward enough for me to understand how accessing SimpleDB worked, which led me to make my own SimpleDB client.

The script will work with any PHP 5, and doesn’t depend on anything that isn’t built in by default. I hope it is helpful to someone else. It would be really easy to add the SimpleDB requests I haven’t implemented yet.

<?php
/**
 * AWS_SimpleDB_Client v0.1 by Andy VanWagoner, distributed under the ISC licence.
 * Provides simple access to Amazon's SimpleDB from PHP 5.
 *
 * Copyright (c) 2009, Andy VanWagoner
 *
 * Permission to use, copy, modify, and/or distribute this software for any
 * purpose with or without fee is hereby granted, provided that the above
 * copyright notice and this permission notice appear in all copies.
 *
 * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 */

class AWS_SimpleDB_Client {

	// AWS SimpleDB API Constants
	private static $service_endpoint	= "sdb.amazonaws.com";
	private static $api_version			= "2007-11-07";
	private static $timestamp_format	= "Y-m-d\TH:i:s.\\\\\Z";
	private static $signature_version	= 1;

	private static $user_agent = "AWS_SimpleDB_Client 0.1 - Andy VanWagoner";

	/**
	* Constructor
	*
	* @param string $access			// your AWS "Access Key ID"
	* @param string $secret			// your AWS "Seceret Access Key"
	*/
	function AWS_SimpleDB_Client($access, $secret) {
		$this->access_key = $access;
		$this->secret_key = $secret;
	}

	/**
	* AWS SimpleDB API - CreateDomain
	* NOTE: This call will take a while (AWS says 10 seconds)
	*
	* @param string $domain			// the domain to create
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>)
	*/
	function create_domain($domain) {
		$params = array(
			'Action' => 'CreateDomain',
			'DomainName' => $domain
		);

		return $this->post($params);
	}

	/**
	* AWS SimpleDB API - DeleteDomain
	* NOTE: This call will take a while (AWS says 10 seconds)
	*
	* @param string $domain			// the domain to delete
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>)
	*/
	function delete_domain($domain) {
		$params = array(
			'Action' => 'DeleteDomain',
			'DomainName' => $domain
		);

		return $this->post($params);
	}

	/**
	* AWS SimpleDB API - ListDomains
	*
	* @param string $next = ''		// Optional - Sent as NextToken parameter
	* @param string $max = 100		// Optional - Sent as MaxNumberOfDomains
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>,
	* 				'DomainName'=>array('...', ...) [, 'NextToken'=>])
	*/
	function list_domains($next = '', $max = 0) {
		$params = array('Action' => 'ListDomains');

		if ($max > 0 && $max) post($params);
	}

	/**
	* AWS SimpleDB API - PutAttributes
	*
	* @param string $domain			// The domain the item is in
	* @param string $item			// The name of the item
	* @param array  $attributes		// array(array('Name'=>, 'Value'=> [, 'Replace'=>]), ...)
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>)
	*/
	function put_attributes($domain, $item, $attributes) {
		$params = array(
			'Action' => 'PutAttributes',
			'DomainName' => $domain,
			'ItemName' => $item
		);

		foreach($attributes as $i => $value) {
			$params["Attribute.$i.Name"] = $value['Name'];
			$params["Attribute.$i.Value"] = $value['Value'];
			if (isset($value['Replace']))
				$params["Attribute.$i.Replace"] = $value['Replace'];
		}

		return $this->post($params);
	}

	/**
	* AWS SimpleDB API - DeleteAttributes
	*
	* @param string $domain			// The domain the item is in
	* @param string $item			// The name of the item
	* @param array  $attributes		// array(array('Name'=>, 'Value'=>), ...)
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>)
	*/
	function delete_attributes($domain, $item, $attributes) {
		$params = array(
			'Action' => 'DeleteAttributes',
			'DomainName' => $domain,
			'ItemName' => $item
		);

		foreach($attributes as $i => $value) {
			$params["Attribute.$i.Name"] = $value['Name'];
			$params["Attribute.$i.Value"] = $value['Value'];
		}

		return $this->post($params);
	}

	/**
	* AWS SimpleDB API - GetAttributes
	*
	* @param string $domain			// the domain name
	* @param string $item			// the item's name
	* @param string $attribute		// Optional - If specified, only this attribute's values are retrieved.
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>,
	* 				'Attribute'=>array(array('Name'=>,'Value'=>), ...))
	*/
	function get_attributes($domain, $item, $attribute = '') {
		$params = array(
			'Action' => 'GetAttributes',
			'DomainName' => $domain,
			'ItemName' => $item
		);

		if ($attribute)
			$params['AttributeName'] = $attribute;

		return $this->post($params);
	}

	/**
	* AWS SimpleDB API - Query
	*
	* @param string  $domain		// The domain name
	* @param string  $query			// The query to run on this domain
	* @param string  $next = ''		// OPTIONAL - token supplied on last paged call
	* @param integer $max = 100		// OPTIONAL - max items you want returned 1-250, default = 100
	*
	* @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>,
	* 				'ItemName'=>array('...', ...))
	*/
	function query($domain, $query, $next = '', $max = 0) {
		$params = array(
			'Action' => 'Query',
			'DomainName' => $domain,
			'QueryExpression' => $query
		);

		if ($max > 250) $max = 250;
		if ($max > 0)
			$params['MaxNumberOfItems'] = $max;
		if ($next)
			$params['NextToken'] = $next;

		return $this->post($params);
	}

	/**
	 * Sign the parameters, following AWS version 1 signing
	 *
	 * @param array $params			// array of all (except for the signiture) params to be passed to amazon
	 *
	 * @return string				// signature string
	 */
	private function sign($params) {
		uksort($params, 'strnatcasecmp');

		$data = '';
		foreach ($params as $key=>$value) {
			$data .= $key . $value;
		}

		return base64_encode (	pack("H*", sha1((str_pad($this->secret_key, 64, chr(0x00)) ^ (str_repeat(chr(0x5c), 64))) .
								pack("H*", sha1((str_pad($this->secret_key, 64, chr(0x00)) ^ (str_repeat(chr(0x36), 64))) .
								$data)))) );
	}

	/**
	 * POST to AWS SimpleDB and then parse the response.
	 *
	 * @param array $params			// all params to pass on the post
	 *
	 * @return array('status'=>array('code'=>, 'message'=>), 'RequestId'=>, 'BoxUsage'=>, ...)
	 */
	private function post($params) {

		// Add all of the common parameters needed by AWS SimpleDB
		$params['AWSAccessKeyId']	= $this->access_key;
		$params['Timestamp'] 		= gmdate(self::$timestamp_format, time());
		$params['Version'] 			= self::$api_version;
		$params['SignatureVersion']	= self::$signature_version;
		$params['Signature'] 		= $this->sign($params);

		// Generate the POST request
		$content = http_build_query($params);

		$post  = 'POST / HTTP/1.0'															. "\r\n";
		$post .= 'Host: ' 			. self::$service_endpoint 								. "\r\n";
		$post .= 'Content-Type: ' 	. 'application/x-www-form-urlencoded; charset=utf-8'	. "\r\n";
		$post .= 'Content-Length: ' . strlen($content)										. "\r\n";
		$post .= 'User-Agent: ' 	. self::$user_agent 									. "\r\n";
		$post .= 																			  "\r\n";
		$post .= $content;

		$socket = @fsockopen(self::$service_endpoint, 80, $errno, $errstr, 10);
  		if ($socket) {
			fwrite($socket, $post);

			$response = stream_get_contents($socket);
			fclose($socket);

			// Parse the response
			return $this->format_result($response);
		}

		// Return a fail result
		return array('status' => array('code' => 404, 'message' => 'Not Found'),
			'Error' => array('Code' => $errno, 'Message' =>
				'Could not connect to ' . $this->$service_endpoint . " ($errstr)"
			)
		);
	}

	/**
	 * Take the XML document returned by AWS SimpleDB, and transform it into a hash
	 *
	 * @param string $result		// the full http response string from SimpleDB
	 */
	private function format_result($result) {
		list($http_headers, $content) = explode("\r\n\r\n", $result, 2);
		$header_lines = explode("\r\n", $http_headers);
		list($protocol, $code, $message) = explode(" ", $header_lines[0], 3);

		// record the http status
		$formatted = array('status' => array('code' => $code, 'message' => $message));

		$xml = simplexml_load_string($content);

		// Look for Errors
		if (isset($xml->Errors)) {
			$formatted['RequestId'] = (string)$xml->RequestId;
			$formatted['Error'] = array();
			foreach($xml->Errors->Error as $error) {
				array_push($formatted['Error'], array(
					'Code' => (string)$error->Code,
					'Message' => (string)$error->Message
				));
			}
			return $formatted;
		}

		// Get the metadata for this request
		$metadata = $xml->ResponseMetadata;
		$formatted['RequestId'] = (string)$metadata->RequestId;
		$formatted['BoxUsage'] = (string)$metadata->BoxUsage;

		// GetAttributes Response
		if (isset($xml->GetAttributesResult)) {
			$formatted['Attribute'] = array();
			foreach($xml->GetAttributesResult->Attribute as $attribute) {
				array_push($formatted['Attribute'], array(
					'Name' => (string)$attribute->Name,
					'Value' => (string)$attribute->Value
				));
			}
		}

		// ListDomains Response
		if (isset($xml->ListDomainsResult)) {
			$formatted['DomainName'] = array();
			foreach($xml->ListDomainsResult->DomainName as $domain) {
				array_push($formatted['DomainName'], (string)$domain);
			}
			if (isset($xml->ListDomainsResult->NextToken)) {
				$formatted['NextToken'] = (string)$xml->ListDomainsResult->NextToken;
			}
		}

		// Query Response
		if (isset($xml->QueryResult)) {
			$formatted['ItemName'] = array();
			foreach($xml->QueryResult->ItemName as $item) {
				array_push($formatted['ItemName'], (string)$item);
			}
			if (isset($xml->QueryResult->NextToken)) {
				$formatted['NextToken'] = (string)$xml->QueryResult->NextToken;
			}
		}

		return $formatted;
	}
}

?>
Follow

Get every new post delivered to your Inbox.

Join 36 other followers