The WebApp Wizard Web development made magical

4Mar/123

Web development process, from end to end

Just a little post I wanted to write for a long time. Even if it is mainly aimed at beginner web developers, I think it could be interesting for more experienced ones to read on too. I don't pretend to be one of these kick-ass developers we see out there, but just an average one who loves his job, and do it as well as I can, yet not perfectly. So here is a quick honest overview of an average web development process and the tools that come with it, with its advantages and drawbacks.

The IDE

Maybe one of the most important tools, as we are using it a great part of our time. I used a few of them, from the most simple ones to the most advanced ones. After having used notepad, a few fancy visual editors, a few awesome and simple editors like TextMate and Zend Studio, I think I kind of made up my mind with Aptana.

First, the default theme looks great and has been a real relief for my eyes. At first, I wondered why this dark background was used, I felt it like a regression: we used to type light characters on dark backgrounds a long time ago, and then we started writing with black characters on white background. Just like we do with a pen and paper, that seemed kind of logical to me. But when we think about it, this is just stupid. A dark background doesn't emit as much light as a light one, so it puts really less strain on the eyes. So I'm definitely convinced by this default theme.

Then, even if they are not always perfect, available bundles to help with auto-completion or documentation work quite well, and suit my needs. Plus, there are some little bonuses like Capistrano integration that made me adopt it.

The only thing I regret from Zend Studio is the PHPUnit / Code coverage / Debug integration. That was really great. But Aptana will reintegrate debug support someday, I hope.

And last but not least, it is freely available.

There are also really great editors like TextMate, but honestly, I didn't have the courage yet to learn how to use these properly. Just know that it embeds tons of shortcuts and features that saves you time and effort though.

Minification

A key aspect of web development, as many of us know, consists in reducing the number of HTTP request our pages make, and in reducing the weight we transfer over the wire. I became aware of this relatively recently, and since I have spent quite a lot of time trying to resolve this issue. Reducing the number and weight of our requests helps a lot in producing fast responding pages, which is crucial for users with a slow Internet connection, but also counts for users with a good broadband connection.

Combining and minifying our files is good, but we don't want it to slow down or clutter up our development process. Ideally, this should be taken care of automatically. And after trying a few ways to do that minification process more or less manually, I recently stumbled upon the ideal solution for me, which I recently talked about. Assetic is so good not only because it takes care of combining and minifying your files, but also because it allows you to use all sorts of filters like Less, Sass, or virtually anything that could come to your mind. Moreover, it facilitates caching AND leaves no room for out of date cache. But I'll talk about this in the next paragraph.

Caching

Caching is also very important. I can't make my mind about this question: what is the most important part? Minifying or caching? After all, if our caching is right, our non-minification will only hurt us the first time the user hits the page. Once all the files are in the cache, magnification doesn't matter anymore.

So I consider caching at the same level of importance as minification. Once again, Assetic helped me a lot with this. But whatever the tool you use, the key is the key. Nope, this sentence has no mistake in it.

The most basic cache associates a key with a content, and updates the content related to a key, when necessary. The problem with this is that the client can't know directly if its cache is up to date or not. It has to ask the server "hey, is this cache still good or do I need to update it?". If it doesn't do this, it can lead to out of date cache entries, which can be more or less of an issue, depending on the context. So we can't have the best performance and best reliability with this kind of cache.

Another approach to this is to update not the value of a cache entry, but rather the key. When the client finds a key (filename) it doesn't know about, it will be forced to ask the content to the server. As long as the content doesn't change, the key won't change. But as soon as the content is modified, a new key/value pair is created, leaving the old key/value pair or deleting it. One one hand, we have a single key with multiple values over time, and on the other hand, we have a multiple keys over time, with only one value each. So the evolution of the cache is not a problem anymore. In other words, you have to version your filenames. You can do it manually, or you can use a tool like Assetic take care of this for you. It allows you to always serve fresh content, with maximum caching capabilities as the client doesn't have to ask if the cached entry is ok or not. Be careful though, adding a version number in the query string isn't always a good idea, as some proxies rely only on the filename to determine whether to download the file or serve it from its cache. So the best option is to change the filename itself.

Deploying

The last important step is how you deploy your app. Like many people, I started with some basic FTP upload, but as soon as you start working on more serious applications, you probably need something more reliable and more automated. That's why I began writing deployment scripts to help me out. The main problem when we do this manually is we are humans, so we are prone to error and to forget things. How many times did I have to put the server configuration file back in place, as it was erased by my development one which I just committed by mistake? I don't know, nut what I know for sure is "too many times". A script is more reliable for this, but, when this script is written by one human, it is also prone to error. The difference is that once the error is spotted and corrected, it is for good.

But we can do better: a deployment script that is written and used by many humans, therefore reducing the error risk. Not to mention this script will also probably have more features, which can be good too. That's the case with Capistrano, a really great deployment tool I started using last year. It not only takes care of deploying an app from a repository, it also versions it and supports a nice rollback feature in case something went wrong. Another really nice thing is that it allows you to store your users files outside of your code, and create symlinks automatically to ensure everything works as if the files where right under your code's tree.

To sum up

This was a quick post despite its length, and it covers a few topics just on the surface, but the aim was not to dive too deeply in these. It is more meant to give a few leads to follow and make your own opinion about what tools to use or not to use. I could also have talked about testing, but I am still not using this as a real part of my development process, sadly. I tend to use them more and more, but I don't think I am ready to really talk about this right now. A lot of other people will do this way better than me out there.

I just wanted to share the principle I run with, hoping it will help somebody, as I know I've searched for a long time before finding my way of doing these things.

Happy coding!

7Dec/113

Web performance : further optimization

If you use tools like YSlow, PageSpeed and WebPageTest, you have probably already gone a long way about web performance.

The problem

Working on a website which had already good YSlow / PageSpeed ratings, I just wanted to push a little further: can I go up to 100/100 or get very very close to it? That may seem a bit pointless, I mean : who is going to be able to tell the difference? Will it make any difference for the server, too? Well, I don't know, but I wanted to try this for fun. Yeah, strange kind of fun.

So I looked at the metrics of my favorite tools, and PageSpeed told me something: maybe you should try to inline these scripts. What? Am I not supposed to make my scripts external (rule 8 of my bible)? In fact, not always. Making an extra HTTP request is contrary to rule 1, after all. A small file is often not worth a request. So we'd better make it inline, right into the page, to avoid unnecessary HTTP overhead.

But hey, I don't want to sacrifice my cleanly organized JS folder just for the sake of performance. So I had to come up with something that would inline my scripts/css when necessary, without me having to copy and paste the contents of said resources. More importantly, I want it to be dynamic: maybe my files will grow large enough to be worth an extra HTTP request again. So there is no way I manage this by hand.

The solution

Working with Smarty on this project, I decided to make a little Smarty plugin that would help me doing this. The idea is, based on a file size limit, to include scripts the "normal" way or to inline them.

I came up with two little plugins, one for JS files, the other for CSS files.

The results

Using these plugins resulted in one tiny script (a few hundred bytes) and one CSS, on some pages, to be included inline. To be perfectly honest, I didn't measure if there were any "real" performance improvement, and I don't know if it had such a big impact on performance from the user point of view. But it is obvious that this really tiny javascript file generated more HTTP overhead than its content, which is ridiculous. So inlining it can't be bad for performance either.

I was quite surprised by the results from the metrics point of view, though. My YSlow score jumped from 92 - 93 up to 99! Now, that's what I'm talking about: a pretty solid A-grade score :-) . I didn't expect much gain on the YSlow side, as it doesn't mention anything about inlining your scripts. I was even expecting a slightly lower score as YSlow tells you to make your scripts external. But it seems that it doesn't only rely on some stupid rules of thumb, but rather also on real performance.

99 YSlow A-grade score

The PageSpeed score also jumped from something around 91 to 98, which is less of a surprise, as I just applied its recommendations.

What about the server?

That's nice, but I still have a doubt about overall performance, or, more accurately, server charge. That's not really a problem in my case, as I don't have thousands of simultaneous users, so my server can take a little extra charge, but: looking at my plugins implementation, I wonder if this couldn't be optimized. Each time it is used, it checks the file size to decide whether to inline it or not. I don't know if it's a heavy operation, if there is some kind of cache somewhere in the system that avoids to make disk access each time, etc. And when it decides to inline it, it reads the file contents and writes it in the page. And neither do I know precisely how heavy this is.

Anyway, as I told, that's not much of an issue for me, so the overall performance isn't affected. And that's easy to understand: it's quite easy to shave off 100 from the front end (any HTTP request easily takes that much), but what is 100 ms on the back end? That's a whole lot of time. 100ms of PHP execution (or Ruby, or Python, or Java, or C) is huge: most operations won't take more than a few ms. So I think it's pretty safe to say that avoiding unnecessary HTTP traffic is worth a little extra work on the server. And that's the whole thing! I see people working hard optimizing their server code, just to save 3ms here and there. On the server side, that may be important if you have tons of simultaneous connections, but the user won't even notice. When you start working just a little on front end optimization, you save milliseconds by packs of 500!

More!

So, how could I get up to 100/100 on YSlow (and maybe PageSpeed)? Well, if I look at YSlow output, I see this:

Google Analytics preventing 100 YSlow A-grade score

Google Analytics script is not allowing me to get the holy 100, just because it doesn't set a far-future expiration date, thus making it difficult to cache for the browser. I don't know if there is any way to fix this, and I would be glad to hear there is one. I'm pretty sure that would be a great enhancement for the user, as this script download is not that fast.

6Jul/113

Distributed, transparent NodeJS architecture in hostile environment

Hi everyone.

As we're trying to redesign an application at ORU-MiP, we're wondering if something already exists.

Let's settle the context first: it is an application designed for disater, emergency situations. It allows users (health professionals) to simply input victims basic data. There are three main concerns about this application:

  1. Anybody must be able to use it under difficult, extreme circumstances. Just imagine you have a ton of victims coming at you, and you must ask their names, age, etc. and type this as fast as possible in an unknown piece of software...
  2. It must just work. You have absolutely no time to configure anything, just plug your tablet PC / iPad / anything you want on the local network, type an URL in your browser, and you're ready to go.
  3. Maybe the most difficult part: we have absolutely no idea of what we're going to encounter. We cannot rely on one medium for the network part. 3G might not be available (imagine a subway crash). The hardware parts to connect could be too far apart to connect via RJ45. There might be some wave interferences, making any wireless attempt fail. A disaster can happen anywhere, anytime, so we must consider we are going to be in a really unfriendly environment. We're not talking about users in an office, sitting on a chair, in front of a 24 inches screen here.

The last 2 points are very important in this post.

So, what do we want to achieve? It's quite simple:

  1. Any client must be able to connect to the system and use it just by typing an URL in a browser
  2. The clients will have everything stored locally after the first step (hello localStorage, hello cache), and must be able to work independently of the network state.
  3. The server actually just broadcasts the messages it receives to the other connected clients (synchronization messages: create, update and delete).
  4. The server must be hosted on one of the clients (pre-installed). Less hardware to deploy is better.
  5. There might be more than one server one the network, just in case. In fact, every client will also be a server.
  6. If one server falls, another must be able to replace it instantly, with no action from the user.
  7. Everything must be absolutely transparent for the user.
  8. The user must have nothing to do (did I already say that?)

It seems like an HTML5 web app is a perfect fit for our needs. It allows us to combine ease of deployment (just type in an URL) and client-side storage and processing.

It could be really simple: one of the clients also hosts the server, and everything is synchronized by this server. But what if this server fails? What if the tablet PC on which it will be installed runs out of battery? What if the system crashes? What if the tablet PC is destroyed (unlikely, but still possible)? And believe me, these considerations are not here just for fun. It happens, and when it does, the system must continue with no action from the user.

So the idea is to duplicate the server on each client. We can pre-install the machines, unless it's some complicated, time-consuming stuff. The application is currently coded in C#.net, and it is a real pain to install/update the framework, install system updates, etc. We want to get rid of this.

Ok, so we have a few machines with the same lightweight server on it. NodeJS and Socket.IO are good choices as it allows to code fast and responsive web apps. The clients would be single-page web apps, so this is OK.

Here is how I plan to deal with the complicated stuff: each client, once connected to the server, would establish a socket connection with every server on the network (we would use some network detection protocol to achieve this). It would promote one of these sockets to the state of "pub" socket. This socket would be used to publish (pub) messages (add, update, delete of victims for instance). The other sockets would be only used to subscribe (sub) to other clients modifications.

So what happens when a client publishes a message on his "pub" socket? The server will broadcast the message to every connected client, which means every client, as every client is connected to every server via their "sub" sockets. Nice!

What happens if the server is down or unreachable? The client has a list of every other server on the network (remember the "sub" sockets). So it is possible to promote a "sub" socket to state of "pub" socket, and use this one instead. The client will then publish his message to another sever, chosen at random. The server will be able to broadcast the message to every other client, as every client is connected to every server.

In the end, after a few network connection issues, or unresponsive server for whatever reason, we might end up with every client connected to a different server, but it's no big deal, as the servers are stateless and only used to brodcast client messages.

Moreover, we could imagine that any client that was not intended to be used in the system could come in and help. Just connect to one of the servers via your browser, and you're ready. It just won't be able to be used as a server, but that's not that important. Somebody is walking by, we could ask him for help, and in a few seconds, he would be able to welcome victims and type in their names, ages, etc.

A little simplified diagram to sum things up (only two client sets of socket connections represented for the sake of readability...):

Distributed NodeJS architecture with automatic failover

In fact, I dream about the client to client communication protocol the W3C is working on. But in the meantime, we need a workaround. And, for the moment, this is the best one I've found. Every other solution I imagined would fail someday under certain circumstances (and remember, we're not talking about traditional web here, we have to consider that everything out there is trying to kill your system, and if it happens, it stacks up another disaster on top of the first one), and if so, would require some user action to continue working.

Technically, it might be pretty simple to implement. But I'm wondering:

  • Is there already some NodeJS framework or module that does the same kind of thing?
  • If not, would you be interested in one?

Thanks for reading this long post.

14Dec/108

Javascript variable declaration and hoisting

Do you know what JS hoisting is? If you do, I guess you already declare your variables properly. But if you don't, I am pretty sure you can still improve your code a little bit, and make it much more reliable.

If you already experienced some strange variable behavior, like a variable's value being unexpectedly changed or not changed, this post is for you.

Let's see how JS handles variables through a few examples.

JavaScript variables scoping

First, let's have a look at how JS variables are scoped. One may think that JS is a C-like language. And yes, the syntax is quite C-ish. But the language works quite differently. And yes, variable scoping is really different from what you may see in C-like languages.

Most languages use block-level scoping. JS doesn't. It uses function-level scoping. You should keep this in mind at all times, as it is probably one of the greatest sources of confusion for newcomers.

So what happens if we execute this piece of code?

(function() {
  var a = 10;

  if (true) {
    var a = 35;
  }

  alert(a);
})();

Yes, it alerts 35. The same kind of program written in C would output 10 (provided you really re-declare your variable inside your block).

Well that's the first surprise. And not the last, nor the least.

JavaScript variables declaration

Well, let's see what happens when we declare a variable.

Let's take the following code:

var a = 10;
alert(a);

(function() {
  alert(a);
  var a = 200;
  alert('a now contains: ' + a);
})();

What do we have here? The first alert says "10", as we could expect. But the second one alerts "undefined". Why on earth would a be undefined? Try to remove the "var a = 200;" line. The second alert says "10" again? Right. So this declaration / initialization line has something to do with this strange behavior. No matter where you declare a variable in your function, this variable will be declared anywhere in this function. What about the initialization? It stays right where you wrote it.

Eventually, JavaScript hoisting

Wow wow wow... What's that? Do you mean JS doesn't care about how I write my code? Not really. In fact, the previous piece of code will be interpreted like so:

var a = 10;
alert(a);

(function() {
  var a;
  alert(a);
  a = 200;
  alert('a now contains: ' + a);
})();

In fact, all the variable declarations, but not the initializations, are put at the top of your function. This is commonly called hoisting.

Note that this works with all declarations. Even the functions declarations. Be careful though, as the variable-like defined functions will only see their name hoisted (as we saw in the previous example with variable a), but not their body. "Traditionally" declared functions will be entirely hoisted (name and body). Here is a little example you can run to understand this phenomenon:

(function() {
  f1(); // Will run OK
  f2(); // Will throw an error

  f1() {
    alert("I'm in function f1");
  }

  var f2 = function() {
    alert("I'm in function f2, but I will never run before the f2 initialization...");
  };
})();

That may seem quite surprising, but that's how JS works. I can't say if that's a good thing or a bad thing, honestly, I don't see any advantage or drawback. It's just another way to work. As it pushes us to keep a clean code, with all vars declared at the top, it may be a good thing.

Speaking of code quality, your code won't pass JSLint (when you select "the good parts") if you don't declare all your variables at once in each function.

But what should I do then?

This is why you should always declare your variables at the top of your functions. You'll have no bad surprises since you respect this rule.

I hope this post was useful, and that it will help you achieve a better code quality.

6Sep/103

JavaScript Content Assist in Zend Studio 8

Ever been wondering how to get the most out of Zend Studio for JavaScript? I had. With Zend Studio 8, JavaScript tends to be a bit more on the front of the scene. So let's take advantage of this by configuring Content Assist for our favorite JavaScript libraries.

Well, Zend Studio 8 now natively supports jQuery. That's good. But what if, for example, I want to add support for OpenLayers? That's really simple, and it is worth it. Tired of checking the online documentation every 5 minutes when you have to create a new layer? That is for you.

Right-click on your project, the go to properties, then to JavaScript => Include Path.

24Aug/101

How to localize your JS

i18n (internationalization) and l10n (localization) are two major concerns in today's web. And JS isn't an exception to the rule. So here I propose a few ways to manage that essential task.

Simple JS files

The way I generally use. A single file per language, containing all the translated strings for all my JS. Variables respect the naming convention: "i18n_english_word", to distinguish them from other "standard" variables. Looks like this, for a French translation example:

<code>
var i18n_ok = 'OK',
    i18n_cancel = 'Annuler',
    i18n_yes = 'Oui',
    i18n_no = 'Non',
    i18n_close = 'Fermer';
</code>

I usually suffix the filename with "_fr_FR.js" for a French translation file, "_en_US.js" for an American English translation file, and so on.

I use this technique because I usually don't have tons of strings to translate in my JS code, so it is sufficient. Moreover, there is absolutely no overhead on server side, compared to the other solutions I present here.