1 Introduction
One of the things which makes working with Node fundamentally different from other server-side scripting languages is the functional paradigm provided by JavaScript, which structurally allows for asynchronous, non-blocking transactions through the use of callbacks. What this means is instead of executing a script from beginning to end, Node is always running, with listeners attached to functions ready to do our bidding.
Node does not force you into an architecture, nor does it offer a firm opinion outside of providing JavaScript as a connector between file systems, operating systems, and the internet at large. This interconnection is the core of what Node provides, supplanted by thousands of community written modules: the Node 'ecosystem'.
2 Node Modules
Node modules are written in JavaScript and compiled at runtime, which offers developers the ability to see how modules work, evaluate differing development practices, and diagnose the inevitable bugs that show up. Due to their modular implementation, Node modules don't especially need to be compatible with each other. Several standards, such as Express and Connect, have established a common API to handle common problems, so there are many modules that offer Express compatibility, or use Express as part of their core functionality.
Node modules provide an interchangeable extensibility that is important to get comfortable with; we'll dive into npm first so you can find and utilize modules for your project. At the end of the day, much of what Node developers wrangle with are event callbacks and object manipulation, so we will also explore the async and lodash modules which make these two tasks much easier.
3 NPM
The Node philosophy aligns with the UNIX philosophy of modularity and reusability, and this is highlighted by npm, the Node package manager.
3.1 npm init
The best place to start using npm with a project is to run npm init, which will walk you through some questions and create a package.json in your folder.
3.2 package.json
Every Node module uses a package.json file to determine what modules need to be installed in order to function properly. Use npm init to generate your own.
This file also holds (other meta information)[https://www.npmjs.org/doc/files/package.json.html], such as version, descriptions, authors, license, and script hooks. After downloading / cloning a Node project, run npm install to download all the dependencies.
Make sure to .gitignore your node_modules folder, lest you embarrassingly commit all your modules to your repository, an act which will require some non-trivial git sorcery to erase.
When using npm in non-trivial projects, append --save, e.g. npm install
--save
, in order to add the package information to package.json automagically.
3.3 npm & npm install (global vs local)
npm offers two 'locations' for packages, local and global. Local packages are installed in a project's node_modules folder, using the command npm install
. Global packages are installed using the -g flag, e.g. npm install -g
, and -- if they are designed for it -- can be installed in your environment and accessed from the command line.
Here's an example:
Suppose I have a home weather system that writes the average temperature for a day to a file. As I have been deeply immersed with developing Node, I haven't been outside in two weeks. I still want to know the hot days so I can have a point of common reference with other humans when I see them. Reading a line of numbers, however, would be an undue cognitive burden, so I desire a graphical representation that I can summon from my command line. Enter global Node module 'sparkly'.
npm install -g sparkly provides a simple command line driven sparkline that can be piped into. Given any set of numeric information, we can get a rough overview of their relationship. tail -n14 weather.tmp | sparkly shows me how the week would have been had I stepped outside; now I can complain accordingly and accurately, thanks to my weather station, Node and sparkly.
Admittedly, by itself, this is a limited utility, but every Node module this can be combined with other modules, or UNIX tools in the case of global modules. The universe of small pieces can be assembled into greater wholes, and despite there being thousands of modules, there is still plenty of room for improvements and innovations.
One of the headaches of navigating the sea of modules is discovery and documentation. At the time of this writing, npm search cli returns 7500 modules, many of which are referring to 'client', but the rest can be used on the command line.
npm searching for any sort of adapter, protocol or standard often turns up a match. Many modules are wrappers on top of C++ codebases, which brings many of the advances of the past decades of computing, to your JavaScript-sullied fingertips.
3.5 npm repo / npm docs
Navigating the documentation for so many modules can be overwhelming, npm repo
or npm docs
will bring you to a website (often a GitHub page) where you can read up on how to use a given module. As you will be accessing said documentation very often, it is useful to know how to get there quickly and with a minimum of effort.
As previously mentioned, module source code is readily accessible, where it is hopefully well commented and clearly written, so one can understand the underlying mechanisms which may be causing headaches or frustrations in implementation.
4 Must-use modules
Two modules I use in almost every project are async and lodash.
Async helps to manage the 'callback hell' that nested callbacks can create. Since the depth of a function becomes its execution order, it becomes very visually confusing after a depth of 4 or 5 functions and any amount of conditionals or forking.
Lodash helps to manipulate objects and arrays. Moving information between databases, clients and servers is a big part of working with Node, and having powerful tools, like lodash, help tremendously in this regard.
4.1 Lodash
The spiritual successor to underscore.js, lodash maintains backwards compatibility with underscore code (if you use the compatible version), and takes bold steps forward to help you manipulate those objects and arrays quickly and easily.
There are many more methods to lodash then I've covered here, these are the ones I find myself using the most.
4.1.1 Array - []
first
Pop, shift, slice? How about just give me the _.first x numbers of elements.
rest
Push, unshift or splice? What about when you just want the last few items of an array?
compact
Sometimes you end up with a bunch of empty fields in an array. _.compact removes everything which evaluates to false, null, 0, undefined or NaN. Useful for keeping data tidy.
flatten
Nested arrays are often part of working with JavaScript, use _.flatten to get rid of the un-needed structure.
each
Good old forEach, except it works in all browsers and is a little trimmer.
It's important to notice that lodash uses the following formats on anonymous functions:
Dealing with arrays
Dealing with objects
This may be second nature to anyone used to for-in loops, but it could be a stumbling block for anyone used to the key, value notation.
Collections are arrays of objects, and while the above functions work perfectly well on them, there are other ways we can traverse and manipulate them.
4.1.2 Collections - [{},{},{}]
map
Map is useful when you need to transform values of an array or object. Unlike each, this generates a new array.
reduce
_.reduce turns an array or collection into a single value, using an accumulator as a starting point and passing it into a function for each iteration.
filter
For the next three functions, assume this dataset:
_.filter returns an array of objects which all pass a truth test.
pluck
Pluck is similar to reduce, except you get an ordered list of properties as an array, instead of an accumulated value. Useful when you need to build isolated datasets out of large collections.
where
Where is a shorthand for what you might otherwise accomplish with _.filter.
4.1.3 Objects - {}
clone
One of the gotchas in JavaScript-land is using an object as a property of another object, and we change that property, thinking that we are affecting a copy of the object. Not true. Only a pointer to the original object is stored.
extend
_.extend divides your development experience into before you used extend and after you used extend. Simply, it merges objects. Properties of subsequent objects will override earlier.
has
Useful as a safe alternative to if (obj.prop) conditional testing.
isEmpty / isPlainObject / isUndefined / isEqual / isNull
These simple conditional tests are extremely useful, abstract away all sorts of edge cases, and are easy to use.
transform
_.transform is similar to reduce, in that it runs a function for each element of an array or object. Transform does not need an explicit starting point for the accumulator however.
pick
_.pick can be handy when you need to turn a large object into a specific smaller one.
4.1.4 Chains - _.()
Chains allow you to put together all your lodash knowledge in a very powerful sequence of code.
Chains can be started with _.chain(value) or the shorthand _(value).
The value is then substituted for the input value for each chainable function that supports it. You can see a list of what methods are and aren't supported here, as well as the native Array methods supported by lodash.
If you want your chain to return a value, not just manipulate provided arrays, you have to use .value() at the end of your chain.
Chains keep code concise and readable for easy maintenance. If, for debugging, or for detailed manipulation, you need to access an array in the middle of chaining, you can use the _.tap method.
4.2 Async
One of the major stumbling blocks for new people trying to use Node is learning the difference between asynchronous and blocking code.
A good rule of thumb that I use to teach new Noders:
If it uses an =, it's blocking, and there is no need for async. If you find this particular calculation is taking too long, you may wish to consider a Worker. If a function uses a callback, callFoo(bar, function(err, data) it's asynchronous, and you may wish to manage this complexity with async.
4.2.1 Arrays
each
async.each applies an asynchronous function to each element in an array.
The final function is passed the combined results of the first function. This is a pattern you will see frequently throughout async.
(each and filter examples borrowed from the async docs)
eachSeries
Just like each, except this guarantees the order of operations will be the same as the parent array.
filter
Async.filter, just like in lodash, passes results if they return true for a testing function. In this case, fs.exists returns true for files that exist and false for files that do not.
map
Async.map runs arrays through an asynchronous transformation function. If your database adapter does not support arrays, this may help get the results you need.
4.2.3 Control flow
Controlling the order of operations is a critically important part of working with Node, as you cannot rely on procedurally executed code to ensure completion. With great power comes great responsibility, so we move forth.
If any of the child functions throw an error, the operation will stop and the final function will handle the error.
series / parallel
async.series and async.parallel are functionally identical in structure. They have an array or an object full of functions as the first parameter, and a final callback in the second.
With the array declaration while using series, it will determine the order which the asynchronous operations are executed. Using an object with series does not guarantee order of execution.
Unless there is a logical reason for to use series, most asynchronous operations should be done in parallel, as async will open a number of processes to execute the functions, and execution will happen very quickly.
Parallel/Series has two styles of declaration; object and array.
Array declaration
Will return the results in the order defined by the array, regardless of execution time.
Object declaration
Will return an keyed object in the final results.
whilst
async.whilst is do-while, asynchronous style, continuously executing until a conditional returns true.
whilst tests the function first, then runs the function. doWhilst runs the function first, and then does the test, note that the order of the functions are reversed, the first is the function, the second the conditional for doWhilst
until
async.until is the logical inverse of whilst, executing a function until the conditional returns false.
doWhilst runs the test after the function, and the parameters are switched accordingly.
queue
async.queue is a quick worker delegation system, for managing the execution of asynchronous functions with a dynamic approach. Useful for when you have a variable number of functions, or want to spin up a bunch of processes at once.
Queue can pause and resume, much like a stream, so you can have very fine grained control over execution. If you need more control, also examine queuePriority.
waterfall
async.waterfall is a very useful function that will overtake your development world if you let it. It abstracts one of the most common asynchronous patterns, taking information from one callback and using it to inform another in a linear order.
waterfall-queue
Async opens the door for all sorts of design patterns to re-use throughout your projects. One that I find very enjoyable to use is waterfall-queue.
In the waterfall we build up a series of objects with database calls. In the queue, we save all the modified instances.
The following example assumes some sort of ORM like Mongoose or Waterline:
4.3 What's next?
A recently introduced module called Highland (by the creator of Async) seems to combine data manipulation with asynchronicity -- utilizing a stream-based approach -- and seems promising.
5 Conclusion
We learned about the entry point to the Node ecosystem, npm. Now you can find any module for almost any use scenario to make your development faster, or educate yourself about how to approach a problem.
We scraped the surface of lodash, which will save you from writing hundreds of lines of repetitive code, countless iterators, loops and conditionals, and ultimately save the sanity of you and whoever has to maintain your code.
Finally, we dove into the most used parts of async, to help soften the Node learning curve and make it less painful to move from a procedural world to a functional one.
I hope reading this was as enjoyable as it was writing it, and if you'd like to engage in irreverent coding banter, I'm @seancanton, or we can AirPair!