2014-01-15

I have to work with a lot (9) of different package managers at my daily work at VersionEye. Part of our mission is it to make manual updating of dependencies extinct, because it’s a manual and time consuming task which nobody enjoys. That’s why we are building a notification system for open source software libraries to make Continuous Updating easy and fun. And since we support several programming languages – 8 at this point! – I get to write crawlers and parsers for all of them. To give you a better overview over the strengths and weaknesses of these package managers, I picked the most popular one for each language and will compare them. The contenders are:

RubyGems / Bundler (Ruby)

PIP / PyPI (Python)

Packagist / Composer (PHP)

NPM (Node.JS)

Bower (JS, CSS, HTML)

CocoaPods (Objective-C)

Maven (Java)

Lein (Clojure)

What are package managers?

Package managers are tools for software developers to help them easily share and consume software libraries and of course to manage dependencies! Specially transitive dependencies. If you are a software developer and you are still downloading software libraries with your browser and placing them into the right directory by hand, then you are doing something wrong! That is totally 1999!

Nowadays every software developer uses at least 1 package manager at work. That means you define the software libraries (dependencies) you want to use in your project in a project file. The package manager is then  responsible for

downloading the software libraries from a repository

placing the downloaded libraries into the right place and linking them correctly into the project

resolving transitive dependencies

A package manager always has 2 parts:

a Repository of binaries or source code

a client which communicates with the repository and performs some work on the client side. This is in most cases a command line tool

The advantages of such a system is pretty clear. It’s much easier to stay in control of dependencies since you just change a number in a text file and execute a command. The package manager is doing all the work for you. Another big advantage is that you don’t need to checkin the actual software libraries into your SCM. You only checkin the project file from your package manager.

RubyGems / Bundler (Ruby)

RubyGems is the package manager for Ruby. Libraries that are bundled as such are also called gems. A gem contains a directory with source code and a *.gemspec file. The *.gemspec file contains meta information to the gem, such as name, version, author, license and of course the dependencies of the gem. A gem can be build with the command line tool “gem” like this:

And published to the central gem repository like this:

In an non library (gem) project the dependencies are mostly managed by bundler, a very popular dependency manager for RubyGems. Bundler manages dependencies defined in a Gemfile, a simple text file. Here is an example:

The file starts with defining the repository. Which is rubygems.org, the central repository for all ruby gems. Each line in the file defines a gem, with name and version. Gems can be grouped together for different environments. E.g. gems in the test group are not required on production, but in test environments.

You don’t need to define an exact version number. You can also define a range of versions and take advantage of pre defined operators. Everything well documented on bundler.io.

Just execute the command in the directory where the Gemfile is placed and it will fetch all dependencies from the repository, resolves transitive dependencies and places them into the right directory.

This command also creates or updates the file Gemfile.lock. This file contains the actually locked versions of the dependencies in the Gemfile and their transitive dependencies. The locked versions in the Gemfile.lock are specially important when you work with pessimistic version constraints. The Gemfile.lock is something you checkin to your SCM and never change by hand! More on that on bundler.io.

Pros

All gems are stored centralized on RubyGems.org. Really all ruby open source developers are using it and pushing their gems to it.

Learning curve is very low, very easy to understand how it works.

It’s very easy to publish new gems on rubygems.org. It takes less then a minute!

It has a very good REST JSON API.

Besides gems, a tag on a git repository can be defined as a dependency. That makes it very easy to fork & patch a software library.

Bundler supports semantic versioning and the pessimistic version constraints specifier enables to fetch the latest patches without breaking to support fetching always the newest patch/minor version of a gem.

There is a mirror at http://mirror1.prod.rhcloud.com/mirror/ruby/

Cons

Usually gems are not cryptographically signed! This can lead to security issues! It is possible to sign gems, but it’s not mandatory and most developers don’t sign their gems.

Defining a license for a gem is not mandatory. But most of the gems are using the MIT license.

You will not find much negative writing about it on the internet, besides the points which are listed here.

NPM (Node.JS)

NPM is the package manager for Node.JS. Libraries are stored as tgz files in the central Node.JS repository, which is npmjs.org. An NPM package is a zipped directory with source code and a package.json file, which contains all the meta information about the package, such as name, version, description, dependencies and so on. A package can be published like this:

Here is an example for a package.json file from a well known Node.JS library. I modified the file a bit to get it shorter.

By executing the following command in a directory with a package.json file, NPM will download all dependencies from the package.json file, resolve transitive dependencies and place them into the right place.

Pros

All packages are centralized at npmjs.org.

Learning curve is very low.

It’s very easy to publish new packages on npmjs.org.

It has a very good REST JSON API.

There is an NPM mirror in Europe.

Besides NPM packages, a tag on a git repository can be defined as a dependency. That makes it very easy to fork & patch a software library.

NPM supports semantic versioning and has an own operator which supports fetching always the newest patch/minor version of a package.

Cons

The NPM packages are not signed! That might lead to security issues!

Defining a license for a package is not mandatory.

NPM is a very young package manager. They learned from the failures of other package managers. It’s almost perfect!

Packagist / Composer (PHP)

Composer is the new package manager for PHP. It was written to replace PEAR. There are still less then 1.000 packages on Pear, but already more then 25K packages on packagist.org, the central repository for Composer packages.

Composer is similar to NPM. Dependencies are defined in a JSON file, called composer.json. Here is a very simple example:

Executing this command in a directory with composer.json will install all dependencies into your project and generate a composer.lock file, same as in Ruby the Gemfile.lock.

One big difference to NPM is that packagist.org doesn’t host any files. It is completely based on Git/Svn/Hn tags. Submitting a package to packagist.org means submitting a link to a public Git, Subversion or Mercurial repository. packagist.org expects a composer.json file in the root directory of the master branch AND tags on the repository. The name of the tag is the version number displayed on packagist.org.

Pros

All packages are centralized at packagist.org.

Learning curve is very low.

It’s very easy to publish new packages.

It has a very good REST JSON API.

Licenses are mandatory.

It supports semantic versioning and has an own operator which supports fetching always the newest patch/minor version of a package.

Composer has a very unique and cool feature called “minimum stability”, which allows to specify the minimum stability for a requested dependency. More on that on the official documentation.

Cons

There are no mirrors. If packagist.org is down you can not fetch packages anymore.

The packages are not signed! It might lead to security issues!

The Composer / Packagist project is even younger then NPM. It is a big step forward for the whole PHP community. All big PHP frameworks moved already to Composer / Packagist.

PyPI (Python)

PyPI is the central repository for python packages. Dependencies are defined in a setup.py file, which is pretty much pure Python code. Here is an example:

Installing the dependencies works like this:

The packages at PyPI are hosted as “*.tar.gz” files. Each packages contains the source code and a setup.py file with meta informations, such as name, version and dependencies.

With a little bit preparation a package can be published to PyPI with this 2 commands:

Pros

All packages are centralized at PyPI.

Learning curve is very low.

It’s easy to publish new packages on PyPI.

There are multiple mirrors for fail-over scenarios.

Cons

The packages are not signed! That can lead to security issues.

Defining a license for a package is not mandatory.

No build-in support for semantic versioning.

PyPI is a robust package manager! There is room for improvements, that for sure. But the core commiters behind the project are aware of that and new features are in the pipeline.

CocoaPods (Objective-C)

CocoaPods.org is the central repository for CocoaPods, a package manager for Objective-C software libraries and mainly used by iOS developers. The CocoaPods command line tool is implemented in Ruby and hosted on RubyGems.org. It can be installed like this:

Dependencies are defined in a simple text file called Podfile. Here a simple example:

Executing this command in the root directory will install the dependencies:

Currently CocoaPods is completely relying on GitHub as backend. There is one CocoaPods/Specs repository with all Pod specifications available on CocoPods.org. Submitting a new pod package works via pull-request. Each pull-request gets reviewed and merged by a human after passing automated tests. This doesn’t scale infinitely but it scales for the current size and guarantees a high quality for the pod specs.

Pros

All packages are centralized at CocoaPods/Specs.

Learning curve is very low.

Very good documentation.

It’s easy to publish new packages, with a pull-request.

License information is mandatory! Pull-requests for Pods without license definition will not be merged.

It supports semantic versioning and has an own operator which supports fetching always the newest patch/minor version of a package.

Cons

The packages are not signed! That can lead to security issues.

No mirrors available.

CocoaPods is a young project. The quality of the Pod specs is very high, because of the human review. If they grow to the size of RubyGems this workflow will not scale anymore. But for right now it’s good enough.

Bower (Frontside JS & CSS)

Bower.io is a package manager for frontside JavaScript and CSS. Pretty much for everything what you are loading from the server to the browser. Packages like jQuery and Bootstrap for example.

The bower command line tool is available as package on NPM. It can be installed globally like this:

Installing a bower package works like this:

It downloads the package and it’s dependencies into a directory called bower_components.

Every bower package is described in a bower.json file. Here’s an example:

Bower is completely Git based and works without any user authentication. Everybody can register new packages like this:

bower register <my-package-name> <git-endpoint>

Bower expects the root of the git-endpoint to contain a bower.json file, describing the bower package. And it expects that there are some tags on the Git repository. The names of the tags are the version numbers.

To unregister a package you have to ask the maintainers in the longest GitHub issue in the history of mankind.

Pros

All packages are centralized at bower.io.

Learning curve is very low.

It’s easy to publish new packages. Works even without registration.

Cons

The packages are not signed! That can lead to security issues.

No mirrors available.

The process for unregistering a package is a No-Go.

License informations are not mandatory.

Many registered packages doesn’t provide a bower.json in their git repository. The package registration process doesn’t check if the package is valid, if it really contains a bower.json file or if there are some tags on the repository.

Bower is kind of cool, but the quality of the packages is many times bad. Unregistering a package is a pain. And a couple hundred registered packages even doesn’t provide a bower.json file in their repositories. User authentication and package validation would be necessary to improve this.

Maven (Java)

Maven is the package manager for Java. The central Maven repository is Search.maven.org.

A project and it’s dependencies are described in an pom.xml file. Here is an example.

Each dependency is identified by:

groupId

artifactId

version

Other properties like scope are optional. As you can see you need at least 5 lines to define 1 single dependency. Dependencies are checked, fetched, resolved and installed on each compile time.

The packages on the Maven Repository Server are pretty much maintained in a pre defined directory structure, based on groupId, artifactId and version of the artifact. The artifacts themselves are binaries, mostly *.jar files. You can see it on this mirror from ibiblio.

Publishing an artifact on the central maven repository takes only 1 week. No! I’m not kidding you! First of all you have to signup at JIRA and open a ticket. The whole process is described here. And be prepared to scroll

Pros

The artifacts are singed!

Licenses are mandatory for publishing!

There are many mirrors available all over the world.

Cons

Pushing artifacts to the maven central repository is not as easy as it should be. Because this process is a pain in the ass ALL big Java Frameworks and Companies are maintaining their own Maven Repository Servers. They push their artifacts first to their own MVN Server and then 1 or 2 weeks later the artifact appears on search.maven.org.

Not really centralized. Because of the previous point. There are at least 30 different public maven repository servers out there, who are not mirroring search.maven.org, but host other artifacts who are not available on search.maven.org. Java developers have to google for the right repository server with the desired artifacts.

Maven is a very complex tool! A pom.xml file can inherit from another pom.xml file. And a pom.xml file can include other pom.xml files. All that leads to high complexity and sometimes to resolution errors. Another side effect of this architecture is that resolving transitive dependencies this way is very slow, because different xml files have to be downloaded and parsed to build a complete model for 1 single artifact.

Maven violates the “Single Responsibility” pattern. Maven is not only doing dependency management! It is doing all kind of things which has nothing to do with dependency management. For example executing tests, generating reports, generating JavaDoc, doing deployments and many other things which are, in other languages, done by other tools.

Not sure why an artifact/package needs a “GroupId” AND an “ArtifactID” AND a version number. All other package managers are satisfied with a name and a version. KIS!

Having 5 lines of XML code for 1 single dependency is kind of overkill!

Maven is by far the most complex package manager I know. And know many of them!

Lein (Clojure)

Leiningen is the package manager for Clojure. It uses a maven repository as backend. Lein itself has the same scope as Maven. It’s not just doing dependency management! At the same time it’s a build tool. Dependencies are defined in a project.clj file. Here an example:

Defining a dependency fits here into 1 line. And “GroupId” is not required for Clojure packages! Leiningen is using 2 sources for the dependency resolution.

For dependencies with GroupId it uses the central maven repository.

For dependencies without a GroupId it uses clojars.org, which is a maven repository, as well.

The dependencies in a project.clj file can be explicitly fetched with this command:

But they will be fetched anyway at compile time.

Publishing a package on clojars.org is pretty easy, after you signed up.

Pros

The artifacts are signed!

It simplifies the dependency definition of maven.

It’s easy to publish new artifacts.

It’s easy to learn how it works.

It allows to reuse Java artifacts from other maven repositories.

Cons

It violates the “Single Responsibility” pattern. Same as Maven.

Licenses are not mandatory.

No Mirrors.

Leiningen is not perfect, but it is doing many things right.

Maven based package managers

There are many languages running on the JVM and most of them have their own package managers. But all of them are more or less based on Maven repositories. That means they all use a Maven Repository as backend and just have a better client (command line tool) and different conventions to deal with the backend.  These are the package managers from this family:

Leiningen (Clojure)

Gradle (Groovy)

SBT (Scala)

Ivy (Based on Ant)

Ivy, SBT and Gradle support own repository types, beside Maven Repositories. But it kind of makes sense that they all access Maven Repositories, because they all want to reuse existing Maven artifacts.

Conclusion

All discussed package managers are open source and can be hosted internally as well. All of them are easy to get started with, except for Maven which is more complex and less fun. I’ve created a comparison matrix to highlight the differences with the most important attributes on the x axis. Unfortunately, there is no single package manager that offers every desired feature:



There is no clear winner and imho, there is no perfect package manager There are of course more package managers, such as CPAN (Perl) and Nuget (.NET), and we hope to be able to cover them in the future. You can find a complete list of package managers here. We are working continuously to add additional languages to our API and make VersionEye available for every software developer in the world!

What is your favorite package manager? And why? And which features are missing in current package managers? I’d love to hear your thoughts! Leave a comment or send me a tweet.

Follow the discussion on hacker news and Reddit.

if you read this far, you might want to monitor your packages on VersionEye ;-)

Show more