I’ve been building enterprise Java web apps since servlets were created. In that time the Java ecosystem has changed a lot but sadly many enterprise Java developers are stuck in some very painful and inefficient ways of doing things. In my travels I continue to see Java The Sucky Parts – but it doesn’t have to be that way. It is time for enterprises to move past the sucky ways they are using the Java platform. Here is a list of the suckiest parts of Java that I see most often and some recommendations for how to move past them.
10 Page Wikis to Setup Dev Environments Suck
Setting up a new development environment should be no more than 3 steps:
Install the JDK
Clone / checkout the SCM repo
Run the build / start the app
Seriously. It can and should be that easy. Modern build tools like Gradle and sbt have launchers that you can drop right into your root source tree so that new developers can just run ./gradlew or./activator (for sbt). The build should have everything needed to get the app up and running – including the server. The easiest way to do this is to go containerless with things like Play Framework and Drop Wizard but if you are stuck in a container then consider things like Webapp Runner. One of the many problems with the container approach is the very high probability of running into the it works on my machine syndrome because environments easily differ when a critical dependency exists outside of the realm of the build and SCM. How many wikis keep the server.xml changes up-to-date? Wiki-based configuration is a great way to cause pain.
What about service dependencies like databases and external web services – don’t developers need to set those things up and configure them? Not if your build can do it for them. Smart build systems should be able to provision the required services either locally or on the cloud. Docker has emerged as a great way to manage local environments that are a replica of the production system.
If your app needs a relational database then use an in-memory db like hsql or cloud services like Heroku Postgres, RDS, Redis Labs, etc. However one risk with most in-memory databases is that they differ from what is used in production. JPA / Hibernate try to hide this but sometimes bugs crop up due to subtle differences. So it is best to mimic the production services for developers even down to the version of the database. Java-based databases like Neo4J work the same in-memory and out-of-process minimizing risk while also making it easy to setup new development environments. External web services should either have a sandbox host that can be used by developers or the web services should be mocked.
Incongruent Deployment Environments Suck
To minimize risk when promoting builds from dev to staging to production, the only thing that should change between each environment is configuration. A deployable artifact should not change as it moves between environments. Continuous Integration systems should run the same build and tests that developers run. Have the CI system do automatic deployment to a testing or staging environment. A proper release pipeline makes it easy to promote a deployable artifact from staging to production.
I used to maintain a Java web app where the deployment process went like this:
Build a WAR file
SCP the WAR file to a server
SSH to the server
Extract the WAR file
Edit the web.xml file so it contains new database connection info
Restart the server
That setup isn’t the worst I’ve seen but it is was always risky. It would have been much better to utilize environment variables so the only thing that changed between environments was those variables. Environment variables can be automatically read by the app so that the artifact stays the exact same. In this setup reproducing an environment is super easy – just set the env vars.
Servers That Take More Than 30 Seconds to Start Suck
For developer productivity and so that scaling up can happen instantly, servers should startup quickly. If your server takes more than 30 seconds to start then break the app into smaller pieces, adopting a Microservices architecture. Going containerless or having a one-app-per-container rule can really help reduce startup time. If your container takes a long time to start you should ask yourself: What are all those container services there for? Can the services be broken out into separate apps? Can they be removed or turned off?
If you need some ammunition to prove to your management that your startup times are killing your team’s productivity then use the stopwatch on your phone to count the total minutes per day wasted by waiting for the app to start. Bonus points if you calculate out how much wasted money that translates to for yourself, your team, and your org. Double bonus points if you show a chart that defeats the “we spent a lot of money on this app server” sunk cost argument.
Manually Managed Dependencies Suck
It sucks if any of your library dependencies aren’t managed by a build tool. Manually copying Jar files into the WEB-INF/lib is horribly error prone. It makes it hard to correlate files to versions. Transitive dependencies are addressed by ClassNotFound errors. Dependencies are brittle. Knowing the libraries’ licenses is hard. Getting your IDE to pull the sources and JavaDocs for the libraries is tough.
So first… Use a build tool. It doesn’t matter if you choose Ant + Ivy, Maven, Gradle, or sbt. Just pick one and use it to automatically pull your dependencies from Maven Central or your own Artifactory / Nexus server. With WebJars you can even manage your JavaScript and CSS library dependencies. Then get fancy by automatically denying SCM check-ins that include Jar files.
Unversioned & Unpublished Libraries Suck
Enterprises usually have many libraries and services shared across apps and teams. To help make teams more productive and to enable managed dependencies these libraries should be versioned and published to internal artifact servers like Nexus and Artifactory. SNAPSHOT releases should be avoided since they break the guarantee of a reproducible build. Instead consider versioning based on your SCM information. For instance, the sbt-git plugin defaults the build version to the git hash or if there is a git tag for the current position then the tag is used instead. This makes published releases immutable so that library consumers know exactly the correlation between the version they are using and the point-in-time in the code.
Long Development / Validation Cycles Really Suck
Billions of dollars a year are probably wasted with developers just waiting to see / test their changes. Modern web frameworks like Play Framework and tools like JRebel can significantly reduce the time to see changes. If every change requires a rebuild of a WAR file or a restart of a container then you are wasting ridiculous amounts of money. Likewise, running tests should happen continuously. Testing a code change (via reloading the browser or running a test) should not take more time than an incremental compile. Web frameworks that display helpful compile and runtime errors in the browser post-refresh are also very helpful to reduce long manual testing cycles.
When I work on Play apps I am continuously rebuilding the source on file save, re-running the tests, and reloading the web page – all automatically. If your dev tools & frameworks can’t support this kind of workflow then it is time to modernize. I’ve used a lot of Java frameworks over the years and Play Framework definitely has the most mature and rapid change cycle support. But if you can’t switch to Play, consider JRebel with a continuous testing plugin for Maven or Gradle.
Monolithic Releases Suck
Unless you work for NASA there is no reason to have release cycles longer than two weeks. It is likely that the reason you have such long release cycles is because a manager somewhere is trying to reduce risk. That manager probably used to do waterfall and then switched to Agile but never changed the actually delivery model to one that is also more Agile. So you have your short sprints but the code doesn’t reach production for months because it would be too risky to release more often. The truth is that Continuous Delivery (CD) actually lowers the cumulative risk of releases. No matter how often you release, things will sometimes break. But with small and more frequent releases fixing that breakage is much easier. When a monolithic release goes south, there goes your weekend, week, or sometimes month. Besides… Releasing feels good. Why not do it all the time?
Moving to Continuous Delivery has a lot of parts and can take years to fully embrace (unless like all startups today, you started with CD). Here are some of the most crucial elements to CD that you can implement one-at-a-time:
Friction-less App Provisioning & Deployment: Every developer should be able to instantly provision & deploy a new app.
Microservices: Logically group services/apps into independent deployables. This makes it easy for teams to move forward at their own pace.
Rollbacks: Make rolling back to a previous version of the app as simple as flipping a switch. There is an obvious deployment side to this but there is also some policy that usually needs to go into place around schema changes.
Decoupled Schema & Code Changes: When schema changes and code changes depend on each other rollbacks are really hard. Decoupling the two isolates risk and makes it possible to go back to a previous version of an app without having to also figure out what schema changes need to be made at the same time.
Immutable Deployments: Knowing the correlation between what is deployed and an exact point-in-time in your SCM is essential to troubleshooting problems. If you ssh into a server and change something on a deployed system you significantly reduce your ability to reproduce and understand the problem.
Zero Intervention Deployments: The environment you are deploying to should own the app’s config. If you have to edit files or perform other manual steps post-deployment then your process is brittle. Deployment should be no more than copying a tested artifact to a server and starting it’s process.
Automate Deployment: Provisioning virtual servers, adding & removing servers behind load balancers, auto-starting server processes, and restarting dead processes should be automated.
Disposable Servers: Don’t let the Chaos Monkey cause chaos. Servers die. Prepare for it by having a stateless architecture and ephemeral disks. Put persistent state in external, persistent data stores.
Central Logging Service: Don’t use the local disk for logs because it prevents disposability and makes it really hard to search across multiple servers.
Monitor & Notify: Setup automated health checks, performance monitoring, and log monitoring. Know before your users when something goes wrong.
There are a ton of details to these that I won’t go into here. If you’d like to see me expand on any of these in a future blog, let me know in the comments.
Sticky Sessions and Server State Suck
Sticky sessions and server state are usually one of the best ways to kill your performance and resilience. Session state (in the traditional Servlet sense) makes it really hard to do Continuous Delivery and scale horizontally. If you want a session cache use a real cache system – something that was designed to deal with multi-node use and failure. e.g. Memcache, ehcache, etc. In-memory caches are fast but hard to invalidate in multi-node environments and are not durable across restarts – they have their place, like calculated / derived properties where invalidation and recalculation are easy.
Web apps should move state to the edges. UI-related state should live on the client (e.g. cookies, local storage, and in-memory) and in external data stores (e.g. SQL/NoSQL databases, Memcache stores, and distributed cache clusters). Keep those REST services 100% stateless or else the state monster will literally eat you in your sleep.
Useless Blocking Sucks
In traditional web apps a request comes in, fetches some data from a database, creates a webpage, and then returns it. In this model it was ok to give that full roundtrip a single thread that remained blocked for the entire duration of the request. In the modern world requests often stay open beyond the life of a single database call because either it is a push connection or because it is composing multiple back-end services together. This new world requires a different model for how the threads / blocking is managed. The modern model for dealing with this is called async & non-blocking or Reactive.
Most of the traditional Java networking libraries (Servlets, JDBC, Apache HTTP, etc) are blocking. So even if a connection is idle (like when a database connection is waiting for the query to return), a thread is still allocated. The blocking model limits parallelism, horizontal scalability, and the number of concurrent push connections. The Reactive model only uses threads when they are actively doing something. Ideally your application is Reactive all the way down to the underlying network events. When a request comes in it gets a thread, then if that request needs to get data from another system the thread handling the request can be returned to the pool while waiting for the data. Once the data has arrived a thread can be reallocated to the request so the response can be returned to the requestor.
Java has a great foundation for Reactive with Java NIO. But unfortunately most of the traditional Java web frameworks, database drivers, and HTTP clients do not use it. Luckily a whole new landscape of Reactive libraries and frameworks is emerging that is built on NIO and Netty (a great NIO library). For example, Play Framework is a fully Reactive web framework which many people use with Reactive database libraries like Reactive Mongo.
To be Reactive means that you also need to have a construct for being asynchronous. The traditional way to do this in Java is with anonymous inner classes, like:
Java 8 provides a much more concise syntax for asynchronous operations with Lambdas. The same Reactive request handler above with Java 8 & Lambdas is:
If your app does things in parallel and/or handles push connections then you really should be going Reactive. Check out my Building Reactive Apps presentation if you want to dive in deeper on this.
The Java Language Kinda Sucks
The Java Language has a lot of great aspects but due to its massive adoption and desire from its enterprise users for very gradual change, the language is showing its age. Luckily there are a ton of other options that run on the JVM. Here is a quick rundown of the most interesting options and my opinions on some positives and negatives:
Scala
Great
Likely the most widely adopted alternative language on the JVM
Fits well with Reactive and Big Data needs
Mature ecosystem for libraries, frameworks, support, etc
Good
Java interoperability is great but often not useful since the Java libraries aren’t built for Reactive and Scala idioms
Modern programming concepts with very powerful & flexible language
Bad
Language flexibility leads to significantly different ways of writing Scala sacrificing universal readability
Huge learning curve due to large number of features
Groovy
Great
Large ecosystem for libraries, frameworks, support, etc
Simple language with a few very useful features
Good
Interoperability with Java works and feels pretty natural
Bad
I prefer good type inference (like Scala) over Groovy’s dynamic and optional static typing
Clojure
Great
The elegance of a Lisp on the JVM
Mature ecosystem for libraries, frameworks, support, etc
Good
JavaScript target seems good but isn’t core
Bad
The lack of some OO constructs makes managing a large code base challenging
Dynamic typing
Kotlin
Great
Interoperability with Java seems natural
JavaScript target is first class
Good
IDE and build tooling seems decent but immature
Modern language features that aren’t overwhelming
Bad
Uncertain where it will be in 5 years – will it catch on and gain critical mass?
Starting with a new / greenfield project can be an easy time to try a new language but most enterprises don’t often do that. For existing projects there some frameworks and build tools that support mixing existing Java with alternative JVM languages better than others. Play Framework / sbt is the one I’ve used for this but I’m sure there are also others that do this well. At the very least, writing just your new tests in an alternative JVM language can be a great place to start experimenting.
Java 8’s Lambdas are a nice upgrade to the Java language. Lambdas do help reduce boilerplate and fit well with the Reactive model. But there is still a lot of other areas where the language is lacking. Now that I know Scala there are a few things I couldn’t live without that are still absent from Java: Type Inference, Pattern Matching, Case Classes, String Interpolation, and Immutability. It is also very nice to have Option and concurrency constructs baked into the core and library ecosystem.
Reality Check
If you are in a typical enterprise then maybe you are lucky and already doing most of this. As shocking as it may seem for some of us, this is really rare. Most of you are probably reading this and feeling sad because moving the enterprise monolith towards a lot of this stuff is really hard. As physics tells us, it is much harder to move large things than small things. But don’t lose heart! I’ve seen a number of stodgy enterprises slowly creep out of Java the Sucky Parts. Walmart Canada recently switched to Play Framework! My recommendation is to pick one of these sucky things and make it your goal to fix it over the next year. Often this requires buy-in from management which can be tough. Here is my suggestion… Spend a couple evenings or weekends working on implementing one of these items. Then show your manager what you did in your own time (that will convey how much you care) and then let them take the credit for the amazing new thing they thought of. Works every time. And if it doesn’t then there are tons of well paying startups who are already doing all of this stuff.
One last thing… Go read The Twelve-Factor App – it was the inspiration for a lot of this content.