Planet.haskell.org

Gabriel Gonzalez: State of the Haskell ecosystem - August 2015

2015-09-01

Note: This went out as a RFC draft a few weeks ago, which is now a live wiki. See the Conclusions section at the end for more details.

In this post I will describe the current state of the Haskell ecosystem to the best of my knowledge and its suitability for various programming domains and tasks. The purpose of this post is to discuss both the good and the bad by advertising where Haskell shines while highlighting where I believe there is room for improvement.

This post is grouped into two sections: the first section covers Haskell's suitability for particular programming application domains (i.e. servers, games, or data science) and the second section covers Haskell's suitability for common general-purpose programming needs (such as testing, IDEs, or concurrency).

The topics are roughly sorted from greatest strengths to greatest weaknesses. Each programming area will also be summarized by a single rating of either:

Best in class: the best experience in any language

Mature: suitable for most programmers

Immature: only acceptable for early-adopters

Bad: pretty unusable

The more positive the rating the more I will support the rating with success stories in the wild. The more negative the rating the more I will offer constructive advice for how to improve things.

Disclaimer #1: I obviously don't know everything about the Haskell ecosystem, so whenever I am unsure I will make a ballpark guess and clearly state my uncertainty in order to solicit opinions from others who have more experience. I keep tabs on the Haskell ecosystem pretty well, but even this post is stretching my knowledge. If you believe any of my ratings are incorrect, I am more than happy to accept corrections (both upwards and downwards)

Disclaimer #2: There are some "Educational resource" sections below which are remarkably devoid of books, since I am not as familiar with textbook-related resources. If you have suggestions for textbooks to add, please let me know.

Disclaimer #3: I am very obviously a Haskell fanboy if you haven't guessed from the name of my blog and I am also an author of several libraries mentioned below, so I'm highly biased. I've made a sincere effort to honestly appraise the language, but please challenge my ratings if you believe that my bias is blinding me! I've also clearly marked Haskell sales pitches as "Propaganda" in my external link sections. :)

Table of Contents

Application Domains

Compilers

Server-side programming

Scripting / Command-line applications

Numerical programming

Front-end web programming

Distributed programming

Standalone GUI applications

Machine learning

Data science

Game programming

Systems / embedded programming

Mobile apps

ARM processor support

Common Programming Needs

Maintenance

Single-machine Concurrency

Types / Type-driven development

Domain-specific languages (DSLs)

Testing

Data structures and algorithms

Benchmarking

Unicode

Parsing / Pretty-printing

Stream programming

Serialization / Deserialization

Support for file formats

Package management

Logging

Education

Debugging

Cross-platform support

Databases and data stores

Hot code loadng

IDE support

Application Domains

Compilers

Rating: Best in class

Haskell is an amazing language for writing your own compiler. If you are writing a compiler in another language you should genuinely consider switching.

Haskell originated in academia, and most languages of academic origin (such as the ML family of languages) excel at compiler-related tasks for obvious reasons. As a result the language has a rich ecosystem of libraries dedicated to compiler-related tasks, such as parsing, pretty-printing, unification, bound variables, syntax tree manipulations, and optimization.

Anybody who has ever written a compiler knows how difficult they are to implement because by necessity they manipulate very weakly typed data structures (trees and maps of strings and integers). Consequently, there is a huge margin for error in everything a compiler does, from type-checking to optimization, to code generation. Haskell knocks this out of the park, though, with a really powerful type system with many extensions that can eliminate large classes of errors at compile time.

I also believe that there are many excellent educational resources for compiler writers, both papers and books. I'm not the best person to summarize all the educational resources available, but the ones that I have read have been very high quality.

Finally, there are a large number of parsers and pretty-printers for other languages which you can use to write compilers to or from these languages.

Notable libraries:

parsec / attoparsec / trifecta / alex+happy - parsing libraries

bound / unbound - manipulating bound variables

hoopl - optimization

wl-pprint / ansi-wl-pprint - pretty-printing

llvm-general - LLVM API

language-{javascript|python|c-quote|lua|java|objc|cil} - parsers and pretty-printers for other languages

Some compilers written in Haskell:

Elm

Purescript

Idris

Agda

Pugs (the first Perl 6 implementation)

ghc (self-hosting)

frege (very similar to Haskell, also self-hosting)

Educational resources:

Write you a Haskell

A Tutorial Implementation of a Dependently Typed Lambda Calculus

Binders Unbound

Server-side programming

Rating: Mature

Haskell's second biggest strength is the back-end, both for web applications and services. The main features that the language brings to the table are:

Server stability

Performance

Ease of concurrent programming

Excellent support for web standards

The strong type system and polished runtime greatly improve server stability and simplify maintenance. This is the greatest differentiator of Haskell from other backend languages, because it significantly reduces the total-cost-of-ownership. You should expect that you can maintain Haskell-based services with significantly fewer programmers than other languages, even when compared to other statically typed languages.

However, the greatest weakness of server stability is space leaks. The most common solution that I know of is to use ekg (a process monitor) to examine a server's memory stability before deploying to production. The second most common solution is to learn to detect and prevent space leaks with experience, which is not as hard as people think.

Haskell's performance is excellent and currently comparable to Java. Both languages give roughly the same performance in beginner or expert hands, although for different reasons.

Where Haskell shines in usability is the runtime support for the following three features:

lightweight threads enhanced (which differentiate Haskell from the JVM)

software transactional memory (which differentiate Haskell from Go)

garbage collection (which differentiate Haskell from Rust)

Many languages support two of the above three features, but Haskell is the only one that I know of that supports all three.

If you have never tried out Haskell's software transactional memory you should really, really, really give it a try, since it eliminates a large number of concurrency logic bugs. STM is far and away the most underestimated feature of the Haskell runtime.

Notable libraries:

warp / wai - the low-level server and API that all server libraries share, with the exception of snap

scotty - A beginner-friendly server framework analogous to Ruby's Sinatra

spock - Lighter than the "enterprise" frameworks, but more featureful than scotty (type-safe routing, sessions, conn pooling, csrf protection, authentication, etc)

yesod / yesod-* / snap / snap-* / happstack-server / happstack-* - "Enterprise" server frameworks with all the bells and whistles

servant / servant-* - This server framework might blow your mind

authenticate / authenticate-* - Shared authentication libraries

ekg / ekg-* - Haskell service monitoring

stm - Software-transactional memory

Some web sites and services powered by Haskell:

Facebook's spam filter: Sigma

IMVU's REST API

Utrecht's bicycle parking guidance system

elm-lang.org

glot.io

The Perry Bible Fellowship

Silk

Shellcheck

instantwatcher.com

Propaganda:

Fighting spam with Haskell - Haskell in production, at scale, at Facebook

IMVU Engineering - What it's like to use Haskell

Haskell-based Bicycle Parking Guidance System in Utrecht

Mio: A High-Performance Multicore IO Manager for GHC

The Performance of Open Source Applications - Warp

Optimising Garbage Collection Overhead in Sigma

instantwatcher.com author comments on rewrite from Ruby to Haskell - [1] [2]

Educational resources:

Making a Website With Haskell

Beautiful concurrency - a software-transactional memory tutorial

The Yesod book

The Servant tutorial

Overview of Happstack

Scripting / Command-line applications

Rating: Mature

Haskell's biggest advantage as a scripting language is that Haskell is the most widely adopted language that support global type inference. Many languages support local type inference (such as Rust, Go, Java, C#), which means that function argument types and interfaces must be declared but everything else can be inferred. In Haskell, you can omit everything: all types and interfaces are completely inferred by the compiler (with some caveats, but they are minor).

Global type inference gives Haskell the feel of a scripting language while still providing static assurances of safety. Script type safety matters in particular for enterprise environments where glue scripts running with elevated privileges are one of the weakest points in these software architectures.

The second benefit of Haskell's type safety is ease of script maintenance. Many scripts grow out of control as they accrete arcane requirements and once they begin to exceed 1000 LOC they become difficult to maintain in a dynamically typed language. People rarely budget sufficient time to create a sufficiently extensive test suite that exercises every code path for each and every one of their scripts. Having a strong type system is like getting a large number of auto-generated tests for free that exercise all script code paths. Moreover, the type system is more resilient to refactoring than a test suite.

However, the main reason I mark Haskell as mature because the language is also usable even for simple one-off disposable scripts. These Haskell scripts are comparable in size and simplicity to their equivalent Bash or Python scripts. This lets you easily start small and finish big.

Haskell has one advantage over many dynamic scripting languages, which is that Haskell can be compiled into a native and statically linked binary for distribution to others.

Haskell's scripting libraries are feature complete and provide all the niceties that you would expect from scripting in Python or Ruby, including features such as:

rich suite of Unix-like utilities

advanced sub-process management

POSIX support

light-weight idioms for exception safety and automatic resource disposal

Notable libraries:

shelly / turtle - scripting libraries (Full disclosure: I authored turtle)

optparse-applicative / cmdargs - command-line argument parsing

haskeline - a complete Haskell implementation of readline for console building

process - low-level library for sub-process management

Some command-line tools written in Haskell:

pandoc

git-annex

Educational resources:

Shelly: Write your shell scripts in Haskell

Use Haskell for shell scripting

Numerical programming

Rating: Immature? (Uncertain)

Haskell's numerical programming story is not ready, but steadily improving.

My main experience in this area was from a few years ago doing numerical programming for bioinformatics that involved a lot of vector and matrix manipulation and my rating is largely colored by that experience.

The biggest issues that the ecosystem faces are:

Really clunky matrix library APIs

Fickle rewrite-rule-based optimizations

When the optimizations work they are amazing and produce code competitive with C. However, small changes to your code can cause the optimizations to suddenly not trigger and then performance drops off a cliff.

There is one Haskell library that avoids this problem entirely which I believe holds a lot of promise: accelerate generates LLVM and CUDA code at runtime and does not rely on Haskell's optimizer for code generation, which side-steps the problem. accelerate has a large set of supported algorithms that you can find by just checking the library's reverse dependencies:

Reverse dependencies of accelerate

However, I don't have enough experience with accelerate or enough familiarity with numerical programming success stories in Haskell to vouch for this just yet. If somebody has more experience then me in this regard and can provide evidence that the ecosystem is mature then I might consider revising my rating upward.

Notable libraries:

accelerate / accelerate-* - GPU programming

vector - high-performance arrays

repa / repa-* - parallel shape-polymorphic arrays

hmatrix / hmatrix-* - Haskell's BLAS / LAPACK wrapper

ad - automatic differentiation

Propaganda:

Exploiting vector instructions with generalized stream fusion

Type-safe Runtime Code Generation: Accelerate to LLVM

Educational Resources:

Parallel and concurrent programming in Haskell

Front-end web programming

Rating: Immature

This boils down to Haskell's ability to compile to Javascript. ghcjs is the front-runner, but for a while setting up ghcjs was non-trivial. However, ghcjs appears to be very close to having a polished setup story now that ghc-7.10.2 is out (Source).

One of the distinctive features of ghcjs compared to other competing Haskell-to-Javascript compilers is that a huge number of Haskell libraries work out of the box with ghcjs because it supports most Haskell primitive operations.

I would also like to mention that there are two Haskell-like languages that you should also try out for front-end programming: elm and purescript. These are both used in production today and have equally active maintainers and communities of their own.

Areas for improvement:

There needs to be a clear story for smooth integration with existing Javascript projects

There need to be many more educational resources targeted at non-experts explaining how to translate existing front-end programming idioms to Haskell

There need to be several well-maintained and polished Haskell libraries for front-end programming

Notable Haskell-to-Javascript compilers:

ghcjs

haste

Notable libraries:

reflex-dom - Functional reactive programming library for DOM manipulation

Distributed programming

Rating: Immature

This is sort of a broad area since I'm using this topic to refer to both distributed computation (for analytics) and distributed service architectures. However, in both regards Haskell is lagging behind its peers.

The JVM, Go, and Erlang have much better support for this sort of things, particularly in terms of libraries.

There has been a lot of work in replicating Erlang-like functionality in Haskell through the Cloud Haskell project, not just in creating the low-level primitives for code distribution / networking / transport, but also in assembling a Haskell analog of Erlang's OTP. I'm not that familiar with how far progress is in this area, but people who love Erlang should check out Cloud Haskell.

Areas for improvement:

We need more analytics libraries. Haskell has no analog of scalding or spark. The most we have is just a Haskell wrapper around hadoop

We need a polished consensus library (i.e. a high quality Raft implementation in Haskell)

Notable libraries:

distributed-process / distributed-process-* - Haskell analog to Erlang

hadron - Haskell wrapper around hadoop

aws / aws-* - Amazon web services libraries

Standalone GUI applications

Rating: Immature

Haskell really lags behind the C# and F# ecosystem in this area.

My experience on this is based on several private GUI projects I wrote several years back. Things may have improved since then so if you think my assessment is too negative just let me know.

All Haskell GUI libraries are wrappers around toolkits written in other languages (such as GTK+ or Qt). The last time I checked the gtk bindings were the most comprehensive, best maintained, and had the best documentation.

However, the Haskell bindings to GTK+ have a strongly imperative feel to them. The way you do everything is communicating between callbacks by mutating IORefs. Also, you can't take extensive advantage of Haskell's awesome threading features because the GTK+ runtime is picky about what needs to happen on certain threads. I haven't really seen a Haskell library that takes this imperative GTK+ interface and wraps it in a more idiomatic Haskell API.

My impression is that most Haskell programmers interested in applications programming have collectively decided to concentrate their efforts on improving Haskell web applications instead of standalone GUI applications. Honestly, that's probably the right decision in the long run.

Another post that goes into more detail about this topic is this post written by Keera Studios:

On the state of GUI programming in Haskell

Areas for improvement:

A GUI toolkit binding that is maintained, comprehensive, and easy to use

Polished GUI interface builders

Notable libraries:

gtk / glib / cairo / pango - The GTK+ suite of libraries

wx - wxWidgets bindings

X11 - X11 bindings

threepenny-gui - Framework for local apps that use the web browser as the interface

hsqml - A Haskell binding for Qt Quick, a cross-platform framework for creating graphical user interfaces.

fltkhs - A Haskell binding to FLTK. Easy install/use, cross-platform, self-contained executables.

Some example applications:

xmonad

Educational resources:

Haskell port of the GTK tutorial

Building pragmatic user interfaces in Haskell with HsQML

Machine learning

Rating: Immature? (Uncertain)

This area has been pioneered almost single-handedly by one person: Mike Izbicki. He maintains the HLearn suite of libraries for machine learning in Haskell.

I have essentially no experience in this area, so I can't really rate it that well. However, I'm pretty certain that I would not rate it mature because I'm not aware of any company successfully using machine learning in Haskell.

For the same reason, I can't really offer constructive advice for areas for improvement.

If you would like to learn more about this area the best place to begin is the Github page for the HLearn project:

Github repository for HLearn

Notable libraries: * HLearn-*

Data science

Rating: Immature

Haskell really lags behind Python and R in this area. Haskell is somewhat usable for data science, but probably not ready for expert use under deadline pressure.

I'll primarily compare Haskell to Python since that's the data science ecosystem that I'm more familiar with. Specifically, I'll compare to the scipy suite of libraries:

The Haskell analog of NumPy is the hmatrix library, which provides Haskell bindings to BLAS, LAPACK. hmatrix's main limitation is that the API is a bit clunky, but all the tools are there.

Haskell's charting story is okay. Probably my main criticism of most charting APIs is that their APIs tend to be large, the types are a bit complex, and they have a very large number of dependencies.

Fortunately, Haskell does integrate into IPython so you can use Haskell within an IPython shell or an online notebook. For example, there is an online "IHaskell" notebook that you can use right now located here:

IHaskell notebook - Click on "Welcome to Haskell.ipynb"

If you want to learn more about how to setup your own IHaskell notebook, visit this project:

IHaskell Github repository

The closest thing to Python's pandas is the frames library. I haven't used it that much personally so I won't comment on it much other than to link to some tutorials in the Educational Resources section.

I'm not aware of a Haskell analog to SciPy (the library) or sympy. If you know of an equivalent Haskell library then let me know.

One Haskell library that deserves honorable mention here is the diagrams library which lets you produce complex data visualizations very easily if you want something a little bit fancier than a chart. Check out the diagrams project if you have time:

The Diagrams project

Gallery of example diagrams

Areas for improvement:

Smooth user experience and integration across all of these libraries

Simple types and APIs. The data science programmers I know dislike overly complex or verbose APIs

Beautiful data visualizations with very little investment

Notable libraries:

cassava - CSV encoding and decoding

hmatrix - BLAS / LAPACK wrapper

Frames - Haskell data analysis tool analogous to Python's pandas

statistics - Statistics (duh!)

Chart / Chart-* - Charting library

diagrams / diagrams-* - Vector graphics library

ihaskell - Haskell backend to IPython

Game programming

Rating: Immature? / Bad?

Haskell has SDL and OpenGL bindings, which are actually quite good, but that's about it. You're on your own from that point onward. There is not a rich ecosystem of higher-level libraries built on top of those bindings. There is some work in this area, but I'm not aware of anything production quality.

There is also one really fundamental issue with the language, which is garbage collection, which runs the risk of introducing perceptible pauses in gameplay if your heap grows too large.

For this reason I don't see Haskell ever being used for AAA game programming. I suppose you could use Haskell for simpler games that don't require keeping a lot of resources in memory.

Haskell could maybe be used for the scripting layer of a game or to power the backend for an online game, but for rendering or updating an extremely large graph of objects you should probably stick to another language.

The company that has been doing the most to push the envelope for game programming in Haskell is Keera Studios, so if this is an area that interests you then you should follow their blog:

Keera Studios Blog

Areas for improvement:

Improve the garbage collector and benchmark performance with large heap sizes

Provide higher-level game engines

Improve distribution of Haskell games on proprietary game platforms

Notable libraries:

Helm

gl

SDL / SDL-*

sdl2

SFML

quine (Github-only project)

Systems / embedded programming

Rating: Bad / Immature (?) (See description)

Since systems programming is an abused word, I will clarify that I mean programs where speed, memory layout, and latency really matter.

Haskell fares really poorly in this area because:

The language is garbage collected, so there are no latency guarantees

Executable sizes are large

Memory usage is difficult to constrain (thanks to space leaks)

Haskell has a large and unavoidable runtime, which means you cannot easily embed Haskell within larger programs

You can't easily predict what machine code that Haskell code will compile to

Typically people approach this problem from the opposite direction: they write the low-level parts in C or Rust and then write Haskell bindings to the low-level code.

It's worth noting that there is an alternative approach which is Haskell DSLs that are strongly typed that generate low-level code at runtime. This is the approach championed by the company Galois.

Notable libraries:

atom / ivory - DSL for generating embedded programs

copilot - Stream DSL that generates C code

improve - High-assurance DSL for embedded code that generates C and Ada

Educational resources:

/r/haskell - Haskell compiled down to Embedded Hardware

Mobile apps

Rating: Immature? / Bad? (Uncertain)

This greatly lags behind using the language that is natively supported by the mobile platform (i.e. Java for Android or Objective-C / Swift for iOS).

I don't know a whole lot about this area, but I'm definitely sure it is far from mature. All I can do is link to the resources I know of for Android and iPhone development using Haskell.

I also can't really suggest improvements because I'm pretty out of touch with this branch of the Haskell ecosystem.

Educational resources:

Android development in Haskell

iPhone development in Haskell

ARM processor support

Rating: Immature / Early adopter

On hobbyist boards like the raspberry pi its possible to compile haskell code with ghc. But some libraries have problems on the arm platform, ghci only works on newer compilers, and the newer compilers are flaky.

If haskell code builds, it runs with respectable performance on these machines.

Raspian (raspberry pi, pi2, others) * current version: ghc 7.4, cabal-install 1.14 * ghci doesn't work.

Debian Jesse (Raspberry Pi 2) * current version: ghc 7.6 * can install the current ghc 7.10.2 binary and ghci starts. However, fails to build cabal, with 'illegal instruction'

Arch (Raspberry Pi 2) * current version 7.8.2, but llvm is 3.6, which is too new. * downgrade packages for llvm not officially available. * with llvm downgrade to 3.4, ghc and ghci work, but problems compiling yesod, scotty.
* compiler crashes, segfaults, etc.

Arch (Banana Pi) * similar to raspberry pi 2, ghc is 7.8.2, works with llvm downgrade * have had success compiling a yesod project on this platform.

Common Programming Needs

Maintenance

Rating: Best in class

Haskell is unbelievably awesome for maintaining large projects. There's nothing that I can say that will fully convey how nice it is to modify existing Haskell code. You can only appreciate this through experience.

When I say that Haskell is easy to maintain, I mean that you can easily approach a large Haskell code base written by somebody else and make sweeping architectural changes to the project without breaking the code.

You'll often hear people say: "if it compiles, it works". I think that is a bit of an exaggeration, but a more accurate statement is: "if you refactor and it compiles, it works". This lets you move fast without breaking things.

Most statically typed languages are easy to maintain, but Haskell is on its own level for the following reasons:

Strong types

Global type inference

Type classes

Laziness

The latter two features are what differentiate Haskell from other statically typed languages.

If you've ever maintained code in other languages you know that usually your test suite breaks the moment you make large changes to your code base and you have to spend a significant amount of effort keeping your test suite up to date with your changes. However, Haskell has a very powerful type system that lets you transform tests into invariants that are enforced by the types so that you can statically eliminate entire classes of errors at compile time. These types are much more flexible than tests when modifying code and types require much less upkeep as you make large changes.

The Haskell community and ecosystem use the type system heavily to "test" their applications, more so than other programming language communities. That's not to say that Haskell programmers don't write tests (they do), but rather they prefer types over tests when they have the option.

Global type inference means that you don't have to update types and interfaces as you change the code. Whenever I do a large refactor the first thing I do is delete all type signatures and let the compiler infer the types and interfaces for me as I go. When I'm done refactoring I just insert back the type signatures that the compiler infers as machine-checked documentation.

Type classes also assist refactoring because the compiler automatically infers type class constraints (analogous to interfaces in other languages) so that you don't need to explicitly annotate interfaces. This is a huge time saver.

Laziness deserves special mention because many outsiders do not appreciate how laziness simplifies maintenance. Many languages require tight coupling between producers and consumers of data structures in order to avoid wasteful evaluation, but laziness avoids this problem by only evaluating data structures on demand. This means that if your refactoring process changes the order in which data structures are consumed or even stops referencing them altogether you don't need to reorder or delete those data structures. They will just sit around patiently waiting until they are actually needed, if ever, before they are evaluated.

Single-machine Concurrency

Rating: Best in class

I give Haskell a "Best in class" rating because Haskell's concurrency runtime performs as well or better than mainstream languages and is significantly easier to use due to the runtime support for software-transactional memory.

The best explanation of Haskell's threading module is the documentation in Control.Concurrent:

Concurrency is "lightweight", which means that both thread creation and context switching overheads are extremely low. Scheduling of Haskell threads is done internally in the Haskell runtime system, and doesn't make use of any operating system-supplied thread packages.

The best way to explain the performance of Haskell's threaded runtime is to give hard numbers:

The Haskell thread scheduler can easily handle millions of threads

Each thread requires 1 kb of memory, so the hard limitation to thread count is memory (1 GB per million threads).

Haskell channel overhead for the standard library (using TQueue) is on the order of one microsecond per message and degrades linearly with increasing contention

Haskell channel overhead using the unagi-chan library is on the order of 100 nanoseconds (even under contention)

Haskell's MVar (a low-level concurrency communication primitive) requires 10-20 ns to add or remove values (roughly on par with acquiring or releasing a lock in other languages)

Haskell also provides software-transactional memory, which allows programmers build composable and atomic memory transactions. You can compose transactions together in multiple ways to build larger transactions:

You can sequence two transactions to build a larger atomic transaction

You can combine two transactions using alternation, falling back on the second transaction if the first one fails

Transactions can retry, rolling back their state and sleeping until one of their dependencies changes in order to avoid wasteful polling

A few other languages provide software-transactional memory, but Haskell's implementation has two main advantages over other implementations:

The type system enforces that transactions only permit reversible memory modifications. This guarantees at compile time that all transactions can be safely rolled back.

Haskell's STM runtime takes advantage of enforced purity to improve the efficiency of transactions, retries, and alternation.

Notable libraries:

stm - Software transactional memory

unagi-chan - High performance channels

async - Futures library

Educational resources:

Parallel and Concurrent Programming in Haskell

Parallel and Concurrent Programming in Haskell - Software transactional memory

Beautiful concurrency - a software-transactional memory tutorial

Performance numbers for primitive operations - Latency timings for various low-level operations

Types / Type-driven development

Rating: Best in class

Haskell definitely does not have the most advanced type system (not even close if you count research languages) but out of all languages that are actually used in production Haskell is probably at the top. Idris is probably the closest thing to a type system more powerful than Haskell that has a realistic chance of use in production in the foreseeable future.

The killer features of Haskell's type system are:

Type classes

Global type and type class inference

Light-weight type syntax

Haskell's type system really does not get in your way at all. You (almost) never need to annotate the type of anything. As a result, the language feels light-weight to use like a dynamic language, but you get all the assurances of a static language.

Many people are familiar with languages that support "local" type inference (like Rust, Java, C#), where you have to explicitly type function arguments but then the compiler can infer the types of local variables. Haskell, on the other hand, provides "global" type inference, meaning that the types and interfaces of all function arguments are inferred, too. Type signatures are optional (with some minor caveats) and are primarily for the benefit of the programmer.

This really benefits projects where you need to prototype quickly but refactor painlessly when you realize you are on the wrong track. You can leave out all type signatures while prototyping but the types are still there even if you don't see them. Then when you dramatically change course those strong and silent types step in and keep large refactors painless.

Some Haskell programmers use a "type-driven development" programming style, analogous to "test-driven development":

they specify desired behavior as a type signature which initially fails to type-check (analogous to adding a test which starts out "red")

they create a quick and dirty solution that satisfies the type-checker (analogous to turning the test "green")

they improve on their initial solution while still satisfying the type-checker (analogous to a "red/green refactor")

"Type-driven development" supplements "test-driven development" and has different tradeoffs:

The biggest disadvantage of types is that test as many things as full-blown tests, especially because Haskell is not dependently typed

The biggest advantage of types is that they can prove the complete absence of programming errors for all possible cases, whereas tests cannot examine every possibility

Type-checking is much faster than running tests

Type error messages are informative: they explain what went wrong and never get stale

Type-checking never hangs and never gives flaky results

Haskell also provides the "Typed Holes" extension, which lets you add an underscore (i.e. "_") anywhere in the code whenever you don't know what expression belongs there. The compiler will then tell you the expected type of the hole and suggest terms in scope with related types that you can use to fill the hole.

Educational resources:

Learn you a Haskell - Types and type classes

Learn you a Haskell - Making our own types and type classes

Typed holes

Partial type signatures proposal

Propaganda:

What exactly makes the Haskell type system so revered (vs say, Java)?

Difference between OOP interfaces and FP type classes

Domain-specific languages (DSLs)

Rating: Mature

Haskell rocks at DSL-building. While not as flexible as a Lisp language I would venture that Haskell is the most flexible of the non-Lisp languages. You can overload a large amount of built-in syntax for your custom DSL.

The most popular example of overloaded syntax is do notation, which you can overload to work with any type that implements the Monad interface. This syntactic sugar for Monads in turn led to a huge overabundance of Monad tutorials.

However, there are lesser known but equally important things that you can overload, such as:

numeric and string literals

if/then/else expressions

list comprehensions

numeric operators

Educational resources:

You could have invented monads

Rebindable syntax

Monad comprehensions

Overloaded strings

Testing

Rating: Mature

There are a few places where Haskell is the clear leader among all languages:

property-based testing

mocking / dependency injection

Haskell's QuickCheck is the gold standard which all other property-based testing libraries are measured against. The reason QuickCheck works so smoothly in Haskell is due to Haskell's type class system and purity. The type class system simplifies automatic generation of random data from the input type of the property test. Purity means that any failing test result can be automatically minimized by rerunning the check on smaller and smaller inputs until QuickCheck identifies the corner case that triggers the failure.

Mocking is another area where Haskell shines because you can overload almost all built-in syntax, including:

do notation

if statements

numeric literals

string literals

Haskell programmers overload this syntax (particularly do notation) to write code that looks like it is doing real work:

... and the code will actually evaluate to a pure syntax tree that you can use to mock in external inputs and outputs:

Haskell also supports most testing functionality that you expect from other languages, including:

standard package interfaces for testing

unit testing libraries

test result summaries and visualization

Notable libraries:

QuickCheck - property-based testing

doctest - tests embedded directly within documentation

free - Haskell's abstract version of "dependency injection"

hspec - Testing library analogous to Ruby's RSpec

HUnit - Testing library analogous to Java's JUnit

tasty - Combination unit / regression / property testing library

Educational resources:

Why free monads matter

Purify code using free monads

Up-front Unit Testing in Haskell

Data structures and algorithms

Rating: Mature

Haskell primarily uses persistent data structures, meaning that when you "update" a persistent data structure you just create a new data structure and you can keep the old one around (thus the name: persistent). Haskell data structures are immutable, so you don't actually create a deep copy of the data structure when updating; any new structure will reuse as much of the original data structure as possible.

The Notable libraries sections contains links to Haskell collections libraries that are heavily tuned. You should realistically expect these libraries to compete with tuned Java code. However, you should not expect Haskell to match expertly tuned C++ code.

The selection of algorithms is not as broad as in Java or C++ but it is still pretty good and diverse enough to cover the majority of use cases.

Notable libraries:

vector - High-performance arrays

containers - High-performance Maps, Sets, Trees, Graphs, Seqs

unordered-containers - High-performance HashMaps, HashSets

accelerate / accelerate-* - GPU programming

repa / repa-* - parallel shape-polymorphic arrays

Benchmarking

Rating: Mature

This boils down exclusively to the criterion library, which was done so well that nobody bothered to write a competing library. Notable criterion features include:

Detailed statistical analysis of timing data

Beautiful graph output: (Example)

High-resolution analysis (accurate down to nanoseconds)

Customizable HTML/CSV/JSON output

Garbage collection insensitivity

Notable libraries:

criterion

Educational resources:

The criterion tutorial

Unicode

Rating: Mature

Haskell's Unicode support is excellent. Just use the text and text-icu libraries, which provide a high-performance, space-efficient, and easy-to-use API for Unicode-aware text operations.

Note that there is one big catch: the default String type in Haskell is inefficient. You should always use Text whenever possible.

Notable libraries:

text

text-icu

Parsing / Pretty-printing

Rating: Mature

Haskell is amazing at parsing. Recursive descent parser combinators are far-and-away the most popular parsing paradigm within the Haskell ecosystem, so much so that people use them even in place of regular expressions. I strongly recommend reading the "Monadic Parsing in Haskell" functional pearl linked below if you want to get a feel for why parser combinators are so dominant in the Haskell landscape.

If you're not sure what library to pick, I generally recommend the parsec library as a default well-rounded choice because it strikes a decent balance between ease-of-use, performance, good error messages, and small dependencies (since it ships with GHC).

attoparsec deserves special mention as an extremely fast backtracking parsing library. The speed and simplicity of this library will blow you away. The main deficiency of attoparsec is the poor error messages.

The pretty-printing front is also excellent. Academic researchers just really love writing pretty-printing libraries in Haskell for some reason.

Notable libraries:

parsec - best overall "value"

attoparsec - Extremely fast backtracking parser

trifecta - Best error messages (clang-style)

alex / happy - Like lexx / yacc but with Haskell integration

Earley - Early parsing embedded within the Haskell language

ansi-wl-pprint - Pretty-printing library

text-format - High-performance string formatting

Educational resources:

Monadic Parsing in Haskell

Propaganda:

<a href="http://w