Bit-player.org

Getting to Know Julia

2016-07-22

When I first started playing with computers, back in the Pleistocene, writing a few lines of code and watching the machine carry out my instructions was enough to give me a little thrill. “Look! It can count to 10!” Today, learning a new programming language is my way of reviving that sense of wonder.

Lately I’ve been learning Julia, which describes itself as “a high-level, high-performance dynamic programming language for technical computing.” I’ve been dabbling with Julia for a couple of years, but this spring I completed my first serious project—my first attempt to do something useful with the language. I wrote a bunch of code to explore correlations between consecutive prime numbers mod m, inspired by the recent paper of Lemke Oliver and Soundararajan. The code from that project, wrapped up in a Jupyter notebook, is available on GitHub. A bit-player article published last month presented the results of the computation. Here I want to say a few words about my experience with the language.

I also discussed Julia a year ago, on the occasion of JuliaCon 2015, the annual gathering of the Julia community. Parts of this post were written at JuliaCon 2016, held at MIT June 21–25.

This document is itself derived from a Jupyter notebook, although it has been converted to static HTML—meaning that you can only read it, not run the code examples. (I’ve also done some reformatting to save space and improve appearance.) If you have a working installation of Julia and Jupyter, I suggest you download the fully interactive notebook from GitHub, so that you can edit and run the code. If you haven’t installed Julia, download the notebook anyway. You can upload it to JuliaBox.org and run it online.

What Is Julia?

A 2014 paper by the founders of the Julia project sets an ambitious goal. To paraphrase: There are languages that make programming quick and easy, and there are languages that make computation fast and efficient. The two sets of languages have long been disjoint. Julia aims to fix that. It offers high-level programming, where algorithms are expressed succinctly and without much fussing over data types and memory allocation. But it also strives to match the performance of lower-level languages such as Fortran and C. Achieving these dual goals requires attention both to the design of the language itself and to the implementation.

The Julia project was initiated by a small group at MIT, including Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B. Shah, and has since attracted about 500 contributors. The software is open source, hosted at GitHub.

Who Am I?

In my kind of programming, the end product is not the program itself but the answers it computes. As a result, I favor an incremental and interactive style. I want to write and run small chunks of code—typically individual procedures—without having to build a lot of scaffolding to support them. For a long time I worked mainly in Lisp, although in recent years I’ve also written a fair amount of Python and JavaScript. I mention all this because my background inevitably colors my judgment of programming languages and environments. If you’re a software engineer on a large team building large systems, your needs and priorities are surely different from mine.

What Julia Code Looks Like

Enough generalities. Here’s a bit of Julia code, plucked from my recent project:

Given an integer x, we add 1 if x is even and otherwise add 2, thereby ensuring that the new value of x is odd. Now we repeatedly check the isprime(x) predicate, or rather its negation !isprime(x). As long as x is not a prime number, we continue incrementing by 2; when x finally lands on a prime value (it must do so eventually, according to Euclid), the while loop terminates and the function returns the prime value of x.

In this code snippet, note the absence of punctuation: No semicolons separate the statements, and no curly braces delimit the function body. The syntax owes a lot to MATLAB—most conspicuously the use of the end keyword to mark the end of a block, without any corresponding begin. The resemblance to MATLAB is more than coincidence; some of the key people in the Julia project have a background in numerical analysis and linear algebra, where MATLAB has long been a standard tool. One of those people is Alan Edelman. I remember asking him, 15 years ago, “How do you compute the eigenvalues of a matrix?” He explained: “You start up MATLAB and type eig.” The same incantation works in Julia.

The sparse punctuation and the style of indentation also make Julia code look a little like Python, but in this case the similarity is only superficial. In Python, indentation determines the scope of statements, but in Julia indentation is merely an aid to the human reader; the program would compute the same result even if all the lines were flush left.

More Basics

The next_prime function above shows a while loop. Here’s a for loop:

The same idea can be expressed more succinctly and idiomatically:

Here 1:n denotes the numeric range \(1, 2, 3, …, n\), and the prod function returns the product of all the numbers in the range. Still too verbose? Using an abbreviated style of function definition, it becomes a true one-liner:

Recursion

As an immigrant from the land of Lisp, I can’t in good conscience discuss the factorial function without also presenting a recursive version:

You can save some keystrokes by replacing the if... else... end statement with a ternary operator. The syntax is

If predicate evaluates to true, the expression returns the value of consequent; otherwise it returns the value of alternative. With this device, the recursive factorial function looks like this:

One sour note on the subject of recursion: The Julia compiler does not perform tail-call conversion, which endows recursive procedure definitions with the same space and time efficiency as iterative structures such as while loops. Consider this pair of daisy-plucking procedures:

These are mutually recursive definitions: Control passes back and forth between them until one or the other terminates when the petals are exhausted. The function works correctly on small values of petals. But let’s try a more challenging exercise:

The stack overflows because Julia is pushing a procedure-call record onto the stack for every invocation of either she_loves_me or she_loves_me_not. These records will never be needed; when the answer (true or false) is finally determined, it can be returned directly to the top level, without having to percolate up through the cascade of stack frames. The technique for eliminating such unneeded stack frames has been well known for more than 30 years. Implementing tail-call optimization has been discussed by the Julia developers but is “not a priority.” For me this is not a catastrophic defect, but it’s a disappointment. It rules out a certain style of reasoning that in some cases is the most direct way of expressing mathematical ideas.

Array Comprehensions

Given Julia’s heritage in linear algebra, it’s no surprise that there are rich facilities for describing matrices and other array-like objects. One handy tool is the array comprehension, which is written as a pair of square brackets surrounding a description of what the compiler should compute to build the array.

The idea of comprehensions comes from the set-builder notation of mathematics. It was adopted in the programming language SETL in the 1960s, and it has since wormed its way into several other languages, including Haskell and Python. But Julia’s comprehensions are unusual: Whereas Python comprehensions are limited to one-dimensional vectors or lists, Julia comprehensions can specify multidimensional arrays.

Multidimensionality comes at a price. A Python comprehension can have a filter clause that selects some of the array elements and excludes others. If Julia had such comprehension filters, you could generate an array of prime numbers with an expression like this: [x for x in 1:100 if isprime(x)]. But adding filters to multidimensional arrays is problematic; you can’t just pluck values out of a matrix and keep the rest of the structure intact. Nevertheless, it appears a solution is in hand. After three years of discussion, a fix has recently been added to the code base. (I have not yet tried it out.)

Multiple Dispatch

This sounds like something the 911 operator might do in response to a five-alarm fire, but in fact multiple dispatch is a distinctive core idea in Julia. Understanding what it is and why you might care about it requires a bit of context.

In almost all programming languages, an expression along the lines of x * y multiplies x by y, whether x and y are integers, floating-point numbers, or maybe even complex numbers or matrices. All of these operations qualify as multiplication, but the underlying algorithms—the ways the bits are twiddled to compute the product—are quite different. Thus the * operator is polymorphic: It’s a single symbol that evokes many different actions. One way to handle this situation is to write a single big function—call it mul(x, y)—that has a chain of if... then... else statements to select the right multiplication algorithm for each x, y pair. If you want to multiply all possible combinations of \(n\) kinds of numbers, the if statement needs \(n^2\) branches. Maintaining that monolithic mul procedure can become a headache. Suppose you want to add a new type of number, such as quaternions. You would have to patch in new routines throughout the existing mul code.

When object-oriented programming (OOP) swept through the world of computing in the 1980s, it offered an alternative. In an object-oriented setting, each class of numbers has its own set of multiplication procedures, or methods. Instead of one universal mul function, there’s a mul method for integers, another mul for floats, and so on. To multiply x by y, you don’t write mul(x, y) but rather something like x.mul(y). The object-oriented programmer thinks of this protocol as sending a message to the x object, asking it to apply its mul method to the y argument. You can also describe the scheme as a single-dispatch mechanism. The dispatcher is a behind-the-scenes component that keeps track of all methods with the same name, such as mul, and chooses which of those methods to invoke based on the data type of the single argument x. (The x object still needs some internal mechanism to examine the type of y to know which of its own methods to apply.)

Multiple dispatch is a generalization of single dispatch. The dispatcher looks at all the arguments and their types, and chooses the method accordingly. Once that decision is made, no further choices are needed. There might be separate mul methods for combining two integers, for an integer and a float, for a float and a complex, etc. Julia has 138 methods linked to the multiplication operator *, including a few mildly surprising combinations. (Quick, what’s the value of false * 42?)

138 methods for generic function *:

*(x::Bool, y::Bool) at bool.jl:38

*{T
(x::Bool, y::T
) at bool.jl:53

*(x::Bool, z::Complex{Bool}) at complex.jl:122

*(x::Bool, z::Complex{T
) at complex.jl:129

*{T
(x::Bool, y::T
) at bool.jl:49

*(x::Float32, y::Float32) at float.jl:211

*(x::Float64, y::Float64) at float.jl:212

*(z::Complex{T
, w::Complex{T
) at complex.jl:113

*(z::Complex{Bool}, x::Bool) at complex.jl:123

*(z::Complex{T
, x::Bool) at complex.jl:130

*(x::Real, z::Complex{Bool}) at complex.jl:140

*(z::Complex{Bool}, x::Real) at complex.jl:141

*(x::Real, z::Complex{T
) at complex.jl:152

*(z::Complex{T
, x::Real) at complex.jl:153

*(x::Rational{T
, y::Rational{T
) at rational.jl:186

*(a::Float16, b::Float16) at float16.jl:136

*{N}(a::Integer, index::CartesianIndex{N}) at multidimensional.jl:50

*(x::BigInt, y::BigInt) at gmp.jl:256

*(a::BigInt, b::BigInt, c::BigInt) at gmp.jl:279

*(a::BigInt, b::BigInt, c::BigInt, d::BigInt) at gmp.jl:285

*(a::BigInt, b::BigInt, c::BigInt, d::BigInt, e::BigInt) at gmp.jl:292

*(x::BigInt, c::Union{UInt16,UInt32,UInt64,UInt8}) at gmp.jl:326

*(c::Union{UInt16,UInt32,UInt64,UInt8}, x::BigInt) at gmp.jl:330

*(x::BigInt, c::Union{Int16,Int32,Int64,Int8}) at gmp.jl:332

*(c::Union{Int16,Int32,Int64,Int8}, x::BigInt) at gmp.jl:336

*(x::BigFloat, y::BigFloat) at mpfr.jl:208

*(x::BigFloat, c::Union{UInt16,UInt32,UInt64,UInt8}) at mpfr.jl:215

*(c::Union{UInt16,UInt32,UInt64,UInt8}, x::BigFloat) at mpfr.jl:219

*(x::BigFloat, c::Union{Int16,Int32,Int64,Int8}) at mpfr.jl:223

*(c::Union{Int16,Int32,Int64,Int8}, x::BigFloat) at mpfr.jl:227

*(x::BigFloat, c::Union{Float16,Float32,Float64}) at mpfr.jl:231

*(c::Union{Float16,Float32,Float64}, x::BigFloat) at mpfr.jl:235

*(x::BigFloat, c::BigInt) at mpfr.jl:239

*(c::BigInt, x::BigFloat) at mpfr.jl:243

*(a::BigFloat, b::BigFloat, c::BigFloat) at mpfr.jl:379

*(a::BigFloat, b::BigFloat, c::BigFloat, d::BigFloat) at mpfr.jl:385

*(a::BigFloat, b::BigFloat, c::BigFloat, d::BigFloat, e::BigFloat) at mpfr.jl:392

*{T
(x::T
, D::Diagonal{T}) at linalg/diagonal.jl:89

*(x::Irrational{sym}, y::Irrational{sym}) at irrationals.jl:72

*(y::Real, x::Base.Dates.Period) at dates/periods.jl:55

*(x::Number) at operators.jl:74

*(y::Number, x::Bool) at bool.jl:55

*(x::Int8, y::Int8) at int.jl:19

*(x::UInt8, y::UInt8) at int.jl:19

*(x::Int16, y::Int16) at int.jl:19

*(x::UInt16, y::UInt16) at int.jl:19

*(x::Int32, y::Int32) at int.jl:19

*(x::UInt32, y::UInt32) at int.jl:19

*(x::Int64, y::Int64) at int.jl:19

*(x::UInt64, y::UInt64) at int.jl:19

*(x::Int128, y::Int128) at int.jl:456

*(x::UInt128, y::UInt128) at int.jl:457

*{T
(x::T
, y::T
) at promotion.jl:212

*(x::Number, y::Number) at promotion.jl:168

*{T
(A::Union{DenseArray{T
, x::Union{DenseArray{S,1}, SubArray{S,1,A
) at linalg/matmul.jl:82

*(A::SymTridiagonal{T}, B::Number) at linalg/tridiag.jl:86

*(A::Tridiagonal{T}, B::Number) at linalg/tridiag.jl:406

*(A::UpperTriangular{T,S
, x::Number) at linalg/triangular.jl:454

*(A::Base.LinAlg.UnitUpperTriangular{T,S
, x::Number) at linalg/triangular.jl:457

*(A::LowerTriangular{T,S
, x::Number) at linalg/triangular.jl:454

*(A::Base.LinAlg.UnitLowerTriangular{T,S
, x::Number) at linalg/triangular.jl:457

*(A::Tridiagonal{T}, B::UpperTriangular{T,S
) at linalg/triangular.jl:969

*(A::Tridiagonal{T}, B::Base.LinAlg.UnitUpperTriangular{T,S
) at linalg/triangular.jl:969

*(A::Tridiagonal{T}, B::LowerTriangular{T,S
) at linalg/triangular.jl:969

*(A::Tridiagonal{T}, B::Base.LinAlg.UnitLowerTriangular{T,S
) at linalg/triangular.jl:969

*(A::Base.LinAlg.AbstractTriangular{T,S
, B::Base.LinAlg.AbstractTriangular{T,S
) at linalg/triangular.jl:975

*{TA,TB}(A::Base.LinAlg.AbstractTriangular{TA,S
, B:: Union{DenseArray{TB,1},DenseArray{TB,2}, SubArray{TB,1,A
) at linalg/triangular.jl:989

*{TA,TB}(A:: Union{DenseArray{TA,1},DenseArray{TA,2}, SubArray{TA,1,A
, B::Base.LinAlg.AbstractTriangular{TB,S
) at linalg/triangular.jl:1016

*{TA,Tb}(A::Union{Base.LinAlg.QRCompactWYQ{TA,M
, b:: Union{DenseArray{Tb,1}, SubArray{Tb,1,A
) at linalg/qr.jl:166

*{TA,TB}(A::Union{Base.LinAlg.QRCompactWYQ{TA,M
, B:: Union{DenseArray{TB,2},SubArray{TB,2,A
) at linalg/qr.jl:178

*{TA,TQ,N}(A:: Union{DenseArray{TA,N}, SubArray{TA,N,A
, Q::Union{Base.LinAlg.QRCompactWYQ{TQ,M
) at linalg/qr.jl:253

*(A::Union{Hermitian{T,S},Symmetric{T,S}}, B::Union{Hermitian{T,S},Symmetric{T,S}}) at linalg/symmetric.jl:117

*(A::Union{DenseArray{T,2}, SubArray{T,2, A
, B::Union{Hermitian{T,S},Symmetric{T,S}}) at linalg/symmetric.jl:118

*{T
(D::Diagonal{T}, x::T
) at linalg/diagonal.jl:90

*(Da::Diagonal{T}, Db::Diagonal{T}) at linalg/diagonal.jl:92

*(D::Diagonal{T}, V::Array{T,1}) at linalg/diagonal.jl:93

*(A::Array{T,2}, D::Diagonal{T}) at linalg/diagonal.jl:94

*(D::Diagonal{T}, A::Array{T,2}) at linalg/diagonal.jl:95

*(A::Bidiagonal{T}, B::Number) at linalg/bidiag.jl:192

*(A::Union{Base.LinAlg.AbstractTriangular{T,S
, B::Union{Base.LinAlg.AbstractTriangular{T, S
) at linalg/bidiag.jl:198

*{T}(A::Bidiagonal{T}, B::AbstractArray{T,1}) at linalg/bidiag.jl:202

*(B::BitArray{2}, J::UniformScaling{T
) at linalg/uniformscaling.jl:122

*{T,S}(s::Base.LinAlg.SVDOperator{T,S}, v::Array{T,1}) at linalg/arnoldi.jl:261

*(S::SparseMatrixCSC{Tv,Ti
, J::UniformScaling{T
) at sparse/linalg.jl:23

*{Tv,Ti}(A::SparseMatrixCSC{Tv,Ti}, B::SparseMatrixCSC{Tv,Ti}) at sparse/linalg.jl:108

*{TvA,TiA,TvB,TiB}(A::SparseMatrixCSC{TvA,TiA}, B::SparseMatrixCSC{TvB,TiB}) at sparse/linalg.jl:29

*{TX,TvA,TiA}(X:: Union{DenseArray{TX,2}, SubArray{TX,2,A
, A::SparseMatrixCSC{TvA,TiA}) at sparse/linalg.jl:94

*(A::Base.SparseMatrix.CHOLMOD.Sparse {Tv
, B:: Base.SparseMatrix.CHOLMOD.Sparse{Tv
) at sparse/cholmod.jl:1157

*(A::Base.SparseMatrix.CHOLMOD.Sparse {Tv
, B:: Base.SparseMatrix.CHOLMOD.Dense {T
) at sparse/cholmod.jl:1158

*(A::Base.SparseMatrix.CHOLMOD.Sparse {Tv
, B:: Union{Array{T,1},Array{T,2}}) at sparse/cholmod.jl:1159

*{Ti}(A::Symmetric{Float64,SparseMatrixCSC{Float64,Ti}}, B::SparseMatrixCSC{Float64,Ti}) at sparse/cholmod.jl:1418

*{Ti}(A::Hermitian{Complex{Float64}, SparseMatrixCSC{Complex{Float64},Ti}}, B::SparseMatrixCSC{Complex{Float64},Ti}) at sparse/cholmod.jl:1419

*{T
(x::AbstractArray{T
) at abstractarraymath.jl:50

*(B::Number, A::SymTridiagonal{T}) at linalg/tridiag.jl:87

*(B::Number, A::Tridiagonal{T}) at linalg/tridiag.jl:407

*(x::Number, A::UpperTriangular{T,S
) at linalg/triangular.jl:464

*(x::Number, A::Base.LinAlg.UnitUpperTriangular{T, S
) at linalg/triangular.jl:467

*(x::Number, A::LowerTriangular{T,S
) at linalg/triangular.jl:464

*(x::Number, A::Base.LinAlg.UnitLowerTriangular{T, S
) at linalg/triangular.jl:467

*(B::Number, A::Bidiagonal{T}) at linalg/bidiag.jl:193

*(A::Number, B::AbstractArray{T,N}) at abstractarraymath.jl:54

*(A::AbstractArray{T,N}, B::Number) at abstractarraymath.jl:55

*(s1::AbstractString, ss::AbstractString…) at strings/basic.jl:50

*(this::Base.Grisu.Float, other::Base.Grisu.Float) at grisu/float.jl:138

*(index::CartesianIndex{N}, a::Integer) at multidimensional.jl:54

*{T,S}(A::AbstractArray{T,2}, B::Union{DenseArray{S,2}, SubArray{S,2,A
) at linalg/matmul.jl:131

*{T,S}(A::AbstractArray{T,2}, x::AbstractArray{S,1}) at linalg/matmul.jl:86

*(A::AbstractArray{T,1}, B::AbstractArray{T,2}) at linalg/matmul.jl:89

*(J1::UniformScaling{T
, J2::UniformScaling{T
) at linalg/uniformscaling.jl:121

*(J::UniformScaling{T
, B::BitArray{2}) at linalg/uniformscaling.jl:123

*(A::AbstractArray{T,2}, J::UniformScaling{T
) at linalg/uniformscaling.jl:124

*{Tv,Ti}(J::UniformScaling{T
, S::SparseMatrixCSC{Tv,Ti}) at sparse/linalg.jl:24

*(J::UniformScaling{T
, A::Union{AbstractArray{T,1},AbstractArray{T,2}}) at linalg/uniformscaling.jl:125

*(x::Number, J::UniformScaling{T
) at linalg/uniformscaling.jl:127

*(J::UniformScaling{T
, x::Number) at linalg/uniformscaling.jl:128

*{T,S}(R::Base.LinAlg.AbstractRotation{T}, A::Union{AbstractArray{S,1},AbstractArray{S,2}}) at linalg/givens.jl:9

*{T}(G1::Base.LinAlg.Givens{T}, G2::Base.LinAlg.Givens{T}) at linalg/givens.jl:307

*(p::Base.DFT.ScaledPlan{T,P,N}, x::AbstractArray{T,N}) at dft.jl:262

*{T,K,N}(p::Base.DFT.FFTW.cFFTWPlan{T,K,false,N}, x::Union{DenseArray{T,N}, SubArray{T,N,A
) at fft/FFTW.jl:621

*{T,K}(p::Base.DFT.FFTW.cFFTWPlan{T,K,true,N}, x::Union{DenseArray{T,N}, SubArray{T,N,A
) at fft/FFTW.jl:628

*{N}(p::Base.DFT.FFTW.rFFTWPlan{Float32,-1,false,N}, x:: Union{DenseArray{Float32,N}, SubArray{Float32,N,A
) at fft/FFTW.jl:698

*{N}(p::Base.DFT.FFTW.rFFTWPlan{Complex{Float32},1,false,N}, x::Union{DenseArray{Complex{Float32},N}, SubArray{Complex{Float32},N,A
) at fft/FFTW.jl:705

*{N}(p::Base.DFT.FFTW.rFFTWPlan{Float64,-1,false,N}, x:: Union{DenseArray{Float64,N}, SubArray{Float64,N,A
) at fft/FFTW.jl:698

*{N}(p::Base.DFT.FFTW.rFFTWPlan{Complex{Float64},1,false,N}, x:: Union{DenseArray{Complex{Float64},N}, SubArray{Complex{Float64},N,A
) at fft/FFTW.jl:705

*{T,K,N}(p::Base.DFT.FFTW.r2rFFTWPlan{T,K,false,N}, x:: Union{DenseArray{T,N}, SubArray{T,N,A
) at fft/FFTW.jl:866

*{T,K}(p::Base.DFT.FFTW.r2rFFTWPlan{T,K,true,N}, x:: Union{DenseArray{T,N}, SubArray{T,N,A
) at fft/FFTW.jl:873

*{T}(p::Base.DFT.FFTW.DCTPlan{T,5,false}, x:: Union{DenseArray{T,N}, SubArray{T,N,A
) at fft/dct.jl:188

*{T}(p::Base.DFT.FFTW.DCTPlan{T,4,false}, x::Union{DenseArray{T,N}, SubArray{T,N,A
) at fft/dct.jl:191

*{T,K}(p::Base.DFT.FFTW.DCTPlan{T,K,true}, x::Union{DenseArray{T,N}, SubArray{T,N,A
) at fft/dct.jl:194

*{T}(p::Base.DFT.Plan{T}, x::AbstractArray{T,N}) at dft.jl:221

*(?::Number, p::Base.DFT.Plan{T}) at dft.jl:264

*(p::Base.DFT.Plan{T}, ?::Number) at dft.jl:265

*(I::UniformScaling{T
, p::Base.DFT.ScaledPlan{T,P,N}) at dft.jl:266

*(p::Base.DFT.ScaledPlan{T,P,N}, I::UniformScaling{T
) at dft.jl:267

*(I::UniformScaling{T
, p::Base.DFT.Plan{T}) at dft.jl:268

*(p::Base.DFT.Plan{T}, I::UniformScaling{T
) at dft.jl:269

*{P
(x::P
, y::Real) at dates/periods.jl:54

*(a, b, c, xs…) at operators.jl:103

Before Julia came along, the best-known example of multiple dispatch was the Common Lisp Object System, or CLOS. As a Lisp guy, I’m familiar with CLOS, but I’ve seldom found it useful; it’s too heav