Planet.gnome.org

Mon 2016/Nov/14

2016-11-16

Exposing Rust objects to C code

When librsvg parses an SVG file, it will encounter
elements that generate path-like objects:
lines, rectangles, polylines, circles, and actual path
definitions. Internally, librsvg translates all of
these into path definitions. For example, librsvg will
read an element from the SVG that defines a rectangle like

and translate it into a path definition with the
following commands:

But where do those commands live? How are they fed into
Cairo to actually draw a rectangle?

Get your Cairo right here

One of librsvg's public API entry points is
rsvg_handle_render_cairo():

Your program creates an appropriate Cairo surface (a
window, an off-screen image, a PDF surface, whatever),
obtains a cairo_t drawing
context for the surface, and passes the cairo_t
to librsvg using that
rsvg_handle_render_cairo() function. It means,
"take this parsed SVG (the handle), and render
it to this cairo_t drawing context".

SVG files may look like an XML-ization of a tree of
graphical objects: here is a group which contains a
blue rectangle and a green circle, and here is a closed
Bézier curve with a black outline and a red fill.
However, SVG is more complicated than that; it allows
you to define objects once and recall them later many
times, it allows you to use CSS cascading rules for
applying styles to objects ("all the objects in this
group are green unless they define another color on
their own"), to reference other SVG files, etc. The
magic of librsvg is that it resolves all of that into
drawing commands for Cairo.

Feeding a path into Cairo

This is easy enough: Cairo provides an API for its
drawing context with functions like

Librsvg doesn't feed paths to Cairo as soon as
it parses them from the XML; that is done until
rendering time. In the meantime, librsvg has to keep an
intermediate representation of path data.

Librsvg uses an
RsvgPathBuilder object to hold
on to this path data for as long as needed. The API is
simple enough:

This mimics the sub-API of cairo_t to build
paths, except that instead of feeding them immediately
into the Cairo drawing context, RsvgPathBuilder
builds an array of path commands that it will later
replay to a given cairo_t. Let's look at the
methods of RsvgPathBuilder.

"pub fn new () -> RsvgPathBuilder" - this
doesn't take a self parameter; you could call
it a static method in languages that support classes.
It is just a constructor.

"pub fn move_to (&mut self, x: f64, y:
f64)" - This one is a normal method, as it takes a
self parameter. It also takes (x, y)
double-precision floating point values for the move_to
command. Note the "&mut self": this means
that you must pass a mutable reference to an
RsvgPathBuilder, since the method will change the
builder's contents by adding a move_to command. It
is a method that changes the state of the
object, so it must take a mutable object.

The other methods for path commands are similar to
move_to. None of them have return values; if they did,
they would have a "-> ReturnType" after
the argument list.

But that RsvgPathBuilder is a Rust object! And
it still needs to be called from the C code in librsvg
that hasn't been ported over to Rust yet. How do we do that?

Exporting an API from Rust to C

C doesn't know about objects with methods, even though
you can fake
them pretty well with structs and pointers to
functions. Rust doesn't try to export structs with
methods in a fancy way; you have to do that by hand.
This is no harder than writing a GObject implementation
in C, fortunately.

Let's look at the C header file for the
RsvgPathBuilder object, which is entirely
implemented in Rust. The C header file is rsvg-path-builder.h.
Here is part of that file:

Nothing special here. RsvgPathBuilder is an
opaque struct; we declare it like that just so we can
take a pointer to it as in the
rsvg_path_builder_move_to() and
rsvg_path_builder_line_to() functions.

How about the Rust side of things? This is where it
gets more interesting. This is part of path-builder.rs:

Let's look at the numbered lines:

1. We use the cairo crate from the
excellent gtk-rs, the
Rust binding for GTK+ and Cairo.

2. This is our Rust structure. Its
fields are not important for this discussion; they are
just what the struct uses to store Cairo path commands.

3. Now we begin implementing methods
for that structure. These are Rust-side methods, not
visible from C. In 4 and
5 we see the implementation of
::move_to(); it just creates a new
cairo::PathSegment and pushes it to the vector of
segments.

6. The "#[no_mangle]" line
instructs the Rust compiler to put the following
function name in the .a library just as it is, without
any name mangling. The function name without name
mangling looks just like
rsvg_path_builder_move_to to the linker, as we
expect. A name-mangled Rust function looks like
_ZN14rsvg_internals12path_builder15RsvgPathBuilder8curve_to17h1b8f49042ff19daaE
— you can explore these with "objdump -x rust/target/debug/librsvg_internals.a"

7. "pub extern fn
rsvg_path_builder_move_to (raw_builder: *mut
RsvgPathBuilder". This is a public function with
an exported symbol in the .a file, not an internal one,
as it will be called from the C code. And the
"raw_builder: *mut RsvgPathBuilder" is Rust-ese
for "a pointer to an RsvgPathBuilder with mutable
contents". If this were only an accessor function, we
would use a "*const RsvgPathBuilder" argument
type.

8. "assert! (!raw_builder.is_null
());". You can read this as "g_assert
(raw_builder != NULL);" if you come from GObject
land.

9. "let builder: &mut
RsvgPathBuilder = unsafe { &mut (*raw_builder)
}". This declares a builder variable, of
type &mut RsvgPathBuilder, which is a
reference to a mutable path builder. The variable gets
intialized with the result of "&mut
(*raw_builder)": first we de-reference the
raw_builder pointer with the asterisk, and convert that
to a mutable reference with the &mut.
De-referencing pointers that come from who-knows-where
is an unsafe operation in Rust, as the compiler
cannot guarantee their validity, and so we must wrap
that operation with an unsafe{} block.
This is like telling the compiler, "I acknowledge that
this is potentially unsafe". Already this is better
than life in C, where *every* de-reference is
potentially dangerous; in Rust, only those that "bring
in" pointers from the outside are potentially dangerous.

10. Now we have a Rust-side reference
to an RsvgPathBuilder object, and we can call the
builder.move_to() method as in regular Rust code.

Those are methods. And the constructor/destructor?

Excellent question! We defined an absolutely
conventional method, but we haven't created a Rust
object and sent it over to the C world yet. And we
haven't taken a Rust object from the C world
and destroyed it when we are done with it.

Construction

Here is the C prototype for the constructor, exactly as
you would expect from a GObject library:

And here is the corresponding implementation in Rust:

1. Again, this is a public function
with an exported symbol. However, this whole function
is marked as unsafe since it returns a pointer,
a *mut RsvgPathBuilder. To Rust this
declaration means, "this pointer will be out of your
control", hence the unsafe. With that we
acknowledge our responsibility in handling the memory to
which the pointer refers.

2. We instantiate an RsvgPathBuilder with normal Rust code...

3. ... and ensure that that object is
put in the heap by Boxing it. This is a common
operation in garbage-collected languages. Boxing is
Rust's primitive for putting data in the program's heap;
it allows the object in question to outlive the scope
where it got created, i.e. the duration of the
rsvg_path_builder_new()function.

4. Finally, we call
Box::into_raw() to ask Rust to give us a
pointer to the contents of the box, i.e. the actual
RsvgPathBuilder struct that lives there. This statement
doesn't end in a semicolon, so it is the return value
for the function.

You could read this as "builder = g_new (...);
initialize (builder); return builder;". Allocate
something in the heap and initialize it, and return a
pointer to it. This is exactly what the Rust code is
doing.

Destruction

This is the C prototype for the destructor. This
not a reference-counted GObject; it is just an
internal thing in librsvg, which does not need reference
counting.

And this is the implementation in Rust:

1. Same as before; we declare the whole
function as public, exported, and unsafe since it takes
a pointer from who-knows-where.

2. Same as in the implementation for move_to(), we assert that we
got passed a non-null pointer.

3. Let's take this bit by bit.
"Box::from_raw (raw_builder)" is the
counterpart to Box::into_raw() from above; it
takes a pointer and wraps it with a Box, which Rust
knows how to de-reference into the actual object it
contains. "let _ =" is to have a variable
binding in the current scope (the function we are
implementing). We don't care about the variable's name,
so we use _ as a default name. The variable is
now bound to a reference to an RsvgPathBuilder. The
function terminates, and since the _ variable goes
out of scope, Rust frees the memory for the
RsvgPathBuilder. You can read this idiom as
"g_free (builder)".

Recapitulating

Make your object. Box it. Take a pointer to it with
Box::into_raw(), and send it off into the wild
west. Bring back a pointer to your object. Unbox it
with Box::from_raw(). Let it go out of scope
if you want the object to be freed. Acknowledge your
responsibilities with unsafe and that's all!

Making the functions visible to C

The code we just saw lives in path-builder.rs.
By convention, the place where one actually exports the
visible API from a Rust library is a file called lib.rs,
and here is part of that file's contents in librsvg:

The mod path_builder indicates that
lib.rs will use the path_builder sub-module.
The pub use block exports the functions listed
in it to the outside world. They will be visible as
symbols in the .a file.

The Cargo.toml
(akin to a toplevel Makefile.am) for my librsvg's little
sub-library has this bit:

This means that the sub-library will be called
librsvg_internals.a, and it is a static
library. I will link that into my master
librsvg.so. If this were a stand-alone shared
library entirely implemented in Rust, I would use the
"cdylib"
crate type instead.

Linking into the main .so

In librsvg/Makefile.am
I have a very simplistic scheme for building the
librsvg_internals.a library with Rust's tools, and
linking the result into the main librsvg.so:

This uses a .PHONY target for
librsvg_internals.a, so "cargo build" will always be
called on it. Cargo already takes care of dependency
tracking; there is no need for make/automake to do that.

I put the filename of my library in a RUST_LIB
variable, which I then reference from LIBADD. This gets
librsvg_internals.a linked into the final
librsvg.so.

When you run "cargo build" just like that, it
creates a debug build in a target/debug
subdirectory. I haven't looked for a way to make it
play together with Automake when one calls "cargo
build --release": that one puts things in a
different directory, called target/release.
Rust's tooling is more integrated that way, while in the
Autotools world I'm expected to pass any CFLAGS for
compilation by hand, depending on whether I'm doing a
debug build or a release build. Any ideas for how to do
this cleanly are appreciated.

I don't have any code in configure.ac to
actually detect if Rust is present. I'm just assuming
that it is for now; fixes are appreciated :)

Using the Rust functions from C

There is no difference from what we had before! This
comes from rsvg-shapes.c:

Note that we are calling
rsvg_path_builder_new() and
rsvg_path_builder_move_to(), and returning a
pointer to an RsvgPathBuilder structure as
usual. However, all of those are implemented in the
Rust code. The C code has no idea!

This is the magic of Rust: it allows you to
move your C code bit by bit into a safe
language. You don't have to do a whole rewrite
in a single step. I don't know any other languages that
let you do that.