From Rust to beyond: The WebAssembly galaxy

This blog post is part of a series explaining how to send Rust beyond earth, into many different galaxies:


The first galaxy that our Rust parser will explore is the WebAssembly (WASM) galaxy. This post will explain what WebAssembly is, how to compile the parser into WebAssembly, and how to use the WebAssembly binary with Javascript in a browser and with NodeJS.

What is WebAssembly, and why?

If you already know WebAssembly, you can skip this section.

WebAssembly defines itself as:

WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable target for compilation of high-level languages like C/C++/Rust, enabling deployment on the web for client and server applications.

Should I say more? Probably, yes…

WebAssembly is a new portable binary format. Languages like C, C++, or Rust already compiles to this target. It is the spirit successor of ASM.js. By spirit successor, I mean it is the same people trying to extend the Web platform and to make the Web fast that are working on both technologies. They share some design concepts too, but that’s not really important right now.

Before WebAssembly, programs had to compile to Javascript in order to run on the Web platform. The resulting files were most of the time large. And because the Web is a network, the files had to be downloaded, and it took time. WebAssembly is designed to be encoded in a size- and load-time efficient binary format.

WebAssembly is also faster than Javascript for many reasons. Despites all the crazy optimisations engineers put in the Javascript virtual machines, Javascript is a weakly and dynamically typed language, which requires to be interpreted. WebAssembly aims to execute at native speed by taking advantage of common hardware capabilities. WebAssembly also loads faster than Javascript because parsing and compiling happen while the binary is streamed from the network. So once the binary is entirely fetched, it is ready to run: No need to wait on the parser and the compiler before running the program.

Today, and our blog series is a perfect example of that, it is possible to write a Rust program, and to compile it to run on the Web platform. Why? Because WebAssembly is implemented by all major browsers, and because it has been designed for the Web: To live and run on the Web platform (like a browser). But its portable aspect and its safe and sandboxed memory design make it a good candidate to run outside of the Web platform (see a serverless WASM framework, or an application container built for WASM).

I think it is important to remind that WebAssembly is not here to replace Javascript. It is just another technology which solves many problems we can meet today, like load-time, safety, or speed.

Rust 🚀 WebAssembly

Rust to WASM

The Rust WASM team is a group of people leading the effort of pushing Rust into WebAssembly with a set of tools and integrations. There is a book explaining how to write a WebAssembly program with Rust.

With the Gutenberg Rust parser, I didn’t use tools like wasm-bindgen (which is a pure gem) when I started the project few months ago because I hit some limitations. Note that some of them have been addressed since then! Anyway, we will do most of the work by hand, and I think this is an excellent way to understand how things work in the background. When you are familiar with WebAssembly interactions, then wasm-bindgen is an excellent tool to have within easy reach, because it abstracts all the interactions and let you focus on your code logic instead.

I would like to remind the reader that the Gutenberg Rust parser exposes one AST, and one root function (the axiom of the grammar), respectively defined as:

pub enum Node<'a> {
    Block {
        name: (Input<'a>, Input<'a>),
        attributes: Option<Input<'a>>,
        children: Vec<Node<'a>>
    },
    Phrase(Input<'a>)
}

and

pub fn root(
    input: Input
) -> Result<(Input, Vec<ast::Node>), nom::Err<Input>>;

Knowing that, let’s go!

General design

Here is our general design or workflow:

  1. Javascript (for instance) writes the blog post to parse into the WebAssembly module memory,
  2. Javascript runs the root function by passing a pointer to the memory, and the length of the blog post,
  3. Rust reads the blog post from the memory, runs the Gutenberg parser, compiles the resulting AST into a sequence of bytes, and returns the pointer to this sequence of bytes to Javascript,
  4. Javascript reads the memory from the received pointer, and decodes the sequence of bytes as Javascript objects in order to recreate an AST with a friendly API.

Why a sequence of bytes? Because WebAssembly only supports integers and floats, not strings or vectors, and also because our Rust parser takes a slice of bytes as input, so this is handy.

We use the term boundary layer to refer to this Javascript piece of code responsible to read from and write into the WebAssembly module memory, and responsible of exposing a friendly API.

Now, we will focus on the Rust code. It consists of only 4 functions:

  • alloc to allocate memory (exported),
  • dealloc to deallocate memory (exported),
  • root to run the parser (exported),
  • into_bytes to transform the AST into a sequence of bytes.

The entire code lands here. It is approximately 150 lines of code. We explain it.

Memory allocation

Let’s start by the memory allocator. I choose to use wee_alloc for the memory allocator. It is specifically designed for WebAssembly by being very small (less than a kilobyte) and efficient.

The following piece of code describes the memory allocator setup and the “prelude” for our code (enabling some compiler features, like alloc, declaring external crates, some aliases, and declaring required function like panic, oom etc.). This can be considered as a boilerplate:

#![no_std]
#![feature(
    alloc,
    alloc_error_handler,
    core_intrinsics,
    lang_items
)]

extern crate gutenberg_post_parser;
extern crate wee_alloc;
#[macro_use] extern crate alloc;

use gutenberg_post_parser::ast::Node;
use alloc::vec::Vec;
use core::{mem, slice};

#[global_allocator]
static ALLOC: wee_alloc::WeeAlloc = wee_alloc::WeeAlloc::INIT;

#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
    unsafe { core::intrinsics::abort(); }
}

#[alloc_error_handler]
fn oom(_: core::alloc::Layout) -> ! {
    unsafe { core::intrinsics::abort(); }
}

// This is the definition of `std::ffi::c_void`, but WASM runs without std in our case.
#[repr(u8)]
#[allow(non_camel_case_types)]
pub enum c_void {
    #[doc(hidden)]
    __variant1,

    #[doc(hidden)]
    __variant2
}

The Rust memory is the WebAssembly memory. Rust will allocate and deallocate memory on its own, but Javascript for instance needs to allocate and deallocate WebAssembly memory in order to communicate/exchange data. So we need to export one function to allocate memory and one function to deallocate memory.

Once again, this is almost a boilerplate. The alloc function creates an empty vector of a specific capacity (because it is a linear segment of memory), and returns a pointer to this empty vector:

#[no_mangle]
pub extern "C" fn alloc(capacity: usize) -> *mut c_void {
    let mut buffer = Vec::with_capacity(capacity);
    let pointer = buffer.as_mut_ptr();
    mem::forget(buffer);

    pointer as *mut c_void
}

Note the #[no_mangle] attribute that instructs the Rust compiler to not mangle the function name, i.e. to not rename it. And extern "C" to export the function in the WebAssembly module, so it is “public” from outside the WebAssembly binary.

The code is pretty straightforward and matches what we announced earlier: A Vec is allocated with a specific capacity, and the pointer to this vector is returned. The important part is mem::forget(buffer). It is required so that Rust will not deallocate the vector once it goes out of scope. Indeed, Rust enforces Resource Acquisition Is Initialization (RAII), so whenever an object goes out of scope, its destructor is called and its owned resources are freed. This behavior shields against resource leaks bugs, and this is why we will never have to manually free memory or worry about memory leaks in Rust (see some RAII examples). In this case, we want to allocate and keep the allocation after the function execution, hence the mem::forget call.

Let’s jump on the dealloc function. The goal is to recreate a vector based on a pointer and a capacity, and to let Rust drops it:

#[no_mangle]
pub extern "C" fn dealloc(pointer: *mut c_void, capacity: usize) {
    unsafe {
        let _ = Vec::from_raw_parts(pointer, 0, capacity);
    }
}

The Vec::from_raw_parts function is marked as unsafe, so we need to delimit it in an unsafe block so that the dealloc function is considered as safe.

The variable _ contains our data to deallocate, and it goes out of scope immediately, so Rust drops it.

From input to a flat AST

Now the core of the binding! The root function reads the blog post to parse based on a pointer and a length, then it parses it. If the result is OK, it serializes the AST into a sequence of bytes, i.e. it flatten it, otherwise it returns an empty sequence of bytes.

Flatten AST
The logic flow of the parser: The input on the left is parsed into an AST, which is serialized into a flat sequence of bytes on the right.
#[no_mangle]
pub extern "C" fn root(pointer: *mut u8, length: usize) -> *mut u8 {
    let input = unsafe { slice::from_raw_parts(pointer, length) };
    let mut output = vec![];

    if let Ok((_remaining, nodes)) = gutenberg_post_parser::root(input) {
        // Compile the AST (nodes) into a sequence of bytes.
    }

    let pointer = output.as_mut_ptr();
    mem::forget(output);

    pointer
}

The variable input contains the blog post. It is fetched from memory with a pointer and a length. The variable output is the sequence of bytes the function will return. gutenberg_post_parser::root(input) runs the parser. If parsing is OK, then the nodes are compiled into a sequence of bytes (omitted for now). Then the pointer to the sequence of bytes is grabbed, the Rust compiler is instructed to not drop it, and finally the pointer is returned. The logic is again pretty straightforward.

Now, let’s focus on the AST to the sequence of bytes (u8) compilation. All data the AST hold are already bytes, which makes the process easier. The goal is only to flatten the AST:

  • The first 4 bytes represent the number of nodes at the first level (4 × u8 represents u32) ,
  • Next, if the node is Block:
    • The first byte is the node type: 1u8 for a block,
    • The second byte is the size of the block name,
    • The third to the sixth bytes are the size of the attributes,
    • The seventh byte is the number of node children the block has,
    • Next bytes are the block name,
    • Next bytes are the attributes (&b"null"[..] if none),
    • Next bytes are node children as a sequence of bytes,
  • Next, if the node is Phrase:
    • The first byte is the node type: 2u8 for a phrase,
    • The second to the fifth bytes are the size of the phrase,
    • Next bytes are the phrase.

Here is the missing part of the root function:

if let Ok((_remaining, nodes)) = gutenberg_post_parser::root(input) {
    let nodes_length = u32_to_u8s(nodes.len() as u32);

    output.push(nodes_length.0);
    output.push(nodes_length.1);
    output.push(nodes_length.2);
    output.push(nodes_length.3);

    for node in nodes {
        into_bytes(&node, &mut output);
    }
}

And here is the into_bytes function:

fn into_bytes<'a>(node: &Node<'a>, output: &mut Vec<u8>) {
    match *node {
        Node::Block { name, attributes, ref children } => {
            let node_type = 1u8;
            let name_length = name.0.len() + name.1.len() + 1;
            let attributes_length = match attributes {
                Some(attributes) => attributes.len(),
                None => 4
            };
            let attributes_length_as_u8s = u32_to_u8s(attributes_length as u32);

            let number_of_children = children.len();
            output.push(node_type);
            output.push(name_length as u8);
            output.push(attributes_length_as_u8s.0);
            output.push(attributes_length_as_u8s.1);
            output.push(attributes_length_as_u8s.2);
            output.push(attributes_length_as_u8s.3);
            output.push(number_of_children as u8);

            output.extend(name.0);
            output.push(b'/');
            output.extend(name.1);

            if let Some(attributes) = attributes {
                output.extend(attributes);
            } else {
                output.extend(&b"null"[..]);
            }

            for child in children {
                into_bytes(&child, output);
            }
        },

        Node::Phrase(phrase) => {
            let node_type = 2u8;
            let phrase_length = phrase.len();

            output.push(node_type);

            let phrase_length_as_u8s = u32_to_u8s(phrase_length as u32);

            output.push(phrase_length_as_u8s.0);
            output.push(phrase_length_as_u8s.1);
            output.push(phrase_length_as_u8s.2);
            output.push(phrase_length_as_u8s.3);
            output.extend(phrase);
        }
    }
}

What I find interesting with this code is it reads just like the bullet list above the code.

For the most curious, here is the u32_to_u8s function:

fn u32_to_u8s(x: u32) -> (u8, u8, u8, u8) {
    (
        ((x >> 24) & 0xff) as u8,
        ((x >> 16) & 0xff) as u8,
        ((x >> 8)  & 0xff) as u8,
        ( x        & 0xff) as u8
    )
}

Here we are. alloc, dealloc, root, and into_bytes. Four functions, and everything is done.

Producing and optimising the WebAssembly binary

To get a WebAssembly binary, the project has to be compiled to the wasm32-unknown-unknown target. For now (and it will change in a near future), the nightly toolchain is needed to compile the project, so make sure you have the latest nightly version of rustc & co. installed with rustup update nightly. Let’s run cargo:

$ RUSTFLAGS='-g' cargo +nightly build --target wasm32-unknown-unknown --release

The WebAssembly binary weights 22kb. Our goal is to reduce the file size. For that, the following tools will be required:

  • wasm-gc to garbage-collect unused imports, internal functions, types etc.,
  • wasm-snip to mark some functions as unreachable, this is useful when the binary includes unused code that the linker were not able to remove,
  • wasm-opt from the Binaryen project, to optimise the binary,
  • gzip and brotli to compress the binary.

Basically, what we do is the following:

$ # Garbage-collect unused data.
$ wasm-gc gutenberg_post_parser.wasm

$ # Mark fmt and panicking as unreachable.
$ wasm-snip --snip-rust-fmt-code --snip-rust-panicking-code gutenberg_post_parser.wasm -o gutenberg_post_parser_snipped.wasm
$ mv gutenberg_post_parser_snipped.wasm gutenberg_post_parser.wasm

$ # Garbage-collect unreachable data.
$ wasm-gc gutenberg_post_parser.wasm

$ # Optimise for small size.
$ wasm-opt -Oz -o gutenberg_post_parser_opt.wasm gutenberg_post_parser.wasm
$ mv gutenberg_post_parser_opt.wasm gutenberg_post_parser.wasm

$ # Compress.
$ gzip --best --stdout gutenberg_post_parser.wasm > gutenberg_post_parser.wasm.gz
$ brotli --best --stdout --lgwin=24 gutenberg_post_parser.wasm > gutenberg_post_parser.wasm.br 

We end up with the following file sizes:

  • .wasm: 16kb,
  • .wasm.gz: 7.3kb,
  • .wasm.br: 6.2kb.

Neat! Brotli is implemented by most browsers, so when the client sends Accept-Encoding: br, the server can response with the .wasm.br file.

To give you a feeling of what 6.2kb represent, the following image also weights 6.2kb:

1398208027wordpress-logo-simplified-rgb

The WebAssembly binary is ready to run!

WebAssembly 🚀 Javascript

WASM to JS

In this section, we assume Javascript runs in a browser. Thus, what we need to do is the following:

  1. Load/stream and instanciate the WebAssembly binary,
  2. Write the blog post to parse in the WebAssembly module memory,
  3. Call the root function on the parser,
  4. Read the WebAssembly module memory to load the flat AST (a sequence of bytes) and decode it to build a “Javascript AST” (with our own objects).

The entire code lands here. It is approximately 150 lines of code too. I won’t explain the whole code since some parts of it is the “friendly API” that is exposed to the user. So I will rather explain the major pieces.

Loading/streaming and instanciating

The WebAssembly API exposes multiple ways to load a WebAssembly binary. The best you can use is the WebAssembly.instanciateStreaming function: It streams the binary and compiles it in the same time, nothing is blocking. This API relies on the Fetch API. You might have guessed it: It is asynchronous (it returns a promise). WebAssembly itself is not asynchronous (except if you use thread), but the instanciation step is. It is possible to avoid that, but this is tricky, and Google Chrome has a strong limit of 4kb for the binary size which will make you give up quickly.

To be able to stream the WebAssembly binary, the server must send the application/wasm MIME type (with the Content-Type header).

Let’s instanciate our WebAssembly:

const url = '/gutenberg_post_parser.wasm';
const wasm =
    WebAssembly.
        instantiateStreaming(fetch(url), {}).
        then(object => object.instance).
        then(instance => { /* step 2 */ });

The WebAssembly binary has been instanciated! Now we can move to the next step.

Last polish before running the parser

Remember that the WebAssembly binary exports 3 functions: alloc, dealloc, and root. They can be found on the exports property, along with the memory. Let’s write that:

        then(instance => {
            const Module = {
                alloc: instance.exports.alloc,
                dealloc: instance.exports.dealloc,
                root: instance.exports.root,
                memory: instance.exports.memory
            };

            runParser(Module, '<!-- wp:foo /-->xyz');
        });

Great, everything is ready to write the runParser function!

The parser runner

As a reminder, this function has to: Write the input (the blog post to parse) in the WebAssembly module memory (Module.memory), to call the root function (Module.root), and to read the result from the WebAssembly module memory. Let’s do that:

function runParser(Module, raw_input) {
    const input = new TextEncoder().encode(raw_input);
    const input_pointer = writeBuffer(Module, input);
    const output_pointer = Module.root(input_pointer, input.length);
    const result = readNodes(Module, output_pointer);

    Module.dealloc(input_pointer, input.length);

    return result;
}

In details:

  • The raw_input is encoded into a sequence of bytes with the TextEncoderAPI, in input,
  • The input is written into the WebAssembly memory module with writeBuffer and its pointer is returned,
  • Then the root function is called with the pointer to the input and the length of the input as expected, and the pointer to the output is returned,
  • Then the output is decoded,
  • And finally, the input is deallocated. The output of the parser will be deallocated in the readNodes function because its length is unknown at this step.

Great! So we have 2 functions to write right now: writeBuffer​ and readNodes.

Writing the data in memory

Let’s go with the first one, writeBuffer:

function writeBuffer(Module, buffer) {
    const buffer_length = buffer.length;
    const pointer = Module.alloc(buffer_length);
    const memory = new Uint8Array(Module.memory.buffer);

    for (let i = 0; i < buffer_length; ++i) {
        memory[pointer + i] = buffer[i];
    }

    return pointer;
}

In details:

  • The length of the buffer is read in buffer_length,
  • A space in memory is allocated to write the buffer,
  • Then a uint8 view of the buffer is instanciated, which means that the buffer will be viewed as a sequence of u8, exactly what Rust expects,
  • Finally the buffer is copied into the memory with a loop, that’s very basic, and return the pointer.

Note that, unlike C strings, adding a NUL byte at the end is not mandatory. This is just the raw data (on the Rust side, we read it with slice::from_raw_parts, slice is a very simple structure).

Reading the output of the parser

So at this step, the input has been written in memory, and the root function has been called so it means the parser has run. It has returned a pointer to the output (the result) and we now have to read it and decode it.

Remind that the first 4 bytes encodes the number of nodes we have to read. Let’s go!

function readNodes(Module, start_pointer) {
    const buffer = new Uint8Array(Module.memory.buffer.slice(start_pointer));
    const number_of_nodes = u8s_to_u32(buffer[0], buffer[1], buffer[2], buffer[3]);

    if (0 >= number_of_nodes) {
        return null;
    }

    const nodes = [];
    let offset = 4;
    let end_offset;

    for (let i = 0; i < number_of_nodes; ++i) {
        const last_offset = readNode(buffer, offset, nodes);

        offset = end_offset = last_offset;
    }

    Module.dealloc(start_pointer, start_pointer + end_offset);

    return nodes;
}

In details:

  • A uint8 view of the memory is instanciated… more precisely: A slice of the memory starting at start_pointer,
  • The number of nodes is read, then all nodes are read,
  • And finally, the output of the parser is deallocated.

For the record, here is the u8s_to_u32 function, this is the exact opposite of u32_to_u8s:

function u8s_to_u32(o, p, q, r) {
    return (o << 24) | (p << 16) | (q << 8) | r;
}

And I will also share the readNode function, but I won’t explain the details. This is just the decoding part of the output from the parser.

function readNode(buffer, offset, nodes) {
    const node_type = buffer[offset];

    // Block.
    if (1 === node_type) {
        const name_length = buffer[offset + 1];
        const attributes_length = u8s_to_u32(buffer[offset + 2], buffer[offset + 3], buffer[offset + 4], buffer[offset + 5]);
        const number_of_children = buffer[offset + 6];

        let payload_offset = offset + 7;
        let next_payload_offset = payload_offset + name_length;

        const name = new TextDecoder().decode(buffer.slice(payload_offset, next_payload_offset));

        payload_offset = next_payload_offset;
        next_payload_offset += attributes_length;

        const attributes = JSON.parse(new TextDecoder().decode(buffer.slice(payload_offset, next_payload_offset)));

        payload_offset = next_payload_offset;
        let end_offset = payload_offset;

        const children = [];

        for (let i = 0; i < number_of_children; ++i) {
            const last_offset = readNode(buffer, payload_offset, children);

            payload_offset = end_offset = last_offset;
        }

        nodes.push(new Block(name, attributes, children));

        return end_offset;
    }
    // Phrase.
    else if (2 === node_type) {
        const phrase_length = u8s_to_u32(buffer[offset + 1], buffer[offset + 2], buffer[offset + 3], buffer[offset + 4]);
        const phrase_offset = offset + 5;
        const phrase = new TextDecoder().decode(buffer.slice(phrase_offset, phrase_offset + phrase_length));

        nodes.push(new Phrase(phrase));

        return phrase_offset + phrase_length;
    } else {
        console.error('unknown node type', node_type);
    }
}

Note that this code is pretty simple and easy to optimise by the Javascript virtual machine. It is almost important to note that this is not the original code. The original version is a little more optimised here and there, but they are very close.

And that’s all! We have successfully read and decoded the output of the parser! We just need to write the Block and Phrase classes like this:

class Block {
    constructor(name, attributes, children) {
        this.name = name;
        this.attributes = attributes;
        this.children = children;
    }
}

class Phrase {
    constructor(phrase) {
        this.phrase = phrase;
    }
}

The final output will be an array of those objects. Easy!

WebAssembly 🚀 NodeJS

WASM to NodeJS

The differences between the Javascript version and the NodeJS version are few:

  • The Fetch API does not exist in NodeJS, so the WebAssembly binary has to be instanciated with a buffer directly, like this: WebAssembly.instantiate(fs.readFileSync(url), {}),
  • The TextEncoder and TextDecoder objects do not exist as global objects, they are in util.TextEncoder and util.TextDecoder.

In order to share the code between both environments, it is possible to write the boundary layer (the Javascript code we wrote) in a .mjs file, aka ECMAScript Module. It allows to write something like import { Gutenberg_Post_Parser } from './gutenberg_post_parser.mjs' for example (considering the whole code we wrote before is a class). On the browser side, the script must be loaded with<script type="module" src="…" />, and on the NodeJS side, node must run with the --experimental-modules flag. I can recommend you this talk Please wait… loading: a tale of two loaders by Myles Borins at the JSConf EU 2018 to understand all the story about that.

The entire code lands here.

Conclusion

We have seen in details how to write a real world parser in Rust, how to compile it into a WebAssembly binary, and how to use it with Javascript and with NodeJS.

The parser can be used in a browser with regular Javascript code, or as a CLI with NodeJS, or on any platforms NodeJS supports.

The Rust part for WebAssembly plus the Javascript part totals 313 lines of code. This is a tiny surface of code to review and to maintain compared to writing a Javascript parser from scratch.

Another argument is the safety and performance. Rust is memory safe, we know that. It is also performant, but is it still true for the WebAssembly target? The following table shows the benchmark results of the actual Javascript parser for the Gutenberg project (implemented with PEG.js), against this project: The Rust parser as a WebAssembly binary.

Javascript parser (ms) Rust parser as a WebAssembly binary (ms) speedup
demo-post.html 13.167 0.252 × 52
shortcode-shortcomings.html 26.784 0.271 × 98
redesigning-chrome-desktop.html 75.500 0.918 × 82
web-at-maximum-fps.html 88.118 0.901 × 98
early-adopting-the-future.html 201.011 3.329 × 60
pygmalian-raw-html.html 311.416 2.692 × 116
moby-dick-parsed.html 2,466.533 25.14 × 98

The WebAssembly binary is in average 86 times faster than the actual Javascript implementation. The median of the speedup is 98. Some edge cases are very interesting, like moby-dick-parsed.html where it takes 2.5s with the Javascript parser against 25ms with WebAssembly.

So not only it is safer, but it is faster than Javascript in this case. And it is only 300 lines of code.

Note that WebAssembly does not support SIMD yet: It is still a proposal. Rust is gently supporting it (example with PR #549). It will dramatically improve the performances!

We will see in the next episodes of this series that Rust can reach a lot of galaxies, and the more it travels, the more it gets interesting.

Thanks for reading!

From Rust to beyond: Prelude

At my work, I had an opportunity to start an experiment: Writing a single parser implementation in Rust for the new Gutenberg post format, bound to many platforms and environments.

gutenberg_logo
The logo of the Gutenberg post parser project.

This series of posts is about those bindings, and explains how to send Rust beyond earth, into many different galaxies. Rust will land in:

The ship is currently flying into the Java galaxy, this series may continue if the ship does not crash or has enough resources to survive!

The Gutenberg post format

Let’s introduce quickly what Gutenberg is, and why a new post format. If you want an in-depth presentation, I highly recommend to read The Language of Gutenberg. Note that this is not required for the reader to understand the Gutenberg post format.

Gutenberg is the next WordPress editor. It is a little revolution on its own. The features it unlocks are very powerful.

The editor will create a new page- and post-building experience that makes writing rich posts effortless, and has “blocks” to make it easy what today might take shortcodes, custom HTML, or “mystery meat” embed discovery. — Matt Mullenweg

The format of a blog post was HTML. And it continues to be. However, another semantics layer is added through annotations. Annotations are written in comments and borrow the XML syntax, e.g.:

<!-- wp:ns/block-name {"attributes": "as JSON"} -->
    <p>phrase</p>
<!-- /wp:ns/block-name -->

The Gutenberg format provides 2 constructions: Block, and Phrase. The example above contains both: There is a block wrapping a phrase. A phrase is basically anything that is not a block. Let’s describe the example:

  • It starts with an annotation (<!-- … -->),
  • The wp: is mandatory to represent a Gutenberg block,
  • It is followed by a fully qualified block name, which is a pair of an optional namespace (here sets to ns, defaults to core) and a block name (here sets to block-name), separated by a slash,
  • A block has optional attributes encoded as a JSON object (see RFC 7159, Section 4, Objects),
  • Finally, a block has optional children, i.e. an heterogeneous collection of blocks or phrases. In the example above, there is one child that is the phrase <p>phrase</p>. And the following example below shows a block with no child:
<!-- wp:ns/block-name {"attributes": "as JSON"} /-->

The complete grammar can be found in the parser’s documentation.

Finally, the parser is used on the editor side, not on the rendering side. Once rendered, the blog post is a regular HTML file. Some blocks are dynamics though, but this is another topic.

block-logic-flow1
The logic flow of the editor (How Little Blocks Work).

The grammar is relatively small. The challenges are however to be as much performant and memory efficient as possible on many platforms. Some posts can reach megabytes, and we don’t want the parser to be the bottleneck. Even if it is used when creating the post state (cf. the schema above), we have measured several seconds to load some posts. Time during which the user is blocked, and waits, or see an error. In other scenarii, we have hit memory limit of the language’s virtual machines.

Hence this experimental project! The current parsers are written in JavaScript (with PEG.js) and in PHP (with phpegjs). This Rust project proposes a parser written in Rust, that can run in the JavaScript and in the PHP virtual machines, and on many other platforms. Let’s try to be very performant and memory efficient!

Why Rust?

That’s an excellent question! Thanks for asking. I can summarize my choice with a bullet list:

  • It is fast, and we need speed,
  • It is memory safe, and also memory efficient,
  • No garbage collector, which simplifies memory management across environments,
  • It can expose a C API (with Foreign Function Interface, FFI), which eases the integration into multiple environments,
  • It compiles to many targets,
  • Because I love it.

One of the goal of the experimentation is to maintain a single implementation (maybe the future reference implementation) with multiple bindings.

The parser

The parser is written in Rust. It relies on the fabulous nom library.

nom
nom will happily take a byte out of your files 🙂.

The source code is available in the src/ directory in the repository. It is very small and fun to read.

The parser produces an Abstract Syntax Tree (AST) of the grammar, where nodes of the tree are defined as:

pub enum Node<'a> {
    Block {
        name: (Input<'a>, Input<'a>),
        attributes: Option<Input<'a>>,
        children: Vec<Node<'a>>
    },
    Phrase(Input<'a>)
}

That’s all! We find again the block name, the attributes and the children, and the phrase. Block children are defined as a collection of node, this is recursive. Input<'a> is defined as &'a [u8], i.e. a slice of bytes.

The main parser entry is the root function. It represents the axiom of the grammar, and is defined as:

pub fn root(
    input: Input
) -> Result<(Input, Vec<ast::Node>), nom::Err<Input>>;

So the parser returns a collection of nodes in the best case. Here is an simple example:

use gutenberg_post_parser::{root, ast::Node};

let input = &b"<!-- wp:foo {\"bar\": true} /-->"[..];
let output = Ok(
    (
        // The remaining data.
        &b""[..],

        // The Abstract Syntax Tree.
        vec![
            Node::Block {
                name: (&b"core"[..], &b"foo"[..]),
                attributes: Some(&b"{\"bar\": true}"[..]),
                children: vec![]
            }
        ]
    )
);

assert_eq!(root(input), output);

The root function and the AST will be the items we are going to use and manipulate in the bindings. The internal items of the parser will stay private.

Bindings

Rust to

From now, our goal is to expose the root function and the Node enum in different platforms or environments. Ready?

3… 2… 1… lift-off!

How Automattic (WordPress.com & co.) partly moved away from PHPUnit to atoum?

Hello fellow developers and testers,

Few months ago at Automattic, my team and I started a new project: Having better tests for the payment system. The payment system is used by all the services at Automattic, i.e. WordPress, VaultPress, Jetpack, Akismet, PollDaddy etc. It’s a big challenge! Cherry on the cake: Our experiment could define the future of the testing practices for the entire company. No pressure.

This post is a summary about what have been accomplished so far, the achievements, the failures, and the future, focused around manual tests. As the title of this post suggests, we are going to talk about PHPUnit and atoum, which are two PHP test frameworks. This is not a PHPUnit vs. atoum fight. These are observations made for our software, in our context, with our requirements, and our expectations. I think the discussion can be useful for many projects outside Automattic. I would like to apologize in advance if some parts sound too abstract, I hope you understand I can’t reveal any details about the payment system for obvious reasons.

Where we were, and where to go

For historical reasons, WordPress, VaultPress, Jetpack & siblings use PHPUnit for server-side manual tests. There are unit, integration, and system manual tests. There are also end-to-end tests or benchmarks, but we are not interested in them now. When those products were built, PHPUnit was the main test framework in town. Since then, the test landscape has considerably changed in PHP. New competitors, like atoum or Behat, have a good position in the game.

Those tests exist for many years. Some of them grew organically. PHPUnit does not require any form of structure, which is —despite being questionable according to me— a reason for its success. It is a requirement that the code does not need to be well-designed to be tested, but too much freedom on the test side comes with a cost in the long term if there is not enough attention.

Our situation is the following. The code is complex for justified reasons, and the testability is sometimes lessened. Testing across many services is indubitably difficult. Some parts of the code are really old, mixed with others that are new, shiny, and well-done. In this context, it is really difficult to change something, especially moving to another test framework. The amount of work it represents is colossal. Any new test framework does not worth the price for this huge refactoring. But maybe the new test frameworks can help us to better test our code?

I’m a long term contributor of atoum (top 3 contributors). And at the time of writing, I’m a core member. You have to believe me when I say that, at each step of the discussions or the processes, I have been neutral, arguing in favor or against atoum. The idea to switch to atoum partly came from me actually, but my knowledge about atoum is definitively a plus. I am in a good position to know the pros and the cons of the tool, and I’m perfectly aware of how it could solve issues we have.

So after many debates and discussions, we decided to try to move to atoum. A survey and a meeting were scheduled 2 months later to decide whether we should continue or not. Spoiler: We will partly continue with it.

Our needs and requirements

Our code is difficult to test. In other words, the testability is low for some parts of the code. atoum has features to help increase the testability. I will try to summarize those features in the following short sections.

atoum/phpunit-extension

As I said, it’s not possible to rewrite/migrate all the existing tests. This is a colossal effort with a non-neglieable cost. Then, enter atoum/phpunit-extension.

As far as I know, atoum is the only PHP framework that is able to run tests that have been written for another framework. The atoum/phpunit-extension does exactly that. It runs tests written with the PHPUnit API with the atoum engines. This is fabulous! PHPUnit is not required at all. With this extension, we have been able to run our “legacy” (aka PHPUnit) tests with atoum. The following scenarios can be fulfilled:

  • Existing test suites written with the PHPUnit API can be run seamlessly by atoum, no need to rewrite them,
  • Of course, new test suites are written with the atoum API,
  • In case of a test suite migration from PHPUnit to atoum, there are two solutions:
    1. Rewrite the test suite entirely from scratch by logically using the atoum API, or
    2. Only change the parent class from PHPUnit\Framework\TestCase to atoum\phpunit\test, and suddenly it is possible to use both API at the same time (and thus migrate one test case after the other for instance).

This is a very valuable tool for an adventure like ours.

atoum/phpunit-extension is not perfect though. Some PHPUnit APIs are missing. And while the test verdict is strictly the same, error messages can be different, some PHPUnit extensions may not work properly etc. Fortunately, our usage of PHPUnit is pretty raw: No extensions except home-made ones, few hacks… Everything went well. We also have been able to contribute easily to the extension.

Mock engines (plural)

atoum comes with 3 mock engines:

  • Class-like mock engine for classes and interfaces,
  • Function mock engine,
  • Constant mock engine.

Being able to mock global functions or global constants is an important feature for us. It suddenly increases the testability of our code! The following example is fictional, but it’s a good illustration. WordPress is full of global functions, but it is possible to mock them with atoum like this:

public function test_foo()
{
    $this->function->get_userdata = (object) [
        'user_login' => …,
        'user_pass' => …,
        …
    ];
}

In one line of code, it was possible to mock the get_userdata function.

Runner engines

Being able to isolate test execution is a necessity to avoid flakey tests, and to increase the trust we put in the test verdicts. atoum comes with de facto 3 runner engines:

  • Inline, one test case after another in the same process,
  • Isolate, one test case after another but each time in a new process (full isolation),
  • Concurrent, like isolate but tests run concurrently (“at the same time”).

I’m not saying PHPUnit doesn’t have those features. It is possible to run tests in a different process each time —with the isolate engine—, but test execution time blows up, and the isolation is not strict. We don’t use it. The concurrent runner engine in atoum tends to reduce the execution time to be close to the inline engine, while still ensuring a strict isolation.

Fun fact: By using atoum and the atoum/phpunit-extension, we are able to run PHPUnit tests concurrently with a strict isolation!

Code coverage reports

At the time of writing, PHPUnit is not able to generate code coverage reports containing the Branch- or Path Coverage Criteria data. atoum supports them natively with the atoum/reports-extension (including nice graphs, see the demonstration). And we need those data.

The difficulties

On paper, most of the pain points sound addressable. It was time to experiment.

Integration to the Continuous Integration server

Our CI does not natively support standard test execution report formats. Thus we had to create the atoum/teamcity-extension. Learn more by reading a blog post I wrote recently. The TeamCity support is native inside PHPUnit (see the --log-teamcity option).

Bootstrap test environments

Our bootstrap files are… challenging. It’s expected though. Setting up a functional test environment for a software like WordPress.com is not a task one can accomplish in 2 minutes. Fortunately, we have been able to re-use most of the PHPUnit parts.

Today, our unit tests run in complete isolation and concurrently. Our integration tests, and system tests run in complete isolation but not concurrently, due to MySQL limitations. We have solutions, but time needs to be invested.

Generally, even if it works now, it took time to re-organize the bootstrap so that some parts can be shared between the test runners (because we didn’t switch the whole company to atoum yet, it was an experiment).

Documentation and help

Here is an interesting paradox. The majority of the team recognized that atoum’s documentation is better than PHPUnit’s, even if some parts must be rewritten or reworked. But developers already know PHPUnit, so they don’t look at the documentation. If they have to, they will instead find their answers on StackOverflow, or by talking to someone else in the company, but not by checking the official documentation. atoum does not have many StackOverflow threads, and few people are atoum users within the company.

What we have also observed is that when people create a new test, it’s a copy-paste from an existing one. Let’s admit this is a common and natural practice. When a difficulty is met, it’s legit to look at somewhere else in the test repository to check if a similar situation has been resolved. In our context, that information lacked a little bit. We tried to write more and more tests, but not fast enough. It should not be an issue if you have time to try, but in our context, we unfortunately didn’t have this time. The team faced many challenges in the same period, and the tests we are building are not simple Hello, World!s as you might think, so it increases the effort.

To be honest, this was not the biggest difficulty, but still, it is important to notice.

Concurrent integration test executions

Due to some MySQL limitations combined with the complexity of our code, we are not able to run integration (and system) tests concurrently yet. Therefore it takes time to run them, probably too much in our development environments. Even if atoum has friendly options to reduce the debug loop (e.g. see the --loop option), the execution is still slow. The problem can be solved but it requires time, and deep modifications of our code.

Note that with our PHPUnit tests, no isolation is used. This is wrong. And thus we have a smaller trust in the test verdict than with atoum. Almost everyone in the team prefers to have slow test execution but isolation, rather than fast test execution but no confidence in the test verdict. So that’s partly a difficulty. It’s a mix of a positive feature and a needle in the foot, and a needle we can live with. atoum is not responsible of this latency: The state of our code is.

The results

First, let’s start by the positive impacts:

  • In 2 months, we have observed that the testability of our code has been increased by using atoum,
  • We have been able to find bugs in our code that were not detected by PHPUnit, mostly because atoum checks the type of the data,
  • We have been able to migrate “legacy tests” (aka PHPUnit tests) to atoum by just moving the files from one directory to another: What a smooth migration!
  • The trust we put in our test verdict has increased thanks to a strict test execution isolation.

Now, the negative impacts:

  • Even if the testability has been increased, it’s not enough. Right now, we are looking at refactoring our code. Introducing atoum right now was probably too early. Let’s refactor first, then use a better test toolchain later when things will be cleaner,
  • Moving the whole company at once is hard. There are thousands of manual tests. The atoum/phpunit-extension is not magical. We have to come with more solid results, stuff to blow minds. It is necessary to set the institutional inertia in motion. For instance, not being able to run integration and system tests concurrently slows down the builds on the CI; it increases the trust we put in the test verdict, but this latency is not acceptable at the company scale,
  • All the issues we faced can be addressed, but it needs time. The experiment time frame was 2 months. We need 1 or 2 other months to solve the majority of the remaining issues. Note that I was kind of in-charge of this project, but not full time.

We stop using atoum for manual tests. It’s likely to be a pause though. The experiment has shown we need to refactor and clean our code, then there will be a good chance for atoum to come back. The experiment has also shown how to increase the testability of our code: Not everything can be addressed by using another test framework even if it largely participates. We can focus on those points specifically, because we know where they are now. Finally, I reckon it has participated in moving the test infrastructure inside Automattic by showing that something else exists, and that we can go further.

I said we stopped using atoum “for manual tests”. Yes. Because we also have automatically generated tests. The experiment was not only about switching to atoum. Many other aspects of the experiment are still running! For instance, Kitab is used for our code documentation. Kitab is able to (i) render the documentation, and (ii) test the examples written inside the documentation. That way the documentation is ensured to be always up-to-date and working. Kitab generates tests for- and executes tests with atoum. It was easy to set up: We just had to use the existing test bootstraps designed for atoum. We also have another tool to compile HTTP API Blueprint specifications into executable tests. So far, everyone is happy with those tools, no need to go back, everything is automat(t)ic. Other tools are likely to be introduced in the future to automatically generate tests. I want to detail this particular topic in another blog post.

Conclusion

Moving to another test framework is a huge decision with many factors. The fact atoum has atoum/phpunit-extension is a time saver. Nonetheless a new test framework does not mean it will fix all the testability issues of the code. The benefits of the new test framework must largely overtake the costs. In our current context, it was not the case. atoum solves issues that are not our priorities. So yes, atoum can help us to solve important issues, but since these issues are not priorities, then the move to atoum was too early. During the project, we gained new automatic test tools, like Kitab. The experiment is not a failure. Will we retry atoum? It’s very likely. When? I hope in a year.

atoum supports TeamCity

atoum is a popular PHP test framework. TeamCity is a Continuous Integration and Continuous Delivery software developed by Jetbrains. Despites atoum supports many industry standards to report test execution verdicts, TeamCity uses its own non-standard report, and thus atoum is not compatible with TeamCity… until now.

icon_TeamCity

The atoum/teamcity-extension provides TeamCity support inside atoum. When executing tests, the reported verdicts are understandable by TeamCity, and activate all its UI features.

Install

If you have Composer, just run:

$ composer require atoum/teamcity-extension '~1.0'

From this point, you need to enable the extension in your .atoum.php configuration file. The following example forces to enable the extension for every test execution:

$extension = new atoum\teamcity\extension($script);
$extension->addToRunner($runner);

The following example enables the extension only within a TeamCity environment:

$extension = new atoum\teamcity\extension($script);
$extension->addToRunnerWithinTeamCityEnvironment($runner);

This latter installation is recommended. That’s it 🙂.

Glance

The default CLI report looks like this:

Default atoum CLI report

The TeamCity report looks like this in your terminal (note the TEAMCITY_VERSION variable as a way to emulate a TeamCity environment):

TeamCity report inside the terminal

Which is less easy to read. However, when it comes into TeamCity UI, we will have the following result:

TeamCity running atoum

We are using it at Automattic. Hope it is useful for someone else!

If you find any bugs, or would like any other features, please use Github at the following repository: https://github.com/Hywan/atoum-teamcity-extension/.

Faster find algorithms in nom

Tagua VM is an experimental PHP virtual machine written in Rust and LLVM. It is composed as a set of libraries. One of them that keeps me busy these days is tagua-parser. It contains the lexical and syntactic analysers for the PHP language, in addition to the AST (Abstract Syntax Tree). If you would like to know more about this project, you can see this conference I gave at the PHPTour last week: Tagua VM, a safe PHP virtual machine.

The library tagua-parser is built with parser combinators. Instead of having a classical grammar, compiled to a parser, we write pure functions acting as small parsers. We then combine them together. This post does not explain why this is a sane approach in our context, but keep in mind this is much easier to test, to maintain, and to optimise.

Because this project is complex enought, we are delegating the parser combinator implementation to nom.

nom is a parser combinators library written in Rust. Its goal is to provide tools to build safe parsers without compromising the speed or memory consumption. To that end, it uses extensively Rust’s strong typing, zero copy parsing, push streaming, pull streaming, and provides macros and traits to abstract most of the error prone plumbing.

Recently, I have been working on optimisations in the FindToken and FindSubstring traits from nom itself. These traits provide methods to find a token (i.e. a lexeme), and to find a substring, crazy naming. However, this is not totally valid: FindToken expects to find a single item (if implemented for u8, it will look for a u8 in a &[u8]), and FindSubstring really is about finding a substring, so a token of any length.

It appeared that these methods can be optimised in some cases. Both default implementations are using Rust iterators: Regular iterator for FindToken, and window iterator for FindSubstring, i.e. an iterator over overlapping subslices of a given length. We have benchmarked big PHP comments, which are analysed by parsers actively using these two trait implementations.

Here are the result, before and after our optimisations:

test …::bench_span ... bench:      73,433 ns/iter (+/- 3,869)
test …::bench_span ... bench:      15,986 ns/iter (+/- 3,068)

A boost of 78%! Nice!

The pull request has been merged today, thank you Geoffroy Couprie! The new algorithms heavily rely on the memchr crate. So all the credits should really go to Andrew Gallant! This crate provides a safe interface libc‘s memchr and memrchr. It also provides fallback implementations when either function is unavailable.

The new algorithms are only implemented for &[u8] though. Fortunately, the implementation for &str fallbacks to the former.

This is small contribution, but it brings a very nice boost. Hope it will benefit to other projects!

I am also blowing the dust off of Algorithms on Strings, by M. Crochemore, C. Hancart, and T. Lecroq. I am pretty sure it should be useful for nom and tagua-parser. If you haven’t read this book yet, I can only encourage you to do so!

sabre/katana

sabre/katana's logo
Project’s logo.

What is it?

sabre/katana is a contact, calendar, task list and file server. What does it mean? Assuming nowadays you have multiple devices (PC, phones, tablets, TVs…). If you would like to get your address books, calendars, task lists and files synced between all these devices from everywhere, you need a server. All your devices are then considered as clients.

But there is an issue with the server. Most of the time, you might choose Google or maybe Apple, but one may wonder: Can we trust these servers? Can we give them our private data, like all our contacts, our calendars, all our photos…? What if you are a company or an association and you have sensitive data that are really private or strategic? So, can you still trust them? Where the data are stored? Who can look at these data? More and more, there is a huge need for “personal” server.

Moreover, servers like Google or Apple are often closed: You reach your data with specific clients and they are not available in all platforms. This is for strategic reasons of course. But with sabre/katana, you are not limited. See the above schema: Firefox OS can talk to iOS or Android at the same time.

sabre/katana is this kind of server. You can install it on your machine and manage users in a minute. Each user will have a collection of address books, calendars, task lists and files. This server can talk to a loong list of devices, mainly thanks to a scrupulous respect of industrial standards:

  • Mac OS X:
    • OS X 10.10 (Yosemite),
    • OS X 10.9 (Mavericks),
    • OS X 10.8 (Mountain Lion),
    • OS X 10.7 (Lion),
    • OS X 10.6 (Snow Leopard),
    • OS X 10.5 (Leopard),
    • BusyCal,
    • BusyContacts,
    • Fantastical,
    • Rainlendar,
    • ReminderFox,
    • SoHo Organizer,
    • Spotlife,
    • Thunderbird ,
  • Windows:
    • eM Client,
    • Microsoft Outlook 2013,
    • Microsoft Outlook 2010,
    • Microsoft Outlook 2007,
    • Microsoft Outlook with Bynari WebDAV Collaborator,
    • Microsoft Outlook with iCal4OL,
    • Rainlendar,
    • ReminderFox,
    • Thunderbird,
  • Linux:
    • Evolution,
    • Rainlendar,
    • ReminderFox,
    • Thunderbird,
  • Mobile:
    • Android,
    • BlackBerry 10,
    • BlackBerry PlayBook,
    • Firefox OS,
    • iOS 8,
    • iOS 7,
    • iOS 6,
    • iOS 5,
    • iOS 4,
    • iOS 3,
    • Nokia N9,
    • Sailfish.

Did you find your device in this list? Probably yes 😉.

sabre/katana sits in the middle of all your devices and synced all your data. Of course, it is free and open source. Go check the source!

List of features

Here is a non-exhaustive list of features supported by sabre/katana. Depending whether you are a user or a developer, the features that might interest you are radically not the same. I decided to show you a list from the user point of view. If you would like to get a list from the developer point of view, please see this exhaustive list of supported RFC for more details.

Contacts

All usual fields are supported, like phone numbers, email addresses, URLs, birthday, ringtone, texttone, related names, postal addresses, notes, HD photos etc. Of course, groups of cards are also supported.

My card on Mac OS X
My card inside the native Contact application of Mac OS X.
My card on Firefox OS
My card inside the native Contact application of Firefox OS.

My photo is not in HD, I really have to update it!

Cards can be encoded into several formats. The most usual format is VCF. sabre/katana allows you to download the whole address book of a user as a single VCF file. You can also create, update and delete address books.

Calendars

A calendar is just a set of events. Each event has several properties, such as a title, a location, a date start, a date end, some notes, URLs, alarms etc. sabre/katana also support recurring events (“each last Monday of the month, at 11am…”), in addition to scheduling (see bellow).

My calendars on Mac OS X
My calendars inside the native Calendar application of Mac OS X.
My calendars on Firefox OS
My calendars inside the native Calendar application of Firefox OS.

Few words about calendar scheduling. Let’s say you are organizing an event, like New release (we always enjoy release day!). You would like to invite several people but you don’t know if they could be present or not. In your event, all you have to do is to add attendees. How are they going to be notified about this event? Two situations:

  1. Either attendees are registered on your sabre/katana server and they will receive an invite inside their calendar application (we call this iTIP),
  2. Or they are not registered on your server and they will receive an email with the event as an attached file (we call this iMIP). All they have to do is to open this event in their calendar application.
Typical mail to invite an attendee to an event
Invite an attendee by email because she is not registered on your sabre/katana server.

Notice the gorgeous map embedded inside the email!

Once they received the event, they can accept, decline or “don’t know” (they will try to be present at) the event.

Receive an invite to an event
Receive an invite to an event. Here: Gordon is inviting Hywan. Three choices for Hywan:

, or

.
Status of all attendees
Hywan has accepted the event. Here is what the event looks like. Hywan can see the response of each attendees.
Notification from attendees
Gordon is even notified that Hywan has accepted the event.

Of course, attendees will be notified too if the event has been moved, canceled, refreshed etc.

Calendars can be encoded into several formats. The most usal format is ICS. sabre/katana allows you to download the whole calendar of a user as a single ICS file. You can also create, update and delete calendars.

Task lists

A task list is exactly like a calendar (from a programmatically point of view). Instead of containg event objects, it contains todo objects.

sabre/katana supports group of tasks, reminder, progression etc.

My task lists on Mac OS X
My task lists inside the native Reminder application of Mac OS X.

Just like calendars, task lists can be encoded into several formats, whose ICS. sabre/katana allows you to download the whole task list of a user as a single ICS file. You can also create, update and delete task lists.

Files

Finally, sabre/katana creates a home collection per user: A personal directory that can contain files and directories and… synced between all your devices (as usual 😄).

sabre/katana also creates a special directory called public/ which is a public directory. Every files and directories stored inside this directory are accessible to anyone that has the correct link. No listing is prompted to protect your public data.

Just like contact, calendar and task list applications, you need a client application to connect to your home collection on sabre/katana.

Connect to a server in Mac OS X
Connect to a server with the Finder application of Mac OS X.

Then, your public directory on sabre/katana will be a regular directory as every other.

List of my files
List of my files, right here in the Finder application of Mac OS X.

sabre/katana is able to store any kind of files. Yes, any kinds. It’s just files. However, it white-lists the kind of files that can be showed in the browser. Only images, audios, videos, texts, PDF and some vendor formats (like Microsoft Office) are considered as safe (for the server). This way, associations can share musics, videos or images, companies can share PDF or Microsoft Word documents etc. Maybe in the future sabre/katana might white-list more formats. If a format is not white-listed, the file will be forced to download.

How is sabre/katana built?

sabre/katana is based on two big and solid projects:

  1. sabre/dav,
  2. Hoa.

sabre/dav is one of the most powerful CardDAV, CalDAV and WebDAV framework in the planet. Trusted by the likes of Atmail, Box, fruux and ownCloud, it powers millions of users world-wide! It is written in PHP and is open source.

Hoa is a modular, extensible and structured set of PHP libraries. Fun fact: Also open source, this project is also trusted by ownCloud, in addition to Mozilla, joliCode etc. Recently, this project has recorded more than 600,000 downloads and the community is about to reach 1000 people.

sabre/katana is then a program based on sabre/dav for the DAV part and Hoa for everything else, like the logic code inside the sabre/dav‘s plugins. The result is a ready-to-use server with a nice interface for the administration.

To ensure code quality, we use atoum, a popular and modern test framework for PHP. So far, sabre/dav has more than 1000 assertions.

Conclusion

sabre/katana is a server for contacts, calendars, task lists and files. Everything is synced, everytime and everywhere. It perfectly connects to a lot of devices on the market. Several features we need and use daily have been presented. This is the easiest and a secure way to host your own private data.

Go download it!

Control the terminal, the right way

Nowadays, there are plenty of terminal emulators in the wild. Each one has a specific way to handle controls. How many colours does it support? How to control the style of a character? How to control more than style, like the cursor or the window? In this article, we are going to explain and show in action the right ways to control your terminal with a portable and an easy to maintain API. We are going to talk about stat, tput, terminfo, Hoa\Console… but do not be afraid, it’s easy and fun!

Introduction

Terminals. They are the ancient interfaces, still not old fashioned yet. They are fast, efficient, work remotely with a low bandwidth, secured and very simple to use.

A terminal is a canvas composed of columns and lines. Only one character fits at a position. According to the terminal, we have some features enabled; for instance, a character might be stylized with a colour, a decoration, a weight etc. Let’s consider the former. A colour belongs to a palette, which contains either 2, 8, 256 or more colours. One may wonder:

  • How many colours does a terminal support?
  • How to control the style of a character?
  • How to control more than style, like the cursor or the window?

Well, this article is going to explain how a terminal works and how we interact with it. We are going to talk about terminal capabilities, terminal information (stored in database) and Hoa\Console, a PHP library that provides advanced terminal controls.

The basis of a terminal

A terminal, or a console, is an interface that allows to interact with the computer. This interface is textual. Like a graphical interface, there are inputs: The keyboard and the mouse, and ouputs: The screen or a file (a real file, a socket, a FIFO, something else…).

There is a ton of terminals. The most famous ones are:

Whatever the terminal you use, inputs are handled by programs (or processus) and outputs are produced by these latters. We said outputs can be the screen or a file. Actually, everything is a file, so the screen is also a file. However, the user is able to use redirections to choose where the ouputs must go.

Let’s consider the echo program that prints all its options/arguments on its output. Thus, in the following example, foobar is printed on the screen:

$ echo 'foobar'

And in the following example, foobar is redirected to a file called log:

$ echo 'foobar' > log

We are also able to redirect the output to another program, like wc that counts stuff:

$ echo 'foobar' | wc -c
7

Now we know there are 7 characters in foobar… no! echo automatically adds a new-line (\n) after each line; so:

$ echo -n 'foobar' | wc -c
6

This is more correct!

Detecting type of pipes

Inputs and outputs are called pipes. Yes, trivial, this is nothing more than basic pipes!

Pipes are like a game, see Mario 😉!

There are 3 standard pipes:

  • STDIN, standing for the standard input pipe,
  • STDOUT, standing for the standard output pipe and
  • STDERR, standing for the standard error pipe (also an output one).

If the output is attached to the screen, we say this is a “direct output”. Why is it important? Because if we stylize a text, this is only for the screen, not for a file. A file should receive regular text, not all the decorations and styles.

Hopefully, the Hoa\Console\Console class provides the isDirect, isPipe and isRedirection static methods to know whether the pipe is respectively direct, a pipe or a redirection (damn naming…!). Thus, let Type.php be the following program:

echo 'is direct:      ';
var_dump(Hoa\Console\Console::isDirect(STDOUT));

echo 'is pipe:        ';
var_dump(Hoa\Console\Console::isPipe(STDOUT));

echo 'is redirection: ';
var_dump(Hoa\Console\Console::isRedirection(STDOUT));

Now, let’s test our program:

$ php Type.php
is direct:      bool(true)
is pipe:        bool(false)
is redirection: bool(false)

$ php Type.php | xargs -I@ echo @
is direct:      bool(false)
is pipe:        bool(true)
is redirection: bool(false)

$ php Type.php > /tmp/foo; cat !!$
is direct:      bool(false)
is pipe:        bool(false)
is redirection: bool(true)

The first execution is very classic. STDOUT, the standard output, is direct. The second execution redirects the output to another program, then STDOUT is of kind pipe. Finally, the last execution redirects the output to a file called /tmp/foo, so STDOUT is a redirection.

How does it work? We use fstat to read the mode of the file. The underlying fstat implementation is defined in C, so let’s take a look at the documentation of fstat(2). stat is a C structure that looks like:

struct stat {
    dev_t    st_dev;              /* device inode resides on             */
    ino_t    st_ino;              /* inode's number                      */
    mode_t   st_mode;             /* inode protection mode               */
    nlink_t  st_nlink;            /* number of hard links to the file    */
    uid_t    st_uid;              /* user-id of owner                    */
    gid_t    st_gid;              /* group-id of owner                   */
    dev_t    st_rdev;             /* device type, for special file inode */
    struct timespec st_atimespec; /* time of last access                 */
    struct timespec st_mtimespec; /* time of last data modification      */
    struct timespec st_ctimespec; /* time of last file status change     */
    off_t    st_size;             /* file size, in bytes                 */
    quad_t   st_blocks;           /* blocks allocated for file           */
    u_long   st_blksize;          /* optimal file sys I/O ops blocksize  */
    u_long   st_flags;            /* user defined flags for file         */
    u_long   st_gen;              /* file generation number              */
}

The value of mode returned by the PHP fstat function is equal to st_mode in this structure. And st_mode has the following bits:

#define S_IFMT   0170000 /* type of file mask                */
#define S_IFIFO  0010000 /* named pipe (fifo)                */
#define S_IFCHR  0020000 /* character special                */
#define S_IFDIR  0040000 /* directory                        */
#define S_IFBLK  0060000 /* block special                    */
#define S_IFREG  0100000 /* regular                          */
#define S_IFLNK  0120000 /* symbolic link                    */
#define S_IFSOCK 0140000 /* socket                           */
#define S_IFWHT  0160000 /* whiteout                         */
#define S_ISUID  0004000 /* set user id on execution         */
#define S_ISGID  0002000 /* set group id on execution        */
#define S_ISVTX  0001000 /* save swapped text even after use */
#define S_IRWXU  0000700 /* RWX mask for owner               */
#define S_IRUSR  0000400 /* read permission, owner           */
#define S_IWUSR  0000200 /* write permission, owner          */
#define S_IXUSR  0000100 /* execute/search permission, owner */
#define S_IRWXG  0000070 /* RWX mask for group               */
#define S_IRGRP  0000040 /* read permission, group           */
#define S_IWGRP  0000020 /* write permission, group          */
#define S_IXGRP  0000010 /* execute/search permission, group */
#define S_IRWXO  0000007 /* RWX mask for other               */
#define S_IROTH  0000004 /* read permission, other           */
#define S_IWOTH  0000002 /* write permission, other          */
#define S_IXOTH  0000001 /* execute/search permission, other */

Awesome, we have everything we need! We mask mode with S_IFMT to get the file data. Then we just have to check whether it is a named pipe S_IFIFO, a character special S_IFCHR etc. Concretly:

  • isDirect checks that the mode is equal to S_IFCHR, it means it is attached to the screen (in our case),
  • isPipe checks that the mode is equal to S_IFIFO: This is a special file that behaves like a FIFO stack (see the documentation of mkfifo(1)), everything which is written is directly read just after and the reading order is defined by the writing order (first-in, first-out!),
  • isRedirection checks that the mode is equal to S_IFREG, S_IFDIR, S_IFLNK, S_IFSOCK or S_IFBLK, in other words: All kind of files on which we can apply a redirection. Why? Because the STDOUT (or another STD* pipe) of the current processus is defined as a file pointer to the redirection destination and it can be only a file, a directory, a link, a socket or a block file.

I encourage you to read the implementation of the Hoa\Console\Console::getMode method.

So yes, this is useful to enable styles on text but also to define the default verbosity level. For instance, if a program outputs the result of a computation with some explanations around, the highest verbosity level would output everything (the result and the explanations) while the lowest level would output only the result. Let’s try with the toUpperCase.php program:

$verbose = Hoa\Console\Console::isDirect(STDOUT);
$string  = $argv[1];
$result  = (new Hoa\String\String($string))->toUpperCase();

if(true === $verbose)
    echo $string, ' becomes ', $result, ' in upper case!', "\n";
else
    echo $result, "\n";

Then, let’s execute this program:

$ php toUpperCase.php 'Hello world!'
Hello world! becomes HELLO WORLD! in upper case!

And now, let’s execute this program with a pipe:

$ php toUpperCase.php 'Hello world!' | xargs -I@ echo @
HELLO WORLD!

Useful and very simple, isn’t it?

Terminal capabilities

We can control the terminal with the inputs, like the keyboard, but we can also control the outputs. How? With the text itself. Actually, an output does not contain only the text but it includes control functions. It’s like HTML: Around a text, you can have an element, specifying that the text is a link. It’s exactly the same for terminals! To specify that a text must be in red, we must add a control function around it.

Hopefully, these control functions have been standardized in the ECMA-48 document: Control Functions for Coded Character Set. However, not all terminals implement all this standard, and for historical reasons, some terminals use slightly different control functions. Moreover, some information do not belong to this standard (because this is out of its scope), like: How many colours does the terminal support? or does the terminal support the meta key?

Consequently, each terminal has a list of capabilities. This list is splitted in 3 categories:

  • boolean capabilities,
  • number capabilities,
  • string capabilities.

For instance:

  • the “does the terminal support the meta key” is a boolean capability called meta_key where its value is true or false,
  • the “number of colours supported by the terminal” is a… number capability called max_colors where its value can be 2, 8, 256 or more,
  • the “clear screen control function” is a string capability called clear_screen where its value might be \e[H\e[2J,
  • the “move the cursor one column to the right” is also a string capability called cursor_right where its value might be \e[C.

All the capabilities can be found in the documentation of terminfo(5) or in the documentation of xcurses. I encourage you to follow these links and see how rich the terminal capabilities are!

Terminal information

Terminal capabilities are stored as information in databases. Where are these databases located? In files with a binary format. Favorite locations are:

  • /usr/share/terminfo,
  • /usr/share/lib/terminfo,
  • /lib/terminfo,
  • /usr/lib/terminfo,
  • /usr/local/share/terminfo,
  • /usr/local/share/lib/terminfo,
  • etc.
  • or the TERMINFO or TERMINFO_DIRS environment variables.

Inside these directories, we have a tree of the form: xx/name, where xx is the ASCII value in hexadecimal of the first letter of the terminal name name, or n/name where n is the first letter of the terminal name. The terminal name is stored in the TERM environment variable. For instance, on my computer:

$ echo $TERM
xterm-256color
$ file /usr/share/terminfo/78/xterm-256color
/usr/share/terminfo/78/xterm-256color: Compiled terminfo entry

We can use the Hoa\Console\Tput class to retrieve these information. The getTerminfo static method allows to get the path of the terminal information file. The getTerm static method allows to get the terminal name. Finally, the whole class allows to parse a terminal information database (it will use the file returned by getTerminfo by default). For instance:

$tput = new Hoa\Console\Tput();
var_dump($tput->count('max_colors'));

/**
 * Will output:
 *     int(256)
 */

On my computer, with xterm-256color, I have 256 colours, as expected. If we parse the information of xterm and not xterm-256color, we will have:

$tput = new Hoa\Console\Tput(Hoa\Console\Tput::getTerminfo('xterm'));
var_dump($tput->count('max_colors'));

/**
 * Will output:
 *     int(8)
 */

The power in your hand: Control the cursor

Let’s summarize. We are able to parse and know all the terminal capabilities of a specific terminal (including the one of the current user). If we would like a powerful terminal API, we need to control the basis, like the cursor.

Remember. We said that the terminal is a canvas of columns and lines. The cursor is like a pen. We can move it and write something. We are going to (partly) see how the Hoa\Console\Cursor class works.

I like to move it!

The moveTo static method allows to move the cursor to an absolute position. For example:

Hoa\Console\Cursor::moveTo($x, $y);

The control function we use is cursor_address. So all we need to do is to use the Hoa\Console\Tput class and call the get method on it to get the value of this string capability. This is a parameterized one: On xterm-256color, its value is e[%i%p1%d;%p2%dH. We replace the parameters by $x and $y and we output the result. That’s all! We are able to move the cursor on an absolute position on all terminals! This is the right way to do.

We use the same strategy for the move static method that moves the cursor relatively to its current position. For example:

Hoa\Console\Cursor::move('right up');

We split the steps and for each step we read the appropriated string capability using the Hoa\Console\Tput class. For right, we read the parm_right_cursor string capability, for up, we read parm_up_cursor etc. Note that parm_right_cursor is different of cursor_right: The first one is used to move the cursor a certain number of times while the second one is used to move the cursor only one time. With performances in mind, we should use the first one if we have to move the cursor several times.

The getPosition static method returns the position of the cursor. This way to interact is a little bit different. We must write a control function on the output, and then, the terminal replies on the input. See the implementation by yourself.

print_r(Hoa\Console\Cursor::getPosition());

/**
 * Will output:
 *     Array
 *     (
 *         [x] => 7
 *         [y] => 42
 *     )
 */

In the same way, we have the save and restore static methods that save the current position of the cursor and restore it. This is very useful. We use the save_cursor and restore_cursor string capabilities.

Also, the clear static method splits some parts to clear. For each part (direction or way), we read from Hoa\Console\Tput the appropriated string capabilities: clear_screen to clear all the screen, clr_eol to clear everything on the right of the cursor, clr_eos to clear everything bellow the cursor etc.

Hoa\Console\Cursor::clear('left');

See what we learnt in action:

echo 'Foobar', "\n",
     'Foobar', "\n",
     'Foobar', "\n",
     'Foobar', "\n",
     'Foobar', "\n";

           Hoa\Console\Cursor::save();
sleep(1);  Hoa\Console\Cursor::move('LEFT');
sleep(1);  Hoa\Console\Cursor::move('↑');
sleep(1);  Hoa\Console\Cursor::move('↑');
sleep(1);  Hoa\Console\Cursor::move('↑');
sleep(1);  Hoa\Console\Cursor::clear('↔');
sleep(1);  echo 'Hahaha!';
sleep(1);  Hoa\Console\Cursor::restore();

echo "\n", 'Bye!', "\n";

The result is presented in the following figure.

Saving, moving, clearing and restoring the cursor with Hoa\Console.

The resulting API is portable, clean, simple to read and very easy to maintain! This is the right way to do.

To get more information, please read the documentation.

Colours and decorations

Now: Colours. This is mainly the reason why I decided to write this article. We see the same and the same libraries, again and again, doing only colours in the terminal, but unfortunately not in the right way 😞.

A terminal has a palette of colours. Each colour is indexed by an integer, from 0 to potentially + . The size of the palette is described by the max_colors number capability. Usually, a palette contains 1, 2, 8, 256 or 16 million colours.

The xterm-256color palette.

So first thing to do is to check whether we have more than 1 colour. If not, we must not colorize the given text. Next, if we have less than 256 colours, we have to convert the style into a palette containing 8 colours. Same with less than 16 million colours, we have to convert into 256 colours.

Moreover, we can define the style of the foreground or of the background with respectively the set_a_foreground and set_a_background string capabilities. Finally, in addition to colours, we can define other decorations like bold, underline, blink or even inverse the foreground and the background colours.

One thing to remember is: With this capability, we only define the style at a given “pixel” and it will apply on the following text. In this case, it is not exactly like HTML where we have a beginning and an end. Here we only have a beginning. Let’s try!

Hoa\Console\Cursor::colorize('underlined foreground(yellow) background(#932e2e)');
echo 'foo';
Hoa\Console\Cursor::colorize('!underlined background(normal)');
echo 'bar', "\n";

The API is pretty simple: We start to underline the text, we set the foreground to yellow and we set the background to #932e2e  . Then we output something. We continue with cancelling the underline decoration in addition to resetting the background. Finally we output something else. Here is the result:

Fun with Hoa\Console\Cursor::colorize.

What do we observe? My terminal does not support more than 256 colours. Thus, #932e2e is automatically converted into the closest colour in my actual palette! This is the right way to do.

For fun, you can change the colours in the palette with the Hoa\Console\Cursor::changeColor static method. You can also change the style of the cursor, like , _ or |.

To get more information, please read the documentation.

The power in your hand: Readline

A more complete usage of Hoa\Console\Cursor and even Hoa\Console\Window is the Hoa\Console\Readline class that is a powerful readline. More than autocompleters, history, key bindings etc., it has an advanced use of cursors. See this in action:

An autocompletion menu, made with Hoa\Console\Cursor and Hoa\Console\Window.

We use Hoa\Console\Cursor to move the cursor or change the colours and Hoa\Console\Window to get the dimensions of the window, scroll some text in it etc. I encourage you to read the implementation.

To get more information, please read the documentation.

The power in your hand: Sound 🎵

Yes, even sound is defined by terminal capabilities. The famous bip is given by the bell string capability. You would like to make a bip? Easy:

$tput = new Hoa\Console\Tput();
echo $tput->get('bell');

That’s it!

Bonus: Window

As a bonus, a quick demo of Hoa\Console\Window because it’s fun.

The video shows the execution of the following code:

Hoa\Console\Window::setSize(80, 35);
var_dump(Hoa\Console\Window::getPosition());

foreach([[100, 100], [150, 150], [200, 100], [200, 80],
         [200,  60], [200, 100]] as list($x, $y)) {

    sleep(1);  Hoa\Console\Window::moveTo($x, $y);
}

sleep(2);  Hoa\Console\Window::minimize();
sleep(2);  Hoa\Console\Window::restore();
sleep(2);  Hoa\Console\Window::lower();
sleep(2);  Hoa\Console\Window::raise();

We resize the window, we get its position, we move the window on the screen, we minimize and restore it, and finally we put it behind all other windows just before raising it.

To get more information, please read the documentation.

Conclusion

In this article, we saw how to control the terminal by: Firstly, detecting the type of pipes, and secondly, reading and using the terminal capabilities. We know where these capabilities are stored and we saw few of them in action.

This approach ensures your code will be portable, easy to maintain and easy to use. The portability is very important because, like browsers and user devices, we have a lot of terminal emulators released in the wild. We have to care about them.

I encourage you to take a look at the Hoa\Console library and to contribute to make it even more awesome 😄.