One conference per day, for one year (2017)

My self-assigned challenge for 2017 was to watch at least one conference per day, for one year. That’s the first time I try this challenge. Let’s dive in for a recap.

267 conferences

In some way, I failed the challenge because I’ve been able to watch only 267 conferences. With an average of 34 minutes per conference, I’ve watched 9078 minutes, or 151 hours of freely available conferences online. Why did I fail to watch 365 of them? Because my first kid was 1.5 years in January 2017, a new little lady came in December 2017, I got a new job, I travelled for my job, I gave talks, I maintain important open source projects requiring lot of time, I’m building my own self-sufficient ecological house, the vegetable garden requires many hours, I watch other videos, and because I’m lazy sometimes. Most of the time, I was able to watch 2 or 3 conferences in a row.

Where to find the resources?

All these conferences are freely available online, on YouTube, or on Vimeo, for most of them. The channel I mostly watch are the following:

It’s very Computer Science centric as you might have noticed, and it targets Rust, C++, Elm, LLVM, or Web technologies (JS, CSS…), but not only, you can find Haskell or Clojure sometimes.

My best-of list

In March 2017, more and more people were questionning me, and asked for sharing. I then decided to start a playlist of my “best-of” conferences. I’ve added 78 conferences in 2017, and 3 new conferences have been added since then.

Thumnails of my “best-of” 2017

Thoughts and conclusion

The challenge was sometimes easy and relaxing, or it was very hard to understand everything especially at 2am after a long day (looking at you CppCon). But it has been a very enjoyable way to learn a lot in a very short period of time. Many speakers are talented, and listening to them is a real pleasure. Some others are just… let’s say unprepared, and it’s good to stop and jump onto another talk. It’s also a good way to get inspired by technologies you don’t necessarily know (for instance, I’m not a big fan of Clojure, but some projects are really inspiring, like Proto REPL).

Sometimes I tweeted about the talk I watched, and it was quite appreciated too. I reckon because it’s a fun and an easy way to learn, especially with the help of video platforms like Youtube.

Am I going to continue this challenge in 2018? Yes! But maybe not at this frequency. It’s now part of my routine to watch conferences many times per week. I like it. I don’t want to stop.

As a closing note, I would like to thank every speakers, and more importantly, every conference organizer. You are doing an amazing job: From the program, to the event, to the final sharing on Internet with everyone. Most of you are volunteers. I know the work it represents. You are producing extremely valuable resources. Thank you!

Faster find algorithms in nom

Tagua VM is an experimental PHP virtual machine written in Rust and LLVM. It is composed as a set of libraries. One of them that keeps me busy these days is tagua-parser. It contains the lexical and syntactic analysers for the PHP language, in addition to the AST (Abstract Syntax Tree). If you would like to know more about this project, you can see this conference I gave at the PHPTour last week: Tagua VM, a safe PHP virtual machine.

The library tagua-parser is built with parser combinators. Instead of having a classical grammar, compiled to a parser, we write pure functions acting as small parsers. We then combine them together. This post does not explain why this is a sane approach in our context, but keep in mind this is much easier to test, to maintain, and to optimise.

Because this project is complex enought, we are delegating the parser combinator implementation to nom.

nom is a parser combinators library written in Rust. Its goal is to provide tools to build safe parsers without compromising the speed or memory consumption. To that end, it uses extensively Rust’s strong typing, zero copy parsing, push streaming, pull streaming, and provides macros and traits to abstract most of the error prone plumbing.

Recently, I have been working on optimisations in the FindToken and FindSubstring traits from nom itself. These traits provide methods to find a token (i.e. a lexeme), and to find a substring, crazy naming. However, this is not totally valid: FindToken expects to find a single item (if implemented for u8, it will look for a u8 in a &[u8]), and FindSubstring really is about finding a substring, so a token of any length.

It appeared that these methods can be optimised in some cases. Both default implementations are using Rust iterators: Regular iterator for FindToken, and window iterator for FindSubstring, i.e. an iterator over overlapping subslices of a given length. We have benchmarked big PHP comments, which are analysed by parsers actively using these two trait implementations.

Here are the result, before and after our optimisations:

test …::bench_span ... bench:      73,433 ns/iter (+/- 3,869)
test …::bench_span ... bench:      15,986 ns/iter (+/- 3,068)

A boost of 78%! Nice!

The pull request has been merged today, thank you Geoffroy Couprie! The new algorithms heavily rely on the memchr crate. So all the credits should really go to Andrew Gallant! This crate provides a safe interface libc‘s memchr and memrchr. It also provides fallback implementations when either function is unavailable.

The new algorithms are only implemented for &[u8] though. Fortunately, the implementation for &str fallbacks to the former.

This is small contribution, but it brings a very nice boost. Hope it will benefit to other projects!

I am also blowing the dust off of Algorithms on Strings, by M. Crochemore, C. Hancart, and T. Lecroq. I am pretty sure it should be useful for nom and tagua-parser. If you haven’t read this book yet, I can only encourage you to do so!