The Fenrir Project

Transport, encryption, authentication protocol

This project is maintained by Luca Fulchir

RSS feed
Posted:

libRaptorQ

It took me a lot more than expected, but finally I have a working implementation of the RaptorQ algorithm (RFC630)

I initially thought it would take something like a month for an implementation of RFC6330.

Guess I was wrong, and it took me two whole months, but now I have a usable version.

You can find it here, it’s written in C++11, and it’s as fast as I could make it.

There were some WTF moments while reading the RFC (ok, lots of WTF moments), but everything seems to work now.

I thought no RFC could be worse than the OAuth ones, guess I was wrong again. I really disliked this one. It was almost as if they tried to make it difficult to understand and implement it.

The library can not be considered a fully-compliant en/decoder yet. I need a cluster with nodes with 6GB of ram to really check it. Yes, the requirements for full compliance include really a lot of tests. And the matrices can go up to 57.000x57.000 bytes

Still, the algorithm looks kinda cool, I already have some ideas on how to speed it up, at least during encoding.

libRaptorQ will be used by Fenrir, of course, so it’s not time (and code) thrown away :)

Also, it’s the only fast open-source implementation! So go get it, test it, and help me ^^

What is this about

RaptorQ is a forward error correction code, from the fountain code family.

This means that when you want to send K packets, you send those packets, and then an arbitrary number of repair packets. As long as the receiver receive K packets (be it source packets or repair packets, or any combination), it can rebuild the original data.

This sounds really useful, as it lets us have bigger burst of data without waiting for the RTT, it helps in retransmission, as we can just send a new repair packet, and so on.

So it is really useful for Fenrir.

How it works.

Well, I’m not going to discuss in detail what the algorithm does.

Basically, you feed it your input, which gets divided into blocks.

Each block is encoded/decoded independently.

Each block is divided in source symbols.

Then the source symbols gets sent, and then we send the repair symbols, which are calculated from the combination of an LDPC code, an HDPC code, and a LT (fountain) code. We can generate as many repair symbols as we want.

Details and limits

The catch is how much time you want to wait for the initial generation of those repair symbols.

The generation of a repair symbol -per se- is very fast, but you need to generate some intermediate symbols before that.

And that can be slow. The amount of work basically amounts to solving the A x C = D problem, where A is a matrix, and D is the matrix of our source symbols. The dimension of the matrix depends on the number of symbols in a block.

So if your blocks are small, let’s say 10 symbols, the algorithm takes a fraction of a millisecond, but if you are trying to work with big matrices you can wait a lot. The time and space is linear on the number of symbols.

On my core i7, 2.4ghz the algorithm takes up to an hour for a 27.000x27.000 matrix. And the RFC describes matrices of up to 57.000x57.000 bytes.

But even working with 200 symbols can be pretty fast (couple of milliseconds), so there are a lot of use cases.

Luca Fulchir