Implementation.tex

% !TeX root = Fenrir.tex


\xkcdchapter[0.65]{Implementation choices}{salt_time}{I promise!}
\label{Implementation choices}

\lettrine{I}{n} this chapter we do not aim to provide a full-featured RFC, but it will help in understanding why and how a lot of things that are not present on the previous formal
model are tied to it and why the addition of new data still conforms to the formal model.

The formal model only checks for security leaks, but it does nothing to test against amplification attacks. We will further discuss this treat and the precautions
that Fenrir adopts to avoid this problem.\\ As we can not model all the real-world data structures in proverif (the complexity would explode and there are no
ways to model some data), we will have to explain how and why we can distinguish packets, sessions and user data.

\section{Connections}

The first thing that we find in a packet is the connection id. The obvious use of this is to identify the different connections, but there are more uses
that this field can have.

The most important one is the handshake. We can not grant a connection id every time we receive an handshake packet, or we would expose ourselves
to DoS. All handshakes must be handled by a single connection id, which will be the id 0.

The rest of the packet will be a stream with random id that will contain the handshake data. The stream identifier and counter will act as a tcp sequence number,
providing us with the necessary tracking information so that both client and server can distinguish different connections and fragmented packets.

As per the TCP sequence number, we will use the counter to make sure that the client is not spoofing somebody else's ip address. This assurance is
not complete, as the attacker might be sniffing the victim traffic or might be the next op in the server traffic, but it still avoids lots of easier spoofing attacks.

In the future (see Section \ref{future}) this field might sustain more restrictions to include caching proxies and multicast delivery.

\subsection{Connection id allocation}

Each client should have the full $2^{32}$ connection space available, and we should not try to spend time synchronizing on which connection id
is available between the server and client, as that would require too many round trips and/or data transfer to be feasible.

The obvious choice is that both the client and server will choose one connection id, and will communicate the choice to the other party. The sender will then
use the connection id the receiver requested.\\
Each party now has the full connection space at his disposal, and there are no clashes or synchronization mechanisms.


\section{Streams}

The stream section can be repeated multiple times in the packet to achieve parallel delivery of multiple streams. The concept is very similar to switching from a virtual circuit network to a packet switched one.

Each stream is identified by a \textbf{16 bit} id, which should leave plenty of space to handle all different network flows that a single program needs. Fenrir itself uses one of this streams to carry connection information, ACKs and so on, but it can use multiple streams to implement features like forward error correction.

Each stream identifier must be random. By doing so we will increase the entropy of the encrypted data and make chosen plaintext or known-plaintext attacks impossible or at least more difficult. Then in the future we might be able to apply compression techniques to the whole packet with less worry of attacks like CRIME or BREACH.

\subsection{Flags}

The 1-byte stream flags field could have been used for multiple things. Experiments included padding and tracking of features like forward error corrections and/or type of stream.

Eventually it became clear that 2 bits had to be allocated to signal the beginning and end of the user message, as that lets us break away from the raw-bytestream model used in TCP that forced the users to meddle with message boundaries, lengths of escape sequences. Introducing this leads to a big simplification of the management of user messages for any type of stream.

There were experiments with additional message counters, padding and tracking of stream features, but all these could be derived from other fields or did not change quickly enough to need dedicated space in every packet. This meant that the original 1-byte bitfield for flag was reduced down to only 2 bits.

The next 6 bits were used to increase the maximum transferable unit in the fenrir streams, to bring the counter up to 1GB. The rationale for this is in the next section.

\subsection{Stream counter}
\label{stream counter}

The stream counter works like the tcp sequence number, and also limits the maximum window size. TCP had a segment size of 32 bits, but a window size of 16 bits: this was proved to be inefficient in high-bandwidth, high-delay connections, for example when using satellite links. To solve this problem extensions to the tcp header have been introduced to increase the receiving window.

The well known \textbf{bandwidth-delay product}\index{Bandwidth-Delay product} ($B*D = bytes$) tells us how much unacknowledged data can be in transit at
any given time. TCP had a maximum window size of $2^{16}$ bits, which meant that on a satellite link with a 200ms RTT we could only transfer up to 320KB/s,
or 116KB/s on a link with 550ms delay. Due to this problem a ``window scale'' option was created, were a shift (up to 14 times) could be applied to the receive
 window, bringing the maximum window size up to almost $2^{30}$. This brings the total maximum bandwidth of TCP to 1.8GB/s.

In Fenrir the receiving window is handling is inside the control stream, so there is no need for an ad-hoc field in each stream. Fenrir also has multiple streams, but handling the congestion window and the receiving window is slightly more complicated as multiple interfaces and streams are involved in a single connection.

Initially the counter was set to only 3 bytes (tracking $2^{24}$ bytes). The counter can obviously overflow without having to destroy the stream, but $2^{24}$ bytes can be considered a limit for the receiving/congestion window, thus it was decided to use the last 6 bits on the stream flags to bring the maximum stream counter to $2^{30}$ bytes. By the previous bandwidth delay product, this means we can transfer data with up to 5GB/s on the 200ms-delayed link.

Geostationary satellites have a physical limit of at least 550ms of latency, which means that we will be limited to 1.8GB/s (slightly more than TCP in the same case). The latest satellites have a total bandwidth of about 140Gbit/s, or about 17.8GB/s, which are only 10 Fenrir streams at full speed.

But this is \textit{not} the real limit of a Fenrir connection, because we are still looking at the theoretical maximum for a single stream. Fenrir leaves the user with up to $2^{16}-1$ different streams, for a \textit{total of more than 116TB/s}. This will also mean that the congestion window for the whole connection now needs to be of at least $30+16=46$ bits, so that we can scale up to the maximum theoretical speed. The receiving window will be set per-stream and will thus remain at 30 bits.

\subsection{Error recovery}

Transport protocols that implement reliable delivery must include a way to recover lost or corrupted packets.

The two main ways to implement this mechanism work by using ACKs or NACKs. ACKs work by telling the transmitting endpoint what chunks of data arrived, while NACKs are the opposite, where you signal which chunks are missing to complete the stream.

Google's QUIC protocol uses NACKs, but also includes the RAID methodology in itself: every two packets the protocol sends a third one which is the XOR or the previous two. It's like applying RAID5 to packet networks. The general technique is called \textit{Forward Error Correction}, as it lets the receiver recover from a number of network errors (packet drops) without the need to wait for retransmission.

While this is an interesting approach, it has a very limited flexibility. Fenrir thus includes the RaptorQ algorithm to generalize the RAID concept and is thus able to generate any number of repair packets from any number of source packets.

The RaptorQ implementation is maintained as a separate library
(\href{https://www.fenrirproject.org/Luker/libRaptorQ}{libRaptorQ}\cite{libRaptorQ}), and is fully conformant to RFC6330, so that other projects might benefit from it without having to extract the code from Fenrir.

The protocol will therefore generate and track recover symbols to match the number of packet drops,
so that the total number of retransmissions become close to zero. Sending and tracking of such repair symbols will be done in a dedicated stream, so that the structure and handling of the user stream is not changed.

In the future all three approaches, ACK, NACK and FEC will be implemented in the Fenrir protocol,
so the best approach can be better studied.

\subsubsection{RaptorQ}

RaptorQ is a Forward Error Correction \index{Forward Error Correction} (\textbf{FEC}\index{Forward Error Correction}) algorithm designed to deliver your data efficiently and without retransmissions for lost packets.

After sending K packets of data as-is, RaptorQ generates as many repair
symbols as you need. Once the receiver has at least K symbols, be it the source
symbols, repair symbols or any combination of the two, it can reconstruct the
whole input it was meant to receive.

Since the first K symbols are the raw data to be sent, RaptorQ is considered a systematic fountain code.


Internally the algorithm needs to generate a set of intermediate symbols from the source symbols that need to be transmitted, and a matrix that encodes the different dependencies of the intermediate symbols from the source symbols.\\
This phase complexity is independent from the symbol size, but time-wise is cubic on the number of symbols, while memory-wise is quadratic (always on the number of symbols).

The second phase of the algorithm generates the repair symbols by XOR-ing together the intermediate symbols, and its complexity is linear on the number of symbols.

While the algorithm's complexity is high, encoding/decoding a block of 100 symbols should take no more than a millisecond. Part of the computation can than be cached to be reused for the next block, so the overall impact on performance is both negligible and configurable.


\section{Handshake}

Security and performance are often opposites. While developing the handshake, it became clear that a tradeoff was necessary in terms of performance
and security. Most users will only care about performance, but developers and administrators might care more about security, especially for applications
that require reliability. For this reason Fenrir has multiple handshakes, designed either for speed of security.

The three handshakes for Fenrir are the Full-security, the stateful handshake and the directory-synchronized handshake.

Before looking at the details, lets look at some generic choices.

\subsection{0-RTT protocols}

While we might have incorporated 0-RTT connections, like QUIC, minimaLT and the newer TLS1.3, a brief analysis revealed that any implementation of this
features is a big problem for security and developer usability. The problem was big enough to actively avoid any kind of 0-RTT handshake.

To easier understand why a 0-RTT protocol is a bad idea no matter the implementation, let's start analyzing what 0-RTT means: 0-RTT means that we will need
to keep a state, as (at least) keys and ciphers must be agreed upon. The state can become bigger when we consider session resumption, so how long
will we keep a state? Keeping this state breaks any guarantee of perfect forward secrecy, and if kept too much leaves us open to memory DoSes.

MinimaLT provides a 0-RTT connection, but technically does not implement perfect forward secrecy. All clients use the same server key, until such key is replaced
in the directory service(DNS).

How do we distinguish a legitimate connection from a replay attack? The server can only use the state until the first successful 0-RTT connection, and if the
client loses connection before asking new shared keys and data, it can't be used again, or replay attacks become possible.

Fenrir detaches itself from the IP layer, so that is can use multiple connections and ips without problems. But if we can connect without even one RTT, what
guarantees us that the client didn't spoof the ip address while requesting a big file? This alone can create amplification attacks of an unseen magnitude.
The only reason TLS can get away with this is that the IP confirmation is provided by the 2-RTT TCP handshake, but DTLS servers might suffer from it when
the dtls specification get updated.\\


In short, safe 0-RTT connections can not be used for the first connection, can only come from the previous ip address(es),
can only be used once (then must be renegotiated), and break the perfect forward secrecy model.\\
This features are nearer to a session resumption than to a new connection setup.


For these reasons new 0-RTT connections will not be allowed in Fenrir, and 1-RTT is the minimum by design.

\subsection{Session resumption}

TLS and other protocols have to provide cookies, shared secrets or signed data to implement session resumption. It is not easy to decide what has to
be used to represent the client-server connection, and thus attacks like 3shake\cite{3Shake} were born.

In the case of the 3Shake attack, the problem was that most of the information that identified a client was taken from freely modifiable fields in the lower
layer protocols. As such, an attacker could impersonate any client and steal a user session.

In Fenrir we are completely detached from the lower layers, so there is no need for cookies or signed data to be transmitted. If the server simply keeps
a long time to live for each connection, the connection id is enough to resume the connection from any IP. Much simpler and straightforward.

\subsection{Protocol version and protection}

Protocols can advance, new versions can be developed, and old version will be deprecated. This is normal, and even SSL and TLS have a field
dedicated to version tracking.\\
What has never been tracked however was the \textit{complete list} of the supported versions. And this lead to downgrade attacks in the SSL/TLS protocols.

During the handshake both the client and server list the supported ciphers. The whole list is then used when signing the data, so that both client and
server know that the received list is the correct one, and has not been tampered with. But for some reason, this concept was not applied to protocol versions.

So during the SSL-TLS transition, an attacker could drop the TLS handshake and force a timeout, thus making the client try older (and bugged) SSL version.\\
The solution today is to simply disable the whole SSL protocol, but this is still a poor way of handling a transition, and does not protect anyone during
the actual transition.

To avoid these problems Fenrir will list and sign all supported protocols and handshakes, making sure that the whole protocol can be upgraded in the
future, without dangerous transitional periods.

\section{Programmer's viewpoint}

The first impression of this protocol is that it is much more complicated than the classical application stack. However one of our aims was not the simplification of the protocol, but the simplification of the application developers' job.

While the API of the project is not stable yet, we will give a stub of the intended usage of the protocol, while listing the features provided in each section. The examples are coded in C++.


The most basic and common setup will be that of a client application which  does not need to receive new connections.


\lstset { %
	language=C++,
	backgroundcolor=\color{black!5}, % set backgroundcolor
	basicstyle=\footnotesize,% basic font setting
}

\begin{lstlisting}

#include "Fenrir.h"

Fenrir::Manager M = Fenrir::Manager();

\end{lstlisting}

The Manager is thread-safe, and will be used for everything, from receiving connections and data, to adding new services.

To connect somewhere, we will simply use the following calls:

\begin{lstlisting}

// unauthenticated version:
uint32_t connection = M.connect_anonymous(std::string domain, uint16_t service_id);

// authenticated version:
uint32_t connection;
uint8_t authorization;

std::tie(connection, authorization) = M.connect(std::string domain, uint16_t service_id);
// optional if we want to know our authorization in the lattice:
Fenrir::Lattice L = M.get_lattice(std::string domain, uint16_t service_id);
\end{lstlisting}

and we can now setup the streams and send/receive the data we want:

\begin{lstlisting}

//choose the stream type
enum class Fenrir::StreamType : uint8_t {
	Rel_Ord_Stream = 0x00,		/* reliable,   ordered,   stream */
	Rel_Ord_Datagram = 0x01,	/* reliable,   ordered,   datagram */
	Rel_Unord_Stream = 0x02,	/* reliable,   unordered, stream */
	Rel_Unord_Datagram = 0x03,	/* reliable,   unordered, datagram */
	UnRel_Ord_Stream = 0x04,	/* unreliable, ordered, stream */
	UnRel_Ord_Datagram = 0x05,	/* unreliable, ordered, datagram */
	UnRel_Unord_Datagram = 0x06,/* unreliable, unordered, datagram */
};

// add the streams
uint16_t stream_id = M.add_stream(uint32_t connection_id, Fenrir::StreamType t);

// send the data
Fenrir::error e = M.send(uint32_t connection_id, uint16_t stream,
							std::vector<uint8_t> &data);

// receive the data:
while (1) {
	Fenrir::Data received = M.receive();
}
\end{lstlisting}

The "Fenrir::Data" structure has multiple fields to accommodate all use cases:

\begin{lstlisting}
enum class Fenrir::DataType : uint32_t {
	UserData = 0x00;	/* data of active connection */
	NewConnection = 0x01 /* used by services, new connection/user information */
};
struct Fenrir::Data {
	Fenrir::DataType type;
	void *data;
};
struct Fenrir::UserData {
	uint32_t connection_id;
	uint16_t stream_id;
	uint32_t counter;	/* useful for unreliable data reordering */
	std::vector<uint8_t> data;
}

// the application will cast the data to the right data type:
if (received.type == Fenrir::DataType::UserData)
	Fenrir::UserData d = static_cast<Fenrir::UserData> (received.data);

\end{lstlisting}


Notice that the application never sees a token or the authentication data.
However the services will need a user identifier to associate a user with the right data in the service database, and this information will be provided in the "NewConnection" data type:

\begin{lstlisting}

struct Fenrir::NewConnection {
	uint64_t tmp_connection_id;
	uint64_t user_id;
	uint8_t authorization;
};

// Once the service receives the "NewConnection" data, it will need to reply with
// "ConnectionConfirmation"
struct Fenrir::ConnectionConfirmation {
	uint64_t tmp_connection_id;
	uint64_t[2] keys;
	uint32_t connection_id;	// "0" on failure
} confirmation;

// get a random key and a new connection id.
std::tie(confirmation.connection_id, confirmation.keys) = M.newIncoming(shared_secret);
// send everything to our right authentication server
M.confirm(confirmation);
\end{lstlisting}