Project.tex

% !TeX root = Fenrir.tex

\xkcdchapter{The Fenrir Project}{standards}{The good thing about standards is that\\there are many to choose from...}
\label{Fenrir Project}

\lettrine{S}{ince} the current protocol stacks limit the developers in the application development, and there is not a sensible authentication protocol that supports authentication and authorization, this chapter is dedicated to the definition of a new one that encompasses as many as the possible requirements \ref{Requirements} as possible.


\section{The ISO Stack}

The first problem is to understand where we want to put our protocol in the ISO stack. As we have seen in \ref{ISO stack}, an incorrect placement in the stack can result in overly complex layering that push the burden of making everything work on the developer.

As a remainder, the ISO model is composed of 7 layers:

\begin{center}
\begin{tabularx}{0.35\textwidth}{| c | X | X |}
	\hline
	7&Application & \\ \hline
	6&Presentation & JSON\\ \hline
	5&Session & RPC\\ \hline
	4&Transport & TCP/UDP\\ \hline
	3&Network & IP\\ \hline
	2&Data Link & MAC \\ \hline
	1&Physical & Ethernet\\ \hline
\end{tabularx}
\end{center}

The TCP/IP model however does not follow the model strictly, as the layers are not always independent one of another. If we want to portray what actually happens in a common web connection, we end up with something like this, where multiple layers limit the upper layers capabilities, and a lof of features (like session managment) are reimplemented multiple times:

\begin{center}
	\begin{tabularx}{0.35\textwidth}{| X | X | X |}
		\hline
		\multicolumn{3}{|c|}{Application}\\ \hline
		OAuth& cookie & HTML\\ \hline
		\multicolumn{3}{|c|}{HTTP} \\ \hline
		\multicolumn{3}{|c|}{TLS} \\ \hline
		\multicolumn{3}{|c|}{TCP} \\ \hline
		\multicolumn{3}{|c|}{IP}\\ \hline
	\end{tabularx}
\end{center}

The reason for which OAuth, cookies and HTML are on the same layer is that they all need direct access to the HTTP headers, and are all dependent one from the other. The application then has to keep all their interaction together.

Keeping in mind what we are trying to avoid, let's analyse our choices:

\subsection{High level protocol}

By "high level" we mean anything above HTTP.

If we try to fix the situation with another high level protocol (like OpenID-Connect is trying to do) we gain ease of implementation due to the abstractions of the lower level protocols and their security properties, but we are also limited by them. Efficiency is also greatly impacted, and we
might have to rely on other protocols to avoid the limitations of the protocol stack we choose (like OpenID-Connect has to rely on webfinger to avoid OAuth non-interoperability).

This means that our quest for simplicity will lead to a contradiction, as the more protocols need to be used, the more the attack surface increments, while we need to handle all the protocol interactions and limitations.

As stated, this is the road chosen by the OAuth and OpenID-Connect authors, so there is little to gain from choosing it again.

\subsection{Low level protocol}

Since we can not go much high in the OSI model, we need to understand how low we should go, and what will change for our protocol. Going lower or at the IP layer will break the very first requirement, the compatibility with the existing infrastructure. The same could be said for working at the TCP/UDP layer, as too many firewalls would require reconfiguration. Fortunately in this case we could think about a transitional period where the protocol is tunnelled via UDP, and when enough support has been gained, we might switch directly above the IP layer.

We have upper and lower bounds, let's analyse the case-by-case problems:
\begin{itemize}
	\item \textit{rewrite the HTTP layer}: we are still fairly high in the protocol stack, still limited by TCP, we still have duplicated session identifications (TCP,TLS), but we should be able to implement the federated support. However we require a new TCP port to be assigned, to not interfere with existing protocols, but making us interfere with strict firewalls. 
	\item \textit{rewrite the TLS layer}: same as above, but now we also have to handle \textbf{secrecy} and \textbf{authenticity}. We still require a new port assignment.
	\item \textit{rewrite the TCP/UDP layer}: we get complete liberty of communication features, and might be able to implement our full list of requirements \ref{Requirements}. We do not require new port assignments, and if we use UDP tunnelling we still retain compatibility with the existing firewalls. This is the road chosen by QUIC and minimaLT.
\end{itemize}


At first glance this is much more complex, as we need to reimplement everything from TCP up to OAuth in a single solution, but we can gain a lot of features from experimental protocols, and add the federation and authorization support, which is found virtually nowhere. The overall stack becomes much shorter, and there can be less feature duplication (like the case of session identificators).

To summarize what we can achive here, we can gain in:
\begin{itemize}
	\item \textbf{Robustness}: we can design against amplification attacks, and avoid design problems like cleartext TCP RST which will drop any connection.
	\item \textbf{Efficiency}: less layers mean less encapsulation, and less headers before the user data.
	\item \textbf{Federation}: we can finally design the protocol so that authentication on multiple domains is the same, by including domain discovery techniques.
	\item \textbf{transport flexibility}: \textbf{multiplexing} support, and choosing every stream transport features (\textbf{reliability, ordered delivery} etc..)will increase application features while simplifying the application development.
	\item \textbf{multihoming/mobility}: finally design a protocol whose connection status is not dependent from layer 3 (IP) data.
	\item \textbf{datagram}: handling message begin/end despite of packet data fragmentation will further simplify user data management.
\end{itemize}

This seems obviously more work, but the overall amount of code used for the whole protocol stack will be much less, thus reducing the attack surface.

~\\

As we are talking about a new, experimental protocol, the obvious choice should be this one. To avoid the SCTP/DCCP mistakes, the protocol will need to be able to work both on top of UDP (for bypassing firewall and NAT problems) and directly on top of IP (for efficiency) seamlessly, so we should also take into account a transitional phase between UDP based and IP-based transport.

Not only the attack surface will be reduced, especially after the code base stabilizes, but there will be no need to analyse the interaction between multiple protocols, thus simplifying the development and analysis phase.

By not having to rely on old technology we will be able to fully control the security properties of the system for the first time in the development of a protocol.


It will seem that we are creating an overly complex protocol, but when comparing the result with the number of protocols we are aiming at replacing (TCP/UDP-(D)TLS-OAuth and more) the complexity of our solution will obviously be inferior to the total complexity of the possible interaction of existing
solution, not to mention the lack of analysis of the security properties of the interaction of the different possible user configurations.


\section{Fenrir outline}

\begin{figure}[h]
    \centering
    \includegraphics[width=0.5\textwidth]{images/Fenrir_logo.png}
    \caption{Fenrir Logo}
    \label{fig:Fenrir_Logo}
\end{figure}


\subsection{Federated Authentication}

The main feature of our protocol is the federated authentication. This means that we will need some form of interaction between the server of multiple independent domains, with each trusting only their users for authentication. Therefore, each user will be identified by its user and domain, in email-like format.

For this, we use the distinction provided by Kerberos for our system, and we divide the players into three (plus one):

\index{Client Manager}\index{Authentication Server}\index{Federation}
\begin{itemize}
\item \textbf{Authentication Server}: in short: \textbf{AS}. Handles authentication and authorization for its domain.
\item \textbf{Service}: the service the client wants to use. Be it the mail service or a web service.
\item \textbf{Client}: the user program that connects to a \textit{Service}
\item \textbf{Client Manager}: the program that manages authentication data for the user.
\end{itemize}

\subsection{Decoupling authentication from application}\index{Authentication!Decoupling}

The first two distinctions are fairly straightforward: the authentication server will handle authentication, so that the service can be designed without having access -for example- to the password database or to the user login. This is an important distinction, as the applications are vulnerable to bugs and have a much higher attack surface. By decoupling the two, the user and password databases should be better protected, as the only application that has access to them is the one especially designed to protect them.

The distinction between \textit{Client} and \textit{Client Manager} has the same purpose. Current applications usually save login information in cleartext, or in a poorly-obfuscated manner (base64 and alike). For the same reasons as before, we want to decouple application authentication from the
application itself. This will permit the system-wide usage of strong authentication methods like security tokens or smart cards, provide better support for authentication algorithms, instead of having clients rely on old methods like the deprecated SSLv3. Over time this will provide better security for the next legacy applications, as the security of the application will be upgradable regardless of the application itself.

Decoupling authentication from the application will have one more interesting outcome: as the \textit{Client Manager} will handle both authorization and authentication, it will be able to limit the application scope, so that the user will be able to limit the applications it does not trust, or that only need to check for the existence of an account.

This means that both the \textit{Authentication Server} and \textit{Client Manager} will be the primary subject of the attacks towards Fenrir, but this also means that the attack surface will be much lower, and the efforts can be concentrated in a single software. As popular software is migrating towards the web, this situation is increasingly common anyway, since the web browsers need to track each and every user password. since users never care enough about security the password database will be as good as in clear text. Moreover the attack surface of a browser is huge, especially thanks to its plugin system.

\subsection{The authentication protocol}

Federated authentication algorithms are nothing new. As per our earlier distinction, we will focus on a kerberos-like infrastructure.

Due to the previously introduced decoupling, our protocol needs some way to convey the various interacting players that a user has a certain authentication and authorization. This is done through the use of a token, so that login and passwords will not reach the \textit{Client} or the \textit{Service}, and are used as little as possible.

One characteristic of many authentication algorithms is the usage of timestamps to protect the authorization tokens or general messages. While this provides safety by putting an expiration time on the usage of said tokens, it also means that applications, servers and authentication servers must
have at least a loosely-synchronized clock.


Although nowadays clock synchronization seems easy and widespread, it is not yet at a state were we can safely assume that the clocks have little discrepancy. Embedded devices are still produced without a clock source, so each time they are booted the clock is reset to 1970. The most famous clock synchronization protocol (NTP) is used almost always in clear text, and basing our clock on the attacker's response is not wise.

Requiring all clocks in the world to be synchronized between a couple of minutes from each other, even for devices that do not have stable clock sources is in our opinion wrong. Therefore Fenrir will \textit{not} use timestamps. This will mean that occasionally additional round trips will be needed to check the validity of the data, but this also means that tokens can be simplified, as they do not need signatures anymore, and a token revocation will be effective immediately.

This choice is further supported by the existence of the \textbf{Online Certificate Status Protocol} \cite{OCSP}. X.509 certificates are essentially proof of authentications granted for a fixed time (mostly one year). For a number of years there was no way to revoke a certificate in a useful way, as the CRL (Certificate Revocation Lists) were updated very slowly. Once such CRLs became not only too slow, but also too big, OCSP was introduced to check in real-time if the certificate had been revoked or not. This alone proves that basing our protocols on time differentials alone is not sufficient.


As the figure \ref{fig:FederationOutline} describes it, the protocol will relay heavily on the authentication servers, which will act as a trusted third party. The image is only an outline, and assumes a shared token between client@example.com and the authentication server on example.com


\label{FederationExample}
\begin{figure}[t]
\centering
\begin{framed}

\centering
\begin{tikzpicture}[node distance=4cm,>=stealth]
\node[label=above:{Auth.Srv example.com}] (AS1) {\includegraphics[width=2cm,keepaspectratio]{images/auth_server.png}};
\node[label=above:{Auth.Srv domain.com}, right= 3.5 cm of AS1] (AS2) {\includegraphics[width=2cm,keepaspectratio]{images/auth_server.png}};
\node[below of=AS2,left=2.5cm of AS2, label=below:{Client example.com}] (C) {\includegraphics[width=2cm,keepaspectratio]{images/computer.png}};
\node[below of=AS2,right=1.5cm of AS2, label=below:{Service domain.com}] (S) {\includegraphics[width=2cm,keepaspectratio]{images/server.png}};

\draw[<-,thick] (AS2.180) -- node[below]{$1: auth, use ``service''$} (C.90);
\draw[<->,thick] (AS1.30) -- node[below]{$2: check account$} (AS2.150);
\draw[->,thick] (AS2.340) -- node[right]{$3: new user: id, keys$} (S.60);
\draw[<-,thick] (AS2.300) -- node[below]{$4: user registered$} (S.120);
\draw[->,thick] (AS2.250) -- node[below]{$5: ok: ip, keys$} (C.20);
\draw[<->,thick] (C.340) -- node[below]{$6: communication$} (S.200);
\end{tikzpicture}
\end{framed}
\caption{Fenrir overview: client@example.com connects to ``service'' in domain.com}
\label{fig:FederationOutline}
\end{figure}


The service will not receive the login data, and in fact will only get a connection id, cryptographic keys and an internal user id. As confirmation, the client will receive the ip, connection id and keys to connect to the service, so that no other handshakes are needed. Moreover, since the client receives a notification only when the connection has been confirmed between the authentication server and service, we avoid non intuitive errors like ``authenticated but not connected'', typical of federated protocols like Kerberos or OAuth, were the authentication server is too detached from the service.

The authentication server will receive a lot of trust from the user, as it needs to control its authentication and authorization data, but it will not be able to impersonate the user on different domain's services, as the service and the user will also share a secret key. There will still be room for impersonating a client on the same domain, although that is technically unavoidable with any design due to the service and the authentication server belonging to the same organization (authentication and logs can always be forged by administrators that control the services).

~\\


We will now follow a detailed bottom-up approach at designing the protocol.


\section{Transport}

\subsection{Layer 4: UDP tunnel}\index{Transport UDP}

The SCTP protocol history tells us that introducing new functionalities while being incompatible with existing network infrastructure is dooming ourselves to fail. However it also tell us that a UDP-based protocol can move up to a standalone one, given enough traction. SCTP evolved maybe too quickly from udp based to standalone, firewalls worldwide did not update, and very few applications ended up using it.

As our initial requirement is the compatibility with existing infrastructure, our protocol will be based on top of IP, but will include an optional UDP lightweight tunnel in its main components so that existing infrastructure will not have problems using it

Using UDP as a lightweight tunnel permits us to use a single socket for transmission or reception of every connection, without having the kernel track for us every connection. Firewalls will permit UDP connections, as the DNS system is based on that, and NATs will continue working as always.

Using UDP permits us to handle everything in user space, and the only thing that a non-UDP connection will need from the kernel is simply that it forwards the packet as-is to the right application.