Validation.tex

% !TeX root = Fenrir.tex

\xkcdchapter[0.22]{Validation}{protocol}{Finally Bob arrives, but everyone's already left}
\label{Validation}
\label{Formal Verification}

\lettrine{I}{ntuitively}, a formal verification of protocols is necessary, as it is a proof of absence of problems in the information flow.

A successful formal verification will permit us to spend less energy on the design of the protocol, quickly test modifications and extensions, and gives us a
structure of how our implementation will have to work.

\section{Formal verification failures}\index{Failed verification!TLS}


We could start from the recent SSL/TLS history. After various attacks towards different parts of the protocols, researchers finally tried to formally verify
its correctness, as it was clear that the most used security protocol could not be left unchecked.

TLS has been verified multiple times\cite{Proverif_TLS}\cite{TLS_1}\cite{TLS_2}\cite{TLS_3}\cite{TLS_4}, especially in the last couple of years, and the result
has always been a positive evaluation.


Although researchers made a lot of work, and their analysis were correct, recently a new attack came out, and it was called \textbf{3Shake}\cite{3Shake}
due to its focus being the TLS handshake.

The researchers' work was correct and every formal verification passed, but probably due to the high complexity of the problem, they focused on the core
of TLS, its handshake, and do not consider other extensions. TLS however has been extended multiple times from the original specification, and multiple parts
remain optional and were thus not included in the verification.

That turned out to be a big mistake, as it turns out, combining the handshake, connection resumption and rekeying extensions led to a possible hijack
of any user connection.

The problems that let the attack go unnoticed were:
\begin{itemize}
	\item \textbf{Complexity}: only main parts of the protocol were verified, but a lot of extensions focused on what basically were new key-agreement
	algorithms, and were not verified as they were optional.
	\item \textbf{Lax RFC}: Due to implementation-defined parts, the specification let programmers define a session id as something generated from
	easy-to-guess or cleartext data, thus making the session id not unique anymore.
\end{itemize}

The result is that all the verifications were correct, yet myopic, but even by modeling the whole specification this bug might have been left unnoticed as it
comes from lax specifications.

Due to this, when an RFC or formal description of the Fenrir protocol is created, the advice is to leave no field undescribed or as implementation-defined.

\section{Verification}

\lettrine{T}{o} avoid making mistakes in the formal verification, the obvious choice is to use existing tools designed exactly to analyze protocols and their security
properties.\\
We are talking about \textbf{Proverif}\cite{Proverif}, that uses a variant of pi-calculus to check that one or more properties hold on a given specification.

Handling the complete specification of the protocol will be too complex for proverif, so we will model a high-level view of the handshakes, authentication,
authorization will not be modeled, as it can be though as integrated in authentication, and in general we will handle all cleartext communication.

As further checks, we will see how the protocol behaves in the various modes after the master key is leaked, so that we can check the perfect forward
secrecy property of the protocol, or the reaction to a rogue authentication server.

Proverif syntax is very close to pseudocode (as it is based on pi calculus), so it should be easy to model the various messages or data that needs to be
transmitted. Those, however, will be only a high-level view of the data, as strictly modeling all the structures would give too much work to proverif,
but it will still be useful as a guideline to understand which data is derived from previous seen data, and which must be uniquely generated (nonces).


\section{The Proverif model}\index{Proverif}


To check the resiliency of the handshake and grant their usage even in the future, when an handshake is broken, or when a specific algorithm is found to
have vulnerabilities, the specific must include an explicitly broken algorithm, and the model must pass the verification when one of the two parties refuses
to use the algorithm.\\
This scenario must be considered to avoid ``downgrade attacks'', which have been exploited multiple times even in TLS. These are attacks were the
attacker forces one or both parties to use older versions of the protocol, through timeouts or message modifications.


The security model that we will use is therefore:
\begin{itemize}
	\item The attacker is the network. t can read, modify, inject or replay any content.
	\item There already exist protocols and versions that are broken yet supported by one (but not both) of the parties.
	\item symmetric or asymmetric cryptographic security is assumed and not under discussion
	\item public key algorithms must support signing and key exchange. Key exchanges were both public key leak (but not the private ones) are secure.
	\item Master keys can be lost after the handshake and perfect forward secrecy holds \footnote{this will not be applied to the directory-synchronized handshake
		as it is purposely built without such resilience in mind.}
\end{itemize}

All Fenrir handshakes are based on asymmetric encryption keys that can sign data and generate keys when paired with an other public key. This is done
to generalize over algorithms like RSA where it is also possible to directly encrypt data. 

This feature is not common in asymmetric encryption, and it has some caveats in RSA, too, where if you blindly encrypt user provided data you can
end up leaking the private key. To avoid such problems, Fenrir will only use signing mode and secret key generation on asymmetric algorithms.

For completeness, all three handshakes must be present concurrently in our model, so that interactions between them can be considered, even if this might
mean more work for proverif.

Rekeying won't be tested, as it gives proverif too much complexity. Rekeying is handled in-band and is a simple signed-key exchange that leads to a 
secret key generation. The problem with this in TLS was that a different, non-unique session id was generated from user-defined data. As there's no such
generation or association, the connection id is unique and tied to an encryption key, and everything happens in-band, the key exchange is as secure as
the previous data sent by definition.

\subsection{Model limitations}

For brevity the specifics of the proverif model is attached in appendix \ref{Appendix}.

The proverif model consists of two files/proofs, one to check if out rekeying and direct authentication\ref{Rogue proof}\index{Proof!Direct Authentication}
\index{Direct Authentication!Proof} work like we have designed it in the case of a rogue (or compromised) authentication server, and the second to check
whether the handshakes\ref{Fenrir proof}, their interaction and the perfect forward security are safe and can protect what they are supposed to protect.

The need for multiple models instead of a single, bigger one arises from the complexity of the verification of multiple properties.

This means however that the models should test incompatible scenarios, so that we do not remake errors like forgetting to test session resumption and
rekeying\cite{3Shake}.


There are also some other limitations imposed by proverif, like the inability to use the full XOR instruction. We thus include a simpler version
of the xor instruction (commutativity, associativity are not considered). The implementation looks a lot like the symmetric encryption part, and it would not be a
good model if the objects being xored had any structure. However they actually are both random keys, so there is no way to distinguish any content.
If this assumption is broken, a rogue authentication server might be able to recover the session keys shared between the client and service.

An other limitation found in proverif was the lack of optimization in the if-else branches: if the condition is based on a nonce which is used to model an
enumerator, proverif will still try to work with all possible enumerations in the else branch, thus increasing complexity. This is quickly resolved by a second
``if'' in the ``else'' branch that captures only the else condition, however.

Finally, since computation requires a lot of time, the handshake model will include two independent tests, in order
to get results in a short time (but both can be run at the same time if half a day is available). The two tests are independent, as one is aimed at testing the
Perfect Forward Secrecy of the protocol, while the second one is aimed at the interaction of the different handshakes when working with the same key.\\
This is necessary as the directory-synchronized handshake does not use a master key to implement PFS, so we can't leak its key and expect the secrets to
remain such. Since we can not leak the key, we need to check again the interactions of the handshake with the same (non-leaked) key, to be sure that there is no
interference.