There are TCP options that we can use to avoid this problem, but they make the TCP header bigger and are not a requirement for the implementation.
Fenrir tries to do the same as QUIC in regard to the number of packets required to manage a session, but extends the same concept to encryption and authentication.
The main difference in general is that QUIC seems to be a way to bring TLS to a lower level, while focusing on latency, while in Fenrir we focus a lot more on security and flexibility.
Header & packet size
By having integrated encryption and authentication both QUIC and Fenrir do not need redundant identifiers for the connection, and can optimize the packet structure as a whole.
The result is that you can send encrypted and authenticated messages with Fenrir in just about 30-40 bytes.
QUIC is more difficult to understand, since basically all of its header fields are completely optional and with varied length. Just for an estimate, a QUIC header can go from 9 byte to 60 bytes.
QUIC format can sometimes be more efficient than Fenrir, but at the cost of a more difficult parser.
Fenrir also includes the option of multiple alignments, while in QUIC everything is byte-aligned.
In QUIC all multiplexed streams are ordered, reliable streams. There currently is a draft to introduce unreliable transport, but it is not in the main documentation yet.
Due to having only reliable stream, QUIC has implicit stream creation: just send the data in a new stream, without notifying the stream creation.
In Fenrir you need to explicitly create each stream, but you can choose any combination of (un)reliable, (un)ordered delivery, both stream and datagram.
Fenrir is also the first protocol to support both unicast and multicast connections tied together. This means that while you send data on a multicast stream you can send additional recovery data or lost packets in a normal unicast stream, and have Fenrir combine them together, obtaining something akin to reliable multicast
The connection setup is a 4-way handshake with cookies to avoid SYN flood attacks, much like SCTP. Fenrir uses a similar structure.
QUIC supports a 0-RTT reconnection. Fenrir does not do less than 1-RTT by design. But maintaining long timeouts for the connections will have the same result. 0-RTT connections are avoided in Fenrir as they leave way too many options for amplification attacks.
While QUIC supports 0-RT connections, applications need to be designed to avoid amplification attacks by explicitly requiring an ulterior RT before sending a lot of data. Relying on applications to implement the correct security features for such as a feature is bound to create a lot of amplification attacks.
Here we can see what we meant before by "latency vs security". To reduce latency QUIC does neither a full PFS (next section), or have a full protection against replay attacks. As you can read in the QUIC Crypto document:
QUIC doesn’t provide replay protection for the client’s data prior to the server’s first reply. It’s up the application to ensure that any such information is safe if replayed by an attacker.
This is the same old mistake of pushing security details on the developers. Developers will (unknowingly) ignore it, until everyone will use a common framework that limits QUIC features to the safest subset. The same as happened with OAuth.
Perfect forward secrecy
PFS is mandatory in both Fenrir and QUIC. QUIC changes the ephemeral public key every once in a while, so more authentications are done with the same public key. Fenrir instead has multiple handshakes that provide multiple levels of security.
This enables QUIC to have one RTT less then Fenrir, but the connection is not totally isolated.
As long as the ephemeral public key is changed quickly (let's say, every couple of hours at most) this seems a good idea, and Fenrir implements it with the Stateful and Directory synchronized handshakes.
An other example of latency vs security. To decrease latency, the initial data sent by the client is NOT protected by PFS. Instead, PFS is set up at the first packet sent by the server. The client can, however, still send non-PFS data in this period of time. The first bytes of a connection might contain login data or other information, and I see no reason to treat this differently, so in Fenrir such behaviour is avoided by design.
PFS on "resumed" connections
By "resumed" here we merely mean that the client did a connection once, then closed, then opened a new one not much later.
QUIC simply tries to guess that the last public key used is still in use, so it hopes for a 0-RTT reconnection, and has a fallback to 1-RTT reconnection.
Connection resumption is not a thing in Fenrir (yet), but the directory-synchronized method could implement a similar mechanism, while avoiding 0-RT
Forward Error Correction
This was a very nice improvement of QUIC over previous protocols:
Basically every 2 (or more) packets QUIC sends an other packet which is the XOR of the previous packets (like RAID-4). So if one was lost, the receiver can reconstruct it without asking a retransmission.
This is actually a good idea, and I decided to learn from it and put it into Fenrir. However it is not flexible enough, so I created the libRaptorQ project implements the RaptorQ algorithm that generalizes the same concept over any number of source packets and repair packets. It's not as quick as a single XOR, but much more flexible.
The FEC has been later removed from QUIC, on the grounds that it could not provide enough protection to really matter. By my experiment RaptorQ instead is a huge advantage when the network loses a lot of packets (5% and more), and an even bigger advantage on High-loss, high-latency connection.
Both QUIC and Fenrir have a variable number of streams to be used in a connection, and both use a 32-bit connection identifier at the beginning of the packet. (in QUIC the id length is actually variable)
QUIC stream identifiers are actually larger (1-8 bytes) then Fenrir's (2 bytes), but really, today's applications use just one stream (TCP) for everything, 65534 streams should be enough. Even assuming it is not enough, introducing an other multiplexing layer before the application is trivial. Trying to get a proper priority out of all the streams is also an other problem itself. Fenrir might move to 64bit connection identifiers, but currently I don't think it is a big improvement.
QUIC uses a 64-bit sequence number for its streams, while Fenrir has a 30-bit one, like extended TCP. Having a bigger sequence number is actually a good thing, as it helps a lot in high-latency-high bandwidth connections, such as satellite ones, but we hope 30 bit should be enough. Putting together the stream multiplexing and the 30-bit counter already lets us have connections in the range of the Tb/s for satellite connections.
This has some limitations, however. It means that if we use a 30-bit offset we can't send more than 1Gb of data without an acknowledgment.
1Gb should be enough, as it means that the sender has to saturate a Gbit connection while the receiver can not send even one packet per second. As this situation seems a little extreme, I don't think it makes much sense to support it.
Does this means that the maximum message size will be 2^30 = 1Gb ? Not really. That is the maximum window size. But as soon as we have the initial part of the message complete, we can put it into a buffer, and wait for the rest, and we can keep doing this as many times as we want.
Of course, you should not expect Fenrir to handle a multi-Gb buffer. The message will be truncated at a certain length (user defined) and a partial message will be delivered to the user application. Big messages are usually just file transfer anyway, so chunking them does not give any problem. And do you really want Fenrir to keep a 2^64 buffer?
Proof of ownership
NOTE: I can't find this part anymore in the QUIC draft, was it removed?
This means that we want to be sure that the packet has not been spoofed.
QUIC uses the usual RTT method to be sure of IP ownership, but also an other peculiar method: it makes a specific bit in the stream field random, and the two endpoints continually exchange the hash of these random bits. Which means that they have problems when a packet is dropped or lost.
Fenrir only makes sure that at least one RTT has been done before "trusting" an ip address. A message is sent to the new IP, and the answer can come back from any IP.
The difference here is that QUIC tries to do continuous IP-ownership checking. This is however already present in the form of the receiving transmission window.
The only sensible limit is a timeout on the last packet received for that IP. But the receiving window would be exhausted before the timeout anyway, so the only sensible option seems to drop an IP from the pool of the "trusted" IPs once the timeout associated with 2 receiving windows has expired.
Every protocol that wants to provide reliability has to include some form of acknowledgment (ACK).
In QUIC ACKs are the same as in TCP. The whole packet is ACK'd, using the packet sequence number.
The SCTP and Fenrir way of doing ACKs is not only to acknowledge the last stream offset, but to send acknowledgments also for incomplete chunks (SACK).
So if we received the bytes 1...10 15...30 of a message, maybe because the 10..15 bytes were lost, we acknowledge both the 1...10 and 15...30 chunks in a single message. This should help the sender to retransmit only the part that we want, so in case of retransmission we should have a lot less duplicate packets
NACK support is being developed atm.
A second, new ACK type is under development: it is based on the Forward Error Correction idea (see earlier section), where we only need to know how may repair symbols the receiver needs in order to recover the lost data, without listing all the holes in the received message. This should be much more efficient, but brings some problems due to the nature of the FEC algorithm, which is probabilistic.