An analysis of TCP secure SN generation in Linux and its privacy issues

Standards of TCP/IP protocol suite, recommend that operating systems should include a timer in generating an initial sequence number for a new TCP connection. ISN is the first sequence number sent by either party in any TCP connection. The timer helps operating system detect stale copies of old incarnations of a single connection by creating monolithic ISNs for those ISNs that are generated consecutively and very close to each other (because if a couple of miliseconds elapse before a new ISN is created, the time difference can make the new 4-byte ISN become smaller since it is extracted form the right-most bits of a 64 bit larger number in Linux). The proposed formula in RFC 6528 to create an ISN is as follows:

ISN = M + F(localip, localport, remoteip, remoteport, secretkey)

where M is the timer value in question. The timer shouldn’t be so slow that during two consecutive ticks two connections can be initiated. In Linux kernel a 64ns timer is used for this use. This is fast enough to prevent accepting duplicate instances of a same connection in most cases. The primary goal of using a timer is enhancing the protocol efficiency and implementation performance. This feature, however, potentially harms the anonymity of users. The amount of load on CPU influences its temperature which in turn induces changes to the crystal oscillator on which the accuracy of the hardware clock is dependent. This leads to changing clock skew of the timer which has been shown to be remotely detectable. [1] Clock skew in ISN can in theory be used to deanonymize users [2][3]. In order to see the chain of ISNs affected by the 64ns timer in Linux, I wrote a small program to initiate multiple connections from a constant source port to a remote server. This can happen in real world scenarios and also a malicious code may make the OS do this. (Sometimes the malware itself cannot reveal the the identity and location of the victim – like when the user is using Whonix – unless doing a side channel attack like this one.) In the chart below, you can see the result of running the program and the generated ISNs using the current implementation in Linux kernel. (horizontal line is the connection number)

The sequence numbers are indeed the way they are supposed to be; monolithic. But since the difference between each pair is based on the system timer, they can be correlated to the real time elapsed between observing each two ISNs. The differences of pairs in this case are as follow.

These differences show the number of 64ns time slices between ISNs. Comparing the differences with the real differences observed when seeing the packets, is fairly easy by an adversary who has access to the network traffic of the victim. By changing the load on the CPU, the adversary can detect a change in the pattern of differences and then can verify whether this is the user who is initiating these new connections with high precision; effectively breaking anonymity. Even when the user is behind a gateway firewall like Whonix gateway it’s still possible to change the load on the gateway and observe the clock skew. The written differences above are totally within a reasonable time range to be included in an adversarial calculation. These are the results of only 10 consecutive connections to a specific host. Having more connections increases the accuracy of verification and reduces the chances of false positives significantly.

Solution

In order to cope with this problem we need to get rid of the timer completely. For our tests we chose Linux and made a patch to the kernel. Here I concisely explain the Linux behavior in this matter and the results of patching the kernel to eliminate the problem in question. Normally Linux checks for old duplicates while it’s in TIME_WAIT state taking the following execution flow(it may be slightly different in different kernel versions):

---> net/ipv4/af_inet.c: init_inet--------------
                                                |
     net/ipv4/protocol.c: inet_add_protocol    <-
                  .
                  .
                  .
---> net/ipv4/tcp_ipv4.c: tcp_v4_rcv --------
                                             |
     net/ipv4/tcp_minisocks.c           <----
                   .
                   .
if (th->syn && !th->rst && !th->ack && !paws_reject &&
        (after(TCP_SKB_CB(skb)->seq, tcptw->tw_rcv_nxt) ||
                   .
                   .
                   .

By removing the role of the timer, if there is an old instance of a same connection that has been stuck somewhere in the path, which arrives after a later instance, the receiver may mistakenly initiate the connection using an old SYN. Firstly it’s unlikely that this happens at all, and second the cost of recovering from such an error is actually negligible and depends on the exact error handling mechanism implemented in different kernels but both parties recover from the problematic instance after fixing and agreeing on new sequence numbers. When anonymity is a serious concern, we can ignore this potential efficiency cost. In order to test this, I changed the ISN generator of the kernel in a way it creates secure random ISNs independent of any sort of timer. Running the same program after this change results in the following chart.

In this case there’s no meaningful pattern in the observed ISNs, and relating the differences between them to the actual time of transmissions of packets is not possible. This will address the anonymity flaw of the way currently TCP ISNs are created in different operating systems.

The Cost

There’s no added time cost in this patch to the kernel for generating a secure initial sequence number.

Usage

This patch adds a new algorithm to create TCP initial sequence numbers and does not replace the original one. The code is currently under test and has not yet been added to the official kernel. But once applied, switching to the new functionality back and forth is as easy as changing a sysctl option in Linux. It will be up to the user whether he/she wants to use the regular ISN generation approach of the kernel or the new one. We keep the the option turned off when system boots so users can activate it when necessary. Of course it’s also possible to have the option on by default for operating systems that demand high privacy, using for example a user space boot script.

[1] https://dl.acm.org/citation.cfm?id=1180410
[2] https://phabricator.whonix.org/T543
[3] https://trac.torproject.org/projects/tor/ticket/16659

Leave a Reply

Your email address will not be published. Required fields are marked *