New research from MIT (Massachusetts Institute of Technology) shows how malicious Tor entry guards can strip away the Dark Web’s anonymity features, exposing users and the hidden websites they visit.
Tor is a sophisticated anonymity tool that can be used for browsing the regular World Wide Web or the so-called Dark Web.
On the regular web the locations of the websites are public but the Tor user is anonymous. On the Dark Web both the user and the websites they visit are anonymous and neither party knows the true IP address of the other.
That’s because between the user and the hidden service there’s a chain of computers, known as a Tor circuit, that uses encryption to hide the two ends of the connection from each other (and anyone trying to eavesdrop).
The attack outlined in the paper Circuit Fingerprinting Attacks: Passive Deanonymization of Tor Hidden Services, comes in two parts and is performed from an entry guard – the first computer in circuit.
Entry guard status is bestowed upon relays in the Tor network that offer plenty of bandwidth and demonstrate reliable uptime for a few days or weeks. To become one an attacker only needs to join the network as a relay, keep their head down and wait.
Once they become an entry guard an attacker can examine the stream of data passing through their node or nodes.
Only a small proportion of that data is actually hidden services traffic and the MIT researchers’ first insight was to find characteristics that allowed them to separate that valuable stream from everything else.
... paths established through the Tor network, used to communicate with hidden services exhibit a very different behavior compared to a general circuit. We found that we can identify the users’ involvement with hidden services with more than 98% true positive rate and less than 0.1% false positive rate with the first attack.
The attacker can now focus their efforts to deanonymise users and hidden services on a much smaller amount of traffic.
The next step is to observe the traffic and identify what’s going on inside it – something the researchers achieved with technique called website fingerprinting.
Because each web page is different the network traffic it generates as it’s downloaded is different too. Even if you can’t see the content inside the traffic you can identify the page from the way it passes through the network, if you’ve seen it before.
The MIT researchers visited the home pages of a number of different hidden services and used the traffic that was generated to teach a computer program that learned to identify them.
... [the attacker] trains a supervised classifier with many identifying features of a network traffic of a website, such as the sequences of packets, size of the packets, and inter-packet timings. Using the model built from the samples, the attacker then attempts to classify the network traces of users on the live network.
Once it had been trained, the computer program was able to tell if a user was visiting a hidden site that it had learned to identify:
... we show that we can ... correctly deanonymize 50 monitored hidden service servers with true positive rate of 88% and false positive rate of 7.8% in an open world setting.
An attacker who is trying to target a specific user also needs something else; a slice of luck, a lot of patience or a data centre full of Tor relays.
That’s because when your target’s Tor browser creates a circuit it chooses an entry guard at random from a pool of a few thousand. Of course, the more entry guards you control, the luckier you can be.
The Tor project has responded to the coverage generated by the research with an article of its own written by Roger Dingledine, Tor’s project leader and one of the project’s original developers.
Dingledine uses his platform to welcome the research and to put the dangers the attack poses in to perspective.
He highlights the element of chance that’s involved “…they hope to get lucky and end up operating the entry guard for the Tor user they’re trying to target”, and the real-world difficulties of website fingerprinting attacks.
Fingerprinting home pages is all well and good he suggests, but hidden services aren’t just home pages:
...is their website fingerprinting classifier actually accurate in practice? They consider a world of 1000 front pages, but ahmia.fi and other onion-space crawlers have found millions of pages by looking beyond front pages. Their 2.9% false positive rate becomes enormous in the face of this many pages—and the result is that the vast majority of the classification guesses will be mistakes.
So, there are certainly limits to the attack proposed in the research paper and, unlike the poisoned Tor exit nodes I wrote about last month, this attack has not been observed in the wild.
But what sets this research apart is that it only needs an entry guard.
There have been real world instances of traffic confirmation attacks that make use of an entry guard and an exit node together.
Indeed, there are concerns within the Tor development community that network nodes may have been compromised during Operation Onymous, the global law enforcement action that took down 400 hidden services including Silk Road 2.0.
Given the difficulty of hacking Tor, the seriousness of the crimes that it’s used to mask and the privileged position that entry guards play it would be a surprise if governments and law enforcement didn’t show a great deal of interest in them.
Eight years ago researcher Dan Egerstad demonstrated how useful having your own Tor exit nodes can be if you want to spy on people by setting up five of his own.
He used them to harvest thousands of emails and messages from embassies in Australia, Japan, Iran, India and Russia, as well as the Iranian Foreign Ministry and the Indian Ministry of Defence.
He was running exit nodes rather than entry guards but his conclusion applies to both – he was convinced (although he provided no proof of it) that governments would surely be running or spying on Tor relays too:
I am absolutely positive that I am not the only one to figure this out ... I'm pretty sure there are governments doing the exact same thing. There's probably a reason why people are volunteering to set up a node.