What is Tor and how can we detect Tor users?

Tor is commonly used to anonymise internet traffic. The core ethos behind Tor is that internet users should have private access to an uncensored web. Due to this, it’s commonly used for illegal activities, such as purchasing weapons or drugs. These illegal sites, known collectively as the “dark web”, are only accessible using Tor using special links – for example, the original Silk Road URL was silkroad6ownowfk.onion. On the public internet, Tor is often used by credit card fraudsters to hide their true identity when they’re using stolen card information.

Due to a large amount of illegal use, some internet providers and websites block access from Tor. We’ll explore how to detect Tor traffic, how to block it, and analyse whether blocking is a good approach.

How does Tor work?

Tor uses a process called “onion routing”, and its name was originally an acronym for “The Onion Router”. Onion routing was developed by the U.S. Naval Research Lab and is named “onion” due to the layers of privacy and encryption.

To help explain it, let’s look at a simplified diagram of how your computer connects to a website.

Connecting to a website without a VPN or Tor

When you open a website, your real IP address is exposed, which can be used to determine your location, ISP and much more. Using HTTPS means your connection is encrypted, so your data is safe from interception, snooping, or tampering. However, your ISP can see which website you’re connecting to, or even change it, which means they are often asked to block websites, such as The Pirate Bay.

To get around this, some users will use a VPN, which sits between the connection between your computer and the website, helping to hide your real IP and identity.

Tor is like chaining together multiple VPNs at the same time. Your laptop connects to a Tor entry node, which in turn connects to a Tor relay. Your connection will pass through multiple relays, with each step adding another layer of security and re-encrypting the data.

A Tor exit node ends up connecting to the website you want to visit. Only the Tor entry node knows your real IP address.

How can I detect and block Tor traffic?

To detect Tor traffic, we need a list of all of the known IPs of Tor exit nodes. We don’t need any information about the relays or the entry nodes, as they’ll never connect to our website.

Tor publishes an official list of exit node IP address. There are fewer than 2000 IPs in that list, so it’s not too difficult to check if the connecting IP is a known Tor exit node. If we had a simple Express application, we could detect and block this traffic using a middleware.

const fs = require("fs");
const express = require("express");
const app = express();

// Download the official list to a txt file, split it into an array of IP addresses
const torIPs = fs.readFileSync("./tor-exit-nodes.txt").split("\n");

app.use((req, res, next) => {
  const ip = req.connection.remoteAddress;
  // Block any known Tor exit node IP addresses
  if (torIPs.contains(ip)) {
    res.status(403).send('Tor is not allowed');
    return;
  }
  next();
});

// Serve the home page
app.get("/", (req, res) =>
  res.sendFile("./index.html", { root: __dirname })
);

app.listen(8000);

This approach will catch most Tor connections, but there are also some unofficial exit nodes which aren’t on the published list. ipdata maintains an up-to-date list of the official exit nodes, and combines that with a proprietary list of unofficial Tor IPs. The data is aggregated every 15 minutes and updates every hour, meaning you’ll get the most accurate Tor detection possible.

Let’s update our application to use ipdata’s API instead.

const express = require("express");
const app = express();
const axios = require('axios');

// Get an ipdata API Key from here: https://ipdata.co/sign-up.html
const IPDATA_API_KEY = "test";
const getIpData = async (ip) => {
  const response = await axios.get(
    `https://api.ipdata.co/${ip}?api-key=${IPDATA_API_KEY}`
  );
  return response.data;
};

app.use(async (req, res, next) => {
  const ip = req.connection.remoteAddress;
  const ipdata = await getIpData(ip);
  // Block any known Tor traffic
  if (ipdata.threat.is_tor) {
    res.status(403).send('Tor is not allowed');
    return;
  }
  next();
});

// Serve the home page
app.get("/", (req, res) =>
  res.sendFile("./index.html", { root: __dirname })
);

app.listen(8000);

Should I block Tor traffic?

Now that we’ve shown how to block Tor traffic, it’s important to understand that Tor isn’t only used for illegal activities. Tor can be used to bypass the Great Firewall and other regulating systems to allow the free movement of information – currently all versions of Wikipedia are blocked in China. Users are becoming more privacy-conscious, leading to increased usage of VPNs and Tor. So the question should be, “do I want to risk blocking legitimate traffic?”

If you’re running an e-commerce store, you’ll need to be wary of fraud, so you may want to block Tor. It’s unlikely that your store would be blocked by any firewalls, and unless you’re selling sensitive products, blocking Tor should have minimal impact on your real customers.

If you’re running a news site, blocking Tor could restrict people in certain areas from accessing the information they need. There’s minimal risk of fraud, unless you’re taking payments from your site, so it’d usually be best to allow Tor traffic, but monitor it closely for fraudulent activity.

Conclusions

Tor traffic can be blocked using a list of IP addresses, or by using the ipdata API, which also exposes other data such as the user’s location and more detailed threat data. However, it’s worth considering whether blocking all Tor traffic is the right option for you.

What’s your experience with Tor? Have you had issues with visitors hitting your website via Tor? Let us know in the comments!

What is Tor and how can we detect Tor users?

How does Tor work?

How can I detect and block Tor traffic?

Should I block Tor traffic?

Conclusions

Next

C vs Golang Python extension performance

Top 10 IP Geolocation API Services