Masscan is a classic and extremely high-speed network scanner. This article starts from implementation details to analyze how Masscan sends and receives packets directly at the user level, how it distinguishes responses generated by itself, as well as its target randomization and high-performance network card access technologies.
1. Masscan’s User-Level Sending/Receiving (libpcap)
Masscan does not use the operating system’s full protocol stack, but instead sends and receives raw data packets directly on Linux based on libpcap. This brings several important impacts and limitations:
- Bypassing Kernel TCP/IP Stack: Masscan can capture and construct arbitrary raw Ethernet frames, which allows it to achieve extremely high-speed packet sending and fine-grained packet control, but also causes some system-level behaviors to no longer take effect automatically.
- Weird Phenomena in Gateway/Direct Connection Scenarios: For example, two machines connected directly via network cable and in the same subnet should be able to communicate without a gateway; but if a gateway is configured arbitrarily, Masscan may not work properly under certain configurations (because it bypasses kernel routing/forwarding logic).
- Source IP Spoofing and ARP: When using
--source-ipto forge a source address, if the forged address does not have a real MAC in the target network segment, the target host will send an ARP request to resolve the MAC of that IP. Masscan can listen to broadcast ARP to judge whether the target is reachable/alive.
2. Asynchronous Sending and Receiving -> Syncookie Idea
Masscan is mainly composed of two relatively independent threads/modules internally:
- Sending Thread: Responsible for sending out constructed SYN (or other) packets according to the rate, including rate control and target order (see Section 3).
- Receiving Thread: Listens to all inbound packets on the network card at the user level, responsible for filtering and matching which ones are responses corresponding to requests initiated by Masscan.
Because it sees all packets at the link layer, the receiver needs an efficient way to distinguish “responses belonging to this scan” from “other normal traffic”. Masscan borrows the idea of syncookie:
- It uses a hash function (SipHash in implementation) to calculate the request elements and takes a part of the result (e.g., 32-bit) as the Sequence Number (Seq) when sending SYN.
- When a SYN/ACK is received, it only needs to recalculate using the same hash method and verify if the Seq matches, thereby judging whether the packet is a scan response from Masscan. This eliminates the need to maintain a large state table while efficiently identifying return packets.
3. Target Randomization: BlackRock Cipher (Finite Domain Permutation)
Scanning large-scale targets sequentially is very easily detected by protection devices. Masscan’s target randomization uses the BlackRock cipher idea to generate a pseudo-random but non-repeating (bijective) target order.
- Core Idea: Map the target set
[0..N-1]to[0..N-1]through a reversible pseudo-random permutation. The mapping is a bijection, so no index will be missed or repeated, but the sequence looks “random”. - Implementation Technology: Using Feistel network (refer to John Black & Phillip Rogaway’s paper Ciphers with Arbitrary Finite Domains), performing multiple rounds of permutation within a finite domain to achieve reversible and uniform mapping.
- Source Code Reference: Masscan’s implementation can be found in
crypto-blackrock.c(specific details of implementing Feistel / BlackRock in the project repository). For example, this implementation takes an integer index as input, and through several rounds of mixing/permutation, outputs a pseudo-random index.
Advantage: It can achieve overall scan coverage (no omissions) while presenting a “network-wide randomized” scanning behavior, reducing the risk of detection/rate limiting.
4. High-Performance Network Card Access: PF_RING / PR_RING
The conventional Linux network stack has a bottleneck in high packet rate scenarios: the number of data packets that ordinary drivers and kernel paths can handle per second is limited. To solve this problem, Masscan supports direct interaction with high-performance packet capture/sending frameworks, such as PF_RING / PR_RING:
- Function: Allows user-space programs to access network card transmit/receive queues more directly and efficiently (bypassing some kernel overhead), significantly increasing the number of data packets that can be sent/received per second.
- Magnitude of Effect: Ordinary drivers may only handle millions of packets per second, while after enabling PF_RING/PR_RING, measured performance can reach higher packet rates (depending on hardware and drivers, scenarios reaching tens of millions of packets/second have been reported).
- Reference: The PF_RING project and documentation provided by ntop (the Masscan community has also discussed high-performance packet sending schemes based on this technology).
5. Summary and Practical Suggestions
- Masscan is designed to quickly scan a large number of IPs and ports, while Nmap is a comprehensive scanning tool. The design goals of the two are different, so selection needs to be combined with requirements.
- Masscan’s design trade-off aims for extreme packet sending rate: realized through user-level raw frames, lightweight stateless verification (similar to syncookie), and high-performance network card access.
- When using, note:
- Network card drivers and kernel configurations have a huge impact on performance; for extremely high-speed packet sending, it is recommended to use network cards/drivers that support technologies like PF_RING/AF_XDP.
- Source address spoofing will trigger ARP, which may affect scan results and observability. Use with caution and comply with laws and compliance requirements.
- Target randomization reduces detection risk but does not equal “invisibility” — compliance and security boundaries are always the priority.