In cybersecurity testing and information gathering, operations such as port scanning, directory scanning, and vulnerability detection often require sending a large number of requests to the target. Without reasonable rate limiting control, it will not only lead to excessive occupation of network bandwidth and system resources but may also trigger firewall/IDS detection and blocking. To balance efficiency and stealth, rate limiting mechanisms are particularly important. Among them, Token Bucket is a common rate limiting algorithm, often used for traffic limiting at gateways, but it can also be used in scanners.


Principle of Token Bucket Algorithm

The basic idea of the Token Bucket:

  • The system puts tokens into the bucket at a fixed rate.
  • The bucket can hold a maximum number of tokens.
  • A token must be obtained before each request.
  • If there are no tokens in the bucket, the request waits or is discarded.

This achieves smooth rate limiting, ensuring the overall rate is controlled while allowing for short bursts of traffic.

flowchart TD A[Start Request] --> B{Token in Bucket?} B -- No --> C[Wait or Discard Request] B -- Yes --> D[Take a Token] D --> E[Request Executed] E --> F[End] subgraph TokenBucket[Token Bucket Mechanism] T1[Add tokens at fixed rate] T2[Bucket capacity limit] T1 --> T2 T2 --> B end

Application Scenarios in Scanners

1. Port Scanning Rate Limiting

  • Avoid being identified as a Denial of Service attack by the target firewall due to a large number of probe packets in a short time.
  • Maintain a low but stable rate to improve scan success rate.

2. Directory/URL Brute-forcing Rate Limiting

  • Avoid causing significant load on Web services.
  • Ensure a fixed number of requests per second to improve detection stealth.

3. Multi-thread Scheduling Control

  • Scanners often use multi-threaded concurrency.
  • The Token Bucket algorithm can serve as a global scheduler to ensure the total rate of all threads does not exceed the set threshold.

Practice in ProjectDiscovery Scanners

The rate limiting algorithm in ProjectDiscovery’s series of scanners is actually a custom implementation of the token bucket algorithm, optimized for the classic token bucket. See: https://github.com/projectdiscovery/ratelimit. Below is the classic token bucket implemented by the official Go library:

// Every converts a minimum time interval between events to a Limit.
func Every(interval time.Duration) Limit {
	if interval <= 0 {
		return Inf
	}
	return 1 / Limit(interval.Seconds())
}

The implementation in ProjectDiscovery is as follows:

func (limiter *Limiter) run(ctx context.Context) {
	defer close(limiter.tokens)
	for {
		if limiter.count.Load() == 0 {
			<-limiter.ticker.C
			limiter.count.Store(limiter.maxCount.Load())
		}
	...
...

It can be seen that the official implementation is very smooth. ProjectDiscovery’s implementation is optimized to handle bursts of large numbers of requests.

Aside: Masscan’s Rate Limiting Principle

Masscan is known for its high-speed scanning, but it also has rate limiting parameters. Masscan uses packet transmission interval control for rate limiting. Because Masscan uses a single thread responsible for sending packets, it only needs to limit the sending interval to 1/rate.