When using Logstash for log collection, a common scenario is to use the syslog plugin to collect logs from network devices or security devices.
But when you search for logstash syslog plugin, you might find that search engines or ChatGPT give answers about the udp plugin or tcp plugin.
This raises some questions:
- What is the difference between the syslog plugin and the tcp/udp plugins?
- Why is the official syslog plugin performance poor?
- Which plugin should be used in a real production environment?
This article attempts to analyze the differences between the syslog plugin and tcp/udp plugins, and provides best practices.
Implementation of the Syslog Plugin
Logstash provides a syslog plugin that supports listening on both TCP and UDP.
You only need to configure the port, without specifying the protocol.
Partial code implementation is as follows:
def udp_listener(output_queue)
@logger.info("Starting syslog udp listener", :address => "#{@host}:#{@port}")
@udp.close if @udp
@udp = UDPSocket.new (IPAddr.new(@host).ipv6? rescue nil) ? Socket::AF_INET6 : Socket::AF_INET
@udp.do_not_reverse_lookup = true
@udp.bind(@host, @port)
while !stop?
payload, client = @udp.recvfrom(65507)
metric.increment(:messages_received)
decode(client[3], output_queue, payload)
end
ensure
close_udp
end # def udp_listener
...
...
def tcp_listener(output_queue)
@logger.info("Starting syslog tcp listener", :address => "#{@host}:#{@port}")
@tcp = TCPServer.new(@host, @port)
@tcp.do_not_reverse_lookup = true
while !stop?
socket = @tcp.accept
@tcp_sockets << socket
metric.increment(:connections)
Thread.new(output_queue, socket) do |output_queue, socket|
tcp_receiver(output_queue, socket)
end
end
ensure
close_tcp
end # def tcp_listener
It can be seen that:
- UDP: Receives data via
recvfrom. - TCP: Starts a thread for each connection to handle it.
That is to say, the TCP processing of the syslog plugin is based on a multi-threaded model. This causes performance to drop sharply when there are many concurrent connections.
Implementation of the UDP Plugin
Let’s look at the implementation of the udp plugin:
def udp_listener(output_queue)
@logger.info("Starting UDP listener", :address => "#{@host}:#{@port}")
...
# some code
...
while !stop?
next if IO.select([@udp], [], [], 0.5).nil?
# collect datagram messages and add to inputworker queue
@queue_size.times do
begin
payload, client = @udp.recvfrom_nonblock(@buffer_size)
break if payload.empty?
push_data(payload, client)
rescue IO::EAGAINWaitReadable
break
end
end
end
ensure
if @udp
@udp.close_read rescue ignore_close_and_log($!)
@udp.close_write rescue ignore_close_and_log($!)
end
end
Here, IO Multiplexing (select/non-blocking IO) is used instead of a thread per connection. Therefore, performance is better in high-concurrency UDP scenarios.
Implementation of the TCP Plugin
The TCP plugin is based on Netty, a high-performance event-driven framework:
java_import 'org.logstash.tcp.InputLoop'
Netty can efficiently handle massive connections. Compared to the multi-threaded implementation of the syslog plugin, the performance improvement is huge.
Overhead of the Syslog Plugin
In addition to differences in the I/O model, the syslog plugin also performs protocol parsing on logs, such as calling grok rules to disassemble syslog messages. The tcp/udp plugins simply collect messages without protocol parsing. Therefore:
- Syslog Plugin: Collection + Parsing (more features, but poor performance), but many current syslogs do not follow relevant protocol regulations.
- TCP/UDP Plugin: Collection Only (requires subsequent manual parsing, but good performance).
Conclusion and Recommended Practices
- Performance First: It is recommended to use
tcporudpplugins to collect syslog logs, and then parse them viagrokordissectplugins. - Simple Configuration: If there are few devices and the log volume is not large, you can use the syslog plugin directly.
- Production Environment Best Practice:
- Use udp/tcp plugins to collect logs.
- Use grok/dissect in the pipeline to parse the syslog format.
- Output to storage systems like Elasticsearch / Kafka.