Logstash vs rsyslog for aggregation of log files

I am working on a solution for centrally combining log files from our CentOS 6.x servers. After installing the Elasticsearch / Logstash / Kibana (ELK) stack, I came across the Rsyslog omelasticsearch plugin that can send messages from Rsyslog to Elasticsearch in logstash format and started asking myself why I need Logstash.

Logstash has many different input plugins, including one receiving Rsyslog message. Is there a reason I would use Logstash for my use case when I need to collect the contents of log files from multiple servers? Also, is it possible to send messages from Rsyslog to Logstash instead of sending them directly to Elasticsearch?

+7
logstash rsyslog
source share
4 answers

I would use Logstash in the middle if there is something I need from it that rsyslog does not have. For example, getting GeoIP from an IP address.

If, on the other hand, I need to get syslog or the contents of a file indexed in Elasticsearch, I would use rsyslog directly. It can do buffering (disk + memory), filtering, you can choose how the document will look (for example, you can put text strictness instead of numbers), and it can analyze unstructured data. But the main advantage is the performance that rsyslog is focused on. Here's a presentation with some numbers (and tips and tricks) on Logstash, rsyslog and Elasticsearch: http://blog.sematext.com/2015/05/18/tuning-elasticsearch-indexing-pipeline-for-logs/

+3
source share

If you go directly from the server to elasticsearch, you can get the basic documents (assuming the source is json, etc.). For me, the power of logstash is to add value to logs, using business logic to modify and expand logs.

Here is an example: syslog provides a priority level (0-7). I don’t want to have a pie chart where the values ​​are 0-7, so I am creating a new field containing nice names ("appearance", "debug", etc.) that can be used for display.

Just one example ...

+2
source share

I would recommend logstash. It would be easier to set up, more examples, and they were tested to match each other.

In addition, there are some advantages, in logstash you can filter and modify your logs.

  • You can expand the logs with useful data: server name, timestamp, ...
  • Role types, string for int, etc. (useful for proper elasticity index)
  • Filter logs by some rules

In addition, you can adjust the batch size to optimize savings to elastic. Another feature, if something went wrong and there are crazy logs per second that the elastic cannot handle, you can configure logstash to keep a certain queue of events or delete events that cannot be saved.

+2
source share

None of these are viable options if you really want to rely on a system to work under load and be very affordable.

We found that using rsyslog to send to a central location, archiving it with redis kafka, and then using logstash to do its magic and sending it to Elasticsearch is the best option.

Read our blog about it here - http://logz.io/blog/deploy-elk-production/

(Disclaimer - I am a VP product for logz.io and we offer ELK as a service)

+2
source share

All Articles