Logstash filter remove_field for all fields except the specified list of fields

Question

Logstash filter remove_field for all fields except the specified list of fields

I am parsing a dataset onto the ELK stack for some non-technical people to view. As part of this, I want to remove all fields except a certain well-known subset of fields from events before sending them to ElasticSearch.

I can explicitly specify each field to delete in the filter mutate as follows:

filter { mutate { remove_field => [ "throw_away_field1", "throw_away_field2" ] } }

In this case, when a new field is added to the input data (which can happen often because the data is pulled out of the queue and used by several systems for several purposes), updating will require updating additional overhead that is not needed. Not to mention that some sensitive data passed between them when the input streams were updated and when the filtering was updated, this can be bad.

Is there a way to use the logstash filter to iterate over each field of an object and remove_field if it is not listed in the list of field names? Or do I need to write a special filter for this? In principle, for each individual object, I just want to save 8 specific fields and drop absolutely everything else.

It seems that a very minimal logic like if ![field] =~ /^value$/ is available in the logstash.conf file, but I see no examples that would sort through the fields themselves in the style of for each and compare the field name with a list of values.

Answer:

After updating logstash to 1.5.0, in order to be able to use plug-in extensions, such as prunes, the solution was as follows:

 filter { prune { interpolate => true whitelist_names => ["fieldtokeep1","fieldtokeep2"] } }

+6

logstash logstash-configuration

redstonemercury Oct 28 '15 at 18:47

source share

2 answers

Another option is to move the processed json to a new field and use mutate, for example:

 filter { json { source => "json" target => "parsed_json" } mutate { add_field => {"nested_field" => "%{[parsed_json][nested_field]}"} remove_field => [ "json", "parsed_json" ] } }

+3

user6751687 Aug 24 '16 at 9:02

source share

Alain collins · Accepted Answer · 2015-10-28T18:50:20+0000

Prune whitelisting should be what you are looking for.

For more specific control, dropping on a ruby filter is probably the next step.

Logstash filter remove_field for all fields except the specified list of fields

More articles: