Logstash grok filter for custom logs

I have two related questions. Firstly, what is the best way to check logs that have a β€œrandom” interval, etc., and the second one I will ask separately is how to handle logs that have arbitrary attribute-value pairs. (See: logstash grok filter for logs with arbitrary value attribute pairs )

So, for the first question, I have a log line that looks like this:

14:46:16.603 [http-nio-8080-exec-4] INFO METERING - msg=93e6dd5e-c009-46b3-b9eb-f753ee3b889a CREATE_JOB job=a820018e-7ad7-481a-97b0-bd705c3280ad data=71b1652e-16c8-4b33-9a57-f5fcb3d5de92 

Using http://grokdebug.herokuapp.com/ I ended up creating the following grok template that works for this line:

 %{TIME:timestamp} %{NOTSPACE:http} %{WORD:loglevel}%{SPACE}%{WORD:logtype} - msg=%{NOTSPACE:msg}%{SPACE}%{WORD:action}%{SPACE}job=%{NOTSPACE:job}%{SPACE}data=%{NOTSPACE:data} 

With the following configuration file:

 input { file { path => "/home/robyn/testlogs/trimmed_logs.txt" start_position => beginning sincedb_path => "/dev/null" # for testing; allows reparsing } } filter { grok { match => {"message" => "%{TIME:timestamp} %{NOTSPACE:http} %{WORD:loglevel}%{SPACE}%{WORD:logtype} - msg=%{NOTSPACE:msg}%{SPACE}%{WORD:action}%{SPACE}job=%{NOTSPACE:job}%{SPACE}data=%{NOTSPACE:data}" } } } output { file { path => "/home/robyn/filteredlogs/trimmed_logs.out.txt" } } 

I get the following output:

 {"message":"14:46:16.603 [http-nio-8080-exec-4] INFO METERING - msg=93e6dd5e-c009-46b3-b9eb-f753ee3b889a CREATE_JOB job=a820018e-7ad7-481a-97b0-bd705c3280ad data=71b1652e-16c8-4b33-9a57-f5fcb3d5de92","@version":"1","@timestamp":"2015-08-07 T17:55:16.529Z","host":"hlt-dev","path":"/home/robyn/testlogs/trimmed_logs.txt","timestamp":"14:46:16.603","http":"[http-nio-8080-exec-4]","loglevel":"INFO","logtype":"METERING","msg":"93e6dd5e-c009-46b3-b9eb-f753ee3b889a","action":"CREATE_JOB","job":"a820018e-7ad7-481a-97b0-bd705c3280ad","data":"71b1652e-16c8-4b33-9a57-f5fcb3d5de92"} 

This is pretty much what I want, but I feel it is a really confusing template, especially with the need to use% {SPACE} and% {NOSPACE}. This tells me that I am not doing it in the best way. Should I create a more specific template for hex identifiers? I think I need% {SPACE} between loglevel and logtype due to the extra space between INFO and METERING in the log, but this also feels kludgy.

Also, how do I get the log timestamp for replacing @timestamp, which seems to be a log time table that we don't need / need.

Obviously, I'm starting to work with ELK and grok, so pointers to useful resources are also recommended.

+1
source share
2 answers

There is an existing template that you can use instead of NOTSPACE , it UUID . Also, when there is one place, there is no need to use the SPACE template, you can leave it. I also use the USERNAME pattern (possibly incorrectly named) only to capture the http field.

So, it will look like this, and you will only have one SPACE pattern to capture multiple spaces.

Example log line:

 14:46:16.603 [http-nio-8080-exec-4] INFO METERING - msg=93e6dd5e-c009-46b3-b9eb-f753ee3b889a CREATE_JOB job=a820018e-7ad7-481a-97b0-bd705c3280ad data=71b1652e-16c8-4b33-9a57-f5fcb3d5de92 

Grok Sample:

 %{TIME:timestamp} \[%{USERNAME:http}\] %{WORD:loglevel}%{SPACE}%{WORD:logtype} - msg=%{UUID:msg} %{WORD:action} job=%{UUID:job} data=%{UUID:data} 

Grock will spit it out:

 { "timestamp": [ [ "14:46:16.603" ] ], "HOUR": [ [ "14" ] ], "MINUTE": [ [ "46" ] ], "SECOND": [ [ "16.603" ] ], "http": [ [ "http-nio-8080-exec-4" ] ], "loglevel": [ [ "INFO" ] ], "SPACE": [ [ " " ] ], "logtype": [ [ "METERING" ] ], "msg": [ [ "93e6dd5e-c009-46b3-b9eb-f753ee3b889a" ] ], "action": [ [ "CREATE_JOB" ] ], "job": [ [ "a820018e-7ad7-481a-97b0-bd705c3280ad" ] ], "data": [ [ "71b1652e-16c8-4b33-9a57-f5fcb3d5de92" ] ] } 
+1
source

It is also possible to use \ s * instead of the SPACE pattern.

You can use the mutate plugin to remove fields, there is a method called "remove_field" β†’ https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html#plugins-filters-mutate-remove_field

If you delete this field, you need to add a new index to the kibana. Because kibana sorts events with the @timestamp field if nothing is selected.

0
source

All Articles