Log Analysis: Search for Lines by Time Difference

I have a long log file generated with log4j , 10 threads writing to the log. I am looking for a log analyzer tool that could find the lines where the user waited a long time (i.e. the difference between the log entries for the same thread is more than a minute).

PS I'm trying to use OtrosLogViewer , but it does filter by certain values ​​(for example, by stream identifier) ​​and does not compare strings.

SFC new version of OtrosLogViewer has a "Delta" column, which calculates the difference between the logical transition lines (in ms)

Thank you

+4
source share
2 answers

This simple Python script may be enough. For testing, I analyzed my local Apache log, which BTW uses the Common Log Format , so you can reuse it as is. I simply calculate the difference between the two subsequent queries and print the query string for deltas that exceed a certain threshold (1 second in my test). You may want to encapsulate the code in a function that also accepts a parameter with a stream identifier, so you can optionally filter

 #!/usr/bin/env python import re from datetime import datetime THRESHOLD = 1 last = None for line in open("/var/log/apache2/access.log"): # You may insert here something like # if not re.match(THREAD_ID, line): # continue # Python does not support %z, hence the [:-6] current = datetime.strptime( re.search(r"\[([^]]+)]", line).group(1)[:-6], "%d/%b/%Y:%H:%M:%S") if last != None and (current - last).seconds > THRESHOLD: print re.search('"([^"]+)"', line).group(1) last = current 
+2
source

Based on @Raffaele's answer, I made some corrections for working with any log file (skipping lines that do not start from the requested date, for example, the Jenkins console log). In addition, a Max / Min Threshold has been added to filter strings based on duration restrictions.

 #!/usr/bin/env python import re from datetime import datetime MIN_THRESHOLD = 80 MAX_THRESHOLD = 100 regCompile = r"\w+\s+(\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d).*" filePath = "C:/Users/user/Desktop/temp/jenkins.log" lastTime = None lastLine = "" with open(filePath, 'r') as f: for line in f: regexp = re.search(regCompile, line) if regexp: currentTime = datetime.strptime(re.search(regCompile, line).group(1), "%Y-%m-%d %H:%M:%S") if lastTime != None: duration = (currentTime - lastTime).seconds if duration >= MIN_THRESHOLD and duration <= MAX_THRESHOLD: print ("#######################################################################################################################################") print (lastLine) print (line) lastTime = currentTime lastLine = line f.closed 
+1
source

All Articles