Skype log parsing

I need to parse the skype log, capture all call durations and add them and find out the total call duration for the entire chat history.

Example:

[3/12/2012 11:36:44 AM] * The call is over, duration 21:33 *

I think I need to use preg_match with the correct regex expression. If it is possible to store the actual timestamp in the array at the same time, that would be better.

I think I'm really fixated - this is the correct regular expression rule, which should only be enough for the duration of the call.

+4
source share
2 answers

Try

(?i)\[(?P<time_stamp>[^[]+)\]\s*[*]\s*[az ,]+(?P<duration>(?:\d{2}:?){2,3})\s*[*] 

Explanation

 " (?i) # Match the remainder of the regex with the options: case insensitive (i) \[ # Match the character "[" literally (?P<time_stamp> # Match the regular expression below and capture its match into backreference with name "time_stamp" [^[] # Match any character that is NOT a "[" + # Between one and unlimited times, as many times as possible, giving back as needed (greedy) ) \] # Match the character "]" literally \s # Match a single character that is a "whitespace character" (spaces, tabs, and line breaks) * # Between zero and unlimited times, as many times as possible, giving back as needed (greedy) [*] # Match the character "*" \s # Match a single character that is a "whitespace character" (spaces, tabs, and line breaks) * # Between zero and unlimited times, as many times as possible, giving back as needed (greedy) [az ,] # Match a single character present in the list below # A character in the range between "a" and "z" # One of the characters " ," + # Between one and unlimited times, as many times as possible, giving back as needed (greedy) (?P<duration> # Match the regular expression below and capture its match into backreference with name "duration" (?: # Match the regular expression below \d # Match a single digit 0..9 {2} # Exactly 2 times : # Match the character ":" literally ? # Between zero and one times, as many times as possible, giving back as needed (greedy) ){2,3} # Between 2 and 3 times, as many times as possible, giving back as needed (greedy) ) \s # Match a single character that is a "whitespace character" (spaces, tabs, and line breaks) * # Between zero and unlimited times, as many times as possible, giving back as needed (greedy) [*] # Match the character "*" " 
+1
source

You can use this:

  \*.+?([0-9]+:){1,2}([0-9]+) 

Then it can capture both HH: MM: SS, and MM: SS, which appear after the first * .

0
source

All Articles