How to extract a string using Javascript Regex

this may seem obvious, but I spent too much time trying to get it to work ...

I am trying to extract a substring from a file using Javascript Regex. Here is a snippet from the file:

DATE:20091201T220000 SUMMARY:Dad birthday 

the field I want to extract is Summary, so I'm trying to write a method that returns only the resulting text. Here is a way:

 extractSummary : function(iCalContent) { /* input : iCal file content return : Event summary */ var arr = iCalContent.match(/^SUMMARY\:(.)*$/g); return(arr); } 

clearly i regex noob :)) Could you fix this please? thank

+92
javascript string regex icalendar
Nov 10 '09 at 11:30
source share
5 answers

You need to use the m flag :

multi-line; treat start and end characters (^ and $) as working on several lines (i.e. correspond to the beginning or end of each line (limited to \ n or \ r), and not just the beginning or end of an entire line of input)

Also put * in the right place:

 "DATE:20091201T220000\r\nSUMMARY:Dad birthday".match(/^SUMMARY\:(.*)$/gm); //------------------------------------------------------------------^ ^ //-----------------------------------------------------------------------| 
+69
Nov 10 '09 at 13:18
source share
 function extractSummary(iCalContent) { var rx = /\nSUMMARY:(.*)\n/g; var arr = rx.exec(iCalContent); return arr[1]; } 

You need the following changes:

  • Put * in the parenthesis as suggested above. Otherwise, the group will contain only one character.

  • Get rid of ^ and $ . With the global variant, they correspond to the beginning and end of a complete line, and not to the beginning and end of lines. Instead, map to explicit newlines.

  • I assume that you need the appropriate group (which is inside the brackets) and not the full array? arr[0] full match ( "\nSUMMARY:..." ) and the following indices contain the match group.

  • String.match (regexp) is a must return array with matches. This is not the case in my browser (Safari on Mac only returns full and not groups), but Regexp.exec (string) works.

+80
Nov 10 '09 at 12:34
source share

Your regular expression most likely wants to be

 /\nSUMMARY:(.*)$/g 

A useful little trick that I like to use is to assign a default to match the array.

 var arr = iCalContent.match(/\nSUMMARY:(.*)$/g) || [""]; //could also use null for empty value return arr[0]; 

So when using arr

you don’t get annoying errors like
+16
Nov 10 '09 at 12:36
source share

(.*) instead of (.)* will be the beginning. The latter will only capture the last character on the line.

In addition, no need to hide :

+6
Nov 10 '09 at 11:32
source share

Here's how you can parse iCal files using javascript

  function calParse(str) { function parse() { var obj = {}; while(str.length) { var p = str.shift().split(":"); var k = p.shift(), p = p.join(); switch(k) { case "BEGIN": obj[p] = parse(); break; case "END": return obj; default: obj[k] = p; } } return obj; } str = str.replace(/\n /g, " ").split("\n"); return parse().VCALENDAR; } example = 'BEGIN:VCALENDAR\n'+ 'VERSION:2.0\n'+ 'PRODID:-//hacksw/handcal//NONSGML v1.0//EN\n'+ 'BEGIN:VEVENT\n'+ 'DTSTART:19970714T170000Z\n'+ 'DTEND:19970715T035959Z\n'+ 'SUMMARY:Bastille Day Party\n'+ 'END:VEVENT\n'+ 'END:VCALENDAR\n' cal = calParse(example); alert(cal.VEVENT.SUMMARY); 
-one
Nov 10 '09 at 13:04
source share