How to parse string into words and punctuation using javascript

I have the line test = "hello, how are you doing all this, I hope this is good! And excellent. We are waiting for you.

I am trying to parse a string into words and punctuation using javascript. I can separate words, but then punctuation disappears with a regular expression

var result = test.match (/ \ b (\ w | ') + \ b / g);

So my expected result

hello
how 
are 
you
all
doing
,
I
hope
that
it's
good
!
and 
fine
.
Looking
forward
to
see
you
0
source share
2 answers

Simple approach

This first approach, if you and javascript define the word "word". Below is a more customizable approach.

Give it a try test.split(/\s*\b\s*/). It breaks into word boundaries ( \b) and eats a space.

"hello how are you all doing, I hope that it good! and fine. Looking forward to see you."
    .split(/\s*\b\s*/);
// Returns:
["hello",
"how",
"are",
"you",
"all",
"doing",
",",
"I",
"hope",
"that",
"it",
"'",
"s",
"good",
"!",
"and",
"fine",
".",
"Looking",
"forward",
"to",
"see",
"you",
"."]

.

var test = "This is. A test?"; // Test string.

// First consider splitting on word boundaries (\b).
test.split(/\b/); //=> ["This"," ","is",". ","A"," ","test","?"]
// This almost works but there is some unwanted whitespace.

// So we change the split regex to gobble the whitespace using \s*
test.split(/\s*\b\s*/) //=> ["This","is",".","A","test","?"]
// Now the whitespace is included in the separator
// and not included in the result.

.

, "isn`t" "one-thousand" , javascript regex , .

test.match(/[\w-']+|[^\w\s]+/g) //=> ["This","is",".","A","test","?"]

, . [\w-']+ , , [^\w\s]+ , . , . + , ( ?!, ⃓) , , +.

+9

:

[,.!?;:]|\b[a-z']+\b

.

, JavaScript:

resultArray = yourString.match(/[,.!?;:]|\b[a-z']+\b/ig);

  • [,.!?;:]
  • ( |)
  • \b
  • [a-z']+
  • \b
+3

All Articles