dog is really re...">

JavaScript RegExp matches HTML ignored text

Is it possible to combine "the dog is really very thick" in " The <strong>dog</strong> is really <em>really</em> fat! " And add " <span class="highlight">WHAT WAS MATCHED</span> "?

I don’t mean it specifically, but as a rule, you can search for text ignoring HTML, preserving it in the end result and just adding span above around everything?

EDIT:
Given the problem of overlapping HTML tags, is it possible to combine a phrase and just add a range around each of the matching words? The problem here is that I do not want the word "dog" to correspond when it was not in the usual context, in this case "the dog is really fat."

+7
source share
6 answers

Update:

Here is a working fiddle that does what you want. However, you will need to update htmlTagRegEx to handle matching for any HTML tag, as it just does a simple match and will not handle all cases.

http://jsfiddle.net/briguy37/JyL4J/

In addition, below is the code. Basically, it takes out the html elements one by one, then replaces the text to add the selection to match the selected selection, and then one by one discards the html elements. This is ugly, but this is the easiest way I could think of to make it work ...

 function highlightInElement(elementId, text){ var elementHtml = document.getElementById(elementId).innerHTML; var tags = []; var tagLocations= []; var htmlTagRegEx = /<{1}\/{0,1}\w+>{1}/; //Strip the tags from the elementHtml and keep track of them var htmlTag; while(htmlTag = elementHtml.match(htmlTagRegEx)){ tagLocations[tagLocations.length] = elementHtml.search(htmlTagRegEx); tags[tags.length] = htmlTag; elementHtml = elementHtml.replace(htmlTag, ''); } //Search for the text in the stripped html var textLocation = elementHtml.search(text); if(textLocation){ //Add the highlight var highlightHTMLStart = '<span class="highlight">'; var highlightHTMLEnd = '</span>'; elementHtml = elementHtml.replace(text, highlightHTMLStart + text + highlightHTMLEnd); //plug back in the HTML tags var textEndLocation = textLocation + text.length; for(i=tagLocations.length-1; i>=0; i--){ var location = tagLocations[i]; if(location > textEndLocation){ location += highlightHTMLStart.length + highlightHTMLEnd.length; } else if(location > textLocation){ location += highlightHTMLStart.length; } elementHtml = elementHtml.substring(0,location) + tags[i] + elementHtml.substring(location); } } //Update the innerHTML of the element document.getElementById(elementId).innerHTML = elementHtml; } 
+8
source

Naah ... just use the good old RegExp;)

 var htmlString = "The <strong>dog</strong> is really <em>really</em> fat!"; var regexp = /<\/?\w+((\s+\w+(\s*=\s*(?:\".*?"|'.*?'|[^'\">\s]+))?)+\s*|\s*)\/?>/gi; var result = '<span class="highlight">' + htmlString.replace(regexp, '') + '</span>'; 
+5
source

An easy way with jQuery would be.

 originalHtml = $("#div").html(); newHtml = originalHtml.replace(new RegExp(keyword + "(?![^<>]*>)", "g"), function(e){ return "<span class='highlight'>" + e + "</span>"; }); $("#div").html(newHtml); 

This works great for me.

+1
source

Here is a working example of a regex to exclude matches within html tags as well as javascripts:

http://refiddle.com/lwy6

Use this regular expression in the replace () script file.

  /(a)(?!([^<])*?>)(?!<script[^>]*?>)(?![^<]*?<\/script>|$)/gi 
+1
source

You can use line replacement with this expression </?\w*> and you will get a line

0
source

If you use jQuery, you can use the text property for the element containing the text you are looking for. Given this markup:

 <p id="the-text"> The <strong>dog</strong> is really <em>really</em> fat! </p> 

This will give: "The dog is really fat!":

 $('#the-text').text(); 

You can search for regular expressions in this text, rather than try to do this in markup.

Without jQuery, I'm not sure that you can extract and merge text nodes from all children.

-2
source

All Articles