Getting consistent MySQL full-text search context in PHP (and security)

I do a full-text search on the MySQL β€œtable” pages. I show a list of pages matching the keyword in their "heading" (plain text, VARCHAR, 255) or "content" (html, TEXT). When a match is found in the "content" field, I would like to display the fragment in which the match was found. I have no idea how to do this.

Can you put me in the right direction?

$query = ' SELECT *, MATCH(title, content) AGAINST("'.$keyword.'") AS score FROM page WHERE MATCH(title, content) AGAINST("'.$keyword.'") ORDER BY score DESC '; $result = mysql_query($query) or die (mysql_error()); if(mysql_num_rows($result) > 0) { $output .= '<p>Your keyword matches the following pages:</p>'; while($row = mysql_fetch_assoc($result)){ $title = htmlentities($row['title']); $content = htmlentities(strip_tags($row['content'])); $content = limit_text($content, 250); // Cuts it down to 250 characters plus ... $output .= '<h2>'.$title.'</h2>'; if(trim($content) != '') { $output .= '<p>'.$content.'</p>'; // I'd like to place a snippet here with the matched context } } } else { $output .= '<p>Keyword not found...</p>'; } 

In addition, I have a security question. Right now I am checking $keyword three ways:

  • Not empty?
  • More than 2 characters?
  • Not dangerous? (see below).

I use regex to match the following to see if user login is dangerous

 <script|&lt;script|&gt;script|document.|alert|bcc:|cc:|x-mailer:|to:|recipient|truncate|drop table 

It may be a little ridiculous and easy to work with, but it is at least a minimal form of protection against XSS exploits. What is the recommended filter protection method for a search keyword? Is PHPIDS overkill?

+6
security mysql search full-text-search
source share
3 answers

This should help you get started on part of the context ...

 // return the part of the content where the keyword was matched function get_surrounding_text($keyword, $content, $padding) { $position = strpos($content, $keyword); // starting at (where keyword was found - padding), retrieve // (padding + keyword length + padding) characters from the content $snippet = substr($content, $position - $padding, (strlen($keyword) + $padding * 2)); return '...' . $snippet . '...'; } $content = 'this is a really long string of characters with a magic word buried somewhere in it'; $keyword = 'magic'; echo get_surrounding_text($keyword, $content, 15); // echoes '... string with a magic word in it...' 

This function does not take into account cases when the boundaries of the filling will go beyond the content line, for example, when a keyword is found near the beginning or end of the content. It also does not account for multiple matches, etc. But he, I hope, will at least point you in the right direction.

+6
source share

Instead of trying to filter the $keywords variable yourself, you can simply use the prepared statement and never worry about losing a potential exploit:

 <?php $stmt = $dbh->prepare("INSERT INTO REGISTRY (name, value) VALUES (:name, :value)"); $stmt->bindParam(':name', $name); $stmt->bindParam(':value', $value); // insert one row $name = 'one'; $value = 1; $stmt->execute(); // insert another row with different values $name = 'two'; $value = 2; $stmt->execute(); ?> 
+2
source share

I would most likely get the $ keyword so that the function clears it first if I were you. and for the record, you better put all the words in the $ keyword into an array, so if necessary use boolean search (for example, put + in front of each word to get AND effect)

0
source share

All Articles