Avatar

Proposal for Solution: Banned Word List -Ban only Exact Word (Technics)

by Milo ⌂, Thursday, February 26, 2015, 13:29 (1001 days ago) @ SDN001

Hi,

Having looked through this forum, it looks like the bad language filter does a regex search on the entire post, so it finds even partial matches like the ones I have above.

At the moment, the function get_not_accepted_words works with strpos.

foreach($not_accepted_words as $not_accepted_word)
     {
      if($not_accepted_word!='' && my_strpos($string, my_strtolower($not_accepted_word, CHARSET), 0, CHARSET)!==false)
       {
        $found_not_accepted_words[] = $not_accepted_word;
       }
     }

Thus, the character sequence POS will be found in position, positive etc. To prevent this characteristic, a regular expression can be used (un-tested):

foreach($not_accepted_words as $not_accepted_word)
     {
      if($not_accepted_word!='' && preg_match("/\\b".$not_accepted_word."\\b/i",$string) 
      #if($not_accepted_word!='' && my_strpos($string, my_strtolower($not_accepted_word, CHARSET), 0, CHARSET)!==false)
       {
        $found_not_accepted_words[] = $not_accepted_word;
       }
     }

The \b is the so-called word boundaries. This flag cater for an exact matching of POS (or even pos because of the i-flag) and ignors words like position, positive etc.

regards
Micha

--
Surveyor-Software: Geodetic Network Adjustment & Deformation-Analysis and Transformation


Complete thread:

 RSS Feed of thread

powered by my little forum