Australia's Leading Digital Marketing Experts. T. 1300 235 433  |  Aggreagtion Enquires Welcome

Calculate Differences Between Strings of Text With PHP

If you've ever dealt with quality and version management you would appreciate the need to record all changes to specific text for the purpose of quality assurance. Some time back I was working on a small project and required the following:

  • Record specific changes to the published version of a document.
  • Provide archived versions of all published data.
  • Render all changes to text on a screen before it was saved to the database.
  • Provide a "roll-back" feature to the last saved version.

Following are a few options I investigated before I implemented my basic solution.

PHP's array_diff() Function

In the first instance I naturally gravitated towards PHP's native array_diff() function. It will compare two or more arrays and compute the differences.

The array_diff() function compares two or more arrays, and returns an array with the keys and values from the first array, only if the value is not present in any of the other arrays.

Example:

1
<?php 
2
$array1 = array("yello", "green", "blue", "red", "white", "black", "purple" );
3
$array2 = array("yello", "orange", "purple", "blue", "red" );
4
$result = array_diff($array1, $array2);
5
 
6
print_r($result);

This will output:

1
Array
2
(
3
    [1] => green
4
    [4] => white
5
    [5] => black
6
)

Keep in mind that this does not return a new array; it simply unsets the matching values. This means that the indexes of the array are not numerical from zero. Although not entirely relevant, PHP's array_merge() function is an easy way to overcome this. The returned array will have numerical indexes.

1
<?php 
2
$result2 = array_merge(array_diff($array1, $array2));
3
print_r($result2);

Result:

1
Array
2
(
3
    [0] => green
4
    [1] => white
5
    [2] => black
6
)

You can print out the values as text (or do anything else with them) with code similar to the following:

1
<?php 
2
$new_array = array_diff($array1,$array2);
3
while (list ($key, $val) = each ($new_array)) {
4
echo "$key -- $val <br>";

Result:

1 -> green
4 -> white
5 -> black

The array_diff() function was a nice start, but I was somewhat clueless when it came to recursively comparing each element of the array and actually applying it as I explained above. Thankfully, others had previously addressed the problem.

Simple Diff Algorithm in PHP

The following function that will compare the differences between two strings:

1
<?php 
2
/*
3
    Calculate Differences Between Strings of Text With PHP
4
    http://www.beliefmedia.com/code/php-snippets/string-differences
5
*/
6
 
7
 
8
function diff($old, $new){
9
    $maxlen = 0;
10
    foreach ($old as $oindex => $ovalue){
11
        $nkeys = array_keys($new, $ovalue);
12
        foreach ($nkeys as $nindex){
13
            $matrix[$oindex][$nindex] = isset($matrix[$oindex - 1][$nindex - 1]) ?
14
                $matrix[$oindex - 1][$nindex - 1] + 1 : 1;
15
            if ($matrix[$oindex][$nindex] > $maxlen) {
16
                $maxlen = $matrix[$oindex][$nindex];
17
                $omax = $oindex + 1 - $maxlen;
18
                $nmax = $nindex + 1 - $maxlen;
19
            }
20
        }    
21
    }
22
    if($maxlen == 0) return array(array('d'=>$old, 'i'=>$new));
23
 
24
    return array_merge(
25
        diff(array_slice($old, 0, $omax), array_slice($new, 0, $nmax)),
26
        array_slice($new, $nmax, $maxlen),
27
        diff(array_slice($old, $omax + $maxlen), array_slice($new, $nmax + $maxlen)));
28
}
29
 
30
function htmlDiff($old, $new) {
31
    $ret = '';
32
    $diff = diff(explode(' ', $old), explode(' ', $new));
33
    foreach ($diff as $k) {
34
        if (is_array($k))
35
            $ret .= (!empty($k['d'])?"<del>".implode(' ',$k['d'])."</del> ":'').
36
                (!empty($k['i'])?"<ins>".implode(' ',$k['i'])."</ins> ":'');
37
        else $ret .= $k . ' ';
38
    }
39
    return $ret;
40
}

Usage:

1
<?php 
2
$text1 = "The quick brown fox jumped over the lazy dog.";
3
$text2 = "The slow yellow cat walked over the lazy fox.";
4
echo htmlDiff($text1,$text2);

Result:

The quick brown fox jumped slow yellow cat walked over the lazy dog. fox.

The function works by "finding the longest sequence of words common to both strings, and recursively finding the longest sequences of the remainders of the string until the substrings have no words in common. At this point it adds the remaining new words as an insertion and the remaining old words as a deletion".

Like this article?

Share on facebook
Share on Facebook
Share on twitter
Share on Twitter
Share on linkedin
Share on Linkdin
Share on pinterest
Share on Pinterest

Leave a comment