Package com.logicaldoc.comparison.basic
Class DiffMatch
- java.lang.Object
-
- com.logicaldoc.comparison.basic.DiffMatch
-
public class DiffMatch extends Object
Class containing the diff, match and patch methods. Also contains the behaviour settings.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
DiffMatch.Diff
Class representing one diff operation.static class
DiffMatch.Operation
The data structure representing a diff is a Linked list of Diff objects: {Diff(Operation.DELETE, "Hello"), Diff(Operation.INSERT, "Goodbye"), Diff(Operation.EQUAL, " world.")} which means: delete "Hello", add "Goodbye" and keep " world."static class
DiffMatch.Patch
Class representing one patch operation.
-
Field Summary
Fields Modifier and Type Field Description short
Diff_EditCost
Cost of an empty edit operation in terms of edit characters.float
Diff_Timeout
Number of seconds to fieldsMap a diff before giving up (0 for infinity).int
Match_Distance
How far to search for a match (0 = exact location, 1000+ = broad match).float
Match_Threshold
At what point is no match declared (0.0 = perfection, 1.0 = very loose).float
Patch_DeleteThreshold
When deleting a large block of text (over ~64 characters), how close do the contents have to be to match the expected contents.short
Patch_Margin
Chunk size for context length.
-
Constructor Summary
Constructors Constructor Description DiffMatch()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
diff_cleanupEfficiency(LinkedList<DiffMatch.Diff> diffs)
Reduce the number of edits by eliminating operationally trivial equalities.void
diff_cleanupMerge(LinkedList<DiffMatch.Diff> diffs)
Reorder and merge like edit sections.void
diff_cleanupSemantic(LinkedList<DiffMatch.Diff> diffs)
Reduce the number of edits by eliminating semantically trivial equalities.void
diff_cleanupSemanticLossless(LinkedList<DiffMatch.Diff> diffs)
Look for single edits surrounded on both sides by equalities which can be shifted sideways to align the edit to a word boundary.int
diff_commonPrefix(String text1, String text2)
Determine the common prefix of two stringsint
diff_commonSuffix(String text1, String text2)
Determine the common suffix of two stringsLinkedList<DiffMatch.Diff>
diff_fromDelta(String text1, String delta)
Given the original text1, and an encoded string which describes the operations required to transform text1 into text2, compute the full diff.int
diff_levenshtein(LinkedList<DiffMatch.Diff> diffs)
Compute the Levenshtein distance; the number of inserted, deleted or substituted characters.LinkedList<DiffMatch.Diff>
diff_main(String text1, String text2)
Find the differences between two texts.LinkedList<DiffMatch.Diff>
diff_main(String text1, String text2, boolean checklines)
Find the differences between two texts.String
diff_prettyHtml(LinkedList<DiffMatch.Diff> diffs)
Convert a Diff list into a pretty HTML report.String
diff_text1(LinkedList<DiffMatch.Diff> diffs)
Compute and return the source text (all equalities and deletions).String
diff_text2(LinkedList<DiffMatch.Diff> diffs)
Compute and return the destination text (all equalities and insertions).String
diff_toDelta(LinkedList<DiffMatch.Diff> diffs)
Crush the diff into an encoded string which describes the operations required to transform text1 into text2.int
diff_xIndex(LinkedList<DiffMatch.Diff> diffs, int loc)
loc is a location in text1, compute and return the equivalent location in text2.int
match_main(String text, String pattern, int loc)
Locate the best instance of 'pattern' in 'text' near 'loc'.String
patch_addPadding(LinkedList<DiffMatch.Patch> patches)
Add some padding on text start and end so that edges can match something.Object[]
patch_apply(LinkedList<DiffMatch.Patch> patches, String text)
Merge a set of patches onto the text.LinkedList<DiffMatch.Patch>
patch_deepCopy(LinkedList<DiffMatch.Patch> patches)
Given an array of patches, return another array that is identical.List<DiffMatch.Patch>
patch_fromText(String textline)
Parse a textual representation of patches and return a List of Patch objects.LinkedList<DiffMatch.Patch>
patch_make(String text1, String text2)
Compute a list of patches to turn text1 into text2.LinkedList<DiffMatch.Patch>
patch_make(String text1, LinkedList<DiffMatch.Diff> diffs)
Compute a list of patches to turn text1 into text2.LinkedList<DiffMatch.Patch>
patch_make(LinkedList<DiffMatch.Diff> diffs)
Compute a list of patches to turn text1 into text2.void
patch_splitMax(LinkedList<DiffMatch.Patch> patches)
Look through the patches and break up any which are longer than the maximum limit of the match algorithm.String
patch_toText(List<DiffMatch.Patch> patches)
Take a list of patches and return a textual representation.
-
-
-
Field Detail
-
Diff_Timeout
public float Diff_Timeout
Number of seconds to fieldsMap a diff before giving up (0 for infinity).
-
Diff_EditCost
public short Diff_EditCost
Cost of an empty edit operation in terms of edit characters.
-
Match_Threshold
public float Match_Threshold
At what point is no match declared (0.0 = perfection, 1.0 = very loose).
-
Match_Distance
public int Match_Distance
How far to search for a match (0 = exact location, 1000+ = broad match). A match this many characters away from the expected location will add 1.0 to the score (0.0 is a perfect match).
-
Patch_DeleteThreshold
public float Patch_DeleteThreshold
When deleting a large block of text (over ~64 characters), how close do the contents have to be to match the expected contents. (0.0 = perfection, 1.0 = very loose). Note that Match_Threshold controls how closely the end points of a delete need to match.
-
Patch_Margin
public short Patch_Margin
Chunk size for context length.
-
-
Method Detail
-
diff_main
public LinkedList<DiffMatch.Diff> diff_main(String text1, String text2)
Find the differences between two texts. Run a faster, slightly less optimal diff. This method allows the 'checklines' of diff_main() to be optional. Most of the time checklines is wanted, so default to true.- Parameters:
text1
- Old string to be diffed.text2
- New string to be diffed.- Returns:
- Linked List of Diff objects.
-
diff_main
public LinkedList<DiffMatch.Diff> diff_main(String text1, String text2, boolean checklines)
Find the differences between two texts.- Parameters:
text1
- Old string to be diffed.text2
- New string to be diffed.checklines
- Speedup flag. If false, then don't run a line-level diff first to identify the changed areas. If true, then run a faster slightly less optimal diff.- Returns:
- Linked List of Diff objects.
-
diff_commonPrefix
public int diff_commonPrefix(String text1, String text2)
Determine the common prefix of two strings- Parameters:
text1
- First string.text2
- Second string.- Returns:
- The number of characters common to the start of each string.
-
diff_commonSuffix
public int diff_commonSuffix(String text1, String text2)
Determine the common suffix of two strings- Parameters:
text1
- First string.text2
- Second string.- Returns:
- The number of characters common to the end of each string.
-
diff_cleanupSemantic
public void diff_cleanupSemantic(LinkedList<DiffMatch.Diff> diffs)
Reduce the number of edits by eliminating semantically trivial equalities.- Parameters:
diffs
- LinkedList of Diff objects.
-
diff_cleanupSemanticLossless
public void diff_cleanupSemanticLossless(LinkedList<DiffMatch.Diff> diffs)
Look for single edits surrounded on both sides by equalities which can be shifted sideways to align the edit to a word boundary. e.g: The c<ins>at c</ins>ame. -> The <ins>cat </ins>came.- Parameters:
diffs
- LinkedList of Diff objects.
-
diff_cleanupEfficiency
public void diff_cleanupEfficiency(LinkedList<DiffMatch.Diff> diffs)
Reduce the number of edits by eliminating operationally trivial equalities.- Parameters:
diffs
- LinkedList of Diff objects.
-
diff_cleanupMerge
public void diff_cleanupMerge(LinkedList<DiffMatch.Diff> diffs)
Reorder and merge like edit sections. Merge equalities. Any edit section can move as long as it doesn't cross an equality.- Parameters:
diffs
- LinkedList of Diff objects.
-
diff_xIndex
public int diff_xIndex(LinkedList<DiffMatch.Diff> diffs, int loc)
loc is a location in text1, compute and return the equivalent location in text2. e.g. "The cat" vs "The big cat", 1->1, 5->8- Parameters:
diffs
- LinkedList of Diff objects.loc
- Location within text1.- Returns:
- Location within text2.
-
diff_prettyHtml
public String diff_prettyHtml(LinkedList<DiffMatch.Diff> diffs)
Convert a Diff list into a pretty HTML report.- Parameters:
diffs
- LinkedList of Diff objects.- Returns:
- HTML representation.
-
diff_text1
public String diff_text1(LinkedList<DiffMatch.Diff> diffs)
Compute and return the source text (all equalities and deletions).- Parameters:
diffs
- LinkedList of Diff objects.- Returns:
- Source text.
-
diff_text2
public String diff_text2(LinkedList<DiffMatch.Diff> diffs)
Compute and return the destination text (all equalities and insertions).- Parameters:
diffs
- LinkedList of Diff objects.- Returns:
- Destination text.
-
diff_levenshtein
public int diff_levenshtein(LinkedList<DiffMatch.Diff> diffs)
Compute the Levenshtein distance; the number of inserted, deleted or substituted characters.- Parameters:
diffs
- LinkedList of Diff objects.- Returns:
- Number of changes.
-
diff_toDelta
public String diff_toDelta(LinkedList<DiffMatch.Diff> diffs)
Crush the diff into an encoded string which describes the operations required to transform text1 into text2. E.g. =3\t-2\t+ing -> Keep 3 chars, delete 2 chars, insert 'ing'. Operations are tab-separated. Inserted text is escaped using %xx notation.- Parameters:
diffs
- Array of Diff objects.- Returns:
- Delta text.
-
diff_fromDelta
public LinkedList<DiffMatch.Diff> diff_fromDelta(String text1, String delta) throws IllegalArgumentException
Given the original text1, and an encoded string which describes the operations required to transform text1 into text2, compute the full diff.- Parameters:
text1
- Source string for the diff.delta
- Delta text.- Returns:
- Array of Diff objects or null if invalid.
- Throws:
IllegalArgumentException
- If invalid input.
-
match_main
public int match_main(String text, String pattern, int loc)
Locate the best instance of 'pattern' in 'text' near 'loc'. Returns -1 if no match found.- Parameters:
text
- The text to search.pattern
- The pattern to search for.loc
- The location to search around.- Returns:
- Best match index or -1.
-
patch_make
public LinkedList<DiffMatch.Patch> patch_make(String text1, String text2)
Compute a list of patches to turn text1 into text2. A set of diffs will be computed.- Parameters:
text1
- Old text.text2
- New text.- Returns:
- LinkedList of Patch objects.
-
patch_make
public LinkedList<DiffMatch.Patch> patch_make(LinkedList<DiffMatch.Diff> diffs)
Compute a list of patches to turn text1 into text2. text1 will be derived from the provided diffs.- Parameters:
diffs
- Array of Diff objects for text1 to text2.- Returns:
- LinkedList of Patch objects.
-
patch_make
public LinkedList<DiffMatch.Patch> patch_make(String text1, LinkedList<DiffMatch.Diff> diffs)
Compute a list of patches to turn text1 into text2. text2 is not provided, diffs are the delta between text1 and text2.- Parameters:
text1
- Old text.diffs
- Array of Diff objects for text1 to text2.- Returns:
- LinkedList of Patch objects.
-
patch_deepCopy
public LinkedList<DiffMatch.Patch> patch_deepCopy(LinkedList<DiffMatch.Patch> patches)
Given an array of patches, return another array that is identical.- Parameters:
patches
- Array of Patch objects.- Returns:
- Array of Patch objects.
-
patch_apply
public Object[] patch_apply(LinkedList<DiffMatch.Patch> patches, String text)
Merge a set of patches onto the text. Return a patched text, as well as an array of true/false values indicating which patches were applied.- Parameters:
patches
- Array of Patch objectstext
- Old text.- Returns:
- Two element Object array, containing the new text and an array of boolean values.
-
patch_addPadding
public String patch_addPadding(LinkedList<DiffMatch.Patch> patches)
Add some padding on text start and end so that edges can match something. Intended to be called only from within patch_apply.- Parameters:
patches
- Array of Patch objects.- Returns:
- The padding string added to each side.
-
patch_splitMax
public void patch_splitMax(LinkedList<DiffMatch.Patch> patches)
Look through the patches and break up any which are longer than the maximum limit of the match algorithm. Intended to be called only from within patch_apply.- Parameters:
patches
- LinkedList of Patch objects.
-
patch_toText
public String patch_toText(List<DiffMatch.Patch> patches)
Take a list of patches and return a textual representation.- Parameters:
patches
- List of Patch objects.- Returns:
- Text representation of patches.
-
patch_fromText
public List<DiffMatch.Patch> patch_fromText(String textline) throws IllegalArgumentException
Parse a textual representation of patches and return a List of Patch objects.- Parameters:
textline
- Text representation of patches.- Returns:
- List of Patch objects.
- Throws:
IllegalArgumentException
- If invalid input.
-
-