Package com.logicaldoc.comparison.basic
Class DiffMatch
java.lang.Object
com.logicaldoc.comparison.basic.DiffMatch
Class containing the diff, match and patch methods. Also contains the
behaviour settings.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic class
Class representing one diff operation.static enum
The data structure representing a diff is a Linked list of Diff objects: {Diff(Operation.DELETE, "Hello"), Diff(Operation.INSERT, "Goodbye"), Diff(Operation.EQUAL, " world.")} which means: delete "Hello", add "Goodbye" and keep " world."static class
Class representing one patch operation. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
Reduce the number of edits by eliminating operationally trivial equalities.void
diffCleanupMerge
(LinkedList<DiffMatch.Diff> diffs) Reorder and merge like edit sections.void
Reduce the number of edits by eliminating semantically trivial equalities.void
Look for single edits surrounded on both sides by equalities which can be shifted sideways to align the edit to a word boundary.int
diffCommonPrefix
(String text1, String text2) Determine the common prefix of two stringsint
diffCommonSuffix
(String text1, String text2) Determine the common suffix of two stringsdiffFromDelta
(String text1, String delta) Given the original text1, and an encoded string which describes the operations required to transform text1 into text2, compute the full diff.int
diffLevenshtein
(LinkedList<DiffMatch.Diff> diffs) Compute the Levenshtein distance; the number of inserted, deleted or substituted characters.Find the differences between two texts.Find the differences between two texts.diffPrettyHtml
(List<DiffMatch.Diff> diffs) Convert a Diff list into a pretty HTML report.diffText1
(LinkedList<DiffMatch.Diff> diffs) Compute and return the source text (all equalities and deletions).diffText2
(LinkedList<DiffMatch.Diff> diffs) Compute and return the destination text (all equalities and insertions).diffToDelta
(LinkedList<DiffMatch.Diff> diffs) Crush the diff into an encoded string which describes the operations required to transform text1 into text2.int
diffXIndex
(LinkedList<DiffMatch.Diff> diffs, int loc) loc is a location in text1, compute and return the equivalent location in text2.int
Locate the best instance of 'pattern' in 'text' near 'loc'.patchAddPadding
(LinkedList<DiffMatch.Patch> patches) Add some padding on text start and end so that edges can match something.Object[]
patchApply
(LinkedList<DiffMatch.Patch> patches, String text) Merge a set of patches onto the text.patchDeepCopy
(LinkedList<DiffMatch.Patch> patches) Given an array of patches, return another array that is identical.patchFromText
(String textline) Parse a textual representation of patches and return a List of Patch objects.Compute a list of patches to turn text1 into text2.patchMake
(String text1, LinkedList<DiffMatch.Diff> diffs) Compute a list of patches to turn text1 into text2.patchMake
(LinkedList<DiffMatch.Diff> diffs) Compute a list of patches to turn text1 into text2.void
patchSplitMax
(LinkedList<DiffMatch.Patch> patches) Look through the patches and break up any which are longer than the maximum limit of the match algorithm.patchToText
(List<DiffMatch.Patch> patches) Take a list of patches and return a textual representation.
-
Constructor Details
-
DiffMatch
public DiffMatch()
-
-
Method Details
-
diffMain
Find the differences between two texts. Run a faster, slightly less optimal diff. This method allows the 'checklines' of diff_main() to be optional. Most of the time checklines is wanted, so default to true.- Parameters:
text1
- Old string to be diffed.text2
- New string to be diffed.- Returns:
- Linked List of Diff objects.
-
diffMain
Find the differences between two texts.- Parameters:
text1
- Old string to be diffed.text2
- New string to be diffed.checklines
- Speedup flag. If false, then don't run a line-level diff first to identify the changed areas. If true, then run a faster slightly less optimal diff.- Returns:
- Linked List of Diff objects.
-
diffCommonPrefix
Determine the common prefix of two strings- Parameters:
text1
- First string.text2
- Second string.- Returns:
- The number of characters common to the start of each string.
-
diffCommonSuffix
Determine the common suffix of two strings- Parameters:
text1
- First string.text2
- Second string.- Returns:
- The number of characters common to the end of each string.
-
diffCleanupSemantic
Reduce the number of edits by eliminating semantically trivial equalities.- Parameters:
diffs
- LinkedList of Diff objects.
-
diffCleanupSemanticLossless
Look for single edits surrounded on both sides by equalities which can be shifted sideways to align the edit to a word boundary. e.g: The c<ins>at c</ins>ame. -> The <ins>cat </ins>came.- Parameters:
diffs
- LinkedList of Diff objects.
-
diffCleanupEfficiency
Reduce the number of edits by eliminating operationally trivial equalities.- Parameters:
diffs
- LinkedList of Diff objects.
-
diffCleanupMerge
Reorder and merge like edit sections. Merge equalities. Any edit section can move as long as it doesn't cross an equality.- Parameters:
diffs
- LinkedList of Diff objects.
-
diffXIndex
loc is a location in text1, compute and return the equivalent location in text2. e.g. "The cat" vs "The big cat", 1->1, 5->8- Parameters:
diffs
- LinkedList of Diff objects.loc
- Location within text1.- Returns:
- Location within text2.
-
diffPrettyHtml
Convert a Diff list into a pretty HTML report.- Parameters:
diffs
- LinkedList of Diff objects.- Returns:
- HTML representation.
-
diffText1
Compute and return the source text (all equalities and deletions).- Parameters:
diffs
- LinkedList of Diff objects.- Returns:
- Source text.
-
diffText2
Compute and return the destination text (all equalities and insertions).- Parameters:
diffs
- LinkedList of Diff objects.- Returns:
- Destination text.
-
diffLevenshtein
Compute the Levenshtein distance; the number of inserted, deleted or substituted characters.- Parameters:
diffs
- LinkedList of Diff objects.- Returns:
- Number of changes.
-
diffToDelta
Crush the diff into an encoded string which describes the operations required to transform text1 into text2. E.g. =3\t-2\t+ing -> Keep 3 chars, delete 2 chars, insert 'ing'. Operations are tab-separated. Inserted text is escaped using %xx notation.- Parameters:
diffs
- Array of Diff objects.- Returns:
- Delta text.
- Throws:
UnsupportedEncodingException
- In case the system does not suppot UTF-8
-
diffFromDelta
public LinkedList<DiffMatch.Diff> diffFromDelta(String text1, String delta) throws IllegalArgumentException, UnsupportedEncodingException Given the original text1, and an encoded string which describes the operations required to transform text1 into text2, compute the full diff.- Parameters:
text1
- Source string for the diff.delta
- Delta text.- Returns:
- Array of Diff objects or null if invalid.
- Throws:
IllegalArgumentException
- If invalid input.UnsupportedEncodingException
- The system does not support UTF-8
-
matchMain
Locate the best instance of 'pattern' in 'text' near 'loc'. Returns -1 if no match found.- Parameters:
text
- The text to search.pattern
- The pattern to search for.loc
- The location to search around.- Returns:
- Best match index or -1.
-
patchMake
Compute a list of patches to turn text1 into text2. A set of diffs will be computed.- Parameters:
text1
- Old text.text2
- New text.- Returns:
- LinkedList of Patch objects.
-
patchMake
Compute a list of patches to turn text1 into text2. text1 will be derived from the provided diffs.- Parameters:
diffs
- Array of Diff objects for text1 to text2.- Returns:
- LinkedList of Patch objects.
-
patchMake
Compute a list of patches to turn text1 into text2. text2 is not provided, diffs are the delta between text1 and text2.- Parameters:
text1
- Old text.diffs
- Array of Diff objects for text1 to text2.- Returns:
- LinkedList of Patch objects.
-
patchDeepCopy
Given an array of patches, return another array that is identical.- Parameters:
patches
- Array of Patch objects.- Returns:
- Array of Patch objects.
-
patchApply
Merge a set of patches onto the text. Return a patched text, as well as an array of true/false values indicating which patches were applied.- Parameters:
patches
- Array of Patch objectstext
- Old text.- Returns:
- Two element Object array, containing the new text and an array of boolean values.
-
patchAddPadding
Add some padding on text start and end so that edges can match something. Intended to be called only from within patch_apply.- Parameters:
patches
- Array of Patch objects.- Returns:
- The padding string added to each side.
-
patchSplitMax
Look through the patches and break up any which are longer than the maximum limit of the match algorithm. Intended to be called only from within patch_apply.- Parameters:
patches
- LinkedList of Patch objects.
-
patchToText
Take a list of patches and return a textual representation.- Parameters:
patches
- List of Patch objects.- Returns:
- Text representation of patches.
-
patchFromText
public List<DiffMatch.Patch> patchFromText(String textline) throws IllegalArgumentException, UnsupportedEncodingException Parse a textual representation of patches and return a List of Patch objects.- Parameters:
textline
- Text representation of patches.- Returns:
- List of Patch objects.
- Throws:
IllegalArgumentException
- If invalid input.UnsupportedEncodingException
- The system does not support UTF-8
-