Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- 7. Longest Common substring of two string :
- Given two string S and T. It is required to find their
- longest common substring, that is, a string X, which is a substring of S and T at the same time.
- Complexity Requirement's : O(length(S) + length(T))
- Algorithm : We build the suffix automaton for the string S.
- We follow the string T on the automaton and look for the longest suffix that each prefix has
- appeared in S. In other words, for each position in the string T, we want to find the longest
- common substring for which S and T ends at that position.
- To achieve this, we define two variables
- v = current state
- l = current length
- These two variables describe the current matching part, its length and state, which string
- (which cannot be determined if the length is not stored, since a state may match multiple
- strings of different lengths).
- Initially, v = To
- l = 0
- Now, we consider the character T[i], we want to find the answer to this position.
- 1. If the state v in the automaton has a transfer of the symbol T[i], we can simply take
- the transfer and then add the length l to one.
- 2. If the state v does not have the branch, we should try to shorten the current matching
- part, for which we go along the suffix link v = link(v).
- In this case, the current matching length must be reduced, but the remaining part is
- as much as possible many. Obviously, let l = len(v).
- If the newly arrived state still does not have the transfer we want, then we must go along
- the suffix link again and decrease the value of l until we find (That is, the character T[i]
- does not appear in S, so let v = l = 0 and then continue processing the next i).
- The answer to the original question is the maximum that has been reached.
- The progressive complexity of this traversal method is O(length(T)), because in a move we either add
- one, or follow the suffix link several times, each time will be strictly reduced l. Thus, the sum of
- the total reductions cannot exceed length(T), which means linear time complexity.
- int LCS(string s, string t)
- {
- init();
- int len_s = s.length();
- int len_t = t.length();
- for (int i=0; i<len_s; i++)
- {
- sz_extend(s[i]);
- }
- int v = 0;
- int l = 0;
- int best = 0; // LCS length
- int bestPos = 0; // LCS starting pos in T
- for (int i=0; i<len_t; i++)
- {
- while (v && !st[v].next.count(t[i]))
- {
- v = st[v].link;
- l = st[v].len;
- }
- if (st[v].next.count(t[i]))
- {
- v = st[v].next[t[i]];
- l++;
- }
- if (l > best)
- {
- best = l;
- bestPos = i;
- }
- }
- return t.substr(bestPos-best+1, best);
- }
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement