471 Encode String with Shortest Length

1. Question

Given anon-emptystring, encode the string such that its encoded length is the shortest.

The encoding rule is:k[encoded_string], where theencoded_stringinside the square brackets is being repeated exactlyktimes.

Note:

  1. k will be a positive integer and encoded string will not be empty or have extra space.

  2. You may assume that the input string contains only lowercase English letters. The string's length is at most 160.

  3. If an encoding process does not make the string shorter, then do not encode it. If there are several solutions, return any of them is fine.

Example 1:

Input: "aaa"
Output: "aaa"
Explanation: There is no way to encode it such that it is shorter than the input string, so we do not encode it.

Example 2:

Input: "aaaaa"
Output: "5[a]"
Explanation: "5[a]" is shorter than "aaaaa" by 1 character.

Example 3:

Input: "aaaaaaaaaa"
Output: "10[a]"
Explanation: "a9[a]" or "9[a]a" are also valid solutions, both of them have the same length = 5, which is the same as "10[a]".

Example 4:

Input: "aabcaabcd"
Output: "2[aabc]d"
Explanation: "aabc" occurs twice, so one answer can be "2[aabc]d".

Example 5:

Input: "abbbabbbcabbbabbbc"
Output: "2[2[abbb]c]"
Explanation: "abbbabbbc" occurs twice, but "abbbabbbc" can also be encoded to "2[abbb]c", so one answer can

2. Implementation

(1) Interval DP

思路:这题是要用区间dp做,dp[i][j]代表string s[i...j]之间的最短encoded string(如果encoded的string大于原来string的长度,则保留原来string,比如aaa => 3[a], 3[a]比aaa长,所以这种情况不encode). 我们按照区间DP的模板,遍历每个可能的substring长度(len = 1 ~ n), 找到substring的起始点i和结束点j,先将dp[i][j]初始化为s.substring(i, i + len). 然后我们枚举s[i...j]中的任意一个点k将dp[i...j]分成两半,dp[i....k] 和dp[k + 1...j]查看dp[i...k]和dp[k + 1...j]的长度和是否比s[i....j]的小,如果是的话dp[i][j] 变为 dp[i][k] + dp[k + 1][j].

最后我们检查原来的substring s[i...j]是否可以进一步压缩,检查的方法是(s[i...j] + s[i...j].indexof(s[i..j], 1),这里因为我们要查看s[i...j]中间是否有重复substring,所以我们在indexof用1作为起始点,如果indexof返回的值大于等于s.length(),说明s[i...j]中间没有重复substring无法进一步压缩,否则的话重复substring的次数等于 s.length() / index,其中index是indexof返回的值,重复的substring表示为dp[start][start + index - 1]

class Solution {
    public String encode(String s) {
        int n = s.length();

        if (s.length() <= 3) {
            return s;
        }

        //dp[i][j]表示s[i...j]中最短的encoded string
        String[][] dp = new String[n][n];

        for (int len = 1; len <= n; len++) {
            for (int i = 0; i < n - len + 1; i++) {
                int j = i + len - 1;

                dp[i][j] = s.substring(i, i + len);

                // No need to encode if the current len of substring is less than 4
                if (len <= 3) {
                    continue;
                }

                for (int k = i; k < j; k++) {
                    String left = dp[i][k];
                    String right = dp[k + 1][j];

                    if (left.length() + right.length() < dp[i][j].length()) {
                        dp[i][j] = dp[i][k] + dp[k + 1][j];
                    }
                }

                String collapsedStr = collapse(dp, s.substring(i, i + len), i);
                if (collapsedStr.length() < dp[i][j].length()) {
                    dp[i][j] = collapsedStr;
                }
            }
        }
        return dp[0][n - 1];
    }

    public String collapse(String[][] dp, String s, int start) {        
        // Check if there is repeated pattern in s
        int index = (s + s).indexOf(s, 1);

        // No repeated pattern in this case
        if (index >= s.length()) {
            return s;
        }
        else {
            return (s.length() / index) + "[" + dp[start][start + index - 1] + "]";
        }
    }
}

3. Time & Space Complexity

时间复杂度O(n^3), 空间复杂度O(n^2)

Last updated