Time complexity analysis for the 3-sum/4-sum problem

The 3-Sum problem is defined as follows: Given an array S of n integers, are there elements a, b, c in S such that a + b + c = 0? Find all unique triplets in the array which gives the sum of zero.
Note: Elements in a triplet (a,b,c) must be in non-descending order. (i.e., a ≤ b ≤ c) The solution set must not contain duplicate triplets.
For example, given array S = {-1 0 1 2 -1 -4}, A solution set is: (-1, 0, 1) (-1, -1, 2)

As always brute force approach is the most intuitive method which can help us work out a solution to a problem very quickly. Another consideration about using brute force at first is that the creation of unit tests will be easier. For the 3-Sum problem, the time complexity when using brute force is O(N³), say it will try every possible combinations of the a+b+c=0. Let’s see the code:

public List<List<Integer>> threeSum(int[] num) {
	int len = num.length;
	Set<List<Integer>> set = new HashSet<>();
	for (int i = 0; i < len - 2; i++) {
		for (int j = i + 1; j < len - 1; j++) {
			for (int k = j + 1; k < len; k++) {
				if (num[i] + num[j] + num[k] == 0) {
					List<Integer> l = Arrays.asList(num[i], num[j], num[k]);
					Collections.sort(l);
					set.add(l);
				}
			}
		}
	}
	return new ArrayList<List<Integer>>(set);
}

After the first two FOR-loops, we already traversal all the combinations of a+b and implicitly also get the c=0-a-b, we can alternatively use binary-search instead of traversal to find out the last summand c. Since the time complexity of binary-search is O(logN), thus the total time complexity of the program is about O(N²logN), this is able to make the program a little bit faster.

public List<List<Integer>> threeSum(int[] num) {
	int len = num.length;
	Arrays.sort(num);
	Set<List<Integer>> set = new HashSet<>();
	for (int i = 0; i < len - 2; i++) {
		for (int j = i + 1; j < len - 1; j++) {
			int k = Arrays.binarySearch(num, 0 - num[i] - num[j]);
			if (k > 0 && k != i && k != j) {
				List<Integer> l = Arrays.asList(num[i], num[j], num[k]);
				Collections.sort(l);
				set.add(l);
			}
		}
	}
	return new ArrayList<List<Integer>>(set);
}

The basic idea of the implementation above is that we want to find out the third summand c faster, what if we use HashMap instead of binary-search to index the c? We can traversal the input which is an int array using a separate loop and put all the elements into a HashMap. After all, the time complexity can be reduced to O(N+N²) = O(N²)

public List<List<Integer>> threeSum(int[] num) {
	int len = num.length;
	Set<List<Integer>> set = new HashSet<>();
	Map<Integer, Integer> map = new HashMap<>(len);
	for (int i = 0; i < len; i++) {
		map.put(num[i], i);
	}
	for (int i = 0; i < len - 2; i++) {
		for (int j = i + 1; j < len - 1; j++) {
			if (map.containsKey(0 - num[i] - num[j])) {
				int k = map.get(0 - num[i] - num[j]);
				if (k != i && k != j) {
					List<Integer> l = Arrays.asList(num[i], num[j], num[k]);
					Collections.sort(l);
					set.add(l);
				}
			}
		}
	}
	return new ArrayList<List<Integer>>(set);
}

O(N²) is though much better then O(N³), but do you remember the problem we solved in the “container with most water“? If two-pointer can be applied to the program, it might be faster since we don’t really need to traversal the whole input at the second loop. The time complexity should be theoretically a little bit lower than O(N²) . Please note, we must sort the input before applying the two-pointer approach. It’s proven faster than the former O(N²) algorithm. You can find the complete unit test cases with time out check ThreeSumTest.java on my Github.

public List<List<Integer>> threeSum(int[] num) {
	int len = num.length;
	Arrays.sort(num);
	Set<List<Integer>> set = new HashSet<>();
	for (int i = 0; i < len - 2; i++) {
		for (int j = i + 1, k = len - 1; j < k;) {
			if (num[i] + num[j] + num[k] < 0) {
				j++;
			} else if (num[i] + num[j] + num[k] > 0) {
				k--;
			} else {
				set.add(Arrays.asList(num[i], num[j], num[k]));
				j++;
				k--;
			}
		}
	}
	return new ArrayList<List<Integer>>(set);
}

So far so good, but can we make it faster? Probably you have noticed as well that we were using HashSet to avoid the duplicates in the output. I think if we don’t choose the duplicates as candidates to the output at the very beginning, then we don’t need to use Set at all.

public List<List<Integer>> threeSum(int[] num) {
	int len = num.length;
	Arrays.sort(num);
	List<List<Integer>> res = new ArrayList<>();
	for (int i = 0; i < len - 2; i++) {
		if (i == 0 || (i > 0 && num[i] != num[i - 1])) {
			for (int j = i + 1, k = len - 1; j < k;) {
				if (num[i] + num[j] + num[k] < 0) {
					j++;
				} else if (num[i] + num[j] + num[k] > 0) {
					k--;
				} else {
					res.add(Arrays.asList(num[i], num[j], num[k]));
					while (j < k && num[j] == num[j + 1])
						j++;
					while (j < k && num[k] == num[k - 1])
						k--;
					j++;
					k--;
				}
			}
		}
	}
	return res;
}

The time complexity of this solution is about O(N²) in which the two-pointer was used and the Set data structure was removed. Removing Set is a little bit tricky, however it is proven faster ideed than all other algorithms that are discussed above. Someone might interested in whether there is algorithm even faster than O(N²) especially for this problem, here is a paper that proves a lower bound of Ω(n^ceil(k/2)) for the k-SUM problem, so we can’t do better than Ω(N²) when k=3 in this case.

The 4-Sum problem with dynamic target can just be solved with the same idea as 3-Sum:

public List<List<Integer>> fourSum(int[] num, int target) {
	int len = num.length;
	Arrays.sort(num);
	Set<List<Integer>> set = new HashSet<>();
	for (int l = 0; l < len - 3; l++) {
		for (int i = l + 1; i < len - 2; i++) {
			for (int j = i + 1, k = len - 1; j < k;) {
				if (num[l] + num[i] + num[j] + num[k] < target) {
					j++;
				} else if (num[l] + num[i] + num[j] + num[k] > target) {
					k--;
				} else {
					set.add(Arrays.asList(num[l], num[i], num[j], num[k]));
					j++;
					k--;
				}
			}
		}
	}
	return new ArrayList<List<Integer>>(set);
}

Find container with most water

I’ve recently worked on an interesting problem: Given n non-negative integers a1, a2, …, an, where each represents a point at coordinate (i, ai). n vertical lines are drawn such that the two endpoints of line i is at (i, ai) and (i, 0). Find two lines, which together with x-axis forms a container, such that the container contains the most water. Note: You may not slant the container.

Explanation: to simply the question, we can just imagine that we want to build a water container using two walls with different height in a river. Because of the gravity, only the lowest wall decides how much water it can retain. Suppose we get the height of the walls in certain order, for instance: if the height of the walls are { 9, 1, 1, 6, 1, 1, 1, 1, 1, 1, 1 }, the maximal volume should be 6*3=18 (lowest wall=6, distance=3); Or if the height of walls are { 1, 2, 8, 9, 1, 1, 1, 1, 1, 1, 2 }, the maximal volume should be 2*9=18 (lowest wall=2, distance=9).

The first consideration to this question was straight forward, we could use brute force approach to solve this problem, say we want to try all the combinations of two walls and compute the maximal volume out. let’s see:

public int maxArea1(int[] height) {
	int maxArea = 0;
	for (int i = 0; i < height.length; i++) {
		for (int j = 1; j < height.length; j++) {
			maxArea = Math.max(maxArea, Math.min(height[j], height[i]) * (j - i));
		}
	}
	return maxArea;
}

However it’s time complexity is O(N²), means the performance is not good at all. Is it possible just using only one pass to solve this problem?

The basic idea was that we hope the lowest wall will be as high as possible and the distance between two walls as far as possible. So we can put two pointers start at the head and tail of the wall list and let them moving towards. One pointer moves one step forward only when it’s height is lower than the other side, say we want to find the highest walls from both side of the river and updating the maximal volume each step while they moving towards.

public int maxArea(int[] height) {
	int maxArea = 0, l = 0, r = height.length - 1;
	while (l < r) {
		maxArea = Math.max(maxArea, Math.min(height[l], height[r]) * (r - l));
		if (height[l] < height[r]) {
			l++;
		} else {
			r--;
		}
	}
	return maxArea;
}

with this approach the time complexity can be reduced to O(N), which will achieve a much better performance than the former one.