0

I'm studying dynamic programming by doing Leetcode problems, and I frequently face time limit exceeded errors even though I'm caching my results. Can anyone explain why my version is so much slower than the official version for this problem?

There are obviously differences in the code, e.g., I use a class function for recursion while the official answer does not. My recursive function returns numeric values, the official one does not, etc. None of these seem like meaningful differences though, but the performance difference is nonetheless dramatic.

My version. This takes 0.177669 seconds to run, and receives a time limit exceeded error.

import datetime as dt
from typing import List
from functools import lru_cache


class Solution:
    def canPartition(self, nums: List[int]) -> bool:
        self.nums = nums
        total = sum(self.nums)
        if total % 2 == 1:
            return False
        half_total = total // 2
        return self.traverse(half_total, 0) == 0

    @lru_cache(maxsize=None)
    def traverse(self, subset_sum, index):
        if subset_sum < 0:
            return float('inf')
        elif index == len(self.nums):
            return subset_sum
        else:
            include = self.traverse(subset_sum - self.nums[index], index + 1)
            exclude = self.traverse(subset_sum, index + 1)
            best = min(include, exclude)
            return best


test_case = [20,68,68,11,48,18,50,5,3,51,52,11,13,11,38,100,30,87,1,56,85,63,14,96,7,17,54,11,32,61,94,13,85,10,78,57,69,92,66,28,70,20,3,29,10,73,89,86,28,48,69,54,87,11,91,32,59,4,88,20,81,100,29,75,79,82,6,74,66,30,9,6,83,54,54,53,80,94,64,77,22,7,22,26,12,31,23,26,65,65,35,36,34,1,12,44,22,73,59,99]
solution = Solution()
start = dt.datetime.now()
print(solution.canPartition(test_case))
end = dt.datetime.now()
print((end-start).total_seconds())

This is the official answer. It takes only 0.000165 seconds!

import datetime as dt
from typing import List, Tuple
from functools import lru_cache


class Solution:
    def canPartition(self, nums: List[int]) -> bool:
        @lru_cache(maxsize=None)
        def dfs(nums: Tuple[int], n: int, subset_sum: int) -> bool:
            # Base cases
            if subset_sum == 0:
                return True
            if n == 0 or subset_sum < 0:
                return False
            result = (dfs(nums, n - 1, subset_sum - nums[n - 1])
                    or dfs(nums, n - 1, subset_sum))
            return result

        # find sum of array elements
        total_sum = sum(nums)

        # if total_sum is odd, it cannot be partitioned into equal sum subsets
        if total_sum % 2 != 0:
            return False

        subset_sum = total_sum // 2
        n = len(nums)
        return dfs(tuple(nums), n - 1, subset_sum)


test_case = [20,68,68,11,48,18,50,5,3,51,52,11,13,11,38,100,30,87,1,56,85,63,14,96,7,17,54,11,32,61,94,13,85,10,78,57,69,92,66,28,70,20,3,29,10,73,89,86,28,48,69,54,87,11,91,32,59,4,88,20,81,100,29,75,79,82,6,74,66,30,9,6,83,54,54,53,80,94,64,77,22,7,22,26,12,31,23,26,65,65,35,36,34,1,12,44,22,73,59,99]
solution = Solution()
start = dt.datetime.now()
print(solution.canPartition(test_case))
end = dt.datetime.now()
print((end-start).total_seconds())

3 Answers 3

1

In the former version, all possible cases are searched. While in the latter, the algorithm stops when a feasible solution has been found.

In the first version:

include = self.traverse(subset_sum - self.nums[index], index + 1)
# Suppose {include} is zero, the answer is already obtained, 
# but the algorithm still try to compute {exclude}, which is not neccessary.
exclude = self.traverse(subset_sum, index + 1)

In the second version:

result = (dfs(nums, n - 1, subset_sum - nums[n - 1])
                    or dfs(nums, n - 1, subset_sum))
# Because of the short-circuit behavior of logical operator,
# if the first branch has already obtained the solution, 
# the second branch will not be executed.

Just adding a if-check will improve the performance:

include = self.traverse(subset_sum - self.nums[index], index + 1)
# Check whether we are already done:
if include == 0:
    return include
exclude = self.traverse(subset_sum, index + 1)
Sign up to request clarification or add additional context in comments.

1 Comment

Wow, that was it! Time went down to 7.8e-05. That beats the official solution! Very interesting. I like the explicit check you have at the end, but it's good to know about the short-circuit behavior too.
1

It you want to know about performance, you need to profile your code. Profiling lets you see where your code spends its time.

CPython comes with built-in profiling module called cProfile. But you might want to look at e.g. line_profiler.

Comments

1
  • your function returns a number that alters between a float and an int during the execution. python has to cope with it during the whole execution. whereas you need to return an answer of "yes or no" to the question "can it be partitioned?" and simple boolean Ture/False is enough and suggested.
  • for the same reason above, you are using a comparison function min on two recursively gained result where you have to run two recursion to their deepest levels. by using booleans, you can shortcut this process and this other program uses that shortcut.

3 Comments

I switched to booleans after reading your post and it didn't make a difference in time.
Actually, I take that back, it did help a bit. Went from .177 to .146. Still a lot slower than the solution, but an improvement nonetheless. Upvoted.
@user554481, yep this float/int conversion has a small impact yet can be painful if dataset gets very large. the main impact is in the use of the recursion. we would not want to run it for all possibilites exists as it may take an infinite amount of time. the above answer you accepted, and the one in your question uses shortcut logic to make it fast: "return a result as soon as it fits your needs"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.