1

While rewriting an old Matlab code to NumPy, I noticed differences in logarithmic calculation. In NumPy, I use np.log, Matlab uses log function.

b = [1 1 2 3 5 1 1];
p = b ./ sum(b);
sprintf('log(%.20f) = %.20f', p(5), log(p(5)))
import numpy as np
b = np.array([1, 1, 2, 3, 5, 1, 1])
p = b.astype('float64') / np.sum(b)
print(f'log({p[4]:.20f}) = {np.log(p[4]):.20f}')

For my MacBook Pro 2020 with M1 chip, I get mismatch at 16th decimal digit.

log(0.35714285714285715079) = -1.02961941718115834732  # Matlab
log(0.35714285714285715079) = -1.02961941718115812527  # NumPy

I would like to get exactly the same results. Any idea, how to modify my Python code?

6
  • 1
    Does that difference really matter for your specific application? Keep in mind that neither of those numbers are actually correct. As per WolframAlpha, ln(5/14) = -1.029619417181158 239921825531675168658... Commented Dec 10, 2023 at 20:41
  • For practical applications doing a single log call no. But for research, this is important. What if the imprecisions cumulate, and the results deviate even more? Imagine a deep neural network with a log-based activation function, which learns something else, when implemented in Python or Matlab. Commented Dec 10, 2023 at 20:56
  • It's difficult to say how Matlab gets its value without seeing the implementation. Using the bilinear expansion in Python produces the same result as numpy, so they might be using that (I don't know where in the numpy source code it is implemented). I computed the natural log in c and that gave another value, -1.029619417181158 21776. The result is inconsistent between languages, which is unsurprising. Commented Dec 10, 2023 at 21:31
  • 2
    “What if the imprecisions cumulate, and the results deviate even more?” This is why you need numerically stable algorithms, so that rounding errors don’t accumulate. Scientific computing is a field of research for a reason. “ Imagine a deep neural network…” DL will never depend on the 16th digit. DL is usually implemented with 16-bit or even 8-bit floats, speed and energy efficiency are more important than precision. Commented Dec 10, 2023 at 22:28
  • My colleagues present a paper on the impact of numerical deviations on ML at NEURIPS. It is not completely out of topic. informationsecurity.uibk.ac.at/pdfs/SHB2023_NEURIPS.pdf Commented Dec 11, 2023 at 8:41

1 Answer 1

5

Both MATLAB and numpy by default use 64bit float that have a 52 bit mantissa. This means the smallest relative step between two float64 numbers is 2**-52 = 2.2e-16. This means any decimal after the 16th has no significance. The difference you're seeing is probably due to a sligtly different implementation. You can check this by using

np.nextafter(a, 1)-a

For a = np.log(0.35714285714285715079) you get 2.2e-16 which is roughly the size of the machine precision np.finfo(np.float64).eps.

Even if you look at the input: You're providing more decimals than necessary to completely define a 64-bit float. We can set the number of displaye decimals to 100 and it will still just print 17 digits for this reason:

>>> np.set_printoptions(precision=100)
>>> np.array([0.35714285714285715079])
array([0.35714285714285715])                       

The difference between MATLAB and numpy might even be caused by reordering a sum, as floating point addition is not associative. If you do depend on the 16th decimal place then you should use something other than 64bit float. I'd recommend familiarizing yourself with how floating point types are implemented, as it is vital when working with scientific software. And if you'd like, I'd recommend taking a look at the source code of numpy, to see how it is implemented, and compare it to other open libraries.

Sign up to request clarification or add additional context in comments.

2 Comments

Related to this answer, it is worth noting that Matlab's sum was recently changed to better deal with round-off errors
And likewise np.sum is inferior to math.fsum if you want the last bit of precision

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.