1

I just bumped into some weird performance "issue"/"gain" with python3. The following code loads 5 weight matrices and applies them to a fairly large dataset. While doing so it writes each row out to disk.

When I execute this program, all eight processors are occupied for a 100% by it.

Does python automatically executes a programs on multiple threads ? If so, is there any documentation on this ? If not, how can it be that this program consumes all 8 processors on an octcore ?

#!/usr/bin/python3
import numpy
import struct
from scipy.special import expit
from dA import load_data
from dA import load_wb
import sys

if __name__ == '__main__':
   stages=[2223,723,172,84,21]
   wb=[]
   for stage in stages:
      w,b=load_wb("%d" % (stage))
      print(numpy.max(w))
      wb.append((w,b))

   data=load_data()
   n=data.shape[0]
   dimensions=stages[-1]
   filename="%d.data" % (dimensions)
   chunk=">"+('f'*dimensions)
   with open(filename,"wb") as f:
      for i in range(n):
         row=data[i]
         for (w, b) in wb:
            row=2*expit(2*(numpy.dot(row,w)+b))-1
         s=struct.pack(chunk,*row)
         f.write(s)
5
  • 2
    the numpy module releases pythons GIL when possible and uses multiple processors. cPython in general only uses one processor because of the GIL Commented Jun 8, 2016 at 19:21
  • But releasing a lock does not make everything else run in parallel or does it ? Commented Jun 8, 2016 at 19:23
  • Okay, so if the GIL is released, python can actually call functions in parallel. But in this case, there are dependencies between all the numpy interactions. These cannot be parallelized automatically. Commented Jun 8, 2016 at 19:24
  • scipy-cookbook.readthedocs.io/items/ParallelProgramming.html has some info that numpy does not automatically parallelize these things. Commented Jun 8, 2016 at 19:28
  • 1
    Not all, but those that make sense and can profit from parallelization, like dot. numpy releasing GIL only means that other threads in python can run while numpy calculates the result (parallelized or not). When returning to python the GIL is reacqured. I mentioned the GIL only to emphasize that pure python code does not generally run in parallel, numpy running in parallel is not really related to that. Commented Jun 8, 2016 at 19:39

2 Answers 2

1

The reason the eight cores are used is because on Linux, BLAS will automatically use them if available. So the processor time is mostly spend in the numpy.dot operation.

Sign up to request clarification or add additional context in comments.

Comments

1

I recently read an interesting article on this topic that I think it can be interesting for you: Python has been held back by its inability to natively use multiple CPU cores. Now Pythonistas are aiming to find a solution. To answer you question, I quote the first paragraph of the article.

For all of Python's great and convenient features, one goal remains out of reach: Python apps running on the CPython reference interpreter and using multiple CPU cores in parallel. ... In truth, it's possible to use threads in Python applications -- plenty of them already do. What's not possible is for CPython to run multithreaded applications with each thread executing in parallel on a different core. CPython's internal memory management isn't thread-safe, so the interpreter runs only one thread at a time, switching between them as needed and controlling access to the global state.

So, the short answer is: It is possible to run multiple threads in Python but, at the moment, it does not use multiple cores in parallel automatically. However, the work on implementing this feature has begun.

Also, I should point out that this is about CPython, the reference implementation of Python. Unfortunately, I am not aware if other implementations have the same issue or not.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.