Pathos でマルチプロセスの PicklingError を解決します
Python のマルチプロセスは、複数のプロセスで同時タスクを実行する場合に便利です。ただし、実行されるオブジェクトがピクル化をサポートしている必要があり、クラス インスタンス メソッド、静的メソッドなどの型では必ずしもサポートされません。Pathos には、バックエンドで dill を使用するマルチプロセス実装があり、ほぼすべての型のシリアル化とデシリアル化をサポートしています。
Example using builtin multiprocessing that would raise PicklingError
import os
from multiprocessing import Pool
class Tasks:
@staticmethod
def process_some_task(item):
print("Processing...", item, "by pid:", os.getpid())
if __name__ == "__main__":
with Pool(4) as pool:
pool.map(Tasks.process_some_task, range(10))
Error raised running above script
(venv) vagrant@vagrant-ubuntu-trusty-64:~/test$ python test.py
Traceback (most recent call last):
File "test.py", line 13, in <module>
pool.map(Tasks.process_some_task, range(10))
File "/usr/lib/python3.4/multiprocessing/pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/lib/python3.4/multiprocessing/pool.py", line 599, in get
raise self._value
File "/usr/lib/python3.4/multiprocessing/pool.py", line 383, in _handle_tasks
put(task)
File "/usr/lib/python3.4/multiprocessing/connection.py", line 206, in send
self._send_bytes(ForkingPickler.dumps(obj))
File "/usr/lib/python3.4/multiprocessing/reduction.py", line 50, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function Tasks.process_some_task at 0x7f45b3e626a8>:
attribute lookup process_some_task on __main__ failed
Solution by pathos
- install pathos
# install pathos
$ pip install pathos
- replace multiprocessing
import os
from pathos.multiprocessing import ProcessingPool as Pool
class Tasks:
@staticmethod
def process_some_task(item):
print("Processing...", item, "by pid:", os.getpid())
if __name__ == "__main__":
with Pool(4) as pool:
pool.map(Tasks.process_some_task, range(10))
Successful output with pathos
(venv) vagrant@vagrant-ubuntu-trusty-64:~/test$ python test.py
Processing... 0 by pid: 3827
Processing... 1 by pid: 3828
Processing... 2 by pid: 3826
Processing... 3 by pid: 3829
Processing... 4 by pid: 3827
Processing... 5 by pid: 3828
Processing... 6 by pid: 3826
Processing... 7 by pid: 3829
Processing... 8 by pid: 3827
Processing... 9 by pid: 3828
参考文献
- What can multiprocessing and dill do together?
- pathos: a framework for parallel graph management and execution in heterogeneous computing
- dill: serialize all of python