用 Pathos 解決多處理中的 PicklingError 問題

Python 多進程在使用多個進程執行並發任務時很有用。但是它還要求正在執行的物件支援 pickling,而對於類別實例方法、靜態方法等類型來說,這並不總是正確的。 Pathos 有一個多處理實現,它在後端使用 dill,支援幾乎所有類型的序列化和反序列化。

Example using builtin multiprocessing that would raise PicklingError

import os
from multiprocessing import Pool


class Tasks:

    @staticmethod
    def process_some_task(item):
        print("Processing...", item, "by pid:", os.getpid())

if __name__ == "__main__":
    with Pool(4) as pool:
        pool.map(Tasks.process_some_task, range(10))

Error raised running above script

(venv) vagrant@vagrant-ubuntu-trusty-64:~/test$ python test.py
Traceback (most recent call last):
  File "test.py", line 13, in <module>
    pool.map(Tasks.process_some_task, range(10))
  File "/usr/lib/python3.4/multiprocessing/pool.py", line 260, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/lib/python3.4/multiprocessing/pool.py", line 599, in get
    raise self._value
  File "/usr/lib/python3.4/multiprocessing/pool.py", line 383, in _handle_tasks
    put(task)
  File "/usr/lib/python3.4/multiprocessing/connection.py", line 206, in send
    self._send_bytes(ForkingPickler.dumps(obj))
  File "/usr/lib/python3.4/multiprocessing/reduction.py", line 50, in dumps
    cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function Tasks.process_some_task at 0x7f45b3e626a8>: 
attribute lookup process_some_task on __main__ failed

Solution by pathos

  • install pathos
# install pathos
$ pip install pathos
  • replace multiprocessing
import os
from pathos.multiprocessing import ProcessingPool as Pool


class Tasks:

    @staticmethod
    def process_some_task(item):
        print("Processing...", item, "by pid:", os.getpid())

if __name__ == "__main__":
    with Pool(4) as pool:
        pool.map(Tasks.process_some_task, range(10))

Successful output with pathos

(venv) vagrant@vagrant-ubuntu-trusty-64:~/test$ python test.py
Processing... 0 by pid: 3827
Processing... 1 by pid: 3828
Processing... 2 by pid: 3826
Processing... 3 by pid: 3829
Processing... 4 by pid: 3827
Processing... 5 by pid: 3828
Processing... 6 by pid: 3826
Processing... 7 by pid: 3829
Processing... 8 by pid: 3827
Processing... 9 by pid: 3828

參考文獻

python