Original Sources:
- StackOverflow - dump and load
- StackOverflow - dump with ‘recurse’
- StackOverflow - if you want to use cPickle
- Code Redirect - if you want to use PiCloud
- Program Creek - ‘dill’ code examples
- ‘dill’ documentation
Used in this file:
~ Ex.:
import multiprocessing as mp
import dill
def helperFunction(f, inp, *args, **kwargs):
import dill # reimport, just in case this is not available on the new processes
f = dill.loads(f) # converts bytes to (potentially lambda) function
return f(inp, *args, **kwargs)
def mapStuff(f, inputs, *args, **kwargs):
pool = mp.Pool(6) # create a 6-worker pool
f = dill.dumps(f, recurse=True) # converts (potentially lambda) function to bytes
futures = [pool.apply_async(helperFunction, [f, inp, *args], kwargs) for inp in inputs]
return [f.get() for f in futures]
if __name__ == "__main__":
mapStuff(lambda x: x**2, [2, 3]) # returns [4, 9]
mapStuff(lambda x, b: x**2 + b, [2, 3], 1) # returns [5, 10]
mapStuff(lambda x, b: x**2 + b, [2, 3], b=1) # also returns [5, 10]
def f(x):
return x**2
mapStuff(f, [4, 5]) # returns [16, 25]Look at me Pickle is a bitch
The pickle module used to create a multiprocess code, or programm does not allow in any way the use of lambda functions, insted of normal ones, not only that it does not even allow nested functions to be passed around. This is bullshit. So we can use the dill module (not standard library) to dump the function, even a lambda one or nested functions and than load it again to be used in the multiprocess module.