I use autovivification to store data in a multiprocessor setup. However, I cannot figure out how to include it in the multiprocessor dispatcher function.
My auto-visualization code comes from several levels of 'collection.defaultdict' in Python and works great when multiprocessing does not occur.
class vividict(dict):
def __getitem__(self, item):
try:
return dict.__getitem__(self, item)
except KeyError:
value = self[item] = type(self)()
return value
My multiplying code is relative simplex:
if __name__ == "__main__":
man = Manager()
ngramDict = man.dict()
print(ngramDict)
s_queue = Queue()
aProces = Process(target=insert_ngram, args=(s_queue,ngramDict,))
aProces.start()
aProces.join()
print(ngramDict)
write_to_file()
In insert_ngram, the dictionary is read, written, and updated:
def insert_ngram(sanitize_queue, ngramDict):
ngramDict = Vividict()
try:
for w in iter(s_queue.get, None):
if ngramDict[w[0]][w[1]][w[2]][w[3]][w[4]]:
ngramDict[w[0]][w[1]][w[2]][w[3]][w[4]]+=int(w[5])
else:
ngramDict[w[0]][w[1]][w[2]][w[3]][w[4]]=int(w[5])
print(ngramDict)
return
except KeyError as e:
print("Key %s not found in %s" % (e, ngramDict))
except Exception as e:
print("%s failed with: %s" % (current_process().name, e))
I tried a series of what, in my opinion, was a good solution, but I can't get it to work, except by calling write_to_filein insert_ngram, but this is not a very neat solution.
Is it possible to get Manager.dict () for autovivifacte?
--------- 6-12-2013 --------
Manager() , manager.Dict() . (. : multiprocessing.Manager() python?)
:
def insert_ngram(sanitize_queue, ngramDict):
localDict = Vividict()
localDict.update(ngramDict)
ngramDict.update(ngramiDict)
, , , . , , Dicts . ( 200Mb +)
--------- 8-12-2013 --------
dict.update() , Dict ~ 200Mb +, ...