How to start a dask.distributed cluster in a single thread?

How can I run a full Dask.distributed cluster in a single thread? I want to use this for debugging or profiling.

Note. This is a frequently asked question. I am adding a question and will answer here in Qaru just for reuse in the future.

+6
source share
1 answer

Local planner

If you can get around using a single-platform API scheduler (just calculate), you can use a single-threaded scheduler

x.compute(get=dask.get)

Distributed Scheduler - Single Machine

If you want to start the dask.distributed cluster on the same machine, you can start the client without arguments

from dask.distributed import Client
client = Client()  # Starts local cluster
x.compute()

This uses a lot of threads, but runs on the same machine

-

, , processes=False

from dask.distributed import Client
client = Client(processes=False)  # Starts local cluster
x.compute()

, .

-

, , Tornado concurrent.futures. , API Tornado .

from dask.distributed import Scheduler, Worker, Client
from tornado.concurrent import DummyExecutor
from tornado.ioloop import IOLoop
import threading

loop = IOLoop()
e = DummyExecutor()
s = Scheduler(loop=loop)
s.start()
w = Worker(s.address, loop=loop, executor=e)
loop.add_callback(w._start)

async def f():
    async with Client(s.address, start=False) as c:
        future = c.submit(threading.get_ident)
        result = await future
        return result

>>> threading.get_ident() == loop.run_sync(f)
True
+9

All Articles