"-cluster-store" and "-cluster-advertising" do not work

Question

"-cluster-store" and "-cluster-advertising" do not work

I am trying to configure a docker cluster with swarm and consul . I have manager , host1 and host2 .
I run the consul and swarm manager containers in the manager.

 $ docker run --rm -p 8500:8500 progrium/consul -server -bootstrap $ docker run -d -p 2377:2375 swarm manage consul://<manager>:8500

On host1 and host2, I change the daemon settings using --cluster-store and --cluster-advertise and restart docker daemon .

 host1 DOCKER_OPTS="--cluster-store=consul://<manager>:8500 --cluster-advertise=<host1>:2375" host2 DOCKER_OPTS="--cluster-store=consul://<manager>:8500 --cluster-advertise=<host2>:2375"

When I join host1 and host2 to the swarm, it fails.

 host1 $ docker run --rm swarm join --advertise=<host1>:2375 consul://<manager>:8500 host2 $ docker run --rm swarm join --advertise=<host2>:2375 consul://<manager>:8500

From the log of the swarm manager, it throws an error.

 time="2016-01-20T02:17:17Z" level=error msg="Get http://<host1>:2375/v1.15/info: dial tcp <host1>:2375: getsockopt: connection refused" time="2016-01-20T02:17:20Z" level=error msg="Get http://<host2>:2375/v1.15/info: dial tcp <host2>:2375: getsockopt: connection refused"

+6

docker

firelyu Jan 20 '16 at 5:43

source share

3 answers

Jan · Answer 1 · 2016-05-31T10:17:17+0000

Since I had a similar problem, I finally figured out why this did not work (in my example, I use several mailboxes on the local network 192.168.10.0/24 that I want to manage with, and only allow external access to specific containers - the following examples are executed on the box 192.168.10.1 ):

configure daemons using --cluster-store consul://192.168.10.1:8500 and port 8500 (deploying the consul and registrar to each daemon as the first containers) and --cluster-advertise 192.168.10.1:2375 , as well as -H tcp://192.168.10.1:2375 -H unix:///var/run/docker.sock -H tcp://127.0.0.1:2375 (however I do not bind to other available addresses like you do with tcp://0.0.0.0:2375 and instead bind only to local 192.168.10.0/24). If you want containers to be bound only to the local network, and (as in this case), you can specify an additional parameter --ip for Daemon - when containers should be accessible everywhere, in my case only the nginx load balancer with fault tolerance via keepalived) you specify the port binding to all docker run ... -p 0.0.0.0:host_port:container_port ... <image> interfaces docker run ... -p 0.0.0.0:host_port:container_port ... <image>
Daemon launch

Deploying gliderlabs / registrar and Consul with compose (this is an example from the first window in my setup, but I run the equivalent for all Daemons to fully install Consul HA disaster recovery) docker-compose -p bootstrap up -d (designating bootstrap_registrator_1 and bootstrap_consul_1 containers in private network bootstrap ):

 version: '2' services: registrator: image: gliderlabs/registrator command: consul://192.168.10.1:8500 depends_on: - consul volumes: - /var/run/docker.sock:/tmp/docker.sock restart: unless-stopped consul: image: consul command: agent -server -bootstrap -ui -advertise 192.168.10.1 -client 0.0.0.0 hostname: srv-0 network_mode: host ports: - "8300:8300" # Server RPC, Server Use Only - "8301:8301/tcp" # Serf Gossip Protocol for LAN - "8301:8301/udp" # Serf Gossip Protocol for LAN - "8302:8302/tcp" # Serf Gossip Protocol for WAN, Server Use Only - "8302:8302/udp" # Serf Gossip Protocol for WAN, Server Use Only - "8400:8400" # CLI RPC - "8500:8500" # HTTP API & Web UI - "53:8600/tcp" # DNS Interface - "53:8600/udp" # DNS Interface restart: unless-stopped

daemons now register and set locks in the KV-store (Consul) in docker/nodes , and Swarm does not automatically read data from this location. Therefore, when he tries to read which demons are available, you will not find. Now this bit cost me the most time: To solve this problem, I needed to specify --discovery-opt kv.path=docker/nodes and start Swarm with docker-compose -p bootstrap up -d - on all the boxes, and ultimately with the return of Swarm HA managers:
```
 version: '2' services: swarm-manager: image: swarm command: manage -H :3375 --replication --advertise 192.168.10.1:3375 --discovery-opt kv.path=docker/nodes consul://192.168.10.1:8500 hostname: srv-0 ports: - "192.168.10.1:3375:3375" # restart: unless-stopped 
```
Now I have a working swarm that is only available on the network 192.168.10.0/24 on port 3375. All running containers are available only on this network unless I specify -p 0.0.0.0:host_port:container_port at startup (with docker run )
Further scaling. When I add more mailboxes to the local network to increase capacity, my idea was to add more daemons and possibly non-Swarm instance managers with the same ones as later Consul clients (and not servers started with -server ).

Auzias · Answer 2 · 2016-01-20T07:25:20+0000

Do you run a consul to discover multiple hosts or to discover swarm agents?

Have you tried checking consul members ? Why don't you start docker daemon to connect locally to consul , and then consul join members? Is there a reason for this?

I also suggest static methods for detecting Swarm agents. The fastest, easiest and safest, I know!

You should take a look at: how to create an overlay network between multiple hosts? can help you.

vanquangthanhhao · Answer 3 · 2018-03-20T06:10:34+0000

Remove "docker.pid" and "docker.sock" in / var / run. Then restart the host computer and restart the service docker using the "reboot dock supo-service"

Good luck to you!

"-cluster-store" and "-cluster-advertising" do not work

More articles: