Apparently, when running Docker Swarm on ESXi 6.7 on a rare and specific setup, there is a port conflict on 4789/udp where ESXi would drop packets. Docker Swarm uses port 4789/udp as its default data path port. There are two solutions to avoid this issue. One is to use a different port for Docker Swarm, and the other is to patch your ESXi host; whichever is suitable for your use case.
Symptoms
If this is happening on your machines, you will notice timeouts and connection errors between nodes mostly. You will be able to join the nodes to the swarm and they will appear to connect successfully. There will be no errors telling you that the default port 4789/udp has dropped packets.
Solution 1
docker swarm init --default-addr-pool 10.55.128.0/17 --data-path-port=14789Solution 2
You can patch ESXi. More info can be found on this link (PR 2766401):
https://docs.vmware.com/en/VMware-vSphere/6.7/rn/esxi670-202111001.html