Increase overlay network capacity

Busy CapRover hosts can eventually run into a shared swarm networking problem: the global captain-overlay-network runs out of VIP/IP allocations.

Symptoms

In the CapRover browser deployment flow, you may see:

  • No NodeId was found. Try again in a minute...

On the Docker host, docker service ps ... or daemon logs may show:

  • could not find an available IP while allocating VIP
  • service ... has pending allocations

This is not tied to one specific app. It means the shared overlay network CapRover uses for service-to-service traffic does not have enough free addresses.

Why this happens

CapRover one-click apps do not expose custom Docker swarm network topology in a reliable way. Large templates therefore end up consuming addresses on the shared captain-overlay-network.

If that network was created as a /24, it only has a small address pool. A sufficiently busy CapRover instance can exhaust it.

Quick relief

Before migrating the overlay network, try the simpler fixes:

  • remove unused apps and services
  • remove broken or half-deployed stacks
  • retry the install

If the host is still tight on overlay addresses, migrate the network.

Tested migration: /24 to /20

This is the exact shape that was used successfully on a single-node CapRover swarm host.

1. Inspect the current shared overlay

docker network inspect captain-overlay-network

If it shows a /24 subnet and the symptoms above are present, continue.

2. Create a temporary migration network

Use a non-overlapping subnet that is not already in use on the host.

docker network create \
  --driver overlay \
  --attachable \
  --subnet 10.0.32.0/20 \
  captain-overlay-network-migrate

3. Attach every swarm service that currently uses captain-overlay-network

This adds the temporary network first so services stay reachable during the move.

for svc in $(docker service ls --format '{{.Name}}'); do
  if docker service inspect "$svc" --format '{{range .Spec.TaskTemplate.Networks}}{{println .Target}}{{end}}' | grep -qx 'captain-overlay-network'; then
    docker service update --network-add captain-overlay-network-migrate "$svc"
  fi
done

4. Remove the old network from those services

for svc in $(docker service ls --format '{{.Name}}'); do
  if docker service inspect "$svc" --format '{{range .Spec.TaskTemplate.Networks}}{{println .Target}}{{end}}' | grep -qx 'captain-overlay-network'; then
    docker service update --network-rm captain-overlay-network "$svc"
  fi
done

5. Remove and recreate captain-overlay-network with a larger subnet

The tested replacement was 10.0.16.0/20.

docker network rm captain-overlay-network

docker network create \
  --driver overlay \
  --attachable \
  --subnet 10.0.16.0/20 \
  captain-overlay-network

6. Move services back onto the recreated shared overlay

for svc in $(docker service ls --format '{{.Name}}'); do
  if docker service inspect "$svc" --format '{{range .Spec.TaskTemplate.Networks}}{{println .Target}}{{end}}' | grep -qx 'captain-overlay-network-migrate'; then
    docker service update --network-add captain-overlay-network "$svc"
  fi
done

7. Detach the temporary migration network

for svc in $(docker service ls --format '{{.Name}}'); do
  if docker service inspect "$svc" --format '{{range .Spec.TaskTemplate.Networks}}{{println .Target}}{{end}}' | grep -qx 'captain-overlay-network-migrate'; then
    docker service update --network-rm captain-overlay-network-migrate "$svc"
  fi
done

8. Remove the temporary network

docker network rm captain-overlay-network-migrate

If Docker still shows a ghost entry for the temporary overlay even after all services have left it, a Docker daemon restart usually clears it. Do that only after verifying no service still depends on it.

Verification

Check the recreated network:

docker network inspect captain-overlay-network --format '{{json .IPAM.Config}}'

The working state should show:

[{"Subnet":"10.0.16.0/20","Gateway":"10.0.16.1"}]

Then verify the core CapRover services are healthy again:

docker service ls --format '{{.Name}} {{.Replicas}}' | grep 'captain-'

And finally retry the deployment that was failing.