Alpace

Technology

Run Multiple Docker Daemons with Net Container

October 24th, 2015 12:10

Docker is an awesome software that could disrupt how we run software in the new cloud age.  At Alpaca, we are running dozens of servers for production cloud services as well as development environment from virtualized servers in AWS and monthly paid bare metal rental to in-house machines in our office room.  We are running everything inside Docker, so we don’t care much about underlying host OS and provisioning.  We clone the private GitHub repository and type ‘make’  that pulls pre-built Docker images from remote that all dependencies are installed in and run the services in a few minutes.  When it comes to scientific computing in Python and GPU, despite of big community effort making it easier than ever, it is still bumpy and hard to maintain.  Docker images just make it nothing but do it once and only once.  Beautiful.

As a tiny startup company, we are trying to minimize the operation cost literally every day, but running GPU in cloud inherently costs you a lot.  Our application is not CPU or memory intensive, but very GPU intensive, and running a number of small GPU servers with unnecessary CPU and memory for every developer in the team would not be affordable for us.  So we wanted to share one powerful GPU server among handful developers, which would reduce the server cost dramatically compared to running a small GPU instance for each developer.  The question was how do you share Docker environment in one machine, without interrupting others.  Well, it may sound easy given the vision of Docker, which is to say everything is capsuled and isolated, but the reality is not there yet.  We expose a fixed set of port numbers as well as some of the internal processes may start listening on unpredictable ports.  We use predictable docker container names when starting them, which could also conflict with others.  We could avoid such conflicts by offsetting port numbers and suffixing container names, but even worse was the Docker cache.  While someone tags a new Docker image for new feature, others accidentally run that image which doesn’t work with the old version of application.

It was not entirely clear to us how we could achieve it.  I looked at “Docker in Docker” or DnD (sounds crazy…) for example, but it didn’t help our situation much.  The basic idea of DnD is to allow a container process to access the host’s Docker daemon, which wouldn’t isolate anything.  Our conclusion was that we have no choice but running multiple docker daemons in one host to isolate the Docker environment for each developer.  We found it is possible to run multiple daemons, but there were a couple of issues to be resolved.  One is the network bridge issue.  You may not have heard about the docker0 bridge, but this is the one trick that Docker does to let containers communicate with external network and also provides inter-container communication.  The bridge is created behind the scene by the daemon when firstly it runs.  We need to have multiple of it, but for some reason Docker daemon doesn’t like to create one for each, so you have to create one if you want other bridges than docker0.  Did we solve the network issue by this?  No.  We were using –net=host and our application assumes each process can talk to each other without hassle of container linkage (personally I think this container linkage approach is one of the bad designs of Docker), and breaking the assumption was just not possible.  I digged into the document and found there is another net mode which is –net=container.  What it does is to let containers share the same network namespace, which means you can allow set of containers assume as if it ran on –net=host and communicate each other that way, but still put all those assumption into one network container which doesn’t contaminate host’s network space.

s1ultsmdy7hk73ldkiu6gcw

So in the diagram above, the server container is free to listen on any port number as far as it works within the set of containers (collectively “one application”).  In other words, the server in the daemon A can listen on the same port as the server in the B daemon does.  It turns out this then makes a daemon space effectively a virtualized host, where multiple processes run and communicate with each other much like they do in a bare metal host. You don’t have to worry about container linkage but still gain the benefit of Docker’s pre-built dependency mechanism.  Bravo.

Here is the snippet that helps you understand how exactly we do what I just described.

 

function start_dockers {

  OFFSET=0

  for u in $USERS
  do
    BRIDGE_NAME=br_${u}
    DOCKER_ROOT=/home/${u}/docker

      echo "Creating the bridge and starting the daemon for ${u}"
      # create a bridge
      brctl addbr ${BRIDGE_NAME}
      SUBNET=$(expr 52 + ${OFFSET})
      ip addr add 172.18.${SUBNET}.1/24 dev ${BRIDGE_NAME}
      ip link set dev ${BRIDGE_NAME} up

      # IP Masquerade, if not set yet
      if ! iptables -t nat -C POSTROUTING -j MASQUERADE -s 172.18.${SUBNET}.0/24 -d 0.0.0.0/0; then
        iptables -t nat -A POSTROUTING -j MASQUERADE -s 172.18.${SUBNET}.0/24 -d 0.0.0.0/0
      fi

      # create a docker working directory
      mkdir -p ${DOCKER_ROOT}
      chown $u:$u ${DOCKER_ROOT}

      # launch a docker daemon
      docker daemon -D \
        -g ${DOCKER_ROOT}/g \
        --exec-root=${DOCKER_ROOT}/e \
        -b ${BRIDGE_NAME} \
        --dns=8.8.8.8 \
        --iptables=false \
        -H unix://${DOCKER_ROOT}/docker.sock \
        -p ${DOCKER_ROOT}/docker.pid > ${DOCKER_ROOT}/docker-${u}.log 2>&1 &

    OFFSET=$(expr ${OFFSET} + 1)
  done

}

 

First, create a bridge for the user, with a subnet starting from 172.18.52.0/24, then set IP masquerade which is necessary if you run the daemon with –iptables=false.  Then create user’s own Docker working directories, both for exec-root and graph (container caches and filesystem images).  Also, set the own unix domain socket and pid file location.  Set DOCKER_HOST to the unix domain socket specified in the -H option.

 

Then make sure you run a net container before starting other containers.  The net container can be anything, but it just needs to be there.

FROM ubuntu:14.04
CMD ["/bin/sh", "-c", "yes | while read line; do sleep 1; done"]
$ docker run -t -d --name net net

Let’s run Redis in the network of this net container.

$ docker run --net=container:net -d --name redis redis redis-server --appendonly yes

 

The last piece of this work is, don’t forget update your bash_profile to set DOCKER_HOST to be pointer to the unix domain socket path.  Then as you log in the host, you see your own docker process spaces, as if you were exclusively using the host machine with a single Docker environment.

 

We’ve just launched our deep learning-based trading platform, Capitalico, which is heavily using Docker, and welcoming beta test users.  If you are interested in it, please sign up for the waiting list so we can contact you!

http://www.capitalico.co/