Docker I: Discovering Docker and Cassandra

Docker CassandraIn this part we will learn how to run Docker containers. We will explore the basic Docker commands while deploying a small Cassandra cluster on separate hosts on my cluster. To keep things simple we will use the official Cassandra image from Docker Hub to create the Cassandra containers. I will also explain a few basic Cassandra principles and keep it simple for people who have no knowledge of Cassandra.

Docker Setup

Docker Engine is the core library to build and run Docker images and containers on a Linux host. The easiest way to install it for common linux distributions is to run the remote “get docker” installation script:

If you experience difficulties with the script, or are using an unsupported OS/distribution, you can find details for manual installation here.

I have installed Docker Engine on my 5 nodes (ubuntu[0-4]) running Ubuntu Server 14.04 LTS.

For reference, here are the commands for a full manual installation, testing, and setup of a new user called “docker” on my Ubuntu nodes:

Part number 3 of the commands starts the docker daemon, and then runs a Hello-World container which simply prints out a message. The Hello-World image is automatically pulled (i.e. downloaded) from Docker Hub (from here exactly), because it was not found locally. Once the container is deployed, its main process prints a “hello world” message. Then when the main process exits, the container automatically stops.

Overview of Cassandra

Cassandra is a NoSQL database. More specifically, it is a wide column store, and an an AP system which offers tunable consistency to reach C, at the cost of performance.

Pros

Its main advantages are:

  • Decentralized: all nodes have the same role. There is no master or slave. Easier configuration.
  • Linear Scalability: offers the best read/write throughputs for very large clusters (although latency is only mediocre).
  • Fault-Tolerant: data is replicated across datacenters and failed nodes can be replaced without downtime.
  • Tunable Consistency: a level of consistency can be chosen on a per-query basis.

Cassandra is easy to set up and play with, because it has auto-discovery of nodes, and does not need a load balancer or a specific master configuration.

We can simply install 3 instances of Cassandra on 3 different nodes and they can form a cluster automatically (each node only needs to be informed of another node’s IP address at first). Then queries can be run against any instance.

Cons

Cassandra is a very efficient distributed database, but is not appropriate for all use-cases because:

  • Tables are made to only serve a single query. To query on different criteria or using different ordering fields, extra Tables or Materialized Views must be created for those queries. Cassandra Query Language (CQL) sounds like you can query anything like in SQL, but you can’t because of this.
  • No aggregation or joining.

In our case we will only use one table to store posts from users, and always query them in the same order for our webapp, so we should be fine.

Using Containers

Let’s create 3 instances of Cassandra on 3 nodes of my cluster: ubuntu1, ubuntu2 and ubuntu3. Their respective containers will be called cass1, cass2 and cass3.

Starting a first Cassandra container

Let’s first start a Cassandra container on ubuntu1 (192.168.0.201):

The docker run command was used above to run the Hello-World container. But this time it pulls the Cassandra official image from Docker Hub instead of Hello-World. An image is a mold from which containers are spawned. It normally contains a Linux distribution (in this case it is Debian) with all of the necessary resources (in this case Cassandra binaries, scripts, folder structure, etc …) to provide the service that your need.

The container is started with the following options:

  • -d : run the container in detached mode, meaning in the background.
  • --name=cass1 : give it a name so that we can easily call commands on it later on.
  • --net=host : use the host’s network stack directly, and expose all container ports directly on the host IP address.

❗ The last network option is the easiest way to get this starting for us, but is bad practice, and should be used only when testing or playing around. The default behavior is normally to containerize the container’s networking, placing it into a separate network stack. We’ll see more about that a bit later.

Basic Docker Commands

To view all running containers:

To view the container logs:

Stop the container:

View all containers, even the ones which are stopped:

Note the CONTAINER_ID field. The value from this column can be used instead of the container name when running docker commands. This is the was to specify a container if you haven’t given it a name. A prefix of the ID also works, as long as it doesn’t conflict with another ID. For example you can also call docker stop 4cbe to stop the cass1 container.

Start the container again:

Remove a container (all data in the container will be lost):

If the -f option is not used, an error will be caused if the image has not been previously stopped.

Cached Images

Now that the container has been deleted, you’ll have to execute the docker run command above. But this time it won’t need to download the Cassandra image again, because it has been cached in your local repository the first time.

To view docker images cached locally:

As you can see, the Cassandra image is about 379.8 MB. It is bigger than the Debian image (125 MB) which is logical since the Cassandra image was in fact extended from the Debian image. The Hello-World image is very small (< 1 KB) because it doesn’t even have a linux distribution !

To delete an image, use  $ docker rmi <name_or_id>.

Instead of doing docker run to start a container, you can also simply download the image in your local cache first using docker pull cassandra. And then use docker create with the same options to create the container, which will exist but won’t be started until you call docker start <name_or_id>.

Executing Commands inside a Container

The Cassandra container is in fact a virtualized Debian OS which runs Cassandra. We can run commands on that OS using the following command on the host:

For example, you can check what process is running in the container by doing:

Or check some networking information in the container:

Since we used the --net=host mode, the container has the same network interfaces as the host, and it also has the same /etc/hosts file.

To run commands interactively, use the -i (interactive) and -t (tty) options. You can for example run a bash session in the container Debian:

You can see that you are logged in as root in the Debian Jessie command line.

Creating the Cassandra Cluster

Now that we have a first Cassandra container running on the first node, let’s create 2 other containers on the ubuntu2 (192.168.0.202) and ubuntu3 (192.168.0.203) hosts.

We ran the same command as on ubuntu1, but this time we added a -e option to define the environment variable CASSANDRA_SEEDS.

The usable environment variables are generally specified or explained on the Docker Hub page of the image. In this case, this variable takes a list of IP addresses of nodes to join in a cluster.

We can check the status of our cluster using Cassandra’s nodetool utility from inside any of the cluster’s containers:

We can see that our 3 nodes are “UN” (Up & Normal). Our cluster is ready !

Running the CQL Shell in a Container

Our cluster is running, so let’s create a table and insert data in it. To do this we can use the CQL Shell which comes with Cassandra.

We can call it from within any of the cluster’s container, with the docker exec command in interactive mode.

A better way is to create a new Cassandra container for the sole purpose of executing the CQL Shell inside of it:

This technique is more docker-ish in philosophy. It respects functional isolation and does not run extra processes in the Cassandra cluster’s containers. Instead it runs it runs a CQL Shell in a dedicated container (on any host, even one which is not part of the cluster) and communicates with one of the cluster’s node remotely through the CQL port 9042.

The command cqlsh 192.168.0.202 is specified as a parameter after the image name cassandra. In this case, Docker uses the specified command as the container’s main process instead of its default one, which should normally be the cassandra daemon.

The container is created with the --rm option so that once the container is stopped, which happens automatically when exiting the shell, it will be automatically removed.

Cassandra Data Creation and Querying

We can now create a small table and insert some data into it. We must first create a Keyspace to define our cluster organisation and set how many replications we want. Then in that keyspace we can create a table and insert some data:

The only thing which looks different that SQL here is the PRIMARY_KEY definition.

It is composed of 2 parts, in the format ((PARTITION_KEY), CLUSTERING_KEY). In our case, the partition key is the username field and the clustering key is the creation field, but both keys could be composite if you specify multiple field names separated by commas.

  • The partition key defines how to distribute data across nodes. In our case this means that all rows with the same username will be stored together in the same partition, on the same node.
  • The clustering key defines how the rows will be sorted within a partition.

The timeuuid data type serves as a uuid but also contains a timestamp.

Let’s look at a few CQL queries to see what Cassandra can and cannot do.

Allowed Queries

The first query selects all rows for a given value of username, the partition key, and wants it ordered by creation, the clustering key. This is the perfect query for the table we created, and it is the one we will use in our webapp later on.

The results are easy to obtain, because Cassandra simply needs to find the partition (on one of our 3 nodes) which contains all the rows for “nicolas”. It then reads those rows sequentially, and they are conveniently sorted by creation already, because it is the clustering key. In fact if you don’t specify the ORDER BY clause you get exactly the same result, because the rows are already sorted anyways. However you can use the ORDER BY clause with DESC to get the results in reverse order.

This query is also allowed. We still have our condition on the partition key, so Cassandra can go into the partition where “nicolas” rows are, and from there it can easily find the rows where the creation field matched the value we asked, since they are already ordered. Inequalities can also be used this way. Removing the condition on the partition key (username) is possible but not recommended because it requires joining results from multiple nodes. If you do so, a warning will appear and tell you to add  ALLOW FILTERING at the end of your query to force it.

Forbidden Queries

This query cannot be executed, because we are trying to sort by content value. But the values are sorted by order of creation within the target partition containing the “nicolas” rows. This would require to perform random reads, which is bad for performance, or to re-sort the data in memory. Cassandra doesn’t bother to do that, it simply throws an error. Using a condition on the content field is not allowed either for the same reason.

This query cannot be done either, because we did not specify a user. This means that we are trying to sort rows from all partitions from all nodes. This can’t be done with huge amounts of data, so Cassandra doesn’t bother either, and throws an error. As the error suggests, it is possible to get a result if you define a few desired username values using an IN clause. But this is not recommended because required calling multiple nodes and merging results, which can become quite messy.

Networking and Data Volumes

Our cluster is now working and usable. But there are still a couple of important configurations we still need to know about.

Using the Bridge Network

Previously we used the “host” network mode, which puts our contains directly on the host’s network stack. I did this first because it’s a lazy and quick way to get things started, but it is bad practice in Docker. The better way is to put the containers on the host’s virtual Ethernet bridge.

By specifying --net=bridge, or not specifying anything, since it is the default behavior, each container has its own IP:

Compared to when we used  --net=host, the following things have changed:

  • Calling localhost or 127.0.0.1 from within the container now calls the container itself instead of the host ubuntu1 machine.
  • The container has its own IP address, on the bridge network, which is 172.17.0.3 in the example above.
  • The /etc/hosts file is no longer the same as the one on the host. Now it only contains the container’s data.

Container Communication

When using the bridge network, the following observations can be made:

  1. The container can ping the host (e.g. at 192.168.0.201), and the host can ping the container (e.g. at 172.17.0.3).
  2. Two containers on the bridge network of a same host can ping each other using their bridge 172.17.x.x IP addresses.
  3. A container can ping another host using its IP address. But this is not mutual. See last point.
  4. However containers are not aware of other container or host names, because their /etc/hosts only contains their own information.
  5. A container cannot be pinged directly from outside its host. For example ubuntu2 cannot ping cass1 on ubuntu1, because the container’s IP address is on a virtual network, and only ubuntu1 knows about it.
    • This means that two containers on different hosts cannot ping each other.
    • It is however possible to ask the host to forward a port to one of its container. That way, the container can be reached from outside through that port. More in the following section.

💡 It is possible with Docker to make containers from different hosts sit on the same network, by using an overlay network instead of the bridge network. We will see more about this in future posts. For now we are stuck with the bridge network.

Based on these points, we might run into a few problems when starting our cluster. When passing the CASSANDRA_SEEDS variable to a container, we must make sure to pass the IP address instead of the hostname, because of (4).

A bigger problem is that each container cass1, cass2 and cass3, sitting on separate hosts, will auto-discover each other by exchanges IP addresses. They will broadcast their own IP address, which is on the bridge IP address. But this will lead to failed communications because of (5). Luckily, the Cassandra container takes an environment variable, CASSANDRA_BROADCAST_ADDRESS, in which we can tell the container to broadcast its host’s IP address instead.

So to overcome these networking obstacles, this is the command we can use to start our second container cass2:

Exposing Ports

When using the bridge network, all the container’s ports are exposed on its own bridge IP, but they are not reachable on the host’s IP, which means that they are not reachable from other hosts ubuntu2, ubuntu3, etc…

Fortunately, is it possible to expose ports through the host, using 2 options in the run command:

  • -P : this will map container ports to random port numbers on the host.
  • -p <cp>:<hp> : maps the container port cp to the host port hp.

We need to publish our Cassandra container ports to the host because they will be called by other nodes from the cluster. So we must use the second option listed above and map each port to its same value. But how can we know in advance which ports to map ? One way is to look at the Dockerfile of the Cassandra image. Check it out on Github here. This link can be found on the Docker Hub page of the image, which also normally documents how to run the container.

The Dockerfile defined how the image was built. If you look at the end of the file, just before the last line you can see:

These are the ports which are used by the container. So to run a Cassandra container and expose all of its ports:

The first -p option uses a range of ports instead of a single port. This can save a bit of typing if you have a lot of consecutive ports.

Now our containers are containerized in their own network stack, but can still be accessed from their host’s IP address, thanks to port forwarding. One advantage of this, compared to using the host network stack, is that you could deploy 3 Http Server containers on the same host, each listening on port 80, without port conflict, by mapping those 80 ports to 8001, 8002 and 8003 on the host for example.

Mounting a Data Volume

Until now, all the data created in our Cassandra containers were stored in the container file system. Which means that if we remove the containers and create new ones, then we would loose all of our Cassandra data. One way to persist data so that it can be kept after container destruction is to mount a host directory on the container.

This can be done using the -v <host_dir>:<mount_point> option in the run command:

Is this case, we have mounted the /data/cassandra directory of the ubuntu1 host on the /var/lib/cassandra mount point of the Cassandra container. To find which mount points are available on a container, it is probably documented on the Docker Hub page. If not, you can take a look at the image’s Dockerfile, and find the VOLUME values:

This way, after destroying a Cassandra node container, its data will still be present on the host’s  /data/cassandra directory. If we recreate a new container, we can re-mount the same directory and continue using the same data.

Conclusion

With everything we have seen in this post, and with the use of ssh, we can deploy a cluster from a single client node, for example ubuntu0, in a few commands:

As you can imagine, you could scale this up to thousands of nodes with a bit of scripting. In later posts we will have a look at tools which can orchestrate deployment on containers in a more sophisticated way.

3 Comments

  1. Sam Lin said:

    very nice post, it helps me a lot, thank you.

    February 6, 2017
    Reply
  2. Leland Later said:

    Thank you for the great intro, exactly what I needed. Looking forward to reading the subsequent posts.

    February 28, 2017
    Reply
  3. shree said:

    Hello Sir It’s nice one .

    Error response from daemon: Container 80486f6424c4cc6940db365152d106080d7ce8a1a1adc520a458620ea1b74a1a is not running
    please give me some idea for this

    May 16, 2017
    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

four × 3 =