Kent O. Johnson wrote:Regarding database I/O performance, I had an issue where using a PostgreSQL database container with volumes from a data-only container from within a VM gave very unpredictable and erratic I/O behavior, which I am thinking is because of multiple layers of swapping between the physical machine and the virtual machine. I was using volumes so I am thinking the problem was not a container problem at all, just the configuration of the I/O driver for VMWare, the hypervisor I used.
Switching everything to physical machines outside of containers made things much faster and more consistent. I considered trying using containers on the physical machine though I hadn't got to that yet. I am glad you mentioned the different storage drivers, that they can affect performance, though it sounds like they are only relevant for in-container I/O. Is that right? If I wanted to run a PostgreSQL database container or any database container without mounting volumes then how would I go about achieving similar performance to using the DBMS on bare metal?
Using databases inside of containers seems to unnecessarily complicate backups since there is no SSHing into a container, though I could see an automated backup running to a directory mounted as a volume to solve that problem. Would you solve the backup use case like that?
Regarding network I/O, do you find that the virtual network device layer used by the Docker Engine adds significant enough overhead to warrant bypassing it all together with the "--net=host" option? How much of a performance gain have you seen that give? This is the first time I have seen that option and would like to learn more about it and where it would make sense.
Kent
Hi Kent
As far as I've seen, databases (large binary files frequently changed) are about the worst thing you can have inside a container. Aside from the storage driver thing docker layering will also do very poorly (e.g. if you run your schema in a new layer on a large dbspace, the whole dbspace will be copied). I generally avoid doing I/O heavy things inside a container if possible.
Storage drivers are a fundamental part of Docker, but generally you can get by without knowing anything about them. However, knowing how they work gives you some insight into edge cases (like databases). I'll briefly note the how each one works below:
aufs (https://docs.docker.com/engine/userguide/storagedriver/aufs-driver/) - each layer of an image consists of a set of files, and looking at the whole filesystem of a running container works by looking at the files of all layers for the container image (where files in layers higher up 'hide' files from layers lower down). When a container wants to write to a file, the file is copied from the highest layer with that file and used as the container copy.overlayfs (https://docs.docker.com/engine/userguide/storagedriver/overlayfs-driver/) - basically the same as aufsdevicemapper (https://docs.docker.com/engine/userguide/storagedriver/device-mapper-driver/) - layers are a bunch of pointers to the actual block contents of files, and file contents are stored in a big pool shared between layers. When a container wants to write to a file, it just needs to copy the appropriate block of the file and alter one of the block pointers.btrfs (https://docs.docker.com/engine/userguide/storagedriver/btrfs-driver/) - approximately the same idea as devicemapperzfs (https://docs.docker.com/engine/userguide/storagedriver/zfs-driver/) - again, approximately the same idea as devicemappervfs - a 'fake' layer driver that copies the entire contents of the parent layer on creation
We can make some observations just on the designs outlined above:
aufs/overlayfs 'copy-file-on-write' means the first time you try and write to a large file (e.g. a database) it will copy the whole file, which will be slow, but once done should give fast access because you have your own copydevicemapper/btrfs/zfs 'copy-block-on-write' means there isn't a huge penalty the first time you write to a file so you can get started quicker and the disk space reuse is better, but it may not be as fast as aufs or native accessvfs is astoundingly slow to start up, shares no disk space between layers...but should be as fast as raw filesystem access once created
My experience is mainly with aufs and devicemapper, so
you should definitely (at minimum) benchmark the others before accepting my claims. It doesn't hurt to benchmark them all! You can also look at the table at the bottom of
https://docs.docker.com/engine/userguide/storagedriver/selectadriver/ for what Docker Inc suggests are the important points about each driver.
Unfortunately, there's more to consider than just performance. For example,
AUFS is considered a highly stable driver by Docker Inc, but for a particular unusual I/O heavy application I was using 6 months ago, I would weekly see kernel panics bringing down my system!Devicemapper is currently the only commercially supported driver on centos/red hatBTRFS/overlayfs/devicemapper (partition)/zfs can require special setup which may be nontrivial
Honestly, I look at heavy I/O in containers with great scepticism, just because I've lived with a lot of pain with unstable drivers. Of course, things are always improving! But volumes (and possibly volume drivers) feel like the most reliable approach to me for now.