This article was published on January 12th 2015 and takes about 3 minutes to read.
Use it with caution — it is probably still valid, but it has not been updated for over a year.
I'm not particularly proud of the mistake that delayed the work on my projects for the last days, but at least I can present you one possible cause to strange networking behaviour with virtual machines.
I am very happy with how my new book is coming along (about half of the book is feature-complete) but about a week ago I started to notice some strange networking behaviour with the virtual machine I use as a staging environment in the book.
I am using one of my self-built Vagrant boxes to spin up a minimal CentOS 7 system with an additional network interface (with an IP address in my private network) which gets provisioned with Ansible (having spent a great amount of time on Puppet before, I cannot tell you how much more I like Ansible, but that's for another article).
This scenario is quite common, so I began to worry when my virtual machine lost its network connection a few times a day. Most of the time that meant Ansible complained about not being able to reach the server halfway through the provisioning process.
When this happened, the server could not be reached with SSH and did not respond to ping requests either, so it seemed like it completely lost its network connection. Strange thing was that accessing the machine via Vagrant (with
vagrant ssh) continued to work.
This problem was very hard to track down because as soon as I wanted to use
tcpdump (or similar tools) to take a look under the hood, the connection was up again - sometimes after a few minutes, but mostly a few seconds later.
This was something I did not want my readers to experience, so I had to make sure that there was a problem with my configuration and not a bug in one of the tools I am recommending.
The first thing I did was switching from my CentOS 7 box to my CentOS 6 box. After all, CentOS 7 holds quite some surprises if you are coming from an earlier version.
This did not change anything so I wanted to make sure that it is not a bug with Vagrant. I started the virtual machine directly from the VirtualBox, removed the NAT'ed network interface Vagrant requires to operate and rebooted the machine.
Since reinstalling VirtualBox also changed nothing, it became clear that there had to be a problem with my network configuration.
When examining my main router's configuration, I quickly realized that I had changed my DHCP-range some time ago (giving our main devices fixed IP addresses solved a problem I had with Mac OS X Yosemite switching hostnames).
I configured my DHCP-range to start at
192.168.0.100 - an IP address I obviously have a soft spot for because I also assigned my virtual machine the fixed IP address of
After modifying my Vagrantfile to use
192.168.0.50 everything started working as expected and there have not been any networking problems since.
So kids, do not assign fixed IP addresses in a dynamic DHCP range! When you call yourself an experienced operator, this is actually quite embarrassing. But at least it shows that even small configuration mistakes can lead to errors which are difficult to track down. So try to always know what you are configuring!