Cluster Update: New apartment, New Network Hell
I am so excited! For the first time, I am living in a full-fledged apartment instead of a dorm room. After moving in, I enjoy my first day relaxing. The next day, I attempt to connect my computer cluster to my campus's network but I have no luck!
After connecting it to an Ethernet port, there is no blinking led lights on the port or the device's network jack. Well, the port is dead, right? Nope! Connecting other devices to the port works just fine. ???
I investigate further by connecting to the main network board of my cluster, the Espressobin board. There is no link. This is intriguing because, at power-up, the link lights on the port stay active for a few seconds.
I have a wild idea! From a few random conversations in the past, I know that my college campus network uses Packet Fence. Packet Fence manages network devices and isolates misbehaving devices. My Espressobin definitely fits this category. It's running its own DHCP server. It should be pointed only to the internal cluster network. However, if this DHCP server was misconfigured, then the campus network would no doubt issue an "arrest warrant" for my cluster.
All of my network and DHCP configurations begin with the rc.local file (super stable, right!). I disable the rc.local script and run each command within it after boot. Still no luck, ahhhhhhhhhhhhh!
Next up, I tested a USB ethernet adapter instead of the internal port.
Espressobin --> USB Ethernet --> Campus Network
It works! What in the world!!!!!!!!!!!!! Ok, this narrows down the problem to the exact port. It has nothing to do with my device "misbehaving", or a misconfigured DHCP client/server.
Side note: I tested the cluster on a different network. So, this confirms there is not physical damage to the Espressobin's port
lshw -C network command later and I found something. It could be nothing (again) or it could be everything. The
lshw -C network showed the connection speed of the link to be 1Gbps. This stood out to me because I ran a speed test on the port when I was testing the port with my laptop. This is important because I was a little disappointed that the apartment topped out at 100Mbps, unlike the 1000Mbps dorm link.
To continue this investigation I connected a different Linux board to the port, the Odroid xu4. It also has the capability of 1Gbps. However, the
lshw -C network report showed it's speed to be 100Mbps.
From various background knowledge, I know that Gigabit ethernet protocol uses more strands of wire in the ethernet cord than 100Mbps ethernet. This should all be auto negotiated when the link is first created. However, this isn't happening with my Espressobin.
A google search reveals the needed command to forcefully set the Espressobin link to 100Mbps. It works! I have blinking lights! ...and an internet connection.
I added this command to my rc.local to enforce the 100 Mbps speed between reboots.
ethtool -s wan speed 100 duplex full autoneg on
Note to self: I'll need to remove that line from the rc.local when I move back to a 1Gbps capable network.