Sunday, February 15, 2015

ubuntu - Troubleshooting SmartOS VM that won't start



I'm running SmartOS 20130405T010449Z with an Ubuntu KVM inside. The Ubuntu VM ran healthily for months, then after a reboot of the physical hardware the vm no longer connects to the network on startup, so I can't ssh into it to check its health.



I can log into SmartOS and start the VM:



$ vmadm start [uuid]


verify that it's running:




$ vmadm list
UUID TYPE RAM STATE ALIAS
[uuid] KVM 10240 running steve


and ping it:



$ ping steve
steve is alive



but when I attempt to drop into the VM's console, the command simply hangs forever:



$ vmadm console [uuid]
[hangs forever]


I get the same result when I attempt to ssh from inside SmartOS:




$ ssh steve
[hangs forever]


I can't ssh from other machines on the network, because the Ubuntu VM's IP address never comes up on the network.



What should I try next to access this VM?


Answer



Ok, I eventually recovered what I wanted from the VM, so for posterity, here is what I did:




First, I updated SmartOS. I was hesitant at first, fearing data loss, but the upgrade was totally painless: put a new version on a new USB stick, shutdown, swap the sticks, and reboot.



After the update vmadm console and ssh would still hang when connecting to the VM, so the key insight (I was unaware of this before) was to connect via VNC instead:



root@smartos $ vmadm info [UUID] vnc
{
"vnc": {
"host": "192.168.1.7",
"port": 64762,
"display": 58862

}
}

me@anotherMachine $ xtightvncviewer 192.168.1.7::64762


There, the problem was immediately apparent: the VM was stuck at the boot menu, waiting for a boot option to be selected. I selected the default option and hey presto, the VM came up perfectly healthy.



There was a catch, though: presumably when I updated SmartOS, I lost the "external" NIC, so the VM came up without a channel to the outside world. I had to manually edit /usbkey/config in SmartOS and add these lines, which were missing:




external_nic=[MAC address]
external0_ip=192.168.1.20
external0_netmask=255.255.255.0
external0_gateway=192.168.1.1


and then add the external NIC to the VM:



root@smartos $ cat add_nic.json
{

"add_nics": [
{
"physical": "net1",
"index": 1,
"nic_tag": "external",
"mac": "[MAC address]",
"ip": "192.168.1.8",
"netmask": "255.255.255.0",
"gateway": "192.168.1.1"
}

]
}
root@smartos $ cat add_nic.json | vmadm update [UUID]


I had to reboot SmartOS to pick up the configuration change, and then the VM came up with a network interface.



Caveat: vmadm console still won't work, for some reason; it still hangs indefinitely. However, ssh steve works from inside SmartOS, and I can ssh to the IP address from other machines on my network.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...