Err Powercut has just changed something?

T4LLGUY · February 22, 2024, 12:35pm

Hey guys, so this morning I was awoken with a bang… a mighty clap of thunder and my UPSes beeping. As it was a rather unexpected break in power delivery I decided to manually shut down the server.

I’m running an old HP-ML330-V6, with a dedicated boot drive, then with 8 bays and an old LSi RAID card. Running a few VM’s.

To switch off, I hook my laptop into the UPS, navigate to the ProxMox IP Address, Bulk shutdown all VMs/Containers, then Shutdown the server however I noticed a pause in the Shutdown of the server so I flicked on the screen and made note of “Failed deactivating swap /dev/pve/swap” and I was going to look into that further upon powering back up.

Powering back up, I had all the usual joys, some VMs stating “The volume ‘Filesystem root’ only has xxx MB disk space remaining” etc. etc. but all manageable if not time consuming.

Then comes my most important VM, and that being STORJ! ProxMox says the VM is running, err technically but not in reality. So its attempting to boot but falling over.

Traditionally I have always used this thread (as I created it many years ago) - Oh NOOO! - Bloody Updates - Grub broken - #6 by CutieePie

However about a year ago I did actually get around to updating grub so I’m guessing the specific steps may no longer work? (especially the version number?)

Currently all I’m getting is

Booting ‘Debian GNU/Linux’

error: attempt to read or write outside of disk ‘hd0’.
error: attempt to read or write outside of disk ‘hd0’.
Loading Linux 4.19.0-21-amd64 …
error: attempt to read or write outside of disk ‘hd0’.
Loading initial ramdisk …
error: attempt to read or write outside of disk ‘hd0’.

Press any key to continue…

Failed to boot both default and fallback entries.

Press any key to continue…

So guys… over to you once again…
Please help me get my node back online!
Thank-you in advance.

Roxor · February 22, 2024, 2:20pm

So you can’t boot the OS your node is running on… but you’re asking for help in a forum for an app that runs on that OS. Would that be a better question for the ProxMox community?

It’s like saying you can’t start your car because it won’t even turn over… but instead of the dealer or a mechanic… you ask for help from the dude that installed your car stereo

T4LLGUY · February 22, 2024, 2:35pm

Ohh I see what your saying… sorry I’m not ‘very’ ‘linux’…

However Proxmox and all of my other VMs are working perfectly

Confused dot com

ProxMox Thread - https://forum.proxmox.com/threads/powercut-changes-something-on-one-vm.142083/

Alexey · February 23, 2024, 4:20am

If you have the node’s data and identity on a separate drive, and it’s intact, I would recommend just re-create this VM or even better - run the node directly on Proxmox server in a LXC container.

Fixing the Proxmox server should be definitely discussed on Proxmox forum. But we would like to get a feedback and hear how you managed to restore your node.

T4LLGUY · February 23, 2024, 7:23am

Thankyou for your reply Alexey (and also Roxor)

My server contains 8*1TB SAS drives, totalling 6TB in ZFS, I believe my Stork partition is only 1TB or maybe 2TB (I’d have to double check)

Suffice to say, it’s not on one dedicated drive…

I’m intrigued thou about running a container directly, that sounds great. However for me to be messing with live data, sounds a bad idea

I’m willing to ‘do as instructed’ if anybody knows what exactly to do??

Thanks again and hopefully some clever person will get back to me!

Alexey · February 23, 2024, 7:25am

But at least it can be available from the other VM/Proxmox itself?
Then the only thing that you need to do is to make sure that identity is on the same volume/partition/whatever, not inside the broken VM.

T4LLGUY · February 23, 2024, 7:40am

Seemingly it is possible, but it’s beyond me I’m afraid. I’m happy to act as the eyes, ears and hands on this one. But what I really need, is somebody who knows what they’re doing to guide me please…

Toyoo · February 23, 2024, 11:02pm

I have to say that I really wonder what went wrong during this downtime, I cannot imagine any reasonable scenario which would result in this type of problem.

Can you show the contents of the /etc/pve file that describes your Storj VM, and the partition table of that node? The easiest way to get the latter is probably to mount some sort of a linux livecd to the VM, boot from that livecd, and run sudo fdisk -l /dev/sda (or whichever boot drive is there) within that livecd.

T4LLGUY · February 24, 2024, 11:01pm

Hey Toyoo, thanks for you’re reply man!

If I try, I get the following:

ls /etc/pve
ls:cannot access ‘/etc/pve’: No such file or directory

Screenshot

Hope this helps? I am ready to assist, sorry for my delay as I’ve been a little unwell.

Toyoo · February 24, 2024, 11:21pm

/etc/pve exists on the host system, not inside the guest.

T4LLGUY · February 24, 2024, 11:55pm

Oh… I’d have to pull down my other VMs then, hmmm

If I have to go to that extreme, are there any other further commands you would like to know?

Toyoo · February 25, 2024, 10:34am

Just ssh into the proxmox host, you don’t need to stop VMs.

T4LLGUY · February 25, 2024, 11:41am

Ahh that’s a relief, thank-you!

Toyoo · February 25, 2024, 2:22pm

So now please find the file in /etc/pve that describes your VM, and paste it.

T4LLGUY · February 25, 2024, 2:42pm

? Sorry, what do you mean?

I am in the directory /etc/pve/ but nothing jumps out to me relating to being above my VM?

storage.cfg mentions mountpoints and local/network drives so that doesnt seem applicable.
Users.cfg is my log on to Proxmox, so agaain it’s not that… All the .key files, it’s hardly likely to be these and then there are the multiple directories… any clue to which one to start looking into?

Toyoo · February 25, 2024, 7:31pm

You might need to learn Proxmox more. The /etc/pve directory holds pretty much all of configuration for your Proxmox cluster: hosts, storage, VMs, etc. Files there are mostly text files that just describe parts of configuration. I don’t remember now how are files describing VMs named, I am no longer an active user of Proxmox, but they should stand out with names coming from their numerical IDs.

peter_linder · February 25, 2024, 8:14pm

On a single node proxmox system they are in /etc/pve/qemu-server

T4LLGUY · February 25, 2024, 9:41pm

I’m already aware of this nodes ID, as it’s 108

T4LLGUY · February 25, 2024, 9:45pm

Thank-you, OK… contents of 108.conf

Bearing in mind, I’d altered the boot sequence to run Linux Mint from the ISO file… I’ll change it back now

So boot order now is only SCSI0

Here’s a quick video: Watch boot | Streamable

Toyoo · February 25, 2024, 11:55pm

So at least after booting the guest things look ok on both sides, host and guest. And grub is supposed to read the same partition table as fdisk under your livecd. The only thing that comes to mind now is that maybe the guest file system got corrupted and points to outside of the drive, though given the partition sizes look ok, this should be caught… So, I’d try doing fsck on the guest file system as the next step, for example through live CD again.