My first experience using zfs!
Introduction to zfs from a newbie pov
Just sayin’ zfs is way more than just a filesystem.
Zfs is a way of living maaan…
What?, What do you mean?
You have probably heard of software raid if you are a data hoarder or you selfhost things on a Homelab. Well, there’s a “better software raid” that you should try out, an entire filesystem built to be auto-repairable and very resistant to data loss or bit flips from cosmic rays! yep, that’s a thing.
Magic?
All of this by using some open-source dark magic (OpenZFS), data parity, hashes, checksums and more. It’s not magic btw, it’s hard work from many devs. <3
Speed
It is simply great and can make spinning rust faster by using “ARC”, a caching layer that uses your ram (preferably ECC) to access frequently accessed files faster to access.
Corruption
It can also avoid creating corrupted files when a sudden power loss happens for example, because of it being a “COW filesystem” aka: Copy on write. This same concept is used by other filesystems such as btrfs, but zfs is known to be better for data parity and management of drives and snapshots.
I’m very new at zfs so if something is factually wrong please contact me on Mastodon: @cosipa@social.linux.pizza
.
How has my first zpool done?
Actually not that well.
But it was my fault <3 love zfs
Context
I have a small and cheap server for my Homelab, and it doesn’t have a lot of storage expandability. To address this issue I decided to purchase a “4 bay enclosure” that connects directly to my server. It is pretty nice, has a smart fan and stuff, and is also quite nice looking and not too expensive, it doesn’t have raid mode, which is good, because I just want it to be a JBOD so I can use zfs on it without issues.
Well… It wasn’t that simple, I wish. The enclosure turned out to be RMA’d. It was scratched and stuff, which is not too important, if it works…
After realising this I decided to ignore the fact that it looked used and tested it, maybe it was OK, and if it ended up being damaged I would be able to return it instead.
Testing the pool
I first tested with SMART long tests and Badblocks the 2 new disks that I bought for this project.
The disks were just fine. Yayyy
I created a zpool with those 2 drives initially and it seemed to run just fine, I setup compression and encryption on some datasets inside of the pool.
I moved a bunch of files to the zfs mirror and the speed was expected, decent for my use case.
The problems arose (all my fault)
For starters:
- LOTS of checksum errors on the pool, 400+ each drive, meaning a scrub could very likely ruin some files with corruption, and ultimately it happened, some config files where converted from normal config files into folders, and many docker.yml files disappeared…
- Luckily I follow everyone’s advice on the internet and I have multiple backups on different systems and on the cloud (encrypted). People, zfs is not a backup, just like raid isn’t either. Have proper backups.
A zpool status
gave me a “too many errors” on the checksums. I looked it up to see what could be the problem, and what I found made sense…
Possible issues:
- The controller (the usb enclosure in this case).
- Usb ports on the host.
- Usb cable is defective.
- Power supply is not receiving good power.
- Ram is defective (mine is non ECC, I know…)
- Can’t install ECC ram on my “server”. :(
Troubleshooting
After troubleshooting for several days I tried everything, from running a Memtest86+ overnight, to plugging the power supply of the enclosure on a different socket, to changing the usb cable included for a different one.
Nothing fixed the issue, and I destroyed and rebuilt the array multiple times and run many scrubs.
After all of that troubleshooting I decided to return the usb enclosure to Amazon, since all the clues led me to believe that the problem was caused by it (remember, it appears to have already been returned).
In the end, ultimately…
I will make a new post about the subject once I get the new usb enclosure from Amazon.
- I even have a name for the next post: “My second experience using zfs”. I am so creative I know.
Want to reach out?
Contact me @cosipa@social.linux.pizza
on Mastodon.