Description
First time posting to GitHub, be gentle. :)
System information
Type | Version/Name |
---|---|
Distribution Name | Ubuntu |
Distribution Version | 16.04 LTS / 16.10 on USB |
Linux Kernel | Will post soon, current in LTS / 4.8.0-22 on USB |
Architecture | Intel Xeon |
ZFS Version | Will post soon, current in LTS / zfsutils-linux 0.6.5.8 on USB |
SPL Version | Will post soon, current in LTS / ? |
RAM: | 96GBs of ECC RAM |
Other Config Information
- 2 Mirrored VDEVs for a total of 10TBs usable.
- L2ARC, metadata only
- Dedup and compression are enabled. (Ya I know… But I have 96GBs, that should be plenty).
- About 11TBs of data and about 2TBs free (after dedup and compression).
- Just a handful of apps and users. No major load on the system (it just hosts files).
- Exact details to follow once I can get back in.
- Will post these as soon as I can:
- modinfo zfs | grep -iw version
- modinfo spl | grep -iw version
Trigger
Delete a large file 1TB+
Issue
System will slowly consume all memory over the course of several hours (about 12) and hard lock. This happens both after the delete and while importing zpool on reboot.
I have had this happen before, I added a 32GB swap file (on SSD) and that seemed to help. It eventually cleared up after several attempts to reboot (took about two weeks, 12 hours a pop). I made the assumption that the delete was working, but something was causing the memory to not be released. So eventually...
This time I booted off of a live boot USB, added zfs-utils and I was surprised that not only did it attempt to mount the zpool right away (while in apt), but after about an hour it succeeded!
I thought “Cool, it cleared!” and rebooted. No go, 12 hours later, out of memory and locked (still at the boot screen with an out of memory error).
Alright, booted back into the USB stick, again, hung for about an hour, then booted! “Alright, that’s odd.”
At this point I noticed that the mount point for the tank was already taken and I could not access the volume. So, I exported the zpool, took a bit and completed. Moved the folder and re-mount. Watched the memory slowly climb and lock after 12 hours.
I move the USB boot to another system and removed ZFS. I now have the box booted again, re-blocked the mount point and have just re-installed ZFS. I am waiting for the mount to complete. I am hopping it will complete in an hour or so.
FYI, I will be on vacation for several days and unable to access the server after tomorrow.
What else should I grab as I am limited in what I can get right now.? Is this a known issue? Should I go to a newer build?
I have looked at several other open and closed issues including:
#3725
#5706
#5449
#3976
#5923