Auto-commit from giteapush.sh at 2025-05-08 15:44:43

This commit is contained in:
DocTator 2025-05-08 15:44:43 -04:00
parent 21d02c69cc
commit 5d231afbf8

View File

@ -1,8 +1,8 @@
# Postmortem: Genesis I/O Realignment # Postmortem: Genesis I/O Realignment
**Date:** May 8, 2025 **Date:** May 8, 2025
**Author:** Doc **Author:** Doc
**Systems Involved:** minioraid5, shredder, chatwithus.live, zcluster.technodrome1/2, thevault **Systems Involved:** minioraid5, shredder, chatwithus.live, zcluster.technodrome1/2, thevault
**Scope:** Local-first mirroring, permission normalization, MinIO transition **Scope:** Local-first mirroring, permission normalization, MinIO transition
--- ---
@ -11,73 +11,90 @@
To realign the Genesis file flow architecture by: To realign the Genesis file flow architecture by:
- Making local block storage the **primary source** of truth for AzuraCast and Genesis buckets * Making local block storage the **primary source** of truth for AzuraCast and Genesis buckets
- Transitioning FTP uploads to target local storage instead of MinIO directly * Transitioning FTP uploads to target local storage instead of MinIO directly
- Establishing **two-way mirroring** between local paths and MinIO buckets * Establishing **two-way mirroring** between local paths and MinIO buckets
- Correcting inherited permission issues across `/mnt/raid5` using `find + chmod` * Correcting inherited permission issues across `/mnt/raid5` using `find + chmod`
- Preserving MinIO buckets as **backup mirrors**, not primary data stores * Preserving MinIO buckets as **backup mirrors**, not primary data stores
--- ---
## 🔧 Work Performed ## 🔧 Work Performed
### ✅ Infrastructure changes: ### ✅ Infrastructure changes:
- Deployed block storage volume to Linode Mastodon instance
- Mirrored MinIO buckets (`genesisassets`, `genesislibrary`, `azuracast`) to local paths * Deployed block storage volume to Linode Mastodon instance
- Configured cron-based `mc mirror` jobs: * Mirrored MinIO buckets (genesisassets, genesislibrary, azuracast) to local paths
- Local ➜ MinIO: every 5 minutes with `--overwrite --remove` * Configured cron-based `mc mirror` jobs:
- MinIO ➜ Local: nightly pull, no `--remove`
* Local ➜ MinIO: every 5 minutes with `--overwrite --remove`
* MinIO ➜ Local: nightly pull, no `--remove`
* Prepared 5TB local drive for AzuraCast asset mirroring (pending full sync)
### ✅ FTP Pipeline Adjustments: ### ✅ FTP Pipeline Adjustments:
- Users now upload to `/mnt/spl/ftp/uploads` (local)
- Permissions set so only admins access full `/mnt/spl/ftp` * Users now upload to `/mnt/spl/ftp/uploads` (local)
- FTP directory structure created for SPL automation * Permissions set so only admins access full `/mnt/spl/ftp`
* FTP directory structure created for SPL automation
### ✅ System Tuning: ### ✅ System Tuning:
- Set `vm.swappiness=10` on all nodes
- Apache disabled where not in use * Set `vm.swappiness=10` on all nodes
- Daily health checks via `pull_health_everywhere.sh` * Apache disabled where not in use
- Krang Telegram alerts deployed for cleanup and system state * Daily health checks via `pull_health_everywhere.sh`
* Krang Telegram alerts deployed for cleanup and system state
--- ---
## 🧠 Observations ## 🧠 Observations
- **High load** on `minioraid5` during `mc mirror` and `chmod` overlap * **High load** on `minioraid5` during `mc mirror` and `chmod` overlap
- Load ~6.5 due to concurrent I/O pressure
- `chmod` stuck in `D` state (I/O wait) while `mc` dominated disk queues
- Resolved after `mc` completion — `chmod` resumed and completed
- **MinIO buckets were temporarily inaccessible** due to permissions accidentally inherited by FTP group * Load \~6.5 due to concurrent I/O pressure
- Resolved by recursively resetting permissions on `/mnt/raid5` * `chmod` stuck in `D` state (I/O wait) while `mc` dominated disk queues
* Resolved after `mc` completion — `chmod` resumed and completed
- **Krang telemetry** verified: * **MinIO buckets were temporarily inaccessible** due to permissions accidentally inherited by FTP group
- Mastodon swap usage rising under asset load
- All nodes had Apache disabled or dormant * Resolved by recursively resetting permissions on `/mnt/raid5`
- Health alerts triggered on high swap or load
* **Krang telemetry** verified:
* Mastodon swap usage rising under asset load
* All nodes had Apache disabled or dormant
* Health alerts triggered on high swap or load
--- ---
## ✅ Outcome ## ✅ Outcome
- Full Genesis and AzuraCast data now reside locally with resilient S3 mirrors * Full Genesis and AzuraCast data now reside locally with resilient S3 mirrors
- Mastodon running on block storage, no longer dependent on MinIO latency * Mastodon running on block storage, no longer dependent on MinIO latency
- FTP integration with SPL directory trees complete * FTP integration with SPL directory trees complete
- Cleanup script successfully deployed across all nodes via Krang * Cleanup script successfully deployed across all nodes via Krang
- Daily health reports operational with alerts for high swap/load * Daily health reports operational with alerts for high swap/load
--- ---
## 🔁 Recommendations ## 🔁 Recommendations
- Consider adding snapshot-based ZFS backups for `/mnt/raid5` * Proceed with AzuraCast mirror only after:
- Build `verify_mirror.sh` to detect drift between MinIO and local storage
- Auto-trigger `chmod` only after `mc mirror` finishes * Mastodon asset storage transition is confirmed stable
- Monitor long-running background jobs with Krang watchdogs * All `/mnt/raid5` permission fixes are complete
* Consider adding snapshot-based ZFS backups for `/mnt/raid5`
* Build `verify_mirror.sh` to detect drift between MinIO and local storage
* Auto-trigger `chmod` only after `mc mirror` finishes
* Monitor long-running background jobs with Krang watchdogs
* Finalize and launch AzuraCast 5TB mirror sync
--- ---
**Signed,** **Signed,**
Doc Doc
Genesis Hosting Technologies Genesis Hosting Technologies