Auto-commit from giteapush.sh at 2025-04-30 08:55:37
This commit is contained in:
parent
c1127be692
commit
b40dd0828a
77
postmortem/genesisradiozfsmigration.md
Normal file
77
postmortem/genesisradiozfsmigration.md
Normal file
@ -0,0 +1,77 @@
|
||||
from datetime import datetime
|
||||
|
||||
# Define the content of the post-mortem
|
||||
post_mortem_content = f"""# 🔧 Post-Mortem: Genesis Radio Storage Migration
|
||||
**Date:** April 30, 2025
|
||||
**Prepared by:** Doc
|
||||
**Systems Affected:** StationPlaylist (SPL), Voice Tracker, Genesis Radio media backend
|
||||
|
||||
---
|
||||
|
||||
## 🧠 Executive Summary
|
||||
|
||||
Genesis Radio’s backend was migrated from a legacy MinIO instance using local disk (ext4) to a new **ZFS-based, encrypted MinIO deployment on `shredderv2`**. This change was driven by a need for more stable performance, improved security, and a cleaner storage architecture with proper bucket separation.
|
||||
|
||||
This migration was completed **without touching production** until final validation, and all critical services remained online throughout the transition. We also revamped the rclone caching strategy to reduce freeze-ups and playback hiccups.
|
||||
|
||||
---
|
||||
|
||||
## ✅ What We Did
|
||||
|
||||
- Created **three new secure buckets**: `genesislibrary-secure`, `genesisassets-secure`, and `genesisshows-secure`
|
||||
- Migrated data from backup server using `rclone sync`:
|
||||
- `genesislibrary` came directly from backup
|
||||
- `genesisassets` and `genesisshows` were pulled from the same bucket, with de-duping and cleanup to be completed post-migration
|
||||
- Retained **original SPL drive letters** (`Q:\\`, `R:\\`) to avoid changes to the playout config
|
||||
- Switched rclone mounts to point to the new secure buckets, with **aggressive VFS caching** using SSD-backed cache directories
|
||||
- Took a clean **ZFS snapshot** (`@pre-s3-switch`) before switching over
|
||||
- Confirmed no regression in SPL, VT Tracker, or streaming audio
|
||||
|
||||
---
|
||||
|
||||
## ⚙️ Technical Improvements
|
||||
|
||||
- **VFS caching overhaul**:
|
||||
- Increased read-ahead (`1G`), lowered write-back wait
|
||||
- Split cache between `X:\\librarycache` and `L:\\assetcache`
|
||||
- No more rclone choking on large files or freezing during transitions
|
||||
- **Encrypted S3 storage** with isolated buckets per functional role
|
||||
- **TLS-secured** Console and MinIO endpoints with automated renewal
|
||||
- Mounted buckets at startup via batch script (future systemd equivalents to be implemented)
|
||||
- Snapshot-based rollback in ZFS enabled post-deployment resilience
|
||||
|
||||
---
|
||||
|
||||
## 🩹 What Went Weird (and We Fixed It)
|
||||
|
||||
- SPL froze during initial `mc mirror` attempts — solution: switched to `rclone`, which performed exponentially faster
|
||||
- Some hiccups during early cache tuning, including sparse file support issues — solved by switching to ZFS
|
||||
- Missing media files in Mastodon were traced to uploads during sync; resolved with staged sync + retry before final switch
|
||||
- Certbot automation wasn’t configured — resolved with a systemd timer that stops nginx, renews, and restarts nginx automatically
|
||||
|
||||
---
|
||||
|
||||
## 🧯 What We Learned
|
||||
|
||||
- MinIO is solid, but **rclone wins for bulk sync performance**
|
||||
- VFS cache settings **make or break** media-heavy workloads like SPL
|
||||
- ZFS is a game-changer: no sparse file errors, reliable snapshots, clean rollback
|
||||
- Planning matters: pre-syncing from backup avoided downtime
|
||||
- Not touching prod until ready keeps stress and screwups to a minimum
|
||||
|
||||
---
|
||||
|
||||
## 📦 Next Steps
|
||||
|
||||
- [ ] Clean `genesisassets-secure` of misplaced show files
|
||||
- [ ] Sync `azuracast` from live system (no backup copy yet)
|
||||
- [ ] Build automated snapshot send-to-backup workflow (`zfs send | ssh backup zfs recv`)
|
||||
- [ ] Stage full failover simulation (optional but fun)
|
||||
"""
|
||||
|
||||
# Save it as a Markdown file
|
||||
file_path = "/mnt/data/genesis_radio_migration_postmortem.md"
|
||||
with open(file_path, "w") as f:
|
||||
f.write(post_mortem_content)
|
||||
|
||||
file_path
|
Loading…
x
Reference in New Issue
Block a user