feat(#15): add PM2 ecosystem config with restart_delay and port monitoring#
Problem#
PM2 crash-restarts of viewerv2-backend fail with EADDRINUSE :8998 because the OS hasn't released the port from TIME_WAIT before PM2 re-binds it.
Task / Link#
GitHub Issue: AgentSDE/meridian-backend#15
Changes#
- Added
ecosystem.config.cjswithrestart_delay: 3000,max_restarts: 5,min_uptime: 5000to give the OS time to release the port before PM2 restarts the process - Added
scripts/check-port.shthat pollslocalhost:8998/healthfor up to 30 seconds post-restart and exits 1 if still unreachable (alertable in CI) - Updated
deploy.yml"Restart service" step to usepm2 startOrRestart ecosystem.config.cjs --env productionso config is always applied on deploy - Added "Post-restart port monitoring" step in
deploy.ymlthat runscheck-port.sh
Notes#
- The 3s
restart_delayadds ~3s to recovery time — acceptable tradeoff vs. indefinite EADDRINUSE crash loops pm2 startOrRestartrequiresecosystem.config.cjsto be present on the VPS; the deploy workflow pulls latest master before restarting so this is always satisfied
Testing#
- 651 unit tests pass
- Lint passes (0 errors)