Railway - Observed latency spikes on dashboard. Looking into it – Incident details

Observed latency spikes on dashboard. Looking into it

Resolved
Degraded performance
Started 10 months agoLasted about 1 hour

Affected

Dashboard

Degraded performance from 3:08 AM to 4:28 AM

Updates
  • Resolved
    Resolved

    This incident has been resolved.

  • Monitoring
    Update

    The issue seems to have been abated. During our discovery, we saw timeouts to GitHub and a few other external services. Given we don't use any egress products other than Cloud NAT from Google, we've escalated this to Google and are awaiting a response

  • Monitoring
    Monitoring

    Moving back to monitoring as we've observing it again

  • Resolved
    Resolved

    We've identified the issue as a bottleneck in our service mesh under heavier-than-average load which causes higher-than-normal latency.

  • Monitoring
    Monitoring

    Recovered. Initial response appears to be a flood of deployments with lots of variables put temporary lag on the database, which propagated to the dashboard. We will monitor as we continue to dig in

  • Investigating
    Investigating

    Observed latency spikes on dashboard. Looking into it