• Yorick Peterse's avatar
    Respond to DB health in background migrations · 91b752dc
    Yorick Peterse authored
    This changes the BackgroundMigration worker so it checks for the health
    of the DB before performing a background migration. This in turn allows
    us to reduce the minimum interval, without having to worry about blowing
    things up if we schedule too many migrations.
    
    In this setup, the BackgroundMigration worker will reschedule jobs as
    long as the database is considered to be in an unhealthy state. Once the
    database has recovered, the migration can be performed.
    
    To determine if the database is in a healthy state, we look at the
    replication lag of any replication slots defined on the primary. If the
    lag is deemed to great (100 MB by default) for too many slots, the
    migration is rescheduled for a later point in time.
    
    The health checking code is hidden behind a feature flag, allowing us to
    disable it if necessary.
    91b752dc