Runbook name | Sidekiq Redis Migration |
|---|---|
Runbook description | Sidekiq was always meant to operate in its own redis db (not instance necessarily, but definitely db). Currently, it occupies the same db as rails caching and rails algorithm locks. A few sidekiq tasks require scanning the entire db keyspace which is not scalable as we continue to onboard to customers. It leads to errors like this: https://sentry.io/organizations/swipesense/issues/1796500526/?project=219898&query=is%3Aignored This playbook documents the steps for migrating the current redis queues to a new db. |
Owner | |
Version | 1.0 |
Version date |
|
On this page |
🏛 Architecture
đź’¬ Support contacts
Expertise level | Team | Team lead | Contact info |
|---|---|---|---|
Level 1 | Engineering | ||
Level 2 | Engineering |
🎽 Runs
Name | State | Start time | Completed time | Duration |
|---|---|---|---|---|
Staging | DONE | 2021-06-16 11:01:42 -0600 | 2021-06-16 13:08:42 -0600 | |
Production | TODO |
🎢 Process
Step instructions | |
|---|---|
| 1 | Stop the kubernetes rails-server-worker cluster from autoscaling. Go to: http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/#/deployment/default/rails-server-worker?namespace=default and scroll down to “Horizontal Pod Autoscalers”. Click the 3 vertical dots to the right of rails-server-worker and click edit. Change minReplicas AND maxReplicas the the value of currentReplicas. This will ensure that pods aren’t scaled while the queue is paused. |
| 2 | Ship the new schedulers: https://github.com/swipesense/visit_algorithm_scheduler/pull/22 and https://github.com/swipesense/contact_tracing_service/pull/18 Wait for them to fully deploy. New algorithm jobs will be submitted to the new redis db, however for now, they will sit there with nothing to process them. |
| 3 | Go to the sidekiq dashboard in admin. Go to the “Busy” tab. Click “Quiet All”. Wait for all current workers to have the “quiet” tag in the process list. At this point, no jobs are being processed. |
| 4 | Migrate keys to the new db. You will need to be on the vpn and modify the following command to connect to the proper redis host. Note that this skips migrating historical stats. There are thousands of historical stat keys - they are only used for the sidekiq dashboard charts and are non-critical. redis-cli -h REDIS_HOST --raw KEYS sidekiq:[^stat]* | xargs -I {} -n 1 redis-cli -h REDIS_HOST MOVE {} 1
redis-cli -h REDIS_HOST MOVE sidekiq:stat:failed 1
redis-cli -h REDIS_HOST MOVE sidekiq:stat:processed 1
|
| 5 | Ship the updated admin code: https://github.com/swipesense/admin_app/pull/2273 When it is deployed, the sidekiq instances will restart automatically using the new queue. This will also automatically re-enable horizontal pod autoscaling in kubernetes by reapplying the original autoscaler values. |