Locked Shared records with missing Gitlab-Runner

I ran into this problem during my first CI/CD functional testing using our GitLab environment. At some point, the system was updated and the AutoDevops configuration for every repository was enabled. It so happens that we were also testing the Kubernetes integration, and created a shared runner to deal with Docker container deployment in our OpenShift cluster. Fast-Forward pass the failures and bad house cleaning and what you find today is approximately 100,000 records of failed automatic deployments headed for a shared runner that doesn’t exist.

image

I didn’t find a way to auto-select and clear the records and since the runner was deleted some time ago, my only remediation was to select the delete option to the side of each record. Even if I did have time for 10,000 mouse clicks, I refuse to do that on principle.

Note: Do whatever safeguard method you prefer before trying this; snapshots of the virtual machine, application database backups, etc… This is just how I fixed my problem and in no way a guarantee to not break something else.

After some research, in which, I looked for rake tools, migration files, etc, I found that there was a way to get into the Database console.

$ gitlab-rails dbconsole

Inserting this command at the root prompt on your Gitlab instance will drop you into the database. From here, you are basically working in the Gitlab database, so be careful. After getting into the database, I took a look at the tables, using \d, which is standard commands for my chosen PostgreSQL back end.

gitlabhq_production=> \d


                                  List of relations
 Schema |                          Name                          |   Type   | Owner
--------+--------------------------------------------------------+----------+--------
 public | abuse_reports                                          | table    | gitlab
 public | abuse_reports_id_seq                                   | sequence | gitlab
 public | appearances                                            | table    | gitlab
 public | appearances_id_seq                                     | sequence | gitlab
 public | application_setting_terms                              | table    | gitlab
 public | application_setting_terms_id_seq                       | sequence | gitlab
 public | application_settings                                   | table    | gitlab
 public | application_settings_id_seq                            | sequence | gitlab
 public | approval_merge_request_rule_sources                    | table    | gitlab
 public | approval_merge_request_rule_sources_id_seq             | sequence | gitlab
 public | approval_merge_request_rules                           | table    | gitlab
 public | approval_merge_request_rules_approved_approvers        | table    | gitlab
 public | approval_merge_request_rules_approved_approvers_id_seq | sequence | gitlab
 public | approval_merge_request_rules_groups                    | table    | gitlab
 public | approval_merge_request_rules_groups_id_seq             | sequence | gitlab
 public | approval_merge_request_rules_id_seq                    | sequence | gitlab
 public | approval_merge_request_rules_users                     | table    | gitlab

.....

From there I found a table that was likely relevant to my view; ci_runners, so I ran a simple all records select to verify that the records in the view matched the information in the database table.

gitlabhq_production=> select * from ci_runners;

Bingo!!! Now that I had the table, I wanted to make sure that removing the records didn’t impact the application negatively, so I took a look at all the reference keys contained within that ci_runners.

gitlabhq_production=> \d ci_runners;
Indexes:
    "ci_runners_pkey" PRIMARY KEY, btree (id)
    "index_ci_runners_on_contacted_at" btree (contacted_at)
    "index_ci_runners_on_is_shared" btree (is_shared)
    "index_ci_runners_on_locked" btree (locked)
    "index_ci_runners_on_runner_type" btree (runner_type)
    "index_ci_runners_on_token" btree (token)
    "index_ci_runners_on_token_encrypted" btree (token_encrypted)
Referenced by:
    TABLE "clusters_applications_runners" CONSTRAINT "fk_02de2ded36" FOREIGN KEY (runne
r_id) REFERENCES ci_runners(id) ON DELETE SET NULL
    TABLE "ci_runner_namespaces" CONSTRAINT "fk_rails_8767676b7a" FOREIGN KEY (runner_i
d) REFERENCES ci_runners(id) ON DELETE CASCADE

In my case, there was no associated data, so with my snapshot running, I deleted the rows.

gitlabhq_production=> select * from clusters_applications_runners;

gitlabhq_production=> select * from ci_runner_namespaces;

gitlabhq_production=> delete from ci_runners where ip_address = 'XXX.XXX.XXX.XXX';
DELETE 94514

gitlabhq_production=> \q

This cleared up the view and at the point of this post, after deploying a new runner and running some dedicated pipeline jobs, I have seen no negative impacts. This may not be the proper way to approach this solution, but it definitely saved me from clicking a button 94,514 times.

Published by

Carl

In this order, Father, Husband, Technology Professional. I love football (aka Soccer).

Leave a Reply

Your email address will not be published. Required fields are marked *