monellimankeimontenegro

Less is more.

When it comes to pipelines, velocity is key. One of many keys though. Everything should be as slender as possible. This saves compute time and money. Optimizations can be done in a lot of ways. But one very simple, and after reading this great Medium blog [1], very obvious one is worth to implement immediately. The combination with Bitbucket's clone feature, described in their blog [2], makes it very easy to integrate into the pipeline scenario.

This website is deployed through a Bitbucket pipeline. A Bitbucket pipeline is defined within a bitbucket-pipelines.yml file and in there each stage of the pipeline is indicated by a so called step. In that step, by default, the repository is cloned. To have the whole repository cloned is often times, when it comes to pipelines, not very feasible. For example, the unit tests will be carried out on the latest commit. It is not necessary to have the whole repository's history available.

So by taking advantage of the clone feature and disabling the default from true to false. The git command has to be written by manually to the script, but the extra effort keeps limited to a single line. It is a git clone command with some modifications regarding the the depth. By setting the depth to 1, the latest commit is cloned, nothing more. Also a single branch is needed only. A checkout of HEAD is also omitted, as it is not necessary in that case due to depth 1.

              
              - step:
                name: 'Build and Test'
                clone:
                  enabled: 'false'
                script:
                  - echo "Your build and test goes here..."
                  - git clone git@bitbucket.org:senorlowbob/repo-name.git --depth 1 --single-branch -b master --no-checkout .

A problem that occurs when changing this step into this manual clone mode, is that the current host has no permission to access the repository. Bitbucket also enables to overcome that obstacle, by generating a public and private RSA key which will be used by the pipeline's hosts. The public key has then to be copied and pasted to a list of keys used by the repository itself.

Cloning into '.'... Warning: Permanently added the RSA host key for IP address '104.192.141.1' to the list of known hosts. Permission denied (publickey).

As this Stackoverflow [3] post explains. In Repository settings there is a section called Pipelines, in there is SSH keys . After generation of a key pair, the public key is copyable and can be pasted to R epository settings , Access keys in the General section.

The compute instances, bandwidth and copied content is not constant. But as very basic empirical comparison, the times before this change are around 40% higher. After the modification the mean time per step is around (6s+7s+9s)/3 = 7s, whereas before is was (13s+10s+11s+21s+15s+13s+12s+12s+12s+15s+15s+15s+10s+12s+13s+15s+11s+12s)/18 = 13s.

Fig.1 - An extract of the step time after the modified git clone command on the top. Six previous steps with default configuration.

Of course this improvement seems quite large. But when increasing unit tests or other tasks within the pipeline, those changes become marginal. Nevertheless, it is worth implementing since the tradeoff between expense in extra amount of changes and return, the save of time and bandwidth allows that.