My Profile Photo

Sheogorath's Blog

Carving little helper scripts

on

Over the past few weeks I’ve been writing some scripts and bash commands to generate version numbers as well as to automate compliance around repositories. Since I usually just throw them into the Snippets section on SI-GitLab or even more often some repository there, let me share them here, so you’ll have an easier time finding them.

Automated CalVer for Git(Lab)

While SemVer is super popular, it’s by far not the only versioning system. For system compositions and similar complex artefacts it’s often not trivial to use SemVer. Also depending on the size of your composition it often results in constant major version bumps, taking the whole idea behind SemVer ad absurdum.

A good alternative to SemVer is CalVer. CalVer stands for “Calendar Versioning” and uses “date stamps” for versioning instead of trying to provide a semantic meaning in the version number. This makes it super easy to automate and still useful if you are into change management. You probably know it from projects like Ubuntu, which uses CalVer to version their OS releases (e.g. 20.04).

For my infrastructure repository I use the following script to generate the release:

#!/bin/sh
export INFRA_RELEASE_VERSION="$(date +%y.%m)"
export INFRA_RELEASE_NAME="Infrastructure $INFRA_RELEASE_VERSION"

release-cli create --name "$INFRA_RELEASE_NAME"
      --tag-name "v${INFRA_RELEASE_VERSION}" --ref $CI_COMMIT_SHA

While the release-cli and the $CI_COMMIT_SHA are GitLab specific tooling, you can easily adjust this script to work in any environment. However, the simply usage of the date command will fail you as soon as you want to release your software more than once a month as the version schema is YY.0M.

To allow multiple releases per month, I recommend a YY.0M.MICRO version schema, which you can generate easily from your git history like this:

#!/bin/sh
git tag "v$(date "+%y.%m.$(git tag | grep -c "v$(date "+%y.%m")")")"

This takes the current year and month to generate the base version (e.g. v21.03) and generates the MICRO part (e.g. .2) by counting existing git tags (e.g. v21.02.0, v21.03.0 and v21.03.1), that start with the mentioned base version (e.g. v21.03.0 and v21.03.1).

Generate a simple changelog from git history

As part of the release generation, as already mentioned, a changelog is a nice thing. A simple chronological changelog can be generated by using printf, a little bit of grep, sort, uniq, as well as head and of course git log:

#!/bin/sh
set -e

printf '## Changelog\n\n'
git log --no-merges --pretty="- %s (%h)" "HEAD...$(git tag | sort -V -r | head -1)"
printf '\n\n## External Contributors\n\n'
git log --pretty="- %an%n- %cn" HEAD...$(git tag | sort -V -r | head -1) | sort | uniq | grep -v 'Sheogorath'
printf "\n\n---\n*This is an automated release. See [#${CI_JOB_ID}]($CI_JOB_URL) for details.*"

Like the release-cli some variables, like CI_JOB_ID and CI_JOB_URL, are GitLab specific, but can easily replaced by similar functional variables in your own setup.

You might also notice, that I filter myself out of the “External Contributors” list, as I dislike it to credit myself in my own projects, hence the section is called “External Contributors”.

Overall the script makes heavy use of the --pretty parameter of the git log command that allows easy formatting of commit messages.

Automated compliance with reuse

reuse is a CLI tool, that allows to check your repository for license compliance and tries to help developers to properly license their software.

However, introducing this tool to an existing codebase will easily result in an “incompliant” state. One of the first steps is to add the required license headers to your source files. But doing this manually would take way too long and attribute the right people, can be a real pain. A few days ago, I wrote a neat little helper script that extracts all authorship information from the git history and uses reuse’s addheader function to generate a minimal script that applies the headers.

#!/bin/sh
byFile() {
git ls-tree -r --name-only $(git rev-parse --abbrev-ref HEAD) "${1:-./}" | while read file ; do
    echo "# $file"
    git log  --date=format:%Y --follow --pretty="format:reuse addheader --copyright \"%an <%ae>\" --year \"%ad\" $file" -- "$file" | sort | uniq
done
}

byFile "$1"

You can now use the mention script, let’s call it contributors.sh like bash contributors.sh > ./reuse.sh while you are in your repository, and it’ll spit out a nice script that will take care of your reuse calling.

IMPORTANT: The resulting information might be neither correct nor complete. If you have code or artefacts that are vendored into your repository, such as fonts, jQuery or a CSS file, be aware that the script will not be aware of the actual authors and attribute wrong copyright ownership.

Conclusion

None of those scripts is either universal enough or large enough to become an own project or a mandate a whole blog article on its own. But this article should help to share the knowledge and show what a little bit of CLI wizardry can do.

I’m very happy with those scriptlets that I throw into projects here and there. They usually get the job done sufficiently well. Obviously one could “write a proper tool” for those use-cases and even handle various edge cases way better than it’s done in those scripts, but following XKCD 1205, if the task is taking me less than an hour manually and I only have to do it once (a year), then it doesn’t mandate investing more 5 hours into any automation. And that’s exactly what these scripts do, getting work done.

All in all, I hope you consider them helpful and if not, well, then I can claim to have published a new article this month.

Photo by Jo Szczepanska on Unsplash