Storing sensitive data in a git repository using git-crypt

Estimated time reading
5 m 10 sec

TL;DR in this article I show how to use git crypt to store sensitive data in a git repo. I describe the steps using the context of my dotfiles repository but the same thing can be done to store any other kind of sensitive data inside any other git repository.

The context

In my previous article I showed how to use a git repository to store your dotfiles. Someone in the comments pointed out the fact that you should pay attention not to commit sensitive data inside your dotfiles repo.

Indeed, dotfiles usually end up containing several pieces of sensitive data related to your company or your clients. Examples of this are ssh private keys, aws certificates, api credentials, ... The list goes on, inside your personal dotfiles repo you may also want to store personal data like your password wallet data.

In this article I will describe how to store sensitive data in your dotfiles git repo, but this is just an example, you can use the approach outlined here to store any other kind of sensitive data inside any other git repository.

The way I solved this issue in the past was to use another "private" repository, nested inside the main one as a submodule.
As you can see from this old blob content the submodule was actually a bare repository whose content was synced using Dropbox.

"Hosting" a personal bare repo using Dropbox is actually not a problem if you are the only one using it since there are no race condition or publication issues. The problem with that approach was security: anyone at Dropbox HQ, or anyone managing to get their hands on my backups could potentially access my files.

So I started looking into using git-crypt, to perform the same task.

A small disclaimer

Now before continuing, a small disclaimer: I'm not an expert in security and cryptography, please let me know in the comments below what are the weaknesses using this approach.

Using git crypt

Here's the project description copied verbatim:

git-crypt enables transparent encryption and decryption of files in a git
repository. Files which you choose to protect are encrypted when committed,
and decrypted when checked out. git-crypt lets you freely share a repository
containing a mix of public and private content. git-crypt gracefully
degrades, so developers without the secret key can still clone and commit to
a repository with encrypted files. This lets you store your secret material
(such as keys or passwords) in the same repository as your code, without
requiring you to lock down your entire repository.

Using git crypt is pretty straightforward and well documented on the project page. The usage scenarios are as follows:

  1. An initial step:
    Execute git crypt keygen /path/to/keyfile to generate a cryptographic key used for symmetric encryption/decription.
    The key then has to be stored securely somewhere else.
  2. Configure the repository to use this key:
    Execute git crypt init /path/to/keyfile.
    Git-crypt uses the smudge/clean facilities. This step writes into the repo's .git/config file the entries filter.git-crypt.smudge, filter.git-crypt.clean and diff.git-crypt.textconv which you can look up using for example git config filter.git-crypt.clean.
  3. Specify what files need to be encrypted:
    This is done preparing one or more .gitattributes files and checking them in. You want for example to crypt all the files ending in .ssh you would commit the following .gitattributes file:


.gitattributes example file

*.ssh filter=git-crypt diff=git-crypt ```

If another developer clones your repo he only needs to do step 2 above; failing that he will only be able to access the encrypted version of the files (an unintelligible blob).

A git repo entirely dedicated to sensitive data

Git-crypt uses .gitattributes so that you can (but also must) explicitly specify what files you want to commit encrypted.
This approach seemed too much error prone for my use case because even if you specify a glob pattern like private/*, a file like private/subdir/file does not match that pattern and is commited into git unencrypted.

To avoid this problem I wanted to have a git repository where I was sure that every file was commited encrypted. So I fell back using the submodule approach; you can take a look at the repo here.

The most important thing is the root level .gitattributes: it defaults to crypt everything except the .gitmodules and the .gitattributes file themselves:


.gitattributes file

  • filter=git-crypt diff=git-crypt .gitattributes !filter !diff .gitmodules !filter !diff ```

As you can see this file for example was commited as an encrypted blob, even if it is just a plain text file in its unencrypted form.

Abusing git-crypt

On the LIMITATIONS section of git-crypt's README it states:

git-crypt is not designed to encrypt an entire repository. Not only does
that defeat the aim of git-crypt, which is the ability to selectively
encrypt files and share the repository with less-trusted developers, there
are probably better, more efficient ways to encrypt an entire repository,
such as by storing it on an encrypted filesystem.
git-crypt does not itself provide any authentication. It assumes that
either the master copy of your repository is stored securely, or that
you are using git's existing facilities to ensure integrity (signed tags,
remembering commit hashes, etc.).

I actually wanted a git repo where each file was encrypted, and at the same time I wanted an offsite backup system with a remote in the "cloud": git-crypt with its clean/smudge approach (and a simple git push) provided a solution with very few moving parts.

Latest posts from Giuseppe Rota

Keep in touch

Enter your email address below to receive all our latest articles and announcements via email.