Github-like repository hosting on your own server

This article was published on August 18^th 2014 and takes about 6 minutes to read.

Use it with caution — it is probably still valid, but it has not been updated for over a year.

Github-like repository hosting on your own server

You need to provide an SSH key to Github to clone a repository - but have you ever asked yourself how Github distinguishes you from millions of other developers when we all access repositories as the same user?

Disclaimer
A short introduction to key-based SSH authentication
But there is a problem
Finding a solution
Going one step further
A starting point
Troubleshooting
Further reading

When you want to fork a project hosted at Github, you clone it with the following command:

git clone git@github.com:rails/rails.git

When I want to fork it, I use the exact same command. How does Github distinguish us to judge whether we are allowed to clone this repository or not (maybe it's private)?

Disclaimer

I have no idea of the inner workings of Github - maybe they do things completely different. But I want to show you an easy way to setup an environment which works the way Github's does, which can be useful when not all members of your team may contribute to all of your private repositories.

A short introduction to key-based SSH authentication

If you have been configuring key-based SSH authentication before, you know that a user on a Linux machine can have a file called .authorized_keys in the .ssh directory of his home folder, which is used to configure which users are allowed to authenticate as this user via SSH without knowing the associated password.

Let's say we have a user called git on our server and his authorized_keys file (/home/git/.ssh/authorized_keys) looks like the following:

ssh-rsa [some cryptic characters] user1@example.com
ssh-rsa [some cryptic characters] user2@example.com
ssh-rsa [some cryptic characters] user3@example.com

Each line allows one specific user to login as git user on this machine. The cryptic characters are the users' public keys, the email addresses make distinguishing these users easier (you can put any text there).

In other words: Assuming that I am "user1@example.com", I can log in to this server with ssh git@server without having to know the git user's password (maybe he does not even have one).

But there is a problem

Once we are logged in, there is no way to know who initiated this SSH session. No matter which user opened the connection, our server just sees that the user git is logged in:

[git@server ~]$ whoami
git

So we need a way to find out which user logged in via SSH.

Finding a solution

When you take a look at the SSH daemon's manpage (man sshd) there is a section called "AUTHORIZED_KEYS FILE FORMAT". Further reading reveals that you can set environment variables (using environment=) or/and run a specific command (using command=) depending on the user.

Lets try setting an environment variable first.

In order for this to work, we have to configure the SSH daemon to permit user environments. Switch to root, open sshd's config file (/etc/ssh/sshd_config) and search for PermitUserEnvironment. This is set to no by default, so set it to yes and restart the SSH daemon (/etc/init.d/sshd restart).

Switch back to the git user and modify his authorized_keys file:

environment="SSH_USER=user1" ssh-rsa [some cryptic characters] user1@example.com
environment="SSH_USER=user2" ssh-rsa [some cryptic characters] user2@example.com
environment="SSH_USER=user3" ssh-rsa [some cryptic characters] user3@example.com

After logging in to the server with ssh git@server the environment variable SSH_USER is set which holds the name of the user who opened the SSH session:

[git@server ~]$ env | grep SSH_USER
SSH_USER=user1

This is cool. However, it's not of great help since users can easily modify their environment variables, pretending to be someone else. And they should not be able to login in the first place - we just want to allow some specific git commands.

So: Dead end. Clean up by switching back to PermitUserEnvironment no as root and restart the SSH daemon.

Going one step further

We will write a simple script which will act as kind of a shell and use the command directive in the authorized_keys file to force all users to run this script.

Once again edit the git user's authorized_keys file, this time forcing users to run our soon-to-be script and provide their usernames as the only argument:

command="/home/git/shell user1" ssh-rsa [some cryptic characters] user1@example.com
command="/home/git/shell user2" ssh-rsa [some cryptic characters] user2@example.com
command="/home/git/shell user3" ssh-rsa [some cryptic characters] user3@example.com

When user1 logs in via SSH now, the command is run with user1 as argument but what's more important is that if a command was supplied instead of logging in, it is stored in the environment variable SSH_ORIGINAL_COMMAND (which will be empty otherwise).

Depending on the intended action, SSH_ORIGINAL_COMMAND will either start with git-receive-pack (when someone wants to clone the repository or pull from it) or git-upload-pack(when someone pushes to the repository) and contain the repository's path.

Now we have all information at hand to build a system for authorizing individual users to individual repositories.

If we want to deny access, we must return with an exit code greater than 0 and (since this script is the end point for our users) we must not forget to run the intended command before exiting when the user is allowed to clone, push or pull.

A starting point

Here is a small Ruby script to illustrate the concept. Be careful though, it does no strict checking so it probably is not very secure.

Whether you hardcode permissions, read them from a database or even from a textfile is up to you.

Save it as shell in the git user's home folder (/home/git) and make it executable (chmod +x /home/git/shell).

#! /bin/env ruby

user = ARGV.shift
full_command = ENV['SSH_ORIGINAL_COMMAND']

if full_command =~ %r(^git-(receive|upload)-pack '(.+?)'$)
  method = $1
  repository = $2

  # If the user is allowed to read and/or write to the
  # repository, execute the intended command.
  system("git-#{method}-pack '#{repository}'")
  exit 0
else
  abort('This is not a shell.')
end

abort('Permission denied.')

I will release my own script as an OpenSource project in the near future but it needs a little polishing for which I did not find the time yet.

If you have any problems setting this up, please do not hestitate to contact me via email or on Twitter - I'll be glad to help.

Troubleshooting

Problems with key-based authentication are caused by wrong permissions most of the time. So if you cannot get it to work, make sure you meet the following requirements:

The .ssh directory and the authorized_keys file must both belong to the user in whose home folder they reside (chmod -R git:git /home/git/.ssh in our example).
They must not be readable or writeable by anyone else but their owner (chmod 0700 /home/git/.ssh and chmod 0600 /home/git/authorized_keys).

Get in the loop

Join my email list to get new articles delivered straight to your inbox and discounts on my products.

No spam — guaranteed.

Got it, thanks a lot!

Please check your emails for the confirmation request I just sent you. Once you clicked the link therein, you will no longer see these signup forms.