The Wayback Machine - https://web.archive.org/web/20151216222430/http://blog.zx2c4.com:80/293

As many of you know, the KDE Project is transitioning to using Git with Gitolite and CGit. As such, I thought I’d update my aging Gitweb/posix-permissions installation of git to use CGit and Gitolite, and now my public git repository is kicking away. (If you’d like commit access any place or would like to host your own repo on my server, drop me a line.)

Since Gitolite manages git repositories, it has the option of generating the necessary information for Git’s shipped gitweb. This includes making a static list of repository names that should be included in gitweb as well as optionally adding the gitweb.owner entry inside .git/config and the description file at .git/description. The static list of repository names is boring and standard and easy. The owner and description specifications are standards set by the Git project for this kind of information. Hence, Gitolite supports interfacing with them.

Meanwhile, CGit uses its own configuration format for determining the owner and description and repository path. For interfacing with Gitolite, in the past I have created a hook that writes out a CGit-formated configuration file, which is then included in the main cgitrc with the include directive. Essentially I had to do this:

gitcode@starfox ~ $ cat web/cgit/generaterepos.sh 
#!/bin/sh
 
cd $(dirname "$0")
rm -f repos.tmp
 
cat ~/projects.list | while read gitname; do
        name=${gitname%.*}
        fullpath=/home/gitcode/repositories/$gitname
        owner=$(git --git-dir=$fullpath config --get gitweb.owner)
        desc=$(cat $fullpath/description)
        (
                echo repo.url=$name
                echo repo.name=$name
                echo repo.path=$fullpath
                echo repo.desc=$desc
                echo repo.owner=$owner
                echo repo.enable-log-filecount=1
                echo repo.enable-log-linecount=1
        ) >> repos.tmp
done
 
mv repos.tmp repos
 
gitcode@starfox ~ $ tail -n 1 web/cgit/cgitrc 
include=/home/gitcode/web/cgit/repos
 
gitcode@starfox ~ $ cat repositories/gitolite-admin.git/hooks/post-update.secondary 
#!/bin/sh
exec /home/gitcode/web/cgit/generaterepos.sh

This worked decently, but it was cumbersome and ugly, and was likely not to scale as features in both Gitolite and CGit are added and changed. Luckily, CGit supports the scan-path option, which builds an internal list of repositories automatically by scanning a directory for git folders. One such solution for integrating with Gitolite would be to simply point scan-path at Gitolite’s repository directory. This works fine, but it has three main shortcomings, which I’ve addressed this in a generic non-Gitolite-specific way in three patches. Let’s walk through them one by one.

project-list

We don’t want all Gitolite repositories showing up on CGit, and Gitolite provides a generic mechanism for controlling this: it writes a list of all the repositories selected for Gitweb to a file called projects.list. It’s just a flat file with each repository’s name written on a new line:

CheeseWhiz.git
Geoemail.git
MyCoolThangs.git

So, what about augmenting CGit’s scan-path feature with another setting called “project-list” that points to this file? That’s what this patch does. If project-list is set before scan-path is set, then scan-path only scans the git folders at project-list/${a line in the project-list file}. Problem solved, and this is a pretty generic way of doing it too.

remove-suffix

Most people store git repositories on disk at MyGitRepository.git. Notice the .git ending. However, most people prefer to see it listed as just “MyGitRepository” and they especially would like to clone it at gituser@domain.com:MyGitRepository, without needing the .git ending. Usually, CGit’s scan-path infers the repository name directly from the folder name. This patch adds a setting called “remove-suffix” that, if set to 1 (default is 0) before scan-path is set, will remove the .git suffix from the repository name and url while still pointing to the correct physical path. This as well is fairly generic and not specific to Gitolite or Gitweb, but rather Git’s usual conventions.

enable-gitweb-owner

CGit’s scan-path infers the owner of the repository from the posix owner’s UID name. But there is an additional Git standard for overriding this for any interface: the “gitweb.owner” configuration key in .git/config, which Gitolite understands and respects, as well as Gitweb. This patch simply calls Git’s internal C functions for fetching this information from the current repository’s config, and prefers this as the owner to the posix owner’s UID name. If gitweb.owner is not set in the configuration, it falls back to the posix owner’s UID name. This is a standard Git behavior. This occurs only for scan-path — cgitrc specified owners are preferred over these former two, obviously. Again, this configuration standard has been determined by the Git project, and both Gitolite and Gitweb respect it. So, this patch adds support inside CGit for it.

it works

Now instead of the include and the ugly set of scripts and hooks, I can just place this at the bottom of my cgitrc:

enable-gitweb-owner=1
remove-suffix=1
project-list=/home/gitcode/projects.list
scan-path=/home/gitcode/repositories 

and this integrates perfectly with Gitolite. All is harmonious in the Git universe.

On top of all this, I’ve cooked up a wicked good .htaccess file for CGit that allows me to have anonymous http pull at the same time as it rewrites the CGit urls to be pretty. Check it out:

Options FollowSymlinks ExecCGI
 
DirectoryIndex cgit.cgi
Allow from all
Order allow,deny
 
RewriteEngine on
 
SetEnv GIT_PROJECT_ROOT=/home/gitcode/repositories
 
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d 
RewriteRule "^(.*)/(.*)/(HEAD|info/refs|objects/(info/[^/]+|[0-9a-f]{2}/[0-9a-f]{38}|pack/pack-[0-9a-f]{40}\.(pack|idx))|git-(upload|receive)-pack)$" /git-http-backend.cgi/$1.git/$2 [NS,L,QSA]
 
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^.* /cgit.cgi/$0 [L,PT,NS]

A strange combination of stopping internal redirects and partial rewritings and odd stop conditions has made it so that the request gets forwarded and reformatted to git-http-backend if and only if it is first valid with cgit.cgi. Is this crackable? Can anyone figure out a backdoor to grab a repository that isn’t in projects.list?

I’ve also written a super generic script for uploading new repositories to my gitolite/cgit installation. From a git working directory, I run ~/Projects/uploadNewGit.sh "This is a description of my new git repo.", and wham-shabam, all the permissions get set and everything is uploaded just fine. Here is uploadNewGit, the latest version of which you can always find in my GitTools repository:

#!/bin/sh
 
GITOLITE_ADMIN="$HOME/Projects/gitolite-admin"
 
gitdir=$(readlink -f "$(pwd)")
name=`basename "$gitdir" | cut -d / -f 2 | cut -d ' ' -f 1`
description="$1"
 
if [ ! -d "$gitdir/.git" ]; then
        echo Not a git repo.
        exit 1
fi
if [ -z "$description" ]; then
        echo You need to specify a description argument.
        exit 1
fi
 
pushd "$GITOLITE_ADMIN/conf" > /dev/null
echo "Writing config.."
(echo
echo "  repo    $name"
echo "          RW+CD   =   $(whoami)"
echo "          R       =   @all"
echo "          $name \"$(git config --get user.name)\" = \"$description\"") >> gitolite.conf
git commit -a -m "Adding $name to repository."
git push
popd > /dev/null
 
url=`git --git-dir=$GITOLITE_ADMIN/.git remote -v | grep push | cut -f 2 | cut -d ' ' -f 1 | sed "s/$(basename $GITOLITE_ADMIN)/$name/"`
git remote rm origin 2> /dev/null
git remote add origin $url
git push -u --all
git push --tags

(As a side note, I’m not really sure the best way to quote commands inside of commands with variables that have spaces. something=$(command $(othercommand $argument)) has issues if argument has a space or if othercommand produces something with a space or if command produces something with a space (not totally certain about the latter two — I should check). But I can’t do this: something=”$(command “$(othercommand “$argument”)”)” because of obvious quoting problems. What’s the common solution to this? I’ve been using an awkward combination of the backtick operator `…` and the $(…) syntax but the backtick has some weird rules too. What’s the deal? Can somebody point me in a good place to read about this?)

Anyway, most of what I’ve written about in this post is new to me. Or at the very least, I’m a bit uneasy. So if you have any suggestions, by all means please tell me. I’m looking forward to seeing what the KDE sysadmins do in the end. Hopefully the CGit authors accept my patches.

Update: After some back and forth with Lars, the CGit maintainer, I’ve added a few more patches, including putting the gitweb.owner functionality behind configuration setting and also caching the scan, among various other improvements. You can check out all the commits I’ve made on this at the cgit for my cgit clone.

Update 2: I’ve gotten rid of my branch because my commits have been merged to cgit!

July 30, 2010 · [Print]

12 Comments to “Interfacing CGit and Gitolite”

  1. Eike Hein says:

    Actually, the KDE sysadmin team has abandoned using CGit due to a number of bugs (toma mentioned that in an earlier blog). Note however that CGit was always just a “bonus app”; the main focus of effort is on Redmine. We’ve set up Gitweb as another such bonus app instead of CGit, though.

  2. Jason says:

    @Eike
    What a shame. Which bugs?

  3. Eike Hein says:

    There’s a list of ’em in the comments here: http://www.omat.nl/2010/07/07/move-to-git-the-progress-so-far/

  4. larsh says:

    As the cgit maintainer, I’d certainly like to fix these issues:

    1. It’s not displaying this blob correctly: http://cgit.kde.org/amarok/amarok/tree/README (”plain” works)

    Do you have a copy of the generated html? It sounds like some error occurred while generating the reply, and if so the html would lack closing tags. Also, the httpd errorlog generated at the time of the incident might prove helpful.

    2. It’s showing a bogus diff for this merge commit (while git show, Redmine, Gitorious and gitweb show none, as expected): http://cgit.kde.org/konversation/konversation/commit/?id=7d1cb2fae86290da76ee0d0ded602c777f6e0ac8

    This is a known issue in cgit – it shows diffs against the first parent instead of the (usually empty) merge conflict diff. A proper fix for this requires some extensions to git.git, but a quickfix is just to disable diffs for merge commits. I’ll add a cgitrc-option for the latter, and try to extend git.git with the needed callbacks to show proper combined diffs in cgit.

    3. The caching appears to be wonky in practice, e.g. with caching enabled it won’t update the Branch table on the “summary” page to show the actual latest commit.

    I’ve seen similar behavior when the cgit process has lacked permissions to replace cache files (e.g. if the cgit binary had previously been invoked from the command line). When processing a request, cgit will favor fast serving of stale cache entries when a concurrent request has already locked the selected cache entry.

    A copy of the cgitrc-file used in the setup of cgit.kde.org and the httpd errorlog for the period when you discovered these problems might help me debug/fix these issues.

  5. Eike Hein says:

    @larsh: Thanks for your interest. Unfortunately I don’t have a copy of the generated HTML, no. Ditto for the cgitrc most likely, since we stopped using CGit. I’ll ask Jeff to chime in though, he set it up back then.

  6. Rsh says:

    Can anybody please tell me why git makes hosting repositories so complicated? There is a lot of system-specific stuff. I would like a single daemon that I can run, add users to some config (or by some simple tool) and be done with it. Currently I am sttill stuck on my server with svn – every setup description of hosting git scares the hell out of me. I am not a beginner guy in setting servers up but the more complicated things get I feel like I lose control over what is happening.

    Another thing that makes my transition impossible is the limited hosting possibility on Windows .. and I have some Windows guys in teams of my few project – they refuse to use a DVCS that cannot be hosted properly on Windows, even though they can happily use it as a client. I guess everybody has his pride.

    Sorry for my rant but what I hope to get is some answers why things are the way they are and why is it not changing?

    Cheers

  7. […] we need to do is make an rst2html wrapper and push certain URLs to it. My cgit installation already has a pretty advanced .htaccess for rewriting cgit urls, and here we’ll augment the one in that post with: RewriteCond […]

  8. […] to Github’s readme feature. Now it’s in cgit. Hopefully Lars will pull this soon, as he did my other patches. August 3, 2010 · […]

  9. […] I’ve decided to mirror Qt’s repository on my personal server running cgit, which is much much faster than Gitorious. It’s synced once an hour. And if […]

  10. Interfacing CGit and Gitolite | Nerdling Sapple…

  11. The Compiler says:

    > But I can’t do this: something=”$(command “$(othercommand “$argument”)”)” because of obvious quoting problems.

    Why not?

    $ echo “$(date “+$(echo %H %M)”)”
    08 40

    Bash can nest quotes just fine.

  12. Peter Wu says:

    Note that since 0.9.1, the options enable-gitweb-desc and enable-gitweb-owner are replaced by “enable-git-config”.

Leave a Reply