A project of mine (which I’ll announce in a few days) uses restructuredText and git. So, showing an html-ified version of the restructuredText was the logical thing to do. CGit dumps the plain text of a file at http://git.website.com/ProjectName/plain/path/to/myfile. I wanted to add onto this a restructuredText dump at http://git.website.com/ProjectName/rst/path/to/myfile. One solution would be to patch cgit, but instead I chose to build around it and use the handy docutils to do the conversion.

The tool rst2html takes rst on stdin and pushes html on stdout. The first thing we need to do is make an rst2html wrapper and push certain URLs to it. My cgit installation already has a pretty advanced .htaccess for rewriting cgit urls, and here we’ll augment the one in that post with:

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*)/rst/(.*) /rst.cgi/$1/plain/$2 [L,PT,NS]

This internally rewrites all traffic at http://git.website.com/ProjectName/rst/path/to/myfile to http://git.website.com/rst.cgi/ProjectName/plain/path/to/myfile. This means that the CGI environment variable PATH_INFO is set to http://git.website.com/ProjectName/plain/path/to/myfile. CGit looks at PATH_INFO to determine which page to show, so all we have to do is call cgit.cgi and pipe it to rst2html:

#!/bin/sh
echo "Content-Type: text/html"
echo
./cgit.cgi | ../rst2html/rst2html

However, cgit spits out HTTP headers of its own, which we need to strip:

#!/bin/sh
echo "Content-Type: text/html"
echo
./cgit.cgi | sed -n '/^$/,$p' | ../rst2html/rst2html

The sed command prints all lines after the first empty line.

This is all fine, but cgit nicely caches output, and so should our rst script. To do this we need to look at the Last-Modified and Expires headers that cgit spits out and compare them to a cache file if it already exists. If the cache is dirty, we call rst2html on cgit’s output and tee it to the cache file to update it for the next call. If it’s clean, we just cat the cache. Along the way, we make sure to copy cgit’s HTTP cache headers for the rst file.

#!/bin/sh
plain=$(./cgit.cgi)
expiration=$(echo "$plain" | sed -n 's/^Expires: \(.*\)$/\1/p')
lastmodified=$(echo "$plain" | sed -n 's/^Last-Modified: \(.*\)$/\1/p')
echo "Content-Type: text/html"
echo "Expires: $expiration"
echo "Last-Modified: $lastmodified"
echo
expiration=$(date -d "$expiration" +%s)
lastmodified=$(date -d "$lastmodified" +%s)
cache="../rst2html/cache/$(echo $PATH_INFO | md5sum | cut -d ' ' -f 1)"
cachetime=0
if [ -f "$cache" ]; then
        cachetime=$(stat -c %Y "$cache")
fi
if [ $cachetime -ne 0 -a $cachetime -lt $expiration -a $cachetime -ge $lastmodified ]; then
        cat "$cache"
else
        echo "$plain" | sed -n '/^$/,$p' | ../rst2html/rst2html | tee "$cache"
fi
August 3, 2010 · [Print]

2 Comments to “CGit and restructuredText”

  1. [...] wrote in a previous post about hacking restructuredText support into cgit by way of some nasty .htaccess and cgi scripts. Well, now I’ve built support into CGit [...]

  2. [...] I’ve decided to mirror Qt’s repository on my personal server running cgit, which is much much faster than Gitorious. It’s synced once an hour by cron. And if [...]

Leave a Reply