As you’re probably aware of by now, I really like Git. It took some time but things finally started clicking. One of the things I wanted to do was make it easier to interact with Git from Python / Django projects.
I searched around for a Python Git module. I really didn’t find anything that looked complete to me, although I didn’t look too hard. Not being the creative type I noticed that Ruby has the grit library created by Tom Preston-Werner and Chris Wanstrath, which is very nice. I decided to port it because I can use it for some cool stuff, and because I figured it would help me learn a lot about Python. So here it is.
About
GitPython is a python library used to interact with Git repositories.
GitPython is a port of the grit library in Ruby created by Tom Preston-Werner and Chris Wanstrath.
The method_missing
stuff was taken from this blog post.
REQUIREMENTS
-
Git tested with 1.5.3.7
- – used for running the tests
-
Mock by Michael Foord – used for tests.
INSTALL
You can download the code from CheeseShop or alternatively pull the source.
python setup.py install
SOURCE
GitPython’s git repo is available on Gitorious, which can be browsed at:
http://gitorious.org/projects/git-python/
and cloned from:
git://gitorious.org/git-python/mainline.git
USAGE
GitPython provides object model access to your git repository. Once you have created a repository object, you can traverse it to find parent commit(s), trees, blobs, etc.
Initialize a Repo object
The first step is to create a Repo
object to represent your repository.
>>> from git_python import *
>>> repo = Repo("/Users/mtrier/Development/git-python")
In the above example, the directory /Users/mtrier/Development/git-python
is my working repository and contains the .git
directory. You can also initialize GitPython with a bare repository.
>>> repo = Repo.init_bare("/var/git/git-python.git")
Getting a list of commits
From the Repo
object, you can get a list of Commit
objects.
>>> repo.commits()
[,
,
,
]
Called without arguments, Repo.commits
returns a list of up to ten commits reachable by the master branch (starting at the latest commit). You can ask for commits beginning at a different branch, commit, tag, etc.
>>> repo.commits('mybranch')
>>> repo.commits('40d3057d09a7a4d61059bca9dca5ae698de58cbe')
>>> repo.commits('v0.1')
You can specify the maximum number of commits to return.
>>> repo.commits('master', 100)
If you need paging, you can specify a number of commits to skip.
>>> repo.commits('master', 10, 20)
The above will return commits 21-30 from the commit list.
The Commit object
Commit objects contain information about a specific commit.
>>> head = repo.commits()[0]
>>> head.id
'207c0c4418115df0d30820ab1a9acd2ea4bf4431'
>>> head.parents
[]
>>> head.tree
>>> head.author
">
>>> head.authored_date
(2008, 5, 7, 5, 0, 56, 2, 128, 0)
>>> head.committer
">
>>> head.committed_date
(2008, 5, 7, 5, 0, 56, 2, 128, 0)
>>> head.message
'cleaned up a lot of test information. Fixed escaping so it works with subprocess.'
You can traverse a commit’s ancestry by chaining calls to parents
.
>>> repo.commits()[0].parents[0].parents[0].parents[0]
The above corresponds to master^^^
or master~3
in git parlance.
The Tree object
A tree records pointers to the contents of a directory. Let’s say you want the root tree of the latest commit on the master branch.
>>> tree = repo.commits()[0].tree
>>> tree.id
'a006b5b1a8115185a228b7514cdcd46fed90dc92'
Once you have a tree, you can get the contents.
>>> contents = tree.contents
[,
,
,
]
This tree contains three Blob
objects and one Tree
object. The trees are subdirectories and the blobs are files. Trees below the root have additional attributes.
>>> contents = tree.contents[-2]
>>> contents.name
'test'
>>> contents.mode
'040000'
There is a convenience method that allows you to get a named sub-object from a tree.
>>> tree/"lib"
You can also get a tree directly from the repository if you know its name.
>>> repo.tree()
>>> repo.tree("c1c7214dde86f76bc3e18806ac1f47c38b2b7a30")
The Blob object
A blob represents a file. Trees often contain blobs.
>>> blob = tree.contents[-1]
A blob has certain attributes.
>>> blob.name
'urls.py'
>>> blob.mode
'100644'
>>> blob.mime_type
'text/x-python'
>>> len(blob)
415
You can get the data of a blob as a string.
>>> blob.data
"from django.conf.urls.defaults import *\nfrom django.conf..."
You can also get a blob directly from the repo if you know its name.
>>> repo.blob("b19574431a073333ea09346eafd64e7b1908ef49")
What Else?
There is more stuff in there, like the ability to tar or gzip repos, stats, blame, and probably a few other things. Additionally calls to the git instance are handled through a method_missing
construct, which makes available any git commands directly, with a nice conversion of Python dicts to command line parameters.
Check the unit tests, they’re pretty exhaustive.
What is Next?
There are a couple of tests that don’t pass due to an inability to mock them properly, so I’m going to get those fixed up.
I also plan to restructure some of the object relationships. A few of them feel a little dirty to me.
LICENSE
New BSD License. See the LICENSE file.