1. Skip to navigation
  2. Skip to content

Entries tagged “library”

GitPython

written by Michael Trier, on May 8, 2008 12:17:00 AM.

As you’re probably aware of by now, I really like Git. It took some time but things finally started clicking. One of the things I wanted to do was make it easier to interact with Git from Python / Django projects.

I searched around for a Python Git module. I really didn’t find anything that looked complete to me, although I didn’t look too hard. Not being the creative type I noticed that Ruby has the grit library created by Tom Preston-Werner and Chris Wanstrath, which is very nice. I decided to port it because I can use it for some cool stuff, and because I figured it would help me learn a lot about Python. So here it is.

About

GitPython is a python library used to interact with Git repositories.

GitPython is a port of the grit library in Ruby created by Tom Preston-Werner and Chris Wanstrath.

The method_missing stuff was taken from this blog post.

REQUIREMENTS

  • Git tested with 1.5.3.7
  • – used for running the tests
  • Mock by Michael Foord – used for tests.

INSTALL

You can download the code from CheeseShop or alternatively pull the source.


python setup.py install

SOURCE

GitPython’s git repo is available on Gitorious, which can be browsed at:

http://gitorious.org/projects/git-python/

and cloned from:

git://gitorious.org/git-python/mainline.git

USAGE

GitPython provides object model access to your git repository. Once you have created a repository object, you can traverse it to find parent commit(s), trees, blobs, etc.

Initialize a Repo object

The first step is to create a Repo object to represent your repository.


>>> from git_python import *
>>> repo = Repo("/Users/mtrier/Development/git-python")

In the above example, the directory /Users/mtrier/Development/git-python is my working repository and contains the .git directory. You can also initialize GitPython with a bare repository.


>>> repo = Repo.init_bare("/var/git/git-python.git")

Getting a list of commits

From the Repo object, you can get a list of Commit objects.


>>> repo.commits()
[, 
 , 
 , 
 ]

Called without arguments, Repo.commits returns a list of up to ten commits reachable by the master branch (starting at the latest commit). You can ask for commits beginning at a different branch, commit, tag, etc.


>>> repo.commits('mybranch')
>>> repo.commits('40d3057d09a7a4d61059bca9dca5ae698de58cbe')
>>> repo.commits('v0.1')

You can specify the maximum number of commits to return.


>>> repo.commits('master', 100)

If you need paging, you can specify a number of commits to skip.


>>> repo.commits('master', 10, 20)

The above will return commits 21-30 from the commit list.

The Commit object

Commit objects contain information about a specific commit.


>>> head = repo.commits()[0]

>>> head.id
'207c0c4418115df0d30820ab1a9acd2ea4bf4431'

>>> head.parents
[]

>>> head.tree


>>> head.author
">

>>> head.authored_date
(2008, 5, 7, 5, 0, 56, 2, 128, 0)

>>> head.committer
">

>>> head.committed_date
(2008, 5, 7, 5, 0, 56, 2, 128, 0)

>>> head.message
'cleaned up a lot of test information. Fixed escaping so it works with subprocess.'

You can traverse a commit’s ancestry by chaining calls to parents.


>>> repo.commits()[0].parents[0].parents[0].parents[0]

The above corresponds to master^^^ or master~3 in git parlance.

The Tree object

A tree records pointers to the contents of a directory. Let’s say you want the root tree of the latest commit on the master branch.


>>> tree = repo.commits()[0].tree


>>> tree.id
'a006b5b1a8115185a228b7514cdcd46fed90dc92'

Once you have a tree, you can get the contents.


>>> contents = tree.contents
[, 
 , 
 , 
 ]

This tree contains three Blob objects and one Tree object. The trees are subdirectories and the blobs are files. Trees below the root have additional attributes.


>>> contents = tree.contents[-2]


>>> contents.name
'test'

>>> contents.mode
'040000'

There is a convenience method that allows you to get a named sub-object from a tree.


>>> tree/"lib" 

You can also get a tree directly from the repository if you know its name.


>>> repo.tree()


>>> repo.tree("c1c7214dde86f76bc3e18806ac1f47c38b2b7a30")

The Blob object

A blob represents a file. Trees often contain blobs.


>>> blob = tree.contents[-1]

A blob has certain attributes.


>>> blob.name
'urls.py'

>>> blob.mode
'100644'

>>> blob.mime_type
'text/x-python'

>>> len(blob)
415

You can get the data of a blob as a string.


>>> blob.data
"from django.conf.urls.defaults import *\nfrom django.conf..." 

You can also get a blob directly from the repo if you know its name.


>>> repo.blob("b19574431a073333ea09346eafd64e7b1908ef49")

What Else?

There is more stuff in there, like the ability to tar or gzip repos, stats, blame, and probably a few other things. Additionally calls to the git instance are handled through a method_missing construct, which makes available any git commands directly, with a nice conversion of Python dicts to command line parameters.

Check the unit tests, they’re pretty exhaustive.

What is Next?

There are a couple of tests that don’t pass due to an inability to mock them properly, so I’m going to get those fixed up.

I also plan to restructure some of the object relationships. A few of them feel a little dirty to me.

LICENSE

New BSD License. See the LICENSE file.