1. Skip to navigation
  2. Skip to content

Entries tagged “git”

GitPython Release 0.1.6

written by Michael Trier, on Jan 24, 2009 9:33:00 PM.

I just released GitPython version 0.1.6. This version has some backwards incompatible changes to be sure to read through the changes below before upgrading.

GitPython is a python library that makes it easy to interact with Git repositories. The emphasis so far has been on introspection and less on manipulation, although some manipulation level functionality is present. For a good tutorial on getting started, check out the source distribution documentation files.

This version includes Sphinxification of the documentation. If you haven’t worked with Sphinx yet, I highly recommend it for documenting your projects. I just touched the surface with it and I was really amazed on the “out-of-the-box” functionality.

I hope you enjoy GitPython.

CHANGES

General

  • Added in Sphinx documentation.
  • Removed ambiguity between paths and treeishs. When calling commands that accept treeish and path arguments and there is a path with the same name as a treeish git cowardly refuses to pick one and asks for the command to use the unambiguous syntax where ’—’ seperates the treeish from the paths.
  • Repo.commits, Repo.commits_between, Reop.commits_since, Repo.commit_count, Repo.commit, Commit.count and Commit.find_all all now optionally take a path argument which constrains the lookup by path. This changes the order of the positional arguments in Repo.commits and Repo.commits_since.

Commit

  • Commit.message now contains the full commit message (rather than just the first line) and a new property Commit.summary contains the first line of the commit message.
  • Fixed a failure when trying to lookup the stats of a parentless commit from a bare repo.

Diff

  • The diff parser is now far faster and also addresses a bug where sometimes b_mode was not set.
  • Added support for parsing rename info to the diff parser. Addition of new properties Diff.renamed, Diff.rename_from, and Diff.rename_to.

Head

  • Corrected problem where branches was only returning the last path component instead of the entire path component following refs/heads/.

Repo

  • Modified the gzip archive creation to use the python gzip module.
  • Corrected commits_between always returning None instead of the reversed list.

GitPython Ported to FreeBSD

written by Michael Trier, on Jul 24, 2008 2:50:00 PM.

Wen Heping submitted a port of GitPython to be included with FreeBSD. It has been great to see all the people that have taken interest in GitPython and have worked to support the project.

GitPython 0.1.4 Released

written by Michael Trier, on Jul 16, 2008 11:24:00 PM.

I’m pleased to announce the release of GitPython 0.1.4. I appreciate all of the work from contributors on this release, especially from Florian Apolloner who has really taken the lead and managed everything.

DOWNLOAD

You can get it directly from cheeseshop at: http://pypi.python.org/pypi/GitPython/

Or checkout the tag with:


$ git fetch --tags 
$ git checkout -b 0.1.4 0.1.4 

CHANGES

  • renamed git_python to git. Be sure to delete all pyc files before testing.

Commit

  • Fixed problem with commit stats not working under all conditions.

Git

  • Renamed module to cmd.
  • Removed shell escaping completely.
  • Added support for stderr, stdin, and with_status.
  • git_dir is now optional in the constructor for git.Git. Git now falls back to os.getcwd() when git_dir is not specified.
  • add a with_exceptions keyword argument to git commands. GitCommandError is raised when the exit status is non-zero.
  • add support for a GIT_PYTHON_TRACE environment variable. GIT_PYTHON_TRACE allows us to debug GitPython’s usage of git through the use of an environment variable.

Tree

  • Fixed up problem where name doesn’t exist on root of tree.

Repo

  • Corrected problem with creating bare repo. Added Repo.create alias.

Please let us know if you find problems.

We’re at a point where we have to decide where to go with the library, so if you have ideas, we’d like to know that as well.

GitPython

written by Michael Trier, on May 8, 2008 12:17:00 AM.

As you’re probably aware of by now, I really like Git. It took some time but things finally started clicking. One of the things I wanted to do was make it easier to interact with Git from Python / Django projects.

I searched around for a Python Git module. I really didn’t find anything that looked complete to me, although I didn’t look too hard. Not being the creative type I noticed that Ruby has the grit library created by Tom Preston-Werner and Chris Wanstrath, which is very nice. I decided to port it because I can use it for some cool stuff, and because I figured it would help me learn a lot about Python. So here it is.

About

GitPython is a python library used to interact with Git repositories.

GitPython is a port of the grit library in Ruby created by Tom Preston-Werner and Chris Wanstrath.

The method_missing stuff was taken from this blog post.

REQUIREMENTS

  • Git tested with 1.5.3.7
  • – used for running the tests
  • Mock by Michael Foord – used for tests.

INSTALL

You can download the code from CheeseShop or alternatively pull the source.


python setup.py install

SOURCE

GitPython’s git repo is available on Gitorious, which can be browsed at:

http://gitorious.org/projects/git-python/

and cloned from:

git://gitorious.org/git-python/mainline.git

USAGE

GitPython provides object model access to your git repository. Once you have created a repository object, you can traverse it to find parent commit(s), trees, blobs, etc.

Initialize a Repo object

The first step is to create a Repo object to represent your repository.


>>> from git_python import *
>>> repo = Repo("/Users/mtrier/Development/git-python")

In the above example, the directory /Users/mtrier/Development/git-python is my working repository and contains the .git directory. You can also initialize GitPython with a bare repository.


>>> repo = Repo.init_bare("/var/git/git-python.git")

Getting a list of commits

From the Repo object, you can get a list of Commit objects.


>>> repo.commits()
[, 
 , 
 , 
 ]

Called without arguments, Repo.commits returns a list of up to ten commits reachable by the master branch (starting at the latest commit). You can ask for commits beginning at a different branch, commit, tag, etc.


>>> repo.commits('mybranch')
>>> repo.commits('40d3057d09a7a4d61059bca9dca5ae698de58cbe')
>>> repo.commits('v0.1')

You can specify the maximum number of commits to return.


>>> repo.commits('master', 100)

If you need paging, you can specify a number of commits to skip.


>>> repo.commits('master', 10, 20)

The above will return commits 21-30 from the commit list.

The Commit object

Commit objects contain information about a specific commit.


>>> head = repo.commits()[0]

>>> head.id
'207c0c4418115df0d30820ab1a9acd2ea4bf4431'

>>> head.parents
[]

>>> head.tree


>>> head.author
">

>>> head.authored_date
(2008, 5, 7, 5, 0, 56, 2, 128, 0)

>>> head.committer
">

>>> head.committed_date
(2008, 5, 7, 5, 0, 56, 2, 128, 0)

>>> head.message
'cleaned up a lot of test information. Fixed escaping so it works with subprocess.'

You can traverse a commit’s ancestry by chaining calls to parents.


>>> repo.commits()[0].parents[0].parents[0].parents[0]

The above corresponds to master^^^ or master~3 in git parlance.

The Tree object

A tree records pointers to the contents of a directory. Let’s say you want the root tree of the latest commit on the master branch.


>>> tree = repo.commits()[0].tree


>>> tree.id
'a006b5b1a8115185a228b7514cdcd46fed90dc92'

Once you have a tree, you can get the contents.


>>> contents = tree.contents
[, 
 , 
 , 
 ]

This tree contains three Blob objects and one Tree object. The trees are subdirectories and the blobs are files. Trees below the root have additional attributes.


>>> contents = tree.contents[-2]


>>> contents.name
'test'

>>> contents.mode
'040000'

There is a convenience method that allows you to get a named sub-object from a tree.


>>> tree/"lib" 

You can also get a tree directly from the repository if you know its name.


>>> repo.tree()


>>> repo.tree("c1c7214dde86f76bc3e18806ac1f47c38b2b7a30")

The Blob object

A blob represents a file. Trees often contain blobs.


>>> blob = tree.contents[-1]

A blob has certain attributes.


>>> blob.name
'urls.py'

>>> blob.mode
'100644'

>>> blob.mime_type
'text/x-python'

>>> len(blob)
415

You can get the data of a blob as a string.


>>> blob.data
"from django.conf.urls.defaults import *\nfrom django.conf..." 

You can also get a blob directly from the repo if you know its name.


>>> repo.blob("b19574431a073333ea09346eafd64e7b1908ef49")

What Else?

There is more stuff in there, like the ability to tar or gzip repos, stats, blame, and probably a few other things. Additionally calls to the git instance are handled through a method_missing construct, which makes available any git commands directly, with a nice conversion of Python dicts to command line parameters.

Check the unit tests, they’re pretty exhaustive.

What is Next?

There are a couple of tests that don’t pass due to an inability to mock them properly, so I’m going to get those fixed up.

I also plan to restructure some of the object relationships. A few of them feel a little dirty to me.

LICENSE

New BSD License. See the LICENSE file.

Relax, It's Forking Okay

written by Michael Trier, on May 1, 2008 11:57:00 AM.

There’s been a lot of commotion over the recent fork of the Pidgin IM client by the folks behind FunPidgin. The community at large has grown accustomed to the idea that forking is a “bad thing.” On the other hand the use of distributed version control systems (DVCS), like Git, Mercurial, and Bazaar, have made the idea of forking a project something that is part of the standard process. The fact that there is an “official” repository is one of community agreement and not anything dictated by the software. Large open source projects are investigating how to gain the benefits of a DVCS while still maintaining control of their code base. Some, like the Ruby on Rails project, have decided recently to make the jump. I am encouraged by this, and I’ve been watching the IRC discussions in #github pretty closely as they attempt to figure out the logistics of working in a distributed yet overall centralized fashion.

So the real question in my mind is why do we, as a community, have a general distaste for forking? A lot of people will say because it divides the resources and the end result is two half-assed projects instead of something that could adequately compete against closed source solutions. Although, that may be the case in some situations, it certainly doesn’t have to be the case. The drive for providing a permissive license is so that we can all benefit off the work that came before us and help propel it to that next level. In a lot of ways that is what forking is all about. If we look at an analogous situation in the research world, it is dependent on the idea of forking research. Each project takes prior research and throws something new into the mix, and moves it along to the next level, for the benefit of all. Why do we not see it the same way in the software world?

What is interesting to me is that we seem to have no problem of forking ideas if it is to put them on another platform. Consider blogging software. Every time a new framework / language comes out there are multiple development efforts to do a WordPress clone, or a Typo clone, etc… on the new platform. The same holds true for Content Management Systems, and a host of other types of applications. We could argue that those efforts do not work to help strengthen the open source offering. But I feel differently. I think competition and choice are beneficial to the community at large. If WordPress provides integrated Flickr support, then Mephisto feels compelled to add Flickr and Magnolia support. We also can not discount the learning opportunities that these types of forks provide. In the end we all win.

I think one of the real reasons that many are uncomfortable with forking is because because the most successful open source examples we have to look at, projects like Linux, Gnome, or Rails have all been these monolithic efforts driven by a single individual (Linus Torvalds, Miguel de Icaza, or David Heinemeier Hansson). The success of those projects in large part are a result of the charisma, intelligence, and management of those individuals. This has left us with the impression that this is how it should be done. While it has been clearly a successful approach it is not the only approach. There have been truly community driven projects like PostgreSQL that have been hugely successful, with no one individual clearly at the helm.

Clearly a lot of open source dictators of the day have no interest in seeing their projects forked. That has direct impact on their ability to render influence in the community, and works to dilute their personal brand. Quite honestly this is a real issue because this influential capital translates into real dollars for these individuals. That said, where I have seen forking or competitive offerings the end result is that both projects benefit. Most recently I feel that the Merb project has forced Ruby on Rails to get after it and address some of the performance issues. Even if Merb has no impact at all on Ruby on Rails, for a lot of people it will be exactly what they are looking for in a framework. That does not make the Merb approach correct and the Ruby on Rails approach incorrect, just different.

I am actually quite excited that this issue is being brought to the forefront by the DVCS systems. I am also eager to see the forking of more projects. There is no requirement that says the things I’m interested in seeing in a blog platform needs to be the same things that you are interested in seeing. There is room for both of our ideas, whether we work together or separately. Some people will be compelled by my offerings and some by others. In my opinion the community benefits from this plethora of ideas and options. This is a “good thing.”

Great List of Git Resources

written by Michael Trier, on Mar 22, 2008 11:33:00 PM.

Hardcore Forking Action

written by Michael Trier, on Feb 17, 2008 11:01:00 AM.

Someone at github has a sense of humor.

Django Git Screencast

written by Michael Trier, on Jan 23, 2008 9:22:00 PM.

As promised, Brian Rosner, co-host of This Week in Django, has just released a screencast on Using Git with Django. Although this is Brian’s first screencast, he has put together a very professional tutorial. It’s enough to wet your appetite for using Git with Django.

I expect, and hope, that we will see more great screencasts from Brian in the future.

Learning Git

written by Michael Trier, on Dec 29, 2007 4:35:00 PM.

Learning Git has been on my todo list for some time. But, like most thing until you really have the need for it in a real-world situation it rarely rises to the top of the list. I had such a situation today and so I had to get familiar with Git fairly quickly.

Three resources proved to be invaluable to me. The first is this post on Installing Git for Mac OS X Leopard. It worked effortlessly. The only changes I made was to pull the latest version of Git.

The second resource, which is just amazing, is the Git PeepCode Screencast. The PeepCode Screencast is $9 but well worth the money. Geoffrey Grosenbach walks you though the Git SCM, including installing, basic commands, branching, more advanced style branching, and integration with Subversion.

The final resource is not one that I can offer. Brian Rosner (brosner) from the Django IRC channel gave me a lot of insight and walked me through a couple of things. He also graciously gave me access to his Git repository.

Although I’m just getting my feet wet, Git seems to be the next generation source code management tool. If you’re looking to get familiar with Git quickly, I highly recommend these resources as well as the excellent documentation available from the Git project page.