Empty Thoughts

Feed

Living on the Edge

Entries tagged “sqlalchemy”

Creating a Custom Table Object in SQLAlchemy

written by Michael Trier, on Aug 3, 2008 11:39:00 PM.

A frequently asked question on the and on the IRC channel (#sqlalchemy on freenode) is how to go about implementing a custom Table object, which automatically adds specific columns to every table.

For example, if you want all of your tables to contain created_at and updated_at columns, you might be inclined to try and subclass Table. Unfortunately, Table is not easy to subclass due to the metaclass which controls its creational pattern. Instead we can use a factory function to handle it for us:


def TimeStampTable(*args, **kwargs):
    args = args + (
        Column('created_at', DateTime, nullable=False, default=datetime.now),
        Column('updated_at', DateTime, onupdate=datetime.now)
    )
    return Table(*args, **kwargs)

This function simply adds the extra columns to the Table args and returns the new Table. We can then use it in our code like so:


my_table = TimeStampTable('my_table', metadata, 
    Column('id', Integer, primary_key=True)
)

I’ve been using this approach on a recent project and it works quite well. This idea comes from Mike Bayer.

Replacing Django's ORM with SQLAlchemy

written by Michael Trier, on Jul 23, 2008 9:41:00 PM.

Will Larson just created a post titled Replacing Django’s ORM with SQLAlchemy that covers using SQLAlchemy with a Django project. It’s a great write-up, and generally what Django folks are referring to when they say, “of course you can use SQLAlchemy with Django.” I think Will does a good job of stepping you through the important bits.

One thing I appreciate is that the post hints at some of the negatives of doing something like this. You lose things like generic views, all contrib apps including the admin, most of the management commands, and some smaller pieces that are embedded in various parts of the framework. Tread this way cautiously.

Another item in the post that got my attention was when Will mentioned the following in reference to the possibility of doing a full drop in replacement for the Django ORM, essentially allowing you to use fully the Django bits alongside of the SQLAlchemy pieces:

This tutorial won’t build that Frankenstein, since it’s only interested in exploring the loose coupling aspect, but it wouldn’t be prohibitively difficult to do so. Although, it would be awkward in some regards.

Well I, and others, have been building “that Frankenstein” for some time now with a project we call Django-SQLAlchemy. There’s also a somewhat defunct project called that is a hybrid between what Will demonstrates and what we’re trying to do with django-sqlalchemy.

I’m not sure whether or not these projects are “awkward” or “Frankensteins,” but I can assure you thus far they have been prohibitively difficult, as we see no full implementation for a drop-in replacement still today. Getting the initial stuff in place is pretty darn easy, but as you get into it there are lots of little edge cases that complicate things significantly.

I want to thank Will for the great post. I think he does a great job of talking about the strengths of SQLAlchemy, and hopefully people will be inclined to experiment with the idea of using SQLAlchemy in addition to, or instead of, Django’s ORM.

Two New Great Books

written by Michael Trier, on Jul 12, 2008 11:57:00 AM.

There seems to be a plethora of Python related books coming out these days. I just stumbled across two interesting ones. First up is . I desperately want this book but I’m tired of paper books. It’s available on Safari Online, but I don’t have that. I’d really like to see a Kindle version. I’ll probably end up buying the paper, because it looks to be very exciting.

Secondly is . This book is offered as a free beta book, but will also be available for purchase through Apress. I’ve skimmed the contents but plan to spend some time with it in the coming week. I’m really interested in the Deployment chapter, so I’ll probably start there first.

SQLAlchemy Code Swarm Visualization

written by Michael Trier, on Jun 24, 2008 7:55:00 PM.

Someone please stop Brian Rosner. He’s out of control. Although I must say, no surprises on this one.

Django-SQLAlchemy Update

written by Michael Trier, on Apr 7, 2008 1:32:00 AM.

It has been a couple of weeks since I announced the Django-SQLAlchemy project. I thought it would be a good idea to provide somewhat regular updates for those that are interested, but not following the project at a code level.

Summary

At the overall level we’ve seen tremendous progress on the project. In the last couple of week there were 77 commits affecting about 4500 lines of code. We implemented the testing framework into the project and as a result we have added about 45 tests. We are just getting started in the testing area and have a ways to go before we have adequately tested the existing code. But it is moving along nicely.

Additionally, we have a new regular committer with Samuel Hsiung coming on board and submitting patches to implement the inspectdb management command as well as all of the introspection stuff related to that. We really appreciate his help.

Details

As I mentioned in the previous post, we see the integration of Django and SQLAlchemy breaking down into three primary areas of development. I’ll outline how we’re progressing in each of these areas below:

1. Management Commands

As I mentioned above, Sam Hsiung implemented the inspectdb command. In addition to that we have implemented flush, reset, sql, sqlreset, and syncdb. Not all of these are what we would consider 100% complete, but they’re there and working for our current use case.

We are kind of stalled on the loaddata and dumpdata management commands until we implement extra. But there are lots of others that can be implemented. This is a great area for someone new to the project to get involved.

2. Declarative Mapper

This one is more difficult to gauge. Currently we have the declarative stuff working for our current test cases. Which includes straight models, Foreign Keys, and Many To Many. We’ve have not got into model inheritance nor have we really tested out the M2M relationships very extensively at this point.

All field mappings are done, but we continue to tweak them as we run into issues. There are currently very few tests related to the declarative area of things. This is an area that I plan to focus on over the next week.

Related fields are mapped although currently we’re letting Django handle the RelatedManager for us. Although this works it does so at the expense of some functionality that we’d like to provide down the road. We’ve modified the declarative mapper to support overriding this behavior but I’d like to provide a default implementation that takes advantage of SQLAlchemy’s attribute instrumentation to allow SQLAlchemy to manage this relationship but in a way that mirrors how Django manages it. Similar to how we’re managing QuerySet.

3. QuerySet Mapping

This area has been a lot of fun and also the area where we’ve provided the most tests so far. As I stated previously we are providing our own DjangoSQLAlchemyQuerySet in place of the default QuerySet. In doing so it’s necessary to map all of the methods that are available to Django. We’ve broken this into three smaller subsection of code:

a. QuerySet Methods – We have mapped a lot of the methods on the QuerySet. So instead of listing them all, I’ll list what is left. The QuerySet methods that still need some love are: dates, complex_filter, select_related, and extra (ugh).

Additionally there are a couple that need some work or further testing. order_by is implemented but we have not tested it for all forms of it. Remember order_by used to use one format and now uses a format similar to filters. We have a potential problem with delete regarding invalidating session information. I have not yet tested it so I can not say for sure if it is an issue.

We have added in an options method for passing extended options to SQLAlchemy. We have also added in a first method, just because it is convenient. Items like this are just in there for now. The end result of where Django stops and SQLAlchemy begins is to be determined.

b. Field Lookups – We have created a little mapper that handles the field lookups, and does the right thing in creating the join conditions needed. Currently we provide support for everything but range, year, month, day, search, regex, and iregex.

In addition to the above we have added two new constructs, just because we can, like and ilike. These two provide raw like ability for doing more complex LIKE matching in your field lookups.

Join conditions are working and tested for Foreign Key situations. They’ve yet to be tested for Many to Many relations.

c. Subclasses – This includes subclasses of QuerySet that are implemented within Django. Technically these probably do not need to be implemented as such, but the design makes sense on its own and it makes sense for what we are attempting to do.

So far we have implemented a DjangoSQLAlchemyValuesQuerySet and a DjangoSQLAlchemyValuesListQuerySet. Both of these function for the common cases although we have not tested them with Many to One values at this point.

We still need to implement the DateQuerySet and the EmptyQuerySet. So there’s some low hanging fruit there.

Wrap Up

Up to now a lot of the mapping that we’ve been doing has been just a lot of grunt work. Turning the crank and making sure things work as expected. We needed some early wins to know that we were heading in the right direction and we have that now. The next step is really hitting more of the edge cases, the more difficult areas of implementation. Before we get into that though I want to be sure we have good overall test coverage for what is implemented up to this point. That will be my focus over the next week or two.

It’s amazing to me how similar many of the approaches are between Django’s ORM and SQLAlchemy’s ORM, for the subset of functionality provided by Django. That makes the job a lot easier. Although there are also interesting differences.

For instance, Django’s order_by is not generative (they can not be chained), whereas with SQLAlchemy they are generative. Django’s get method respects current filters, whereas SQLAlchemy’s get method ignores any pre-existing filtering. (Note this is what I’m experiencing so if I got it wrong I’m sure I’ll be set straight.)

I must admit that the more I learn about SQLAlchemy the more I’m impressed with the flexibility and power of the framework. It has some amazing code, and the potential for Django that it provides through this project is exciting.

Once again I would like to thank Mike Bayer and Jason Kirtland for providing a ton of assistance. I know that I’ve been annoying at times, but they’ve been great in nudging us in the right direction when we lose our way.

Elegant Testing Decorators

written by Michael Trier, on Mar 28, 2008 10:20:00 PM.

I’ve been working on fleshing out the tests for Django-SQLAlchemy and right away I discovered a problem. Incidentally, this is why I love unit testing. The problems was that a very simple test of the query contains filter syntax was failing. I was expecting four items but was getting five back. After digging into it I discovered the problem was with SQLite. SQLite doesn’t seem to respect case when using the LIKE syntax.

Let us say we have the following setup:


from apps.blog.models import Category

class TestContains(object):
    def setup(self):
        Category.__table__.insert().execute({'name': 'Python'}, 
            {'name': 'PHP'}, {'name': 'Ruby'}, {'name': 'Smalltalk'}, 
            {'name': 'CSharp'}, {'name': 'Modula'}, {'name': 'Algol'},
            {'name': 'Forth'}, {'name': 'Pascal'})

a query with contains in Django on SQLite, like the following, will return five results, instead of the expected four results:


>>> Category.objects.filter(name__contains='a').count()
2008-03-28 20:41:43,228 INFO sqlalchemy.engine.base.Engine.0x..f0 BEGIN
2008-03-28 20:41:43,229 INFO sqlalchemy.engine.base.Engine.0x..f0 SELECT count(foo_category.id) AS count_1 
FROM foo_category 
WHERE foo_category.name LIKE ?
2008-03-28 20:41:43,229 INFO sqlalchemy.engine.base.Engine.0x..f0 ['%a%']
5

(Incidentally that’s actually going through the Django-SQLAlchemy backend and not Django’s ORM.)

So I kind of hemmed and hawed for a bit trying to figure out how I could make this work under different databases, at least from the testing standpoint. It finally occurred to me that SQLAlchemy has to be able to test their ORM against a lot of different backends so they must have a nice solution to this.

Decorators to the Rescue

It turns out that SQLAlchemy has implemented an elegant set of decorators for just this problem. They were also written in such a way that it was quite easy for me to extract them and modify them slightly to work with Django-SQLAlchemy tests. So what’s in this package?

fails_if(callable_) – Mark a test as expected to fail if callable_ returns True.
future – Mark a test as expected to unconditionally fail.
fails_on(dbs) – Mark a test as expected to fail on one or more database implementations.
fails_on_everything_except(dbs) – Mark a test as expected to fail on most database implementations.
unsupported(dbs) – Mark a test as unsupported by one or more database implementations.
exclude(db, op, spec) – Mark a test as unsupported by specific database server versions. This decorator allows an impressive list of options, for example @exclude('mydb', '<', (1,0))

There’s a lot more than that, but I will not detail them all here. If you want to dig through it all check out test/testlib/testing.py module.

The Implementation

So once I was able to extract and modify these decorators I ended up with very elegant syntax for my tests. Here is a sample:


class TestContains(object):
    def setup(self):
        Category.__table__.insert().execute({'name': 'Python'}, 
            {'name': 'PHP'}, {'name': 'Ruby'}, {'name': 'Smalltalk'}, 
            {'name': 'CSharp'}, {'name': 'Modula'}, {'name': 'Algol'},
            {'name': 'Forth'}, {'name': 'Pascal'})

    @fails_on('sqlite')
    def test_should_contain_string_in_name(self):
        assert 4 == Category.objects.filter(name__contains='a').count()
        assert 1 == Category.objects.filter(name__contains='A').count()

    @fails_on_everything_except('sqlite')
    def test_should_contain_string_in_name_on_sqlite(self):
        assert 5 == Category.objects.filter(name__contains='a').count()
        assert 5 == Category.objects.filter(name__contains='A').count()

    def test_should_contain_string_in_name_regardless_of_case(self):
        assert 5 == Category.objects.filter(name__icontains='a').count()
        assert 5 == Category.objects.filter(name__icontains='A').count()

Special thanks goes to Mike Bayer and the rest of the contributors to SQLAlchemy for providing such a great solution. I am constantly amazed by their code.

Django-SQLAlchemy

written by Michael Trier, on Mar 21, 2008 12:59:00 PM.

One of the things that has kept me very busy over the past several months is something called Django-SQLAlchemy. I started toying with the idea back in the fall and it just kind of rumbled around in my head for a while as I thought through different approaches to integrate SQLAlchemy into my Django projects. Yes, I’m aware you can just import sqlalchemy and you’re done. But no one is every really talking about that, are they? What people want is the ability to keep everything the same but still have the power and flexibility of SQLAlchemy available to them. That’s what this project aims to do.

Finally, some time in or around December 2007, Brian Rosner and I began discussing the project. I was glad to hear that he had been thinking about similar things, and we decided that we would put forth some effort on the project. Brian and I have been working for the past several months trying to do a proof of concept and develop a roadmap for the project. A lot of hackish code was developed, and we frequently went down one road only to discover there was a much easier way. In the end we figured out what we think is a plausible plan for implementing SQLAlchemy into Django.

At PyCon 2008 in Chicago Brian and I decided to set up a Birds of a Feather session in one of the Open Spaces to discuss django-sqlalchemy. Much to our surprise we had a great turnout. First of all Mike Bayer, creator of SQLAlchemy, and Jason Kirtland, core contributor to SQLAlchemy, both showed up and were extremely helpful. They indicated that they are really interested in seeing the project become a success, and they would help in any way possible. During the meeting they even discussed modifying some functionality within SQLAlchemy in order to make our job a bit easier. Adam Gomaa also showed up and expressed an interest in helping on the project. We added Adam as a contributor immediately and we all began hacking away.

Over the next several days, Adam, Brian, and I made really good progress. We solidified some of the hackish filter translation code, we created a test runner, We made a lot of progress on mapping the related fields, and most importantly we completely replace the declarative layer (previously Elixir based), with code based on the declarative layer that Mike created in SQLAlchemy. We also bugged Mike and Jason continuously, and every time they provided us with valuable guidance that saved us hours of grief. We can not thank them enough for their help.

Development Roadmap

The fundamental approach to the django-sqlalchemy project is to implement SQLAlchemy as a custom database backend. Actual SQLAlchemy backends are made available through custom project settings. By doing so we can “hijack” all calls to the QuerySet with our own custom QuerySet that interprets these calls and passes them on to SQLAlchemy. We also need to be able to provide SQLAlchemy hooks into our Django models and this is handled by the declarative layer of the project. Finally, things like management commands that have affect on the database backend need to be overridden to do the right thing in SQLAlchemy terms.

1. Management Commands – Management commands that affect the database backend in any way need to be overridden to pass on calls to the SQLAlchemy layer. This affects at least the following: syncdb, sql*, reset, dbshell, createcachetable, dbshell, dumpdata, flush, inspectdb, loaddata.

2. Declarative Mapper – mapping the Django model structure into SQLAlchemy Table and Mapper information so that access to SQLAlchemy is available while still providing access to all of Django’s expectations for functionality of a model. This includes mapping all of the fields, related fields, polymorphic relations, etc. Plus once we get it functionally compliant, there will be a lot of hooks for extensions that will provide greater access to SQLAlchemy power.

3. QuerySet Mapping – Map the backend django-sqlalchemy database so that we can override Django’s QuerySet into a SqlAlchemyQuerySet that interprets and passes on all backend query functionality to SQLAlchemy. This includes the handing of sessions, metadata, and transactions.

Benefits

So why are we doing all of this? Well mostly we are scratching our own itch. We also sense that there is a lot of interest in the community for this type of thing. Additionally we realize that there are a lot of benefits provided by a django-sqlalchemy integration, that will help make Django even more powerful:

Addition database support: MS-SQL, Firebird, MaxDB, MS Access, Sybase and Informix. IBM has also released a DB2 driver

Connection Pooling

Multi-Database Support, including things like sharding

Extremely powerful query language

Initially all of these things may not be available as we first aim to achieve functional equivalence to Django’s ORM, but over time we will provide hooks and ways to tie into the additional functionality provided by SQLAlchemy.

Interest

So if you’re still reading along at this point, you probably have some interest in the project. We welcome you to get involved in any way that works best for you. We have set up the following resources for the project:

Gitorious for Source Control

IRC Channel #django-sqlalchemy

I will also try to be good about posting progress updates here.

Getting a SqlAlchemy Quoted Name

written by Michael Trier, on Mar 5, 2008 9:19:00 PM.

Just because I spent so much time tracking it down, you can get the SqlAlchemy database-dependent quoted name for a column by doing:


>>> metadata.bind.dialect.identifier_preparer.quote_identifier("test")
'"test"'
>>>