1. Skip to navigation
  2. Skip to content

Full-Text Search on SQLite

Although some people may be aware that SQLite supports Full-Text Search (FTS) I would guess that not many have much experience with getting it setup and using it. I only base this information on the fact that the documentation on the SQLite site as well as what I could find on the Googles has been pretty slim. As a result I thought it might would to provide a walkthrough on getting SQLite FTS setup and working. That said I must preface this by saying that the instructions here apply specifically to Mac OS X.

Too Many Options

There are a lot of downloads available for SQLite, and that alone can cause some confusion. The thing to keep in mind is that with SQLite you’re really dealing with two parts: the engine and the client. The client is what you use to interact with a SQLite database on the command-line. The engine is the functionality itself and can operate independently of the command-line application.

When it comes to building from source code, there’s a couple of different options as well. First there’s the raw source. But you can also get an amalgamation of the source. This comes in two different flavors: with and without configure scripts and makefiles. You might be curious as to why there are so many options, and it really comes down to being able to support the many different ways people use SQLite. Some folks are only interested in using SQLite in an embedded environment where they might be linking to the engine directly. On the other spectrum some people are using SQLite as a nice testing platform for doing web development. This often requires the SQLite engine, a language-engine interface (dbapi) library, and the client application.

To support FTS within SQLite we have a couple options. SQLite supports the idea of loadable plugin modules. In other words technically we could build the FTS plugin as a dynamically loadable module and then load it before use. The second option is to statically link the FTS plugin into the SQLite source code at build time. This is the approach I will take here.

Compiling and Installing

The source code bundle that we need is the sqlite-amalgamation package. The amalgamation is a single large source code file that contains the entire SQLite implementation, and (as of version 3.5.3) the Full-text search engine. Therefore we want to be sure we get at least release version 3.5.3, just to make things simple. Since there’s two amalgamation packages be sure to select the one that contains the configure script and makefile for building it.

First let’s download and extract the source code:


$ curl -O http://www.sqlite.org/sqlite-amalgamation-3.5.9.tar.gz
$ tar -xvzf sqlite-amalgamation-3.5.9.tar.gz
$ cd sqlite-amalgamation-3.5.9

Now that we have the source we just need to follow the standard process for configuring, compiling, and installing the package. The most important piece is setting configuration parameters appropriately for our platform. Since we want to statically include FTS we need to let the compile know that’s what we plan to do. To do this let’s issue the following command:


$ CFLAGS="-DSQLITE_ENABLE_FTS3=1" ./configure

Note that we’ve enabled FTS3, the latest version.

UPDATE In the original version of this post I suggested changing the default install location to /usr from /user/local. While this will work fine in most cases, it’s not a good thing to do because the underlying operating system is expecting a specific version in the default location. Read the comments below for a good way around overriding the default implementation.

Along these lines, it is important to not get confused by the actual _sqlite3.so library used by Python. On my Mac this is located in the following directory:

/System/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/lib-dynload/

It appears that this loadable module simply calls out to the installation in /usr/lib/sqlite3.

The next thing we need to do is actually compile the source and install it. So to do that issue the following two statements:


$ make
$ sudo make install

At this point the latest SQLite should be installed and ready to go on your system. We can verify this two different ways. First let’s verify that the command-line client is in fact updated:


$ sqlite3 --version
3.5.9

If you got something other than that you might want to re-check the steps. Secondly we want to be sure that Python is using the correct version.

Note: I assume Python here because that’s what I’m using but if you’re using a different language you’ll want to modify the checks accordingly. If you are still seeing the older version then you might want to investigate how SQLite is being used in your system. Since it has such a small footprint it’s not uncommon for it to be just bundled in with the program. I know that a search on my system found at least 4 or 5 different installations of SQLite.


$ python
>>> import sqlite3
>>> sqlite3.sqlite_version
'3.5.9'

The above assumes that you’re using Python 2.5+. If you’re using an older version of Python then you will likely be using the pysqlite module.

Usage

Once we have SQLite with Full-Text Search capability built in the next step is to just get familiar with the how it all works. I’m not going to go into too much detail here, because there is pretty good usage documentation on the SQLite website.

The key piece is that when you want to utilize FTS on SQLite you need to create your table as a virtual table that utilizes the FTS engine. To do that we need to modify the table creation syntax slightly:


CREATE VIRTUAL TABLE posts using FTS3(title, body);

After we’ve created the appropriate virtual table we can interact with it just like any other table. The difference is we have access to a new keyword syntax: the MATCH operator. The MATCH operator is used to access the FTS functionality.

For instance if we want to find all posts that contain the word python in them we can issue the following SQL:


SELECT * FROM posts WHERE posts MATCH 'python';

This will actually find all posts where the title or body columns contain the text python. In addition to this simple syntax the FTS engine supports things like OR and not (-) types of statements, as well as more complex items like prefix searching:


SELECT * FROM posts WHERE posts MATCH 'py*';

Note: one of the limitations (there’s many) of SQLite’s FTS is that you can’t use more than one MATCH operator in a query. To counteract this just be sure to combine search terms into a single MATCH operator statement and take advantage of the search syntax to do and, or, not, etc…

Have Fun

Well I hope this is enough to get you on your way to using Full-Text Search within SQLite. If you have any special tips, please be sure to leave me a comment.


Discussion