yyafl Release Soon
I’m just about ready to push out the first release of yyafl, if you’re not already running it from the repository.
Stay tuned in the next 3 days ![]()
I’m just about ready to push out the first release of yyafl, if you’re not already running it from the repository.
Stay tuned in the next 3 days ![]()
Just finished adding in a flexible layout system to yyafl, my reimplementation of Django newforms for other Python web frameworks or WSGI adapters. Now its simple to add default layouts or render a different layout on demand.
For example, to add a simple TableLayout() to a form:
class Form1(yyafl.Form): name = fields.CharField(label = "User name", required = True) email = fields.CharField(label = "Your e-mail address", required = True) hidden = fields.CharField(widget = HiddenInput, default = "123") _layout = yyafl.layout.TableLayout()
and to render the form to HTML:
# Render using the layout specified in _layout above. content += f.render() # Or invoke the layout explicitly l = yyafl.layout.TableLayout() content += l.render(f)
yyafl Layout()s also understand decorators:
decorators = { '*' : MarkRequiredDecorator(), 'name_field' : HighlightDecorator() } layout = yyafl.layout.TableLayout(decorators = decorators) layout.render(f)
Decorators can change attributes and wrap fields in markup blocks. In the above example, the ‘*’ signifies the decorator should apply to all fields, and the HighlightDecorator should only apply to the field called ‘name_field’.
There are still some not yet implemented ideas:
You can find these latest changes in the Git repository linked from the main yyafl web page.
With the popularity of Django or Rails, people have asked before why I use CherryPy. The reason is simple, I love the flexibility. I get to pick what I want to use inside, CherryPy simply worries about how to get the information to a web browser. I found this post to be a good overview of the strengths of CherryPy. Give it a good read. I completely agree with the author’s points, and wish CherryPy would get a bit more attention in the do everything web-framework era.
qsgen, the single-script static web-site generator has just hit the first release, version 0.1.
In short, qsgen is a pure Python functional wrapper around Mako templates and the Pygments library. It lets you build a hierarchical set of .html pages and associated base templates using Mako, which will be transformed into static pages for serving on the web. In addition, it also includes support for Pygments syntax highlighting in-line with the HTML or even with source code sourced from separate files.
I’m using qsgen on both www.stackfoundry.com and tomeapi.com. If you feel making a dynamicaly scripted website just in order to display some simple content is silly, or feel the complexity and headaches of a true CMS system outweigh the benefits, then qsgen is for you.
(qsgen is very similar to a similar script called Maehan which I was using for many years. Consider qsgen a more feature packed and cleaner version of the old Perl Maehan).
You may be like me, and have a large (20″+, or especially 30″ in this case) monitor attached to your computer. You’re also a avid computer user and have more than one program open at a time which you want to view simultaneously. You also use the keyboard much more than the mouse (except when lazily surfing the web). How often have you noticed spending large amounts of time moving your windows around, resizing, moving, rearranging, and all the normal window management jazz? How often do you find yourself switching among windows with alt-tab, just to refer to some information which is currently obscured by the window you’re currently in? If your answer is very often, then you may be a candidate for what is known as a Tiling window manager.
I’ve been using both KDE and Gnome on Linux for many years, and also use the Redmond user-interface (aka Windows). I also own a Mac. The Macintosh (OS X) and Windows are similar models, and Gnome and KDE are even more similar to Windows. Now, I am not saying that either interface is unusable, but it is an inefficient choice if you fall into the categories above.
Granted, the classic window management paradigms are very familiar. I won’t say natural, since computers are really not natural, they’re a learned behavior. Computers only feel natural when they operate in a similar fashion to other computers you’ve used or seen in the past. The problem with this user interface paradigm is becoming apparent as screen sizes grow (and shrink!). Your desktop is simply far too large to use properly in its overlapping window mode. Plus all the mouse work moves you away from your primary user interface device: the keyboard.
Many program designers have noticed this. They’ve moved away from the nightmare of MDI (multiple overlapping windows inside of your overlapping window), and developed docking elements. The toolbar you can dock and move. The properties editor on the right. You can’t (generally) overlap the dockable elements. But in all cases, the elements are on the side or around your document or main view. Stuff stays put, is intelligently placed, and doesn’t get in your way constantly. Why can’t a window manager work the same way? But it can! Enter the Tiling window manager.
I like StumpWM, which is a tiling window manager modeled after ratpoison (and all other tiling managers before it). Its command set is a good mix of GNU Screen and emacs. Its also written in pure Common Lisp. There are other tiling managers out there, such as my second favorite, XMonad, which can even work flawlessly in Gnome and KDE (or mostly flawlessly anyway). XMonad has the advantage of automatic layout modes which StumpWM currently lacks. I didn’t particularly like Ion3 or WmII: the multi-monitor support is not as developed as StumpWM), plus Ion3 open-source development could be at risk. .
I’ve been using StumpWM for a couple of days. I am trying to quit Gnome cold-turkey, and so far have been successful. All of my applications work. The few gnome specific management applications I used I’ve managed to replace (such as NetworkManager for the laptop). My environment is lighter without the 50 background processes Gnome uses for automagic abilities (HAL, DBUS, etc). It doesn’t suffer the “users like simple, so lets not complicate stuff” philosophy Gnome uses in several places (an example: gnome-screensaver vs xscreensaver). Plus, I can use the keyboard almost exclusively, only touching the mouse to use certain applications. All window operations are entirely keyboard driven.
But if you’re not trying to quit Gnome completely, both window managers can show the normal Gnome panel for you. One thing you’ll have to scrap is the Desktop paradigm - you can put a picture on your root window, but the cluttered mass of icons isn’t available - you’ve put something far more useful there instead. Your applications.
It may seem scary at first, but I encourage you to try a tiling manager. Give yourself several hours of dedicated time. Will yourself not to switch back to the old familiar interface. Print out the quick reference or the whole manual, just in case you get lost in how to move around :). Its well worth the effort.
Today I’m going to show you how to interface Python to Apache HBase using Facebook’s Thrift package. Hbase is a documented oriented database which is very similar to Google’s BigTable (in fact its more or less a clone of BigTable as seen in the BigTable paper). HBase has two primary interfaces - a REST API which is relatively slow, and a Thrift interface, which is recommended for high speed communication. For speed and other reasons, we’re going to be using the Thrift API.
Note that I am going to be touching on some Hbase jargon (such as column families). Its not essential to understand what those are if you are just trying to build a Python Thrift client. But if you’re trying to use HBase, I would consider that knowledge essential.
First thing’s first, you need need to grab a copy of both HBase and Thrift. For this tutorial, I am using the Subversion copy of HBase (as of July 18th) and Thrift version 20080411p1. Thrift is shipped as a source package, you will need a compiler toolchain, as well as any Python development packages or header files your system may require (such as python-dev on Debian/Ubuntu). You’ll also need the Java JDK package (such as sun-java6-jdk on Ubuntu).
Thrift can be compiled using the standard routine:
./configure make -j4 sudo make install
After installing thrift, you should have a system-wide ‘thrift’ command available, which should provide some usage information. Thrift uses a descriptor file for the communication layer, available as a .thrift file. I’m not going to describe how to create such a descriptor file here (perhaps in a later blog post), as we’ll be using the one provided by HBase (with one small tweak). You will need the HBase source package for this exercise.
Open up [hbasesrc]/src/java/org/apache/hadoop/hbase/thrift/Hbase.thrift in your favorite text editor. Search for lines containing ruby_namespace, and add the following line in the same region:
namespace py hbase
(Alert readers will wonder why we didn’t use py_namespace. The reason is simple, the xxx_namespace Thrift commands are deprecated, replaced with namespace xxx).
Next up, we’ll generate our Python HBase thrift interface. Fire up your shell to the same location, and run
thrift --gen py Hbase.thrift
Now we have generated a set of Python classes in the gen-py folder which will allow you to talk to the Hbase thrift server automatically. Lets setup our Python Thrift server now. I’ll grab the hbase folder inside of the gen-py folder, and move it to a project directory of your choosing.
Next up, we’ll need to work on the Python Thrift client application. I suggest starting with the Thrift server tutorial for a boilerplate template. Below is the file we’re going to use (lets just assume it is called client.py for this discussion):
#!/usr/bin/env python import sys from thrift import Thrift from thrift.transport import TSocket from thrift.transport import TTransport from thrift.protocol import TBinaryProtocol from hbase import Hbase from hbase.ttypes import *
This is general Thrift boilerplate. The application specific portions up to now are the last two lines. Hbase is the name of the service as described in the Hbase.thrift file.
Next up, we’re going to try to connect to our HBase instance. To do that, we will first create a TSocket, then add a TBufferedTransport over the raw socket, and then wrap that in a TBinaryProtocol. If someone has studied too much Java, it was the Thrift developers ;).
# Make socket transport = TSocket.TSocket('localhost', 9090) # Buffering is critical. Raw sockets are very slow transport = TTransport.TBufferedTransport(transport) # Wrap in a protocol protocol = TBinaryProtocol.TBinaryProtocol(transport)
Now two application specific lines - we’re going to build a Hbase.Client() object, and then finally open up our transport.
client = Hbase.Client(protocol) transport.open()
We can do a quick validation pass now, and start up Hbase (if you have a running Hbase server somewhere, you can omit this step of course). If you have a source checkout of Hbase, compiling is as simple as running the ant tool. Assuming you have the JDK installed, Hbase should be ready for action in under a minute. Start up a master Hbase instance by running bin/hbase master start &. Then, start up a thrift server for Hbase, by running bin/hbase thrift start.
Running our client script now should lead to no errors. If it does, stop, and try to figure out what is wrong (did you move the gen-py/hbase directory to where your client.py script is or set the python path appropriately?).
Lets call our first method: getTableNames(). Add this to the end of our script:
print client.getTableNames()
By default, it will simply print a blank list ([]), unless of course you have created tables. This is the simplest example of using Thrift with HBase and Python, where no special data structures are needed or passed around. But if we look at the HBase Thrift API (not up to date - for full details look at the Hbase.thrift file), we can see some methods will require parameters in the form of Thrift structs.
Lets try to create a table in Hbase. When we consult HBase.thrift, we can see it requires a list of ColumnDescriptors.
/**
* Create a table with the specified column families. The name
* field for each ColumnDescriptor must be set and must end in a
* colon (:). All other fields are optional and will get default
* values if not explicitly specified.
*
* @param tableName name of table to create
* @param columnFamilies list of column family descriptors
*
* @throws IllegalArgument if an input parameter is invalid
* @throws AlreadyExists if the table name already exists
*/
void createTable(1:Text tableName, 2:list columnFamilies)
throws (1:IOError io, 2:IllegalArgument ia, 3:AlreadyExists exist)
Luckily, the thrift compiler has generated a Python class for this ColumnDescriptor (which we acquired by importing hbase.ttypes.*). Sadly, this isn’t the most Python of all classes, but will be quite serviceable for our needs. Lets build a ColumnDescriptor for a column-family called foo. For Hbase, we need to specify the column family in the name: format - so don’t forget that colon, or you will be faced with an IllegalArgumentException.
desc = ColumnDescriptor( { 'name' : 'foo:' } )
Note that there are many more fields you can use. Either consult the Hbase.thrift file or the hbase/ttypes.py file for details.
Now we’re ready to create our table!
client.createTable('our_table', [desc]) print client.getTableNames()
Running this script should yield a [] followed by ['our_table']. Now we have a table in Hbase! Congratulations!
If you run the script again, you’ll notice that you get an exception since the table name is already in use. This is of course expected, but also highlights Thrift’s ability to propagate exceptions from the remote system.
Exceptions must be predefined in the .thrift interface file. For the case of the createTable method, there are three possible exceptions. Catching them is much like any other exception. Here is our program, changed to catch the AlreadyExists exception:
try: desc = ColumnDescriptor( d = { 'name' : 'foo:' } ) client.createTable('our_table', [desc]) print client.getTableNames() except AlreadyExists, tx: print "Thrift exception" print '%s' % (tx.message)
Note specifically the presence of the message attribute. The Thrift compiler doesn’t generate a nice __str__ or __repr__ method for Python exceptions, so in many cases to determine the exact cause of the error, you need to grab the message attribute.
Before this turns into an exhaustive documentation of the HBase Thrift API, I’m going to put a close on this post :). I hope this short example will help you with using Hbase and Python, and combining Hbase and Thrift. In a future post, I will touch upon how to create a Python Thrift server, and define your own Thrift interface file.
So, I’m a copious project and idea mill. Sometimes my ideas actually gain some traction, but often they languish on my hard drive. The useful ones I eventually get around to releasing though, which is what this blog post is all about.
If you’ll head over to StackFoundry.com you can see my recently updated Other section. Highlights here include:
Beyond that, you can find the anyvcs and softwedge projects on the site. I’m looking for someone interested in anyvcs to help maintain a Mercurial port of it (and complete the git port ;)). I will be resuming work on anyvcs when I pick up wedge again, which is going to be after a first pass at tome.
As I said, I’m a copious project starter.
Ever heard of VGA Planets? If not, then you can probably safely skip this post. But if you do, I introduce you to one of my first released programming projects. I made this in Junior High, using QBasic(!!). It was an addon for the VGA Planets HOST program which let you use the (very clunky) friendly code system to plant hidden bombs on player’s ships.
To be honest, I don’t think anyone every used it. But, alas, here it is, in all its glory. Note that source is not included, since I don’t have it anymore. I’ve learned by lesson, and decided backups and perpetual archives are a good thing.
The relational database is dead. Long live the document database!
Ok, maybe that statement is far too gloomy and patently untrue. Relational databases have their place and their uses. But as a general purpose web application data storage system, they are not always the best tool for the job. There are many use cases where the RDBMS is not an effective storage engine.
Enter the ‘document database’. Or otherwise known as the big and fancy hash map. The concept is gaining momentum in the industry, particularly for online applications which are made up of structured data and documents: social sites, search engines, blogging applications, etc. It is an extension of data-de-normalization. By de-normalizing data, you can safely distribute and parallelize its storage and representation. You can do this with a RDBMS as well, but at its heart, you are still using an RDBMS.
Document databases are basic key/value storage systems. They are distinctly different from relational databases in that queries can only be performed (efficiently) on a key. Documents should also be de-normalized, minimizing external references so a complete view can be obtained without having to fall back to relational semantics (primarily JOINs).
Projects such as Hadoop/HBase, CouchDB, and even Google’s BigTable are great examples of emerging (and successful) document oriented databases. The problem is they all have non-standard access modules (if any), and are “bare bones” in the access model.
What Python needs is a specification and implementation of a document database access API. On top of this could be a layer similar in ideology to SQLAlchemy, in which Python classes could represent documents and any links amongst them.
I am working on an implementation of the lower-level access layers for HBase and CouchDB (leveraging the excellent CouchDB Python module). In addition, there will be a “DB-API” adapter and anydbm adapter for developer prototyping.
Its not specification worthy at this point in time, but will hopefully foster growth in this emerging field.
And since every project needs a nifty name, I dub thee tome.
Just a quick little update not related to any ongoing projects (more on the projects later).
I’ve created a profile on Ohloh.net and imported several projects into it. I’m still trying to hunt down my old subversion repositories for Cymbeline and PyAudioPlay, but I’ve done something with them and can’t seem to find them. But on the flipside, Ohloh seems to think IRM is worth $2million dollars, or about $2.50 Canadian.
Powered by WordPress