Saturday, January 14, 2006

Relation-chaining gimmick in Python

This is just a piece of a whole idea, but I thought you might find it amusing.

There is an RDF notation called N3 that has an interesting syntax for quickly describing a chain of relations linking one node to another.

Here are some examples from the N3 paper:
:joe!fam:mother!loc:office!loc:zip   Joe's mother's office's zipcode
:joe!fam:mother^fam:mother Anyone whose mother is Joe's mother.
:joe ("Colon Joe" -- sounds like a brand of medicinal coffee, doesn't it?) is an RDF node. The other strings with colons in them, fam:mother, loc:office, and loc:zip, represent binary relations. I think the part before the colon tells you the namespace it's from, but ignore that for now.

The ! is like the "." operator in most object-oriented languages. :joe!fam:mother means "Joe's Mother", and you might say this as joe.mother in some language like Python or C.

The ^ is the converse of !; so if joe!mother == sue, then sue^mother == joe. You can think of !mother as meaning "mother" and ^mother as meaning "child". ^employer means employee, ^owner means possession. "hello, world"!length == 12, so 12^length == "hello, world" (among many other strings).

My mind was brought back to this when I was working on a project with a bunch of related objects in Python, and I was looking for a clean way of expressing complicated queries among them. In my day job I do a lot of ad-hoc SQL querying, and while I don't love SQL, it is nice to be able to fluently, readably, ask some complex questions of a bunch of relations. I'd read that list comprehensions were more or less isomorphic to SQL, but I found when playing with them that my queries were inefficient and not very readable; they were no better than just writing explicit code to traverse the object tree and collect the information I wanted.

I'm not sure if these N3-style paths will be useful in practice, yet, but it was worth playing around with them. The biggest hurdle with them, as I see it, is that there's no easy way when presented with 12^length to know what string is wanted, so you have to have a well-defined universe of objects to work with, and be willing to accept multiple results from every query.

So anyway, here's some code to give you an idea what I've come up with so far:

import pprint

class relation:
def __init__(self):
self.forward = dict()
self.backward = dict()

def add(self, arg1, arg2):
if not self.forward.has_key(arg1):
self.forward[arg1] = dict()
if not (arg2 in self.forward[arg1]):
self.forward[arg1][arg2] = True
if not self.backward.has_key(arg2):
self.backward[arg2] = dict()
if not (arg1 in self.backward[arg2]):
self.backward[arg2][arg1] = True

def __call__(self, arg1, arg2):
try:
return self.forward[arg1][arg2]
except Exception, e:
return false

def get_all_keys(self, hash, key):
try:
return hash[key].keys()
except Exception, e:
return []

def __rlshift__(self, other):
result = []
for item in other:
result = result + self.get_all_keys(self.forward, item)
return result

def __rrshift__(self, other):
result = []
for item in other:
result = result + self.get_all_keys(self.backward, item)
return result

father = relation()
phone = relation()

class frank: pass
class jeff: pass
class chris: pass
class andy: pass

father.add(frank, jeff)
father.add(jeff, chris)
father.add(jeff, andy)
phone.add(andy, "101-555-1249")
assert [chris] >> father == [jeff]
assert [jeff] << father == [chris, andy]
assert [chris] >> father == [jeff]
assert [jeff] << father == [chris, andy]
assert [andy] << phone == ["101-555-1249"]
assert [chris] << phone == []
assert [andy, chris] << phone == ["101-555-1249"]
assert (([chris] >> father) << father) << phone == ["101-555-1249"]
assert [chris] >> father << father << phone == ["101-555-1249"]

Some things to note about this implementation:
Next time I'll try to apply this to a more interesting problem to see if it holds up.

Comments: Post a Comment


Links to this post:

Create a Link

This page is powered by Blogger. Isn't yours?