reload_lamb()
This notebook outlines one way to implement (part of) compositional DRT as developed in Reinhard Muskens, "Combining Montague semantics and discourse representation," Linguistics and Philosophy 19, 1996.
First, I define a new type $b$, which will be the type of a DRS box in the typed lambda calculus.
# Add a type for boxes
drt_types = meta.get_type_system()
type_b = types.BasicType("b") # Type of boxes
drt_types.add_atomic(type_b)
meta.set_type_system(drt_types)
drt_types
Next, I define a new binding operator, $\text{Box}$, in the metalanguage.
The metalanguage expression $\text{Box}~u_1, u_2, \ldots, u_n~.~\phi(u_1, u_2, \ldots, u_n)$ is equivalent to the more conventional linearized box expression $[\;u_1, u_2, \ldots, u_n \mid \phi(u_1, u_2, \ldots, u_n)\;]$.
class DRTBox(meta.BindingOp):
canonical_name = "Box"
op_name_latex = "\\text{Box}~"
allow_multivars=True # A box can introduce more than one
# discourse referent.
allow_novars=True # A box can also introduce no new
# discourse referents.
# Many of the following methods will be implemented in a
# future version of meta.BindingOp, so DRTBox will inherit
# them automatically.
def __init__(self, var_sequence, body, assignment=None):
self.derivation = None
self.type_guessed = False
self.defer = False
self.let = False
self.type = type_b
new_seq = list()
if isinstance(var_sequence, meta.Tuple):
var_sequence = var_sequence.tuple()
for v in var_sequence:
if isinstance(v, tuple):
v = meta.TypedExpr.term_factory(v[0], typ=v[1])
v = self.ensure_typed_expr(v)
if not isinstance(v.type, types.BasicType):
raise types.TypeMismatch(v, v.type, "DRTBox requires atomic non-variable type for universe")
if not meta.is_var_symbol(v.op):
raise ValueError("Need variable name (got '%s')" % v.op)
new_seq.append(v)
self.var_sequence = new_seq
self.init_body(self.ensure_typed_expr(body, types.type_t, assignment=self.scope_assignment(assignment)))
self.op = "%s:" % (self.canonical_name)
self.args[0] = meta.Tuple(self.var_sequence)
def scope_assignment(self, assignment=None):
if assignment is None:
assignment = dict()
else:
assignment = assignment.copy()
for v in self.var_sequence:
assignment[v.op] = v
@property
def varname(self):
return None
@property
def vartype(self):
return None
@property
def var_instance(self):
return meta.Tuple(self.var_sequence)
def latex_str(self, **kwargs):
var_repr = [v.latex_str() for v in self.var_sequence]
if self.body == meta.true_term:
return meta.ensuremath("[~%s~\mid~]" % (",".join(var_repr)))
else:
return meta.ensuremath("[~%s~\mid~%s~]" % (",".join(var_repr),
self.body.latex_str()))
def copy(self):
return DRTBox(self.var_sequence, self.body)
def copy_local(self, var_seq, body):
return DRTBox(var_seq, body)
meta.BindingOp.add_op(DRTBox)
DRTBox([te("x_e"), te("y_e")], te("P_<e,t>(x_e)"))
The next cell demonstrates how to create a box in the Lambda Notebook metalanguage.
The following points are particularly important:
&
.%%lamb
# This is the denotation of example (1), "A man adores a woman. She abhors him.", in Muskens 1996.
box1 = Box x1_e, x2_e : Man(x1) & Woman(x2) & Adores(x1, x2) & Abhors(x2, x1)
# An example of a box with an empty variable list
box2 = Box : Adores(John_e, Mary_e)
# An example of a box with an "empty" body
box3 = Box x_e, y_e, z_e : True
INFO (meta): Coerced guessed type for 'Man_t' into <e,t>, to match argument 'x1_e' INFO (meta): Coerced guessed type for 'Woman_t' into <e,t>, to match argument 'x2_e' INFO (meta): Coerced guessed type for 'Adores_t' into <(e,e),t>, to match argument '(x1_e, x2_e)' INFO (meta): Coerced guessed type for 'Abhors_t' into <(e,e),t>, to match argument '(x2_e, x1_e)' INFO (meta): Coerced guessed type for 'Adores_t' into <(e,e),t>, to match argument '(John_e, Mary_e)'
Next, I define the semicolon operator that "chains" two boxes together. This is equivalent to sentential conjunction in dynamic semantics and hence will be denoted by '&' in the metalanguage; in Muskens 1996, it is denoted by the semicolon operator. Additionally, I define a reduction operation on boxes that merges them together as described by Muskens's Merging Lemma.
class BinaryJoinExpr(meta.BinaryOpExpr):
def __init__(self, arg1, arg2):
super().__init__(type_b, "&", arg1, arg2, op_name_latex = ";")
def reducible(self):
return all(isinstance(x, DRTBox) for x in self.args)
def reduce(self):
b1 = self.args[0]; b2 = self.args[1]
b1_free_vars = b1.body.free_variables()
# Only merge if none of the variables introduced by the second
# argument are free in the body of the first
if all(x.op not in b1_free_vars for x in b2.var_sequence):
combined_vars = b1.var_sequence + b2.var_sequence
combined_body = meta.BinaryAndExpr(b1.body, b2.body).simplify_all()
return meta.derived(DRTBox(combined_vars, combined_body), self, desc="Merging Lemma")
else:
return BinaryJoinExpr(b1, b2)
# Add the new operation to the metalanguage
def and_factory(arg1, arg2):
arg1 = meta.TypedExpr.ensure_typed_expr(arg1)
arg2 = meta.TypedExpr.ensure_typed_expr(arg2)
ts = meta.get_type_system()
if ts.eq_check(arg1.type, types.type_t):
return meta.BinaryAndExpr(arg1, arg2)
elif ts.eq_check(arg1.type, type_b):
return BinaryJoinExpr(arg1, arg2)
else:
raise types.TypeMismatch(arg1, arg2, "Unknown types for operator &")
meta.binary_symbols_to_op_exprs['&'] = and_factory
The following cell shows the semicolon operator in action.
%%lamb
box1 = Box x1_e, x2_e : True
box2 = Box : Man(x1_e)
box3 = Box : Woman(x2_e)
box4 = box1 & box2 & box3
INFO (meta): Coerced guessed type for 'Man_t' into <e,t>, to match argument 'x1_e' INFO (meta): Coerced guessed type for 'Woman_t' into <e,t>, to match argument 'x2_e'
The last box, which contains several boxes linked by the semicolon operator, can be reduced with the Merging Lemma; note that the compositional system will automaticallly apply this operation by default.
box4.reduce_all()
We now have all the machinery needed to define some simple lexical entries from Muskens 1996.
%%lamb
||man|| = L u_e : (Box : Man(u))
||runs|| = L u_e : (Box : Runs(u))
||fluffy|| = L p_<e,b> : p(Fluffy_e)
||loves|| = L p_<<e,b>,b> : L u_e : p(L v_e : (Box : Loves(u, v)))
||cat|| = L u_e : (Box : Cat(u))
# The next entry is the indefinite article "a" with the subscript 1;
# Later, we will see a more elegant way to handle indexed lexical entries.
||a1|| = L p_<e,b> : L q_<e,b> : (Box u1 : True_t) & p(u1) & q(u1)
# The indefinite article "a" with the subscript 2
||a2|| = L p_<e,b> : L q_<e,b> : (Box u2 : True_t) & p(u2) & q(u2)
INFO (meta): Coerced guessed type for 'Man_t' into <e,t>, to match argument 'u_e' INFO (meta): Coerced guessed type for 'Runs_t' into <e,t>, to match argument 'u_e' INFO (meta): Coerced guessed type for 'Loves_t' into <(e,e),t>, to match argument '(u_e, v_e)' INFO (meta): Coerced guessed type for 'Cat_t' into <e,t>, to match argument 'u_e'
Composition now works as expected:
(fluffy * runs).trace()
r = ((a1 * cat) * (loves * (a2 * man)))
r
r.tree()
r[0].content.derivation # show the reduction / simplification of the last step
Finally, the current solution of defining a separate lexical entry for each index that a word like "a" or "himself" can take is cumbersome. The indexed_item
function defined in the next cell is one way around this problem. The first argument of indexed_item
is a string defining the name of the lexical item, and the second is a lambda calculus expression defining its content. Wherever something should depend on the value of an index, such as in the name of a discourse referent introduced by "a", use the #
character.
def indexed_item(name, raw_string):
new_name = name + "{0}"
ex_string = raw_string.replace("#", "{0}")
return lambda n: lang.Item(new_name.format(n), te(ex_string.format(n)))
a = indexed_item("a", "L p_<e,b> : L q_<e,b> : (Box u# : True_t) & p(u#) & q(u#)")
himself = indexed_item("himself", "L p_<e,b> : p(u#)")
The following cells show how these indexed items can be used in composition.
((a(1) * man) * (loves * himself(1)))
(a(3) * cat) * (loves * (a(5) * man))