This notebook illustrates how to quickly show the heads (as computed by Cody Kingham and Christiaan Erwich).
%load_ext autoreload
%autoreload 2
import os, collections, random
from tf.fabric import Fabric
from tf.extra.bhsa import Bhsa
DATA = [
'~/github/etcbc/bhsa/tf',
'~/github/etcbc/lingo/heads/tf',
]
TF = Fabric(locations=DATA, modules='c')
api = TF.load('heads')
api.makeAvailableIn(globals())
B = Bhsa(api, name='showHeads', version='c')
This is Text-Fabric 3.4.12 Api reference : https://github.com/Dans-labs/text-fabric/wiki/Api Tutorial : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb Example data : https://github.com/Dans-labs/text-fabric-data 116 features found and 0 ignored 0.00s loading features ... | 0.86s B heads from /Users/dirk/github/etcbc/lingo/heads/tf/c | 0.00s Feature overview: 109 for nodes; 6 for edges; 1 configs; 7 computed 7.93s All features loaded/computed - for details use loadLog()
Documentation: BHSA Feature docs BHSA API Text-Fabric API 3.4.12 Search Reference
We pick all phrases of at least 5 words
query = '''
phrase
=: word
<: word
<: word
<: word
<: word
'''
phrases = [r[0] for r in B.search(query)]
heads = [(p, *E.heads.f(p)) for p in phrases]
11086 results
query = '''
p1:phrase
-heads> word
p2:phrase
=: word
<: word
<: word
<: word
<: word
p1 = p2
'''
heads = [(r[0], r[1]) for r in B.search(query)]
18843 results
Check whether we have the same phrases.
phrases2 = set(r[0] for r in heads)
print(len(phrases2))
print(set(phrases) - phrases2)
11085 {842051}
nbPhrase = 842051
print(E.heads.f(nbPhrase))
B.pretty(nbPhrase, withNodes=True)
There seems to be one headless phrase.
We show the first ten phrases with their heads. You'll see the phrases in question highlighted and within them the head words. Note that the head of a prepositional phrase is the preposition.
B.show(heads, start=1, end=10)