Relation extraction using distant supervision: Experiments

In [1]:
__author__ = "Bill MacCartney"
__version__ = "CS224U, Stanford, Spring 2019"

Overview

OK, it's time to get (halfway) serious. Let's apply real machine learning to train a classifier on the training data, and see how it performs on the test data. We'll begin with one of the simplest machine learning setups: a bag-of-words feature representation, and a linear model trained using logistic regression.

Just like we did in the unit on supervised sentiment analysis, we'll leverage the sklearn library, and we'll introduce functions for featurizing instances, training models, making predictions, and evaluating results.

Set-up

See the first notebook in this unit for set-up instructions.

In [2]:
from collections import Counter
import os
import rel_ext
In [3]:
rel_ext_data_home = os.path.join('data', 'rel_ext_data')

With the following steps, we build up the dataset we'll use for experiments; it unites a corpus and a knowledge base in the way we described in the previous notebook.

In [4]:
corpus = rel_ext.Corpus(os.path.join(rel_ext_data_home, 'corpus.tsv.gz'))
In [5]:
kb = rel_ext.KB(os.path.join(rel_ext_data_home, 'kb.tsv.gz'))
In [6]:
dataset = rel_ext.Dataset(corpus, kb)

The following code splits up our data in a way that supports experimentation:

In [7]:
splits = dataset.build_splits()

splits
Out[7]:
{'tiny': Corpus with 3,474 examples; KB with 445 triples,
 'train': Corpus with 249,003 examples; KB with 34,229 triples,
 'dev': Corpus with 79,219 examples; KB with 11,210 triples,
 'all': Corpus with 331,696 examples; KB with 45,884 triples}

Building a classifier

Featurizers

Featurizers are functions which define the feature representation for our model. The primary input to a featurizer will be the KBTriple for which we are generating features. But since our features will be derived from corpus examples containing the entities of the KBTriple, we must also pass in a reference to a Corpus. And in order to make it easy to combine different featurizers, we'll also pass in a feature counter to hold the results.

Here's an implementation for a very simple bag-of-words featurizer. It finds all the corpus examples containing the two entities in the KBTriple, breaks the phrase appearing between the two entity mentions into words, and counts the words. Note that it makes no distinction between "forward" and "reverse" examples.

In [8]:
def simple_bag_of_words_featurizer(kbt, corpus, feature_counter):
    for ex in corpus.get_examples_for_entities(kbt.sbj, kbt.obj):
        for word in ex.middle.split(' '):
            feature_counter[word] += 1
    for ex in corpus.get_examples_for_entities(kbt.obj, kbt.sbj):
        for word in ex.middle.split(' '):
            feature_counter[word] += 1
    return feature_counter

Here's how this featurizer works on a single example:

In [9]:
kbt = kb.kb_triples[0]

kbt
Out[9]:
KBTriple(rel='contains', sbj='Brickfields', obj='Kuala_Lumpur_Sentral_railway_station')
In [10]:
corpus.get_examples_for_entities(kbt.sbj, kbt.obj)[0].middle
Out[10]:
'it was just a quick 10-minute walk to'
In [11]:
simple_bag_of_words_featurizer(kb.kb_triples[0], corpus, Counter())
Out[11]:
Counter({'it': 1,
         'was': 1,
         'just': 1,
         'a': 1,
         'quick': 1,
         '10-minute': 1,
         'walk': 1,
         'to': 2,
         'the': 1})

You can experiment with adding new kinds of features just by implementing additional featurizers, following simple_bag_of_words_featurizer as an example.

Now, in order to apply machine learning algorithms such as those provided by sklearn, we need a way to convert datasets of KBTriples into feature matrices. The following steps achieve that:

In [12]:
kbts_by_rel, labels_by_rel = dataset.build_dataset()

featurized = dataset.featurize(kbts_by_rel, featurizers=[simple_bag_of_words_featurizer])

Experiments

Now we need some functions to train models, make predictions, and evaluate the results. We'll start with train_models(). This function takes as arguments a dictionary of data splits, a list of featurizers, the name of the split on which to train, and a model factory, which is a function which initializes an sklearn classifier. It returns a dictionary holding the featurizers, the vectorizer that was used to generate the training matrix, and a dictionary holding the trained models, one per relation.

In [13]:
train_result = rel_ext.train_models(
    splits, 
    featurizers=[simple_bag_of_words_featurizer])

Next comes predict(). This function takes as arguments a dictionary of data splits, the outputs of train_models(), and the name of the split for which to make predictions. It returns two parallel dictionaries: one holding the predictions (grouped by relation), the other holding the true labels (again, grouped by prediction).

In [14]:
predictions, true_labels = rel_ext.predict(
    splits, train_result, split_name='dev')

Now evaluate_predictions(). This function takes as arguments the parallel dictionaries of predictions and true labels produced by predict(). It prints summary statistics for each relation, including precision, recall, and F0.5-score, and it returns the macro-averaged F0.5-score.

In [15]:
rel_ext.evaluate_predictions(predictions, true_labels)
relation              precision     recall    f-score    support       size
------------------    ---------  ---------  ---------  ---------  ---------
adjoins                   0.877      0.386      0.699        407       7057
author                    0.810      0.519      0.728        657       7307
capital                   0.652      0.238      0.484        126       6776
contains                  0.778      0.605      0.736       4487      11137
film_performance          0.782      0.597      0.736        984       7634
founders                  0.822      0.414      0.686        469       7119
genre                     0.517      0.151      0.348        205       6855
has_sibling               0.858      0.251      0.578        625       7275
has_spouse                0.892      0.338      0.672        754       7404
is_a                      0.705      0.217      0.486        618       7268
nationality               0.578      0.192      0.412        386       7036
parents                   0.827      0.538      0.747        390       7040
place_of_birth            0.558      0.206      0.415        282       6932
place_of_death            0.415      0.105      0.261        209       6859
profession                0.659      0.188      0.439        308       6958
worked_at                 0.705      0.261      0.526        303       6953
------------------    ---------  ---------  ---------  ---------  ---------
macro-average             0.715      0.325      0.560      11210     117610
Out[15]:
0.5596964276671111

Finally, we introduce rel_ext.experiment(), which basically chains together rel_ext.train_models(), rel_ext.predict(), and rel_ext.evaluate_predictions(). For convenience, this function returns the output of rel_ext.train_models() as its result.

Running rel_ext.experiment() in its default configuration will give us a baseline result for machine-learned models.

In [16]:
_ = rel_ext.experiment(
    splits,
    featurizers=[simple_bag_of_words_featurizer])
relation              precision     recall    f-score    support       size
------------------    ---------  ---------  ---------  ---------  ---------
adjoins                   0.877      0.386      0.699        407       7057
author                    0.810      0.519      0.728        657       7307
capital                   0.652      0.238      0.484        126       6776
contains                  0.778      0.605      0.736       4487      11137
film_performance          0.782      0.597      0.736        984       7634
founders                  0.822      0.414      0.686        469       7119
genre                     0.517      0.151      0.348        205       6855
has_sibling               0.858      0.251      0.578        625       7275
has_spouse                0.892      0.338      0.672        754       7404
is_a                      0.705      0.217      0.486        618       7268
nationality               0.578      0.192      0.412        386       7036
parents                   0.827      0.538      0.747        390       7040
place_of_birth            0.558      0.206      0.415        282       6932
place_of_death            0.415      0.105      0.261        209       6859
profession                0.659      0.188      0.439        308       6958
worked_at                 0.705      0.261      0.526        303       6953
------------------    ---------  ---------  ---------  ---------  ---------
macro-average             0.715      0.325      0.560      11210     117610

Considering how vanilla our model is, these results are quite surprisingly good! We see huge gains for every relation over our top_k_middles_classifier from the previous notebook. This strong performance is a powerful testament to the effectiveness of even the simplest forms of machine learning.

But there is still much more we can do. To make further gains, we must not treat the model as a black box. We must open it up and get visibility into what it has learned, and more importantly, where it still falls down.

Analysis

Examining the trained models

One important way to gain understanding of our trained model is to inspect the model weights. What features are strong positive indicators for each relation, and what features are strong negative indicators?

In [17]:
rel_ext.examine_model_weights(train_result)
Highest and lowest feature weights for relation adjoins:

     2.556 Taluks
     2.483 Córdoba
     2.481 Valais
     ..... .....
    -1.316 Cook
    -1.438 he
    -1.459 who

Highest and lowest feature weights for relation author:

     2.699 book
     2.566 musical
     2.507 books
     ..... .....
    -2.791 1945
    -2.885 17th
    -2.998 1818

Highest and lowest feature weights for relation capital:

     3.700 capital
     1.718 km
     1.459 posted
     ..... .....
    -1.165 southwestern
    -1.612 Dehradun
    -1.870 state

Highest and lowest feature weights for relation contains:

     2.288 bordered
     2.119 Ontario
     2.021 third-largest
     ..... .....
    -2.347 Midlands
    -2.496 who
    -2.718 Mile

Highest and lowest feature weights for relation film_performance:

     4.404 alongside
     4.049 starring
     3.604 movie
     ..... .....
    -1.578 poem
    -1.718 tragedy
    -1.756 or

Highest and lowest feature weights for relation founders:

     3.993 founded
     3.865 founder
     3.435 co-founder
     ..... .....
    -1.587 band
    -1.673 novel
    -1.764 Bauhaus

Highest and lowest feature weights for relation genre:

     2.792 series
     2.776 movie
     2.635 album
     ..... .....
    -1.326 's
    -1.410 and
    -1.664 at

Highest and lowest feature weights for relation has_sibling:

     5.362 brother
     4.208 sister
     2.790 Marlon
     ..... .....
    -1.350 alongside
    -1.414 Her
    -1.999 formed

Highest and lowest feature weights for relation has_spouse:

     5.038 wife
     4.283 widow
     4.221 married
     ..... .....
    -1.227 which
    -1.265 reported
    -1.298 Sir

Highest and lowest feature weights for relation is_a:

     2.789 
     2.692 order
     2.467 philosopher
     ..... .....
    -1.741 birds
    -3.094 cat
    -4.383 characin

Highest and lowest feature weights for relation nationality:

     2.932 born
     1.859 leaving
     1.839 Set
     ..... .....
    -1.406 or
    -1.608 1961
    -1.710 American

Highest and lowest feature weights for relation parents:

     4.626 daughter
     4.525 father
     4.495 son
     ..... .....
    -1.487 defeated
    -1.524 Sonam
    -1.584 filmmaker

Highest and lowest feature weights for relation place_of_birth:

     3.997 born
     3.004 birthplace
     2.905 mayor
     ..... .....
    -1.319 American
    -1.412 or
    -1.507 and

Highest and lowest feature weights for relation place_of_death:

     2.330 died
     1.821 where
     1.660 living
     ..... .....
    -1.225 as
    -1.232 and
    -1.283 created

Highest and lowest feature weights for relation profession:

     3.338 
     2.538 philosopher
     2.377 American
     ..... .....
    -1.298 Texas
    -1.302 in
    -1.972 on

Highest and lowest feature weights for relation worked_at:

     3.077 CEO
     2.922 professor
     2.818 employee
     ..... .....
    -1.406 bassist
    -1.684 family
    -1.730 or

By and large, the high-weight features for each relation are pretty intuitive — they are words that are used to express the relation in question. (The counter-intuitive results merit a bit of investigation!)

The low-weight features (that is, features with large negative weights) may be a bit harder to understand. In some cases, however, they can be interpreted as features which indicate some other relation which is anti-correlated with the target relation. (As an example, "directed" is a negative indicator for the author relation.)

Optional exercise: Investigate one of the counter-intuitive high-weight features. Find the training examples which caused the feature to be included. Given the training data, does it make sense that this feature is a good predictor for the target relation?

Discovering new relation instances

Another way to gain insight into our trained models is to use them to discover new relation instances that don't currently appear in the KB. In fact, this is the whole point of building a relation extraction system: to extend an existing KB (or build a new one) using knowledge extracted from natural language text at scale. Can the models we've trained do this effectively?

Because the goal is to discover new relation instances which are true but absent from the KB, we can't evaluate this capability automatically. But we can generate candidate KB triples and manually evaluate them for correctness.

To do this, we'll start from corpus examples containing pairs of entities which do not belong to any relation in the KB (earlier, we described these as "negative examples"). We'll then apply our trained models to each pair of entities, and sort the results by probability assigned by the model, in order to find the most likely new instances for each relation.

In [18]:
rel_ext.find_new_relation_instances(
    dataset,
    featurizers=[simple_bag_of_words_featurizer])
Highest probability examples for relation adjoins:

     1.000 KBTriple(rel='adjoins', sbj='Canada', obj='Vancouver')
     1.000 KBTriple(rel='adjoins', sbj='Vancouver', obj='Canada')
     1.000 KBTriple(rel='adjoins', sbj='Mexico', obj='Atlantic_Ocean')
     1.000 KBTriple(rel='adjoins', sbj='Atlantic_Ocean', obj='Mexico')
     1.000 KBTriple(rel='adjoins', sbj='Pakistan', obj='Lahore')
     1.000 KBTriple(rel='adjoins', sbj='Lahore', obj='Pakistan')
     1.000 KBTriple(rel='adjoins', sbj='Sicily', obj='Italy')
     1.000 KBTriple(rel='adjoins', sbj='Italy', obj='Sicily')
     1.000 KBTriple(rel='adjoins', sbj='Great_Britain', obj='Europe')
     1.000 KBTriple(rel='adjoins', sbj='Europe', obj='Great_Britain')

Highest probability examples for relation author:

     1.000 KBTriple(rel='author', sbj='Charles_Dickens', obj='A_Christmas_Carol')
     1.000 KBTriple(rel='author', sbj='Aldous_Huxley', obj='Brave_New_World')
     1.000 KBTriple(rel='author', sbj='Aldous_Huxley', obj='The_Doors_of_Perception')
     1.000 KBTriple(rel='author', sbj='The_Doors_of_Perception', obj='Aldous_Huxley')
     1.000 KBTriple(rel='author', sbj='Pride_and_Prejudice', obj='Jane_Austen')
     1.000 KBTriple(rel='author', sbj='Brave_New_World', obj='Aldous_Huxley')
     1.000 KBTriple(rel='author', sbj='Jane_Austen', obj='Pride_and_Prejudice')
     1.000 KBTriple(rel='author', sbj='A_Christmas_Carol', obj='Charles_Dickens')
     1.000 KBTriple(rel='author', sbj='Oliver_Twist', obj='Charles_Dickens')
     1.000 KBTriple(rel='author', sbj='Charles_Dickens', obj='Oliver_Twist')

Highest probability examples for relation capital:

     1.000 KBTriple(rel='capital', sbj='Dhaka', obj='Bangladesh')
     1.000 KBTriple(rel='capital', sbj='Bangladesh', obj='Dhaka')
     1.000 KBTriple(rel='capital', sbj='Chengdu', obj='Sichuan')
     1.000 KBTriple(rel='capital', sbj='Sichuan', obj='Chengdu')
     1.000 KBTriple(rel='capital', sbj='Delhi', obj='India')
     1.000 KBTriple(rel='capital', sbj='India', obj='Delhi')
     1.000 KBTriple(rel='capital', sbj='Lucknow', obj='Uttar_Pradesh')
     1.000 KBTriple(rel='capital', sbj='Uttar_Pradesh', obj='Lucknow')
     1.000 KBTriple(rel='capital', sbj='Pakistan', obj='Lahore')
     1.000 KBTriple(rel='capital', sbj='Lahore', obj='Pakistan')

Highest probability examples for relation contains:

     1.000 KBTriple(rel='contains', sbj='Canada', obj='Vancouver')
     1.000 KBTriple(rel='contains', sbj='Sydney', obj='New_South_Wales')
     1.000 KBTriple(rel='contains', sbj='Tenerife', obj='Canary_Islands')
     1.000 KBTriple(rel='contains', sbj='Vancouver', obj='Canada')
     1.000 KBTriple(rel='contains', sbj='Melbourne', obj='Australia')
     1.000 KBTriple(rel='contains', sbj='Dhaka', obj='Bangladesh')
     1.000 KBTriple(rel='contains', sbj='Campania', obj='Naples')
     1.000 KBTriple(rel='contains', sbj='Edmonton', obj='Canada')
     1.000 KBTriple(rel='contains', sbj='Pakistan', obj='Lahore')
     1.000 KBTriple(rel='contains', sbj='Australia', obj='Melbourne')

Highest probability examples for relation film_performance:

     1.000 KBTriple(rel='film_performance', sbj='Mohabbatein', obj='Amitabh_Bachchan')
     1.000 KBTriple(rel='film_performance', sbj='Amitabh_Bachchan', obj='Mohabbatein')
     1.000 KBTriple(rel='film_performance', sbj='Charles_Dickens', obj='A_Christmas_Carol')
     1.000 KBTriple(rel='film_performance', sbj='A_Christmas_Carol', obj='Charles_Dickens')
     1.000 KBTriple(rel='film_performance', sbj='Akshay_Kumar', obj='Sonakshi_Sinha')
     1.000 KBTriple(rel='film_performance', sbj='Sonakshi_Sinha', obj='Akshay_Kumar')
     1.000 KBTriple(rel='film_performance', sbj='Kevin_Kline', obj='De-Lovely')
     1.000 KBTriple(rel='film_performance', sbj='De-Lovely', obj='Kevin_Kline')
     1.000 KBTriple(rel='film_performance', sbj='Hrithik_Roshan', obj='Kaho_Naa..._Pyaar_Hai')
     1.000 KBTriple(rel='film_performance', sbj='Kaho_Naa..._Pyaar_Hai', obj='Hrithik_Roshan')

Highest probability examples for relation founders:

     1.000 KBTriple(rel='founders', sbj='Homer', obj='Iliad')
     1.000 KBTriple(rel='founders', sbj='Iliad', obj='Homer')
     1.000 KBTriple(rel='founders', sbj='William_C._Durant', obj='Louis_Chevrolet')
     1.000 KBTriple(rel='founders', sbj='Louis_Chevrolet', obj='William_C._Durant')
     1.000 KBTriple(rel='founders', sbj='Stan_Lee', obj='Marvel_Comics')
     1.000 KBTriple(rel='founders', sbj='Marvel_Comics', obj='Stan_Lee')
     1.000 KBTriple(rel='founders', sbj='SpaceX', obj='Elon_Musk')
     1.000 KBTriple(rel='founders', sbj='Elon_Musk', obj='SpaceX')
     1.000 KBTriple(rel='founders', sbj='Genghis_Khan', obj='Mongol_Empire')
     1.000 KBTriple(rel='founders', sbj='Mongol_Empire', obj='Genghis_Khan')

Highest probability examples for relation genre:

     0.999 KBTriple(rel='genre', sbj='Mark_Twain_Tonight', obj='Hal_Holbrook')
     0.999 KBTriple(rel='genre', sbj='Hal_Holbrook', obj='Mark_Twain_Tonight')
     0.997 KBTriple(rel='genre', sbj='Oliver_Twist', obj='Charles_Dickens')
     0.997 KBTriple(rel='genre', sbj='Charles_Dickens', obj='Oliver_Twist')
     0.989 KBTriple(rel='genre', sbj='Pink_Floyd', obj='The_Dark_Side_of_the_Moon')
     0.989 KBTriple(rel='genre', sbj='The_Dark_Side_of_the_Moon', obj='Pink_Floyd')
     0.986 KBTriple(rel='genre', sbj='Sam_Raimi', obj='Andrew_Garfield')
     0.986 KBTriple(rel='genre', sbj='Andrew_Garfield', obj='Sam_Raimi')
     0.953 KBTriple(rel='genre', sbj='Ronald_Reagan', obj='Jurassic_Park_III')
     0.953 KBTriple(rel='genre', sbj='Jurassic_Park_III', obj='Ronald_Reagan')

Highest probability examples for relation has_sibling:

     1.000 KBTriple(rel='has_sibling', sbj='Jess_Margera', obj='April_Margera')
     1.000 KBTriple(rel='has_sibling', sbj='April_Margera', obj='Jess_Margera')
     1.000 KBTriple(rel='has_sibling', sbj='Lincoln_Borglum', obj='Gutzon_Borglum')
     1.000 KBTriple(rel='has_sibling', sbj='Gutzon_Borglum', obj='Lincoln_Borglum')
     1.000 KBTriple(rel='has_sibling', sbj='Rufus_Wainwright', obj='Kate_McGarrigle')
     1.000 KBTriple(rel='has_sibling', sbj='Kate_McGarrigle', obj='Rufus_Wainwright')
     1.000 KBTriple(rel='has_sibling', sbj='Nicole_Brown_Simpson', obj='Ronald_Goldman')
     1.000 KBTriple(rel='has_sibling', sbj='Ronald_Goldman', obj='Nicole_Brown_Simpson')
     1.000 KBTriple(rel='has_sibling', sbj='Aretha_Franklin', obj='Dionne_Warwick')
     1.000 KBTriple(rel='has_sibling', sbj='Dionne_Warwick', obj='Aretha_Franklin')

Highest probability examples for relation has_spouse:

     1.000 KBTriple(rel='has_spouse', sbj='Akhenaten', obj='Tutankhamun')
     1.000 KBTriple(rel='has_spouse', sbj='Tutankhamun', obj='Akhenaten')
     1.000 KBTriple(rel='has_spouse', sbj='William_C._Durant', obj='Louis_Chevrolet')
     1.000 KBTriple(rel='has_spouse', sbj='Louis_Chevrolet', obj='William_C._Durant')
     1.000 KBTriple(rel='has_spouse', sbj='Nicole_Brown_Simpson', obj='Ronald_Goldman')
     1.000 KBTriple(rel='has_spouse', sbj='Ronald_Goldman', obj='Nicole_Brown_Simpson')
     1.000 KBTriple(rel='has_spouse', sbj='Douglas_Fairbanks', obj='United_Artists')
     1.000 KBTriple(rel='has_spouse', sbj='United_Artists', obj='Douglas_Fairbanks')
     1.000 KBTriple(rel='has_spouse', sbj='Charles_II_of_England', obj='England')
     1.000 KBTriple(rel='has_spouse', sbj='England', obj='Charles_II_of_England')

Highest probability examples for relation is_a:

     1.000 KBTriple(rel='is_a', sbj='Canada', obj='Vancouver')
     1.000 KBTriple(rel='is_a', sbj='Vancouver', obj='Canada')
     1.000 KBTriple(rel='is_a', sbj='Felidae', obj='Panthera')
     1.000 KBTriple(rel='is_a', sbj='Panthera', obj='Felidae')
     1.000 KBTriple(rel='is_a', sbj='Automobile', obj='South_Korea')
     1.000 KBTriple(rel='is_a', sbj='South_Korea', obj='Automobile')
     1.000 KBTriple(rel='is_a', sbj='Hibiscus', obj='Malvaceae')
     1.000 KBTriple(rel='is_a', sbj='Malvaceae', obj='Hibiscus')
     1.000 KBTriple(rel='is_a', sbj='Bird', obj='Phasianidae')
     1.000 KBTriple(rel='is_a', sbj='Phasianidae', obj='Bird')

Highest probability examples for relation nationality:

     1.000 KBTriple(rel='nationality', sbj='Cambodia', obj='Suryavarman_II')
     1.000 KBTriple(rel='nationality', sbj='Suryavarman_II', obj='Cambodia')
     1.000 KBTriple(rel='nationality', sbj='Titus', obj='Roman_Empire')
     1.000 KBTriple(rel='nationality', sbj='Roman_Empire', obj='Titus')
     1.000 KBTriple(rel='nationality', sbj='Norodom_Sihamoni', obj='Cambodia')
     1.000 KBTriple(rel='nationality', sbj='Cambodia', obj='Norodom_Sihamoni')
     1.000 KBTriple(rel='nationality', sbj='Jess_Margera', obj='April_Margera')
     1.000 KBTriple(rel='nationality', sbj='April_Margera', obj='Jess_Margera')
     1.000 KBTriple(rel='nationality', sbj='Genghis_Khan', obj='Mongol_Empire')
     1.000 KBTriple(rel='nationality', sbj='Mongol_Empire', obj='Genghis_Khan')

Highest probability examples for relation parents:

     1.000 KBTriple(rel='parents', sbj='Lincoln_Borglum', obj='Gutzon_Borglum')
     1.000 KBTriple(rel='parents', sbj='Gutzon_Borglum', obj='Lincoln_Borglum')
     1.000 KBTriple(rel='parents', sbj='Philip_II_of_Macedon', obj='Alexander_the_Great')
     1.000 KBTriple(rel='parents', sbj='Alexander_the_Great', obj='Philip_II_of_Macedon')
     1.000 KBTriple(rel='parents', sbj='Thomas_Boleyn,_1st_Earl_of_Wiltshire', obj='Anne_Boleyn')
     1.000 KBTriple(rel='parents', sbj='Anne_Boleyn', obj='Thomas_Boleyn,_1st_Earl_of_Wiltshire')
     1.000 KBTriple(rel='parents', sbj='Jess_Margera', obj='April_Margera')
     1.000 KBTriple(rel='parents', sbj='April_Margera', obj='Jess_Margera')
     1.000 KBTriple(rel='parents', sbj='Prince_Philip,_Duke_of_Edinburgh', obj='Anne,_Princess_Royal')
     1.000 KBTriple(rel='parents', sbj='Anne,_Princess_Royal', obj='Prince_Philip,_Duke_of_Edinburgh')

Highest probability examples for relation place_of_birth:

     1.000 KBTriple(rel='place_of_birth', sbj='Cambodia', obj='Suryavarman_II')
     1.000 KBTriple(rel='place_of_birth', sbj='Suryavarman_II', obj='Cambodia')
     1.000 KBTriple(rel='place_of_birth', sbj='Nepal', obj='Bagmati_Zone')
     1.000 KBTriple(rel='place_of_birth', sbj='Bagmati_Zone', obj='Nepal')
     0.999 KBTriple(rel='place_of_birth', sbj='San_Antonio', obj='Actor')
     0.999 KBTriple(rel='place_of_birth', sbj='Actor', obj='San_Antonio')
     0.999 KBTriple(rel='place_of_birth', sbj='Titus', obj='Roman_Empire')
     0.999 KBTriple(rel='place_of_birth', sbj='Roman_Empire', obj='Titus')
     0.998 KBTriple(rel='place_of_birth', sbj='Roman_Empire', obj='Septimius_Severus')
     0.998 KBTriple(rel='place_of_birth', sbj='Septimius_Severus', obj='Roman_Empire')

Highest probability examples for relation place_of_death:

     1.000 KBTriple(rel='place_of_death', sbj='Titus', obj='Roman_Empire')
     1.000 KBTriple(rel='place_of_death', sbj='Roman_Empire', obj='Titus')
     1.000 KBTriple(rel='place_of_death', sbj='Philip_II_of_Macedon', obj='Alexander_the_Great')
     1.000 KBTriple(rel='place_of_death', sbj='Alexander_the_Great', obj='Philip_II_of_Macedon')
     1.000 KBTriple(rel='place_of_death', sbj='England', obj='Elizabeth_I_of_England')
     1.000 KBTriple(rel='place_of_death', sbj='Elizabeth_I_of_England', obj='England')
     1.000 KBTriple(rel='place_of_death', sbj='Uruguay', obj='World_Trade_Organization')
     1.000 KBTriple(rel='place_of_death', sbj='World_Trade_Organization', obj='Uruguay')
     0.999 KBTriple(rel='place_of_death', sbj='Roman_Empire', obj='Tiberius_Julius_Alexander')
     0.999 KBTriple(rel='place_of_death', sbj='Tiberius_Julius_Alexander', obj='Roman_Empire')

Highest probability examples for relation profession:

     1.000 KBTriple(rel='profession', sbj='Canada', obj='Vancouver')
     1.000 KBTriple(rel='profession', sbj='Vancouver', obj='Canada')
     1.000 KBTriple(rel='profession', sbj='Little_Women', obj='Louisa_May_Alcott')
     1.000 KBTriple(rel='profession', sbj='Louisa_May_Alcott', obj='Little_Women')
     1.000 KBTriple(rel='profession', sbj='Jess_Margera', obj='April_Margera')
     1.000 KBTriple(rel='profession', sbj='April_Margera', obj='Jess_Margera')
     1.000 KBTriple(rel='profession', sbj='Aldous_Huxley', obj='Eyeless_in_Gaza')
     1.000 KBTriple(rel='profession', sbj='Eyeless_in_Gaza', obj='Aldous_Huxley')
     0.999 KBTriple(rel='profession', sbj='Hrithik_Roshan', obj='Kaho_Naa..._Pyaar_Hai')
     0.999 KBTriple(rel='profession', sbj='Kaho_Naa..._Pyaar_Hai', obj='Hrithik_Roshan')

Highest probability examples for relation worked_at:

     1.000 KBTriple(rel='worked_at', sbj='William_C._Durant', obj='Louis_Chevrolet')
     1.000 KBTriple(rel='worked_at', sbj='Louis_Chevrolet', obj='William_C._Durant')
     1.000 KBTriple(rel='worked_at', sbj='SpaceX', obj='Elon_Musk')
     1.000 KBTriple(rel='worked_at', sbj='Elon_Musk', obj='SpaceX')
     1.000 KBTriple(rel='worked_at', sbj='Leonard_Chess', obj='Chess_Records')
     1.000 KBTriple(rel='worked_at', sbj='Chess_Records', obj='Leonard_Chess')
     1.000 KBTriple(rel='worked_at', sbj='Genghis_Khan', obj='Mongol_Empire')
     1.000 KBTriple(rel='worked_at', sbj='Mongol_Empire', obj='Genghis_Khan')
     1.000 KBTriple(rel='worked_at', sbj='Marvel_Comics', obj='Comic_book')
     1.000 KBTriple(rel='worked_at', sbj='Comic_book', obj='Marvel_Comics')

There are actually some good discoveries here! The predictions for the author relation seem especially good. Of course, there are also plenty of bad results, and a few that are downright comical. We may hope that as we improve our models and optimize performance in our automatic evaluations, the results we observe in this manual evaluation improve as well.

Optional exercise: Note that every time we predict that a given relation holds between entities X and Y, we also predict, with equal confidence, that it holds between Y and X. Why? How could we fix this?

[ top ]