First Analysis of the Experiment Data

In this notebook we analyze the first results from the 50 experiment parameters as defined in parameters.py. These parameters have been run and fed to experiment classes (experiments2.py) in data_validation.ipynb. This process is computationally intensive due to the method of utilizing search templates, requiring about 30 minutes of processing. A pickled version of the data is exported by that notebook which is imported here.

The bulk of this analysis notebook takes place in the similarity analysis below, where we take the top 10 most common verbs and analyze their top 5 matches based on averaged scores across all applicable experiment parameters.

To properly see Hebrew search results, view this notebook in JupyterNBviewer

In [1]:
import numpy as np
from numpy.polynomial.polynomial import polyfit
import pandas as pd
import collections, os, sys, random, time, pickle, dill, copy, re
from IPython.display import clear_output
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.cluster import KMeans
from scipy.stats import iqr
from tf.fabric import Fabric
from tf.extra.bhsa import Bhsa
from project_code.experiments2 import Experiment
from project_code.semspace import SemSpace

bhsa_data_paths=['~/github/etcbc/bhsa/tf/c',
                 '~/github/verb_semantics/project_code/lingo/heads/tf/c',
                 '~/github/verb_semantics/project_code/sdbh']

TF = Fabric(bhsa_data_paths)
tf_api = TF.load('''
                function lex lex_utf8 vs language
                pdp freq_lex gloss domain ls
                mother rela typ sp st code txt instruction
                heads prep_obj 
                prs prs_gn prs_nu prs_ps
                sem_domain sem_domain_code
              ''', silent=True)

tf_api.makeAvailableIn(globals())
B = Bhsa(api=tf_api, name='', version='c')
This is Text-Fabric 5.4.3
Api reference : https://dans-labs.github.io/text-fabric/Api/General/
Tutorial      : https://github.com/Dans-labs/text-fabric/blob/master/docs/tutorial.ipynb
Example data  : https://github.com/Dans-labs/text-fabric-data

118 features found and 0 ignored
In [2]:
print('Loading cached experiments...')
with open('/Users/cody/Documents/experiments.dill', 'rb') as infile:
    experiments = dill.load(infile)
    print(f'{len(experiments)} experiments loaded.')
Loading cached experiments...
50 experiments loaded.
In [3]:
print('Loading semantic space data: adjusting counts with pmi, preparing plotter functions, etc...')

spaces = dict((exp_name, SemSpace(exp, info=False)) for exp_name, exp in experiments.items())

print(f'{len(spaces)} experiments successfully loaded into semantic spaces.')
Loading semantic space data: adjusting counts with pmi, preparing plotter functions, etc...
50 experiments successfully loaded into semantic spaces.

Experiments and their Dimensions

In [4]:
print('Experiments (coocurring features x target words):\n')
for shape, exp in sorted((experiments[exp].data.shape, exp) for exp in experiments):
    print(f'{exp}:\t{shape}')
Experiments (coocurring features x target words):

vi_subj_animacy:	(2, 180)
vi_allarg_pa:	(2, 694)
vi_objc_pa:	(2, 714)
vi_adj+_pa:	(2, 734)
vi_cmpl_pa:	(2, 734)
vi_coad_pa:	(2, 734)
vf_obj_pa:	(3, 694)
vd_domain_simple:	(3, 704)
vf_cmpl_pa:	(4, 725)
vi_objc_animacy:	(5, 173)
vf_adju_pa:	(7, 733)
vf_obj_animacy:	(8, 127)
vf_coad_pa:	(8, 734)
vg_tense:	(8, 734)
vi_cmpl_animacy:	(39, 174)
vf_argAll_pa:	(43, 703)
vi_adj+_animacy:	(46, 108)
vi_coad_animacy:	(51, 241)
vd_domain_embed:	(73, 646)
vf_cmpl_animacy:	(88, 158)
vi_allarg_animacy:	(92, 370)
vf_adju_animacy:	(96, 78)
vf_coad_animacy:	(200, 192)
vi_subj_domain:	(247, 231)
vd_par_lex:	(305, 365)
vf_argAll_animacy:	(378, 207)
vi_objc_domain:	(448, 245)
vf_obj_domain:	(584, 213)
vi_cmpl_domain:	(1033, 223)
vf_cmpl_domain:	(1128, 207)
vi_adj+_domain:	(1219, 217)
vf_adju_domain:	(1575, 180)
vi_coad_domain:	(1735, 386)
vi_subj_lex:	(1959, 290)
vi_objc_lex:	(2853, 305)
vi_allarg_domain:	(2902, 527)
vf_obj_lex:	(3012, 274)
vf_coad_domain:	(3055, 301)
vi_adj+_lex:	(3478, 295)
vf_cmpl_lex:	(4074, 267)
vi_cmpl_lex:	(4079, 281)
vf_adju_lex:	(4184, 263)
vd_con_window:	(4463, 790)
vf_argAll_domain:	(5122, 339)
vd_con_clause:	(5477, 900)
vi_coad_lex:	(6765, 475)
vd_con_chain:	(8308, 1218)
vf_coad_lex:	(9337, 418)
vi_allarg_lex:	(11180, 652)
vf_argAll_lex:	(14761, 482)

Experiment names are coded by their strategy, the verbal argument they are denoting, and the particular level of information they are recording for the given verbal argument:

strategy_verbArgument_dataPoint

There are 4 strategies:

  • vi - Verb inventory (vi) makes inventories of verbal arguments based on their simple presence within the clause. The presence of multiple instances will be counted multiple times.
  • vf - Verb frame (vf) experiments take into account all elements at once within the clause. If multiple instances are present, they are all recorded into a single string or frame.
  • vd - Verb domain experiments (vd) create an inventory of surrounding words within a given discourse or context.
  • vg - The single verb grammar experiment (vg) simply measures how often a term occurs in a specific verbal tense, based on the assumption that similar verbs might also prefer similar tenses.

The arguments which are tested by verb inventory and frame experiments are as follows:

  • subj - subject, which is only counted by verb inventory experiments
  • objc - object
  • cmpl - complement (including lexicalized preposition info, e.g. L+COMPLEMENT)
  • adj+/adju - adjunct, which includes tags for location, time, and predicate adjuncts (hence the + sign) (including lexicalized preposition info).
  • coad - complement and adjuncts together (including lexicalized preposition info)
  • allArg - object, complement, and adjunct arguments together (including lexicalized preposition info)

For verb domain we test the following contexts that enclose the verb:

  • simple_domain - i.e. N, Q, D for narrative, quotation, discourse
  • domain_embed - these are the nested codes of simple_domainl like NQ (quotation within narrative)
  • con_chain - this gathers terms (excluding stop words) across a clause chain for all of the clause's daughter clauses
  • con_clause - this gathers terms (excluding stop words) across the enclosing clause
  • con_window - this gathers terms within a window of 2 to the left and right of the verb
  • par_lex - this looks at instances of parallelism in poetry to find corresponding verbs between various bicola

Data points which are gathered for all of these categories are:

  • pa - the simple presence/absence of a given verbal argument. Absent values receive a null string (ø)
  • lex - lexemes of the coocurring terms
  • domain - semantic domain of the cooccuring terms based on the Semantic Dictionary of Biblical Hebrew (De Blois, UBS). NB these semantic "domains" are different from verb discourse data (domain) listed above.
  • animacy - the animacy of a given term based on a custom mapping to SDBH semantic domains

Count Adjustments

For experiments that include a lot of lexical data, we employ the positive pointwise mutual information score or PPMI to adjust the raw counts (following Levshina 2015, 327). But for experiments with small-scale, categorical variables this does not make sense. The goal with categorical data is not to weigh surprising lexical occurrences, but to measure tendencies. For these it is more appropriate to use a ratio normalization.

Animacy experiments contain mixtures of lexical and categorical data. But lexical data is minimal in experiments that do not strongly weigh prepositional phrases such as object experiments. Thus, for these experiments we normalize with a ratio rather than ppmi. By contrast to the vf_obj_animacy experiment, which has 8 unique values, the vf_adju_animacy and vf_cmpl_animacy experiments have 96 and 200 unique values, respectively. These are more suitable for ppmi adjustment.

In [5]:
countType = {}

for exp_name, exp in experiments.items():
    if exp.data.shape[0] < 9:
#         print(exp_name)
#         print(list(exp.data.index.unique()))
#         print()
        countType[exp_name] = 'sim_rRatio_maxNorm'
    else:
        countType[exp_name] = 'sim_pmi_maxNorm'

Deriving a Verb Specimen Set

Which verbs will comprise this study? This depends on which verbs are accounted for in the dataset. The presence/absence (pa) experiments count the most basic features, namely, the simple presence or absence of a given verbal argument: objects, complements, adjuncts. I will first try to derive a specimen set by taking the intersection of all the pa experiments.

In [6]:
specimens = set(experiments['vi_objc_pa'].data.columns)
freq_count = collections.Counter()
print(f'Starting with specimens count of {len(specimens)}')

for exp_name, experiment in experiments.items():
    
    if not 'vi_objc_pa' or not re.match('.*_pa', exp_name): # skip first pa experiments or non-pa exps.
        continue
        
    exp_lexemes = set(experiment.data.columns)
    specimens = exp_lexemes & specimens
    
    # make frequency count of lexemes
    for lex in experiment.data.columns:
        if lex in specimens:
            freq_count[lex] += experiment.data[lex].sum()
    
print(f'Complete with specimen count of {len(specimens)}')
Starting with specimens count of 714
Complete with specimen count of 292

Count Lexemes without Stems

All verbs have their stem appended. How many plain lexemes are in the specimen set?

In [7]:
plain_lexs = set(lex.split('.')[0] for lex in specimens)
lex_nodes = [(next(l for l in F.otype.s('lex') if F.lex.v(l) == lex),) for lex in plain_lexs]

print(f'Number of plain lexemes accounted for: {len(plain_lexs)}\n')
B.show(lex_nodes)
Number of plain lexemes accounted for: 238

We have a good and diverse dataset here. Let's see what the makeup of stems are.

Stem Counts

In [8]:
stem_counts = collections.Counter(lex.split('.')[1] for lex in specimens)

stem_counts.most_common()
Out[8]:
[('qal', 165),
 ('hif', 57),
 ('piel', 40),
 ('nif', 24),
 ('hit', 4),
 ('hof', 1),
 ('hsht', 1)]

Coverage of Specimens throughout the Experiments

There are 297 verb specimens present with 243 plain lexical forms. Note that many of the experiments have different numbers of specimens present, due to the various requirements per experiment. We intend to use as much data as is present to inform the verb clustering. But are there some verbs whose coverage throughout all 50 experiments is especially lacking?

Below we count per lexeme how many different experiments it occurs in. The number itself is then counted.

In [9]:
spread_count = collections.Counter()
accounted_exps = set()

for lex in specimens:
    count = 0
    for exp_name, exp in experiments.items():
        if lex in exp.data.columns:
            count += 1
            accounted_exps.add(exp_name)
    spread_count[count] += 1
    
spread_count.most_common()
Out[9]:
[(50, 26),
 (17, 14),
 (48, 14),
 (25, 13),
 (44, 13),
 (41, 12),
 (21, 11),
 (19, 11),
 (40, 11),
 (42, 10),
 (49, 10),
 (32, 10),
 (18, 10),
 (29, 9),
 (39, 8),
 (23, 8),
 (34, 8),
 (37, 8),
 (22, 7),
 (33, 7),
 (46, 7),
 (36, 7),
 (30, 6),
 (24, 6),
 (31, 6),
 (47, 5),
 (27, 5),
 (43, 5),
 (26, 5),
 (38, 4),
 (20, 4),
 (45, 4),
 (35, 3),
 (28, 2),
 (16, 2),
 (15, 1)]

What is the lowest accounted-for lexemes and how many of them are there?

In [10]:
min(spread_count.most_common())
Out[10]:
(15, 1)
In [11]:
sum([amount for count, amount in spread_count.most_common() if count<20])
Out[11]:
38
In [12]:
38/297 # ratio of total specimens within 20 or less experiments
Out[12]:
0.12794612794612795

Only 1 lexeme is accounted for in only 15 of the experiments. 38 of the 297 verb specimens (12%) have less than 20 applicable experiments. These are acceptable amounts. Verbs that do not have an attested object, for instance, will not appear in 3 of 4 object experiments. But they will occur in the presence/absence object experiment. So it can be expected that certain verbs will be less accounted-for in various experiments.

Examining Distributions of Presence/Absence Inventory Experiments

Presence/absence experiments test whether and how often a given verb occurs with a given argument. Each experiment tests a single argument or combined set of arguments. These are: objects, complements, adjuncts, complements + adjuncts, and all arguments. For each argument type, there is a simple binary variable: present or not. Below we normalize across verbs to observe the distribution of verbs with the given argument.

In [13]:
vi_datas = pd.DataFrame((experiments['vi_objc_pa'].data / experiments['vi_objc_pa'].data.sum()).loc['Objc']) # normalized objc data
vi_datas.columns = ('ratio of attestation',)
vi_datas['experiment'] = 'vi_objc_pa'

for exp_name, experiment in experiments.items():
    
    if exp_name == 'vi_objc_pa' or not re.match('vi_.*_pa', exp_name): # skip any non-pa experiments
        continue
        
    exp = experiment.data / experiment.data.sum() # normalize

    arg = next(i for i in exp.index if i != 'ø')
    
    datas = pd.DataFrame(exp.loc[arg])
    datas.columns = ('ratio of attestation',)
    datas['experiment'] = exp_name
    
    vi_datas = pd.concat((vi_datas, datas))

plt.figure(figsize=(15,8))
plt.title('Distributions of Features Per Presence/Absence Experiment', fontsize=20)
sns.stripplot(x='experiment', y='ratio of attestation', data=vi_datas, jitter=0.3)
plt.xlabel('experiment', fontsize=18)
plt.ylabel('% of argument attestation', fontsize=18)
plt.axhline(0.5, color='black', linestyle='dotted')
Out[13]:
<matplotlib.lines.Line2D at 0x2f2361518>

Points on the graph are individual verbs, for each of which we look at the percentage of its overall uses with the argument in question (e.g. the blue points show how often verbs occur with an object).

Initial Observations

The combined, complement + adjunct experiment seems to have the most even distribution of features. The object presence/absence experiment appears to show the most separation between verbs. The adjunct tends to have <50% attestation, while the opposite is true for the allarg experiment.

It is possible to venture a few generalizations based on this visual:

  • Verbs appear to exhibit a more even polarity between the presence/absence of an object argument.
  • The number of verbs that occur with or without a complementizer appear to be fairly balanced, with a slight preference for <50% of those verbs' occurrences having a complement.
  • There are more verbs with ≤50% of adjunct elements.
  • The number of verbs with a higher proportion of complement or adjunct arguments is seems evenly distributed, with a slight preference for >50%.
  • The number of verbs with a proportion >50% of any of the arguments appears to be high.

Number of Verbs with 100% or 0% Objc

In [14]:
ratioedO = experiments['vi_objc_pa'].data / experiments['vi_objc_pa'].data.sum()
allO = ratioedO.loc['Objc'][ratioedO.loc['Objc'] == 1.0]
noO = ratioedO.loc['Objc'][ratioedO.loc['ø'] == 1.0]

print(f'{allO.shape[0]} verbs that have an object 100% of the time...')
print(f'{noO.shape[0]} verbs that never occur with an object...')
39 verbs that have an object 100% of the time...
148 verbs that never occur with an object...
Verbs with 100% Objects
In [15]:
#print(' | '.join(allO.index))
Verbs with 0% Objects
In [16]:
#print(' | '.join(noO.index))
In [17]:
# B.show(B.search('''

# clause
#     phrase function=Pred
#         word lex=BW>[ vs=qal

#     phrase function=Objc
# '''))

BW> with objects is a good example of the shortcomings of the ETCBC "object" label, which does not sharply distinguish between "objects" and "complements." Note however the group of verbs just beneath the 20% marker in the strip chart. Are many of these motion verbs like BW>?

In [18]:
smallO = ratioedO.loc['Objc'][ratioedO.loc['Objc'] <= 0.20]
print(f'{smallO.shape[0]} verbs that have an object ≤20% of the time...')
289 verbs that have an object ≤20% of the time...
In [19]:
#print(' | '.join(smallO.index))

Preparing for Similarity Analysis

For each specimen, calculate its similarity with all other specimens pairwise based on all experiments which they have in common. The similarity scores will be averaged across all of the shared experiments. All raw counts have been adjusted using the pointwise mutual information (pmi) score. Counts for

Averaged Similarity Scores

Average similarities across all experiments to derive a list of test cases.

In [20]:
sim_matrix = pd.DataFrame(np.zeros(shape=(len(specimens), len(specimens))), columns=specimens, index=specimens) # for pairwise similarities
common_matrix = pd.DataFrame(np.zeros(shape=(len(specimens), len(specimens))), columns=specimens, index=specimens) # for counting number of common experiments

for lex in specimens:
    for space_name, space in spaces.items():
        
        sim_measure = eval(f'space.{countType[space_name]}')
        
        if lex not in sim_measure.columns:
            continue
        
        sim_matrix[lex] = sim_matrix[lex].add(sim_measure[lex], fill_value=0)
        common_matrix[lex] += 1
        
sim = sim_matrix / common_matrix
In [21]:
#list(sim.columns)
In [22]:
sim['PYH[.qal']['PTX[.qal'] # random test
Out[22]:
0.5903673728972471
In [23]:
sim['HJH[.qal'].sort_values(ascending=False).head(10)
Out[23]:
NTN[.qal    0.612082
<FH[.qal    0.603892
BW>[.qal    0.547493
JY>[.qal    0.478234
QR>[.qal    0.471106
JCB[.qal    0.468395
HLK[.qal    0.467038
NPL[.qal    0.456240
<LH[.qal    0.455061
>MR[.qal    0.447714
Name: HJH[.qal, dtype: float64
In [24]:
sim['NTN[.qal'].sort_values(ascending=False).head(10)
Out[24]:
FJM[.qal    0.743461
<FH[.qal    0.695756
BW>[.hif    0.649756
LQX[.qal    0.615713
HJH[.qal    0.612082
NF>[.qal    0.526348
QR>[.qal    0.519454
>MR[.qal    0.509996
CWB[.hif    0.502046
NGD[.hif    0.496463
Name: NTN[.qal, dtype: float64

Looking at Surprise Word Similarities

One of the major goals of this project is to determine how verb meanings and classes are naturally distinguished by their context. היה is an interesting case of a verb which likely has no exact synonyms, but which exhibits similarity across all averaged experiments with words that are seemingly unsimilar such as נתן, עשה, בוא, etc. (see above). The purpose of gathering various kinds of contexts via the experiment parameters is to discover which contexts in particular are similar or different in various kinds of verbs.

Deriving Plot Arrangements for Similarity Analysis

The arrangment of experiment parameters on the plot is important, since I hope to use it to identify patterns. So here I develop an ordering and color scheme to be used with all of the plots.

Concept: Develop an ordering scheme based on specificity of parameters

I can attempt to arrange elements from more to less specific parameters. An example of a "specific" parameter would be one with lexical arguments, while a "general" parameter might only count the presence or absence of a given argument. Below is a concept of priorities:

  1. verb grammar (vg)
    1. verb tense
  2. verb domain (vd)
    1. simple domain
    2. embedded domain
    3. context clause chain
    4. context clause
    5. parallelism
    6. context window
  3. verb inventories (allargs, adjunct, complement, object)
    1. presence
    2. animacy
    3. domain
    4. lexemes
  4. verb frames
    1. same order as verb inventories
In [25]:
# make bar orders and map to colors

verb_grammar = ['vg_tense']

verb_domain = '''
vd_domain_simple
vd_domain_embed
vd_con_chain
vd_con_clause
vd_con_window
vd_par_lex
'''.strip().split('\n')

vi_pa = '''
vi_allarg_pa
vi_coad_pa
vi_adj+_pa
vi_cmpl_pa
vi_objc_pa
'''.strip().split('\n')

vi_animacy = '''
vi_subj_animacy
vi_allarg_animacy
vi_coad_animacy
vi_adj+_animacy
vi_cmpl_animacy
vi_objc_animacy
'''.strip().split('\n')

vi_domain = '''
vi_subj_domain
vi_allarg_domain
vi_coad_domain
vi_adj+_domain
vi_cmpl_domain
vi_objc_domain
'''.strip().split('\n')

vi_lex = '''
vi_allarg_lex
vi_coad_lex
vi_subj_lex
vi_adj+_lex
vi_cmpl_lex
vi_objc_lex
'''.strip().split('\n')

vf_pa = '''
vf_argAll_pa
vf_coad_pa
vf_adju_pa
vf_cmpl_pa
vf_obj_pa
'''.strip().split('\n')

vf_animacy = '''
vf_argAll_animacy
vf_coad_animacy
vf_adju_animacy
vf_cmpl_animacy
vf_obj_animacy
'''.strip().split('\n')

vf_domain = '''
vf_argAll_domain
vf_coad_domain
vf_adju_domain
vf_cmpl_domain
vf_obj_domain
'''.strip().split('\n')

vf_lex = '''
vf_argAll_lex
vf_coad_lex
vf_adju_lex
vf_cmpl_lex
vf_obj_lex
'''.strip().split('\n')

blues = sns.color_palette(palette='Blues')
reds = sns.color_palette(palette='Reds')

# map experiment groups to colors
color2experiment = (('gold', verb_grammar),
                    ('purple', verb_domain),
                    (blues[1], vi_pa),
                    (blues[2], vi_animacy),
                    (blues[4], vi_domain),
                    (blues[5], vi_lex),
                    (reds[1], vf_pa),
                    (reds[2], vf_animacy),
                    (reds[4], vf_domain),
                    (reds[5], vf_lex))

# make ordered tuple of colors
expcolors = tuple((exp, color) for color, exp_list in color2experiment # make tuple of colors
                        for exp in exp_list) 

Below I prepare visualizer functions for examining similarities and differences in similarity scores.

In [26]:
def get_sim_experiments(lex1, lex2, colors=expcolors, show=True, returnData=True, wordSet=None):
    '''
    Exports a barchart that illustrates the level
    of similarity between two lexemes per all experiments
    in the dataset.
    '''
    
    sims = dict() # temporarily hold similarity scores here
    
    # gather rated similarities between the two terms for every provided experiment
    for sp_name, space in spaces.items():
        
        sim_measure = eval(f'space.{countType[sp_name]}')
        wordSet = wordSet or {lex1, lex2}

        if wordSet & set(sim_measure.columns) != wordSet: # ensure both terms are in the space
            continue
            
        sims[sp_name] = sim_measure[lex1][lex2] # get similarity
        
    sims_ordered = collections.OrderedDict((exp[0], sims[exp[0]]) for exp in colors if exp[0] in sims) # customized order
    sims = pd.DataFrame(list(sims_ordered.items())) # drop into Dataframe
    sims.columns = ['experiment', 'score'] # set col names
    
    if show:
        # plot:
        plot_colors = tuple(expcol[1] for expcol in colors if expcol[0] in sims['experiment'].values)        
        plt.figure(figsize=(18, 8)) 
        sns.barplot(sims['experiment'], sims['score'], palette=plot_colors, orient='v')
        plt.xticks(rotation='vertical', fontsize=12)
        plt.xlabel('experiment', fontsize=18)
        plt.ylabel('adjusted similarity score', fontsize=18)
        plt.title(f'Scored Similarities between {lex1} and {lex2}', fontsize=20)
        
    if returnData:
        return sims
    
def compare_simPatterns(dataset1, dataset2, pairs='', colors=expcolors):
    '''
    Compare barchart similarity patterns accross two barcharts.
    The first dataset is the primary dataset and will be colored
    normally. The second dataset will be colored grey for comparison.
    '''
    
    data1Sims = list(reversed(dataset1['score']))
    data2Sims = list(reversed(dataset2['score']))
    xLocations = np.arange(dataset1.shape[0])
    barwidth = 0.8       # the width of the bars: can also be len(x) sequence
    plt.figure(figsize=(10, 15)) 
    p1 = plt.barh(xLocations, data1Sims, barwidth, color=sns.color_palette(palette='Reds')[3])
    p2 = plt.barh(xLocations, data2Sims, barwidth, color=sns.color_palette(palette='Blues')[4])
    plt.yticks(xLocations, list(reversed(dataset1['experiment'])), fontsize=12)
    
    plt.ylabel('experiment', fontsize=18)
    plt.xlabel('adjusted similarity scores', fontsize=18)
    plt.title(f'Similarity Differences between {pairs[0]} and {pairs[1]}', fontsize=20)
    
    plt.legend((p1[0], p2[0]), (f'{pairs[0]}', f'{pairs[1]}'), fontsize=12)
    
def compareSets(set1, set2):
    '''
    Compare two pairs of words and their similarity differences.
    '''
    dataset1 = get_sim_experiments(set1[0], set1[1], show=False, wordSet={set1[0], set1[1], set2[0], set2[1]})
    dataset2 = get_sim_experiments(set2[0], set2[1], show=False, wordSet={set1[0], set1[1], set2[0], set2[1]})
    
    compare_simPatterns(dataset1, dataset2, pairs=(f'{set1[0]} & {set1[1]}', f'{set2[0]} & {set2[1]}'))
    
def compareChange(set1, set2, colors=expcolors, returnData=False, showPlot=True):
    '''
    Compare two pairs of words and plot the positive or negative
    difference in similarities from set1 to set2.
    '''
    dataset1 = get_sim_experiments(set1[0], set1[1], show=False, wordSet={set1[0], set1[1], set2[0], set2[1]}, colors=colors)
    dataset2 = get_sim_experiments(set2[0], set2[1], show=False, wordSet={set1[0], set1[1], set2[0], set2[1]}, colors=colors)
    
    change = dataset1.copy() 
    change['score'] = change['score'] - dataset2['score']
    #change['score'] = change['score'] / dataset1['score'] # don't divide since all experiments are now fairly normal
    
    datachange = list(reversed(change['score']))
    blue = sns.color_palette(palette='Blues')[3]
    red = sns.color_palette(palette='Reds')[4]
    colors = [blue if score > 0 else red for score in datachange]
    
    pairs = f'{set1[0]} & {set1[1]}', f'{set2[0]} & {set2[1]}'
    
    if showPlot:
        xLocations = np.arange(change.shape[0])
        barwidth = 0.8       # the width of the bars: can also be len(x) sequence
        plt.figure(figsize=(10, 15)) 
        plt.barh(xLocations, datachange, barwidth, color=colors)
        plt.yticks(xLocations, list(reversed(dataset1['experiment'])), fontsize=12)
        plt.ylabel('experiment', fontsize=18)
        plt.xlabel('adjusted similarity score differences', fontsize=18)
        plt.title(f'Similarity Differences between {pairs[0]} and {pairs[1]}', fontsize=20)
    
    if returnData:
        return (dict(zip(list(reversed(dataset1['experiment'])), datachange)))
    
def plotTop(scores, lex):
    '''
    Plot the top most similar terms
    using their rank and similarity score.
    '''
    plt.xticks(range(0, scores.shape[0]), scores.index)
    plt.title(f'Plotted Similarity Scores to {lex}')
    plt.ylabel('similarity score')
    plt.xlabel('Most Similar Terms in Ranked Order')
    plt.scatter(range(0, scores.shape[0]), scores)
    plt.plot(range(0, scores.shape[0]), scores)
    
def plotPa(term1, term2, experiment):
    '''
    Plot side-by-side bar charts for presence
    absence experiments to show the raw differences
    in tendencies.
    '''
    
    width = 0.4
    fig, ax = plt.subplots(figsize=(6, 5))    
    term1Dat = spaces[experiment].raw_norm[term1]
    term2Dat = spaces[experiment].raw_norm[term2]
    
    print(term1)
    print(term1Dat, '\n')
    print(term2)
    print(term2Dat, '\n')
    
    index = np.arange(term1Dat.shape[0], step=1)
    ax.bar(index, term1Dat, width, label=term1)
    ax.bar(index+width, term2Dat, width, label=term2)
    ax.legend()
    ax.set_xticks(index+width/2)
    ax.set_xticklabels(term1Dat.index, fontsize=14)
    ax.set_ylabel('attested proportion', fontsize=14)
    ax.set_title(f'scores for {experiment}', fontsize=15)
    
def topCommon(term1, term2, experiment, count_type='pmi'):
    '''
    Find the top common values between two terms within
    a given experiment.
    '''  
    
    count_measure = eval(f'spaces["{experiment}"].{count_type}')
    
    t1Top = count_measure[term1].sort_values(ascending=False)
    t2Top = count_measure[term2].sort_values(ascending=False)
    
    common = t1Top[t2Top > 0]
    common = common[common > 0]
    averaged_common = (common + t2Top[common.index]) / 2
    
    averaged_common = pd.DataFrame(averaged_common, columns=['averaged'])
    averaged_common[term1] = t1Top[common.index]
    averaged_common[term2] = t2Top[common.index]
    
    return averaged_common.sort_values(by='averaged', ascending=False)

def topUncommon(term1, term2, experiment, count_type='pmi', focus=''):
    '''
    Find the top uncommon values between two terms within
    a given experiment.
    '''  
    
    count_measure = eval(f'spaces["{experiment}"].{count_type}')
    
    t1Top = count_measure[term1]
    t2Top = count_measure[term2]
    
    uncommon = abs(t1Top - t2Top) # absolute diff
    uncommon = pd.DataFrame(uncommon, columns=['difference'])
    uncommon[term1] = t1Top[uncommon.index]
    uncommon[term2] = t2Top[uncommon.index]
        
    if not focus:
        return uncommon.sort_values(by='difference', ascending=False)
    else:
        return uncommon[uncommon[focus] > 0].sort_values(by='difference', ascending=False)
    
def sampleBases(targets, basis, exp_name):
    '''
    Returns examples of given bases
    from the Hebrew Bible.
    '''
    results =  list(sample for target in targets
                        for sample in experiments[exp_name].target2basis2result[target][basis])
    random.shuffle(results)
    return results

Similarity Analyses

We will analyze the top ten words ranked by frequency of experiment data counts. For each word we look at the top 5 most similarly ranked words and analyze two categories: intuitive matches and surprising matches. Intuitive matches are those for which we can recognize a valid (as humanly judged) semantic similarity. We visualize the specific scores of similarity per experiment with barcharts. The charts allow us to map specific areas of similarity and dissimilarity based on the test parameters of the experiments. In doing so, we explore how similarity is multifaceted and whether there are any perceptible differences in intuitively and surprisingly matched terms. Another barchart is used to show differences between intuitively matched terms and surprisingly matched terms. This helps to isolate which experiment parameters consistently yield differences between intuitive and surprising matches.

Deriving a Case Study Set

Further above we have generated a frequency count for each lexeme in the specimen set. This allows us to select the top most common lexemes for the case study as shown below.

In [27]:
freq_count.most_common(10)
Out[27]:
[('>MR[.qal', 47974.0),
 ('HJH[.qal', 34658.0),
 ('<FH[.qal', 22837.0),
 ('BW>[.qal', 19062.0),
 ('NTN[.qal', 18194.0),
 ('HLK[.qal', 13360.0),
 ('R>H[.qal', 10784.0),
 ('DBR[.piel', 10002.0),
 ('CM<[.qal', 9999.0),
 ('LQX[.qal', 9136.0)]

We can create a loose list of groups for these lexemes based on perceived intuitive similarities. These groups only serve the purpose of getting a sense of the lexical variety reflected in the most common set, and thus they do not go towards the analysis itself.

In this top ten set, we can see six general groups of verbs:

  1. communication - >MR[.qal, DBR[.piel
  2. state - HJH[.qal
  3. doing/making - <FH[.qal
  4. moving - BW>[.qal, HLK[.qal
  5. manipulating - NTN[.qal, LQX[.qal
  6. perceiving - R>H[.qal, CM>.qal

There is good variety in this top 10 set. We begin with these top ten, to analyze their top-rated most-similar terms, and to compare surprising similarities against intuitive ones.

אמר.qal

We begin the word-by-word evaluation with אמר ("to say"). The top five most similar terms are based on the average of similarity scores across all available experiments for אמר. They are listed below along with the averaged similarity score.

In [28]:
say_top_sim = sim['>MR[.qal'].sort_values(ascending=False).head(5) # get top 5 elements
plotTop(say_top_sim, '>MR[.qal')
print('Similarity Scores with >MR\n', say_top_sim, '\n')
Similarity Scores with >MR
 QR>[.qal     0.643052
DBR[.piel    0.629496
NTN[.qal     0.531245
NGD[.hif     0.529970
<FH[.qal     0.510799
Name: >MR[.qal, dtype: float64 

קרא is ranked most similar at 0.70. דבר is close at 0.68. There is a 10 point decrease at the third ranked word, נגד, which the plot illustrates. נתן and שלח are surprising similarities which will need to be explored.

We can note that the first three terms are good intuitive matches. The last two, נתן and שלח, are surprising matches. Is שלח included here because of its occasional use in messenger/communication situations, or for another reason? What is the cause for the similarity of נתן and שלח with אמר?

To answer these questions, we now turn to the analysis and comparison of similarity scores across the experiments.

Similarity Patterns for Intuitive Matches to אמר

אמר and קרא

In [29]:
say_call = get_sim_experiments('>MR[.qal', 'QR>[.qal')

Each score has been normalized according to the 75 percentile maximum similarity values per experiment so that the scores are judged in relation to the experiment's performance.

Each color group represents a different experiment strategy while different shades represent different data points, with increasing darkness for increasing specificity of the experiments' datapoint. Thus, higher bars within darker shaded groups will presumably show more specific similarities.

Starting with the vg and vd experiments, we see a high agreement in verb tense as well as simple domain. For the other vd experiments there is decreasing similarity as the specificity of the datapoint increases to the context of a window. There does not appear to be any parallelism shared between אמר and קרא (vd_par_lex).

In the inventory pa experiments we see high levels of agreement between the adju and cmpl experiments. The allarg experiment is probably higher due to the high agreement in adju and cmpl. The object experiment is lower (<0.4) in comparison to the adju and cmpl experiments.

In the inventory animacy experiments the subj and cmpl scores stand out above the others. There is a lot of agreement, then, on the animacy of the actant and the animacy of the complement element. For communicative verbs this is probably significant: a living subject communicating to a living complement. The coad experiment is also slightly higher, perhaps due to the increase in the cmpl agreement.

In the inventory domain experiments there is highest agreement in the coad and cmpl experiments. Allarg and subject are also high. Since allarg includes the cmpl and adju, it makes sense that it would also be slightly higher. The high level of agreement with subject domain corresponds with a high level of subject animacy agreement. The adjunct domain is the second lowest. Probably most notable of all, though, is the very small level of agreement in the object experiment. This corresponds with lower levels of agreement seen in the pa experiments (N.B. objc is absent from the animacy experiment due to lack of data).

In the inventory lex experiments coad again stands out. Could it be that cmpl and adju agreement are combining to create even stronger levels of agreement in the coad experiment? Subject is the highest of the levels, just as is it is in the animacy experiments. Which lexemes might be co-occurring in this slot regularly? The adjunct experiment is again second lowest, as in the domain experiemnts. But the object experiment again shows the least amount of agreement.

Moving into the frame pa experiments, the coad, adju, and cmpl experiments show high levels of agreement. The pattern between those three experiments is nearly exactly the same as the vi pa experiments. The allarg experiment, on the other hand, is much lower than its pa counterpart. This may be expected since the allarg frame experiment would require similarity across more diverse categories. Like with the vi experiments, objc has a much lower similarity at ~0.60. It's noteworthy that the adju experiment here is relatively equivalent with the cmpl (as in the inventory experiments) since the adju has consistently scored lower across animacy, domain, and lex experiments. This shows that while >MR and QR> have a similar tendency to attract adjuncts, the adjuncts themselves are particularly different. This could be an interesting area of investigation.

In the frame animacy experiments, we see that adju has essentially zero similarity. This needs to be investigated. It shoud be noted that animacy experiments have much less data to work on, so a slight deficiency in one or another value could result in changes to this value. However, the lower level of animacy agreement in the adju experiment does correspond with its lower attestation in the vi experiments. The coad experiment is slightly higher than both argAll and cmpl, but in general all three are fairly close in agreement.

In the frame domain experiments the cmpl experiment shows the strongest similarity, which results in a higher score for the coad experiment. As in the inventory versions, the adju is second lowest while the object shows the most separation.

The lexical frame experiments show the same pattern and shape as the domain: high agreement in cmpl with a corresponding highest agreement in the coad experiment. argall, as a result, is also higher. The adju experiment is lower while the objc experiment is lowest.

Synthesis

>MR and QR> share very similar contexts at the discourse domain and clause levels. They also have very similar preferences for attracting the presence of adjuncts and complements, but they differ in their simple preference for an object. >MR and QR> differ even more in the kind of objects which they prefer, both by semantic domain and lexeme. The two verbs also tend to select different adjuncts in the animacy, semantic domain, and lexeme experiments. However, >MR and QR> share a high similarity in their complements across animacy, semantic domain, and lexical categories. Finally, both >MR and QR> share similar kinds of subjects. Notably, they share similarity in subject and complement animacy, which may be a key factor in differentiating communication verbs like these.

Investigation

The analysis of the similarity plot shows high agreement in complement categories as well as subject animacy and lexemes. Here we dive deeper into the specific similarities with topCommon, which pulls out the top similar features in common for a given experiment.

First, we see high levels of similarity in the discourse context experiments. We look at the top rated similarities in the window experiment, which records the lexemes to the left and right of the verbs.

In [30]:
topCommon('>MR[.qal', 'QR>[.qal', 'vd_con_window').head(10)
Out[30]:
averaged >MR[.qal QR>[.qal
L>_<MJ/ 8.625709 8.625709 8.625709
J<BY/ 7.803781 7.303781 8.303781
N<MJ=/ 7.717750 8.510232 6.925269
CMCWN/ 7.378714 8.247197 6.510232
GJXZJ/ 7.333228 8.625709 6.040746
NPTLJ/ 7.303781 7.303781 7.303781
BRWK/ 7.303781 7.303781 7.303781
MJKJHW/ 7.248265 8.040746 6.455784
>KJC/ 7.217750 8.510232 5.925269
CM<J=/ 7.110835 7.403316 6.818354

The frame, complement, animacy experiment yields important similarities in the shared complements of אמר and קרא:

In [31]:
topCommon('>MR[.qal', 'QR>[.qal', 'vf_cmpl_animacy')
Out[31]:
averaged >MR[.qal QR>[.qal
>L_animate 4.417922 6.139394 2.696450
L_animate 3.551028 4.445276 2.656780
L_inanimate 2.891350 3.454115 2.328584
<L_inanimate 1.054641 0.811928 1.297354

N.B. that the counts are ppmi adjusted counts. The complement also held high similarity in the frame domain experiment:

In [32]:
topCommon('>MR[.qal', 'QR>[.qal', 'vf_cmpl_domain').head(10) # show top 10
Out[32]:
averaged >MR[.qal QR>[.qal
>L_Professions 6.371559 6.371559 6.371559
L_Gather 5.901006 6.693487 5.108524
L_Control 5.871559 6.371559 5.371559
L_Professions 5.386132 5.886132 4.886132
>L_People 5.168315 6.398031 3.938599
L_Waterbodies 5.108524 5.108524 5.108524
L_Names of Landforms 5.108524 5.108524 5.108524
L_Names of Groups 4.898505 5.190987 4.606024
>L_Names of People 4.866702 6.918107 2.815297
L_Names of People 4.844022 5.447248 4.240797

These commonalities show the actions of both קרא and אמר being directed towards people (professions, gather[ings], people, names of groups). On a few occasions the action is directed towards performing an act of taking control (e.g. L_Control) using the verb שבה (Isa 61:1), אסר (Isa 49:9), or כבשׁ (as in 2 Chr 28:10). In the cases of L_Names of Landforms, the referents are Zion (Isa 52:7) and Hermon (Deut 3:9) as personified cities. In Ezek 6:3, YHWH speaks "to the channels" which is the source of L_Waterbodies.

The analysis also showed key differences in object and adjunct use. Here we will try to tease this out a bit more, and find out which object and adjunct instances in particular are different between the two terms. First we look at the raw counts in the inventory and frame presence/absence experiments.

In [33]:
plotPa('>MR[.qal', 'QR>[.qal', 'vi_objc_pa')
>MR[.qal
Objc    0.921019
ø       0.078981
Name: >MR[.qal, dtype: float64 

QR>[.qal
Objc    0.489666
ø       0.510334
Name: QR>[.qal, dtype: float64 

The data shows that >MR nearly always (92%) occurs alongside an object. Our export of the data counts the beginning of speech as an object. QR>, on the other hand, is evenly divided between its use with and without an object. Let's take a closer look at the cases where QR> occur without an object.

In [34]:
B.show(experiments['vi_objc_pa'].target2basis2result['QR>[.qal']['ø'][:10])

verse 1

clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb call qal wayq
phrase Cmpl PP
art the
subs human, mankind
clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb say qal wayq
phrase Cmpl PP

verse 2

clause WxQX
phrase Conj CP
conj and
phrase Cmpl PP
phrase Pred VP
verb bear pual perf
phrase Subj NP
clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb call qal wayq
phrase Objc PP
prep <object marker>
phrase Objc PrNP
clause xQt0
phrase Time AdvP
phrase Pred VP
verb defile hof perf
clause Adju InfC

verse 3

clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb advance hif wayq
phrase Cmpl PP
phrase Cmpl AdvP
art the
phrase Cmpl AdvP
clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb extend qal wayq
clause NmCl
clause NmCl
phrase Conj CP
conj and
phrase Subj PrNP
phrase PreC PP
clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb build qal wayq
phrase Loca AdvP
clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb call qal wayq

verse 4

clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb call qal wayq
phrase Subj NP
clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb say qal wayq
clause xQt0
phrase Objc IPrP
phrase Pred VP
verb make qal perf
phrase Cmpl PP
clause xQt0
phrase Ques InrP
phrase Nega NegP
phrase Pred VP
verb report hif perf
phrase Cmpl PP
clause Objc NmCl

verse 5

clause Attr xQt0
phrase Rela CP
conj <relative>
phrase Pred VP
verb make qal perf
phrase Loca AdvP
clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb call qal wayq
phrase Loca AdvP
phrase Subj PrNP

verse 6

clause Way0
phrase Conj CP
conj and
phrase Pred VP
clause Way0
phrase Conj CP
conj and
phrase Pred VP
phrase Cmpl PP
clause Attr xQt0
phrase Rela CP
conj <relative>
phrase Pred VP
verb come qal perf
clause ZIm0
phrase PreO VP
verb go out hif impv
clause WYq0
phrase Conj CP
conj and
phrase Pred VP
verb know qal impf
phrase Objc PP
prep <object marker>

verse 7

clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb rise early hif wayq
phrase Time PP
clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb call qal wayq
clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb speak piel wayq
phrase Objc PP
clause WayX
phrase Conj CP
conj and
phrase Pred VP
phrase Modi AdvP

verse 8

clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb call qal wayq
clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb say qal wayq
phrase Cmpl PP
clause xQt0
phrase Objc IPrP
phrase Pred VP
verb make qal perf
clause WxQ0
phrase Conj CP
conj and
phrase Objc IPrP
phrase Pred VP
verb miss qal perf
phrase Cmpl PP
clause Attr xYq0
phrase Rela CP
conj <relative>
phrase Nega NegP
phrase Pred VP
clause xQt0
phrase Pred VP
verb make qal perf

verse 9

clause WayX
phrase Conj CP
conj and
phrase Pred VP
phrase Objc PP
prep <object marker>
art the
clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb say qal wayq
phrase Cmpl PP
clause NmCl
phrase Subj IPrP
prin what
clause Voct
phrase Voct PrNP
clause xYq0
phrase Nega NegP
phrase Pred VP
clause xQtX
phrase Conj CP
phrase Pred VP
verb hear qal perf
clause Adju NmCl
phrase Rela CP
phrase Subj PPrP
phrase PreC AdvP

verse 10

clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb plant qal wayq
phrase Objc NP
clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb call qal wayq
phrase Loca AdvP

We can see that a common reason קרא often does not take the object is because it is used in conjunction with אמר, which will take the object instead. This happens most clearly in the formulaic statement: קרא לאמר. The raw counts for this occurrence is below.

In [35]:
print('Occurrences of QR> + L + >MR: \t', experiments['vi_adj+_lex'].data['QR>[.qal']['L_>MR['])
Occurrences of QR> + L + >MR: 	 23.0

We can make a further note here that this pattern gives some corroborating evidence as to the close meaning of קרא and אמר, especially since אמר often occurs with this same pattern:

In [36]:
print('Occurrences of >MR + L + >MR: \t', experiments['vi_adj+_lex'].data['>MR[.qal']['L_>MR['])
Occurrences of >MR + L + >MR: 	 113.0

Besides this formula, we see in the occurrences above that קרא is also frequently followed by אמר in the next sentence (i.e. call out -> say).

The experiments have thus noticed a subtle difference in meaning between אמר and קרא in their tendencies to attract objects. Namely, אמר almost always requires the beginning of a direct speech section (marked herein as the object), whereas קרא takes an object only about half of the time.

Another observation from the analysis above is that >MR and QR> frequently take different kinds of objects. This is seen in the relatively lower scores for object experiments in the remaining categories. Let's take a look at the top scores for the two terms to see if we can discern where major differences lie.

In [37]:
topUncommon('>MR[.qal', 'QR>[.qal', 'vi_objc_domain').head(20)
Out[37]:
difference >MR[.qal QR>[.qal
>M_Bless 7.936638 7.936638 0.000000
L_Strong 7.936638 7.936638 0.000000
Heat 7.936638 0.000000 7.936638
>CR_Open 7.936638 7.936638 0.000000
L_Speak 7.936638 7.936638 0.000000
>CR_Ingest 7.936638 7.936638 0.000000
L_Possess 7.936638 7.936638 0.000000
L_Non-Exist 7.936638 7.936638 0.000000
>CR_Flee 7.936638 7.936638 0.000000
KJ_Witnesses 7.936638 0.000000 7.936638
Soft 7.936638 0.000000 7.936638
Names of Constructions 7.351675 0.000000 7.351675
Names of Landforms 7.003752 0.000000 7.003752
KJ_Sin 6.936638 6.936638 0.000000
Contain 6.936638 0.000000 6.936638
L_Space 6.936638 6.936638 0.000000
L_Move 6.936638 6.936638 0.000000
Search 6.936638 6.936638 0.000000
Name 6.745695 0.000000 6.745695
KJ_Birth 6.351675 6.351675 0.000000

We see a bigger tendency amongst the data for אמר for the objects to consist of prepositional phrases which would be part of object clauses. קרא, on the other hand, has mainly simple noun phrases for the most part. We can also see that "Names" occur at least 3 times in the קרא's top list. Keep in mind that these scores are the PPMI adjusted scores, which aims to maximize surprising lexical occurrences in proportion to their overall sample sizes. If we look at the raw counts, we can see this "naming" tendency come out even more:

In [38]:
topUncommon('>MR[.qal', 'QR>[.qal', 'vi_objc_domain', count_type='raw').head(20)
Out[38]:
difference >MR[.qal QR>[.qal
Name 106.0 0.0 106.0
Names of People 73.0 0.0 73.0
Names of Locations 34.0 0.0 34.0
Names of Landforms 11.0 0.0 11.0
Happen 10.0 2.0 12.0
Ingest 8.0 0.0 8.0
Names of Deities 7.0 0.0 7.0
Groups 5.0 0.0 5.0
Names of Constructions 4.0 0.0 4.0
Kinship 4.0 0.0 4.0
Free 4.0 0.0 4.0
Know 3.0 3.0 0.0
People 3.0 0.0 3.0
Land 3.0 0.0 3.0
Festivals 3.0 0.0 3.0
Towns 3.0 0.0 3.0
Speak 3.0 5.0 2.0
Universe 2.0 0.0 2.0
Gather 2.0 0.0 2.0
Priests 2.0 0.0 2.0

If we look at the lexical experiments, we can see the same tendency reflected:

In [39]:
topUncommon('>MR[.qal', 'QR>[.qal', 'vi_objc_lex').head(20)
Out[39]:
difference >MR[.qal QR>[.qal
NPTLJ/ 8.252665 0.000000 8.252665
YPNT_P<NX/ 8.252665 0.000000 8.252665
JKJN/ 8.252665 0.000000 8.252665
<MQ/ 8.252665 0.000000 8.252665
<MNW_>L/ 8.252665 0.000000 8.252665
KJ_CGGH/ 8.252665 8.252665 0.000000
GRCM/ 8.252665 0.000000 8.252665
XRVM/ 8.252665 0.000000 8.252665
RXBWT/ 8.252665 0.000000 8.252665
KJ_BL<==[ 8.252665 8.252665 0.000000
FNJR/ 8.252665 0.000000 8.252665
XRMH/ 8.252665 0.000000 8.252665
JMJMH/ 8.252665 0.000000 8.252665
XRB=/ 8.252665 0.000000 8.252665
C_HBL/ 8.252665 8.252665 0.000000
>BL_MYRJM/ 8.252665 0.000000 8.252665
L>_<MJ/ 8.252665 0.000000 8.252665
FRJN/ 8.252665 0.000000 8.252665
>BN_H<ZR/ 8.252665 0.000000 8.252665
L_DRC[ 8.252665 8.252665 0.000000

In the אמר top list we see content of speech with verbs such as לדרש (say -> to seek) or לכנס (say -> to gather). In the קרא top list we see nouns and proper nouns. We can say, then, that one of the primary differences between אמר and קרא is the tendency of קרא to take nouns as direct objects in a naming context; אמר's objects are distinguished by marking action.

One of the distinguishing similarities between אמר and קרא was in the subject animacy experiment. These two verbs have high preferences for animate subjects, as would be expected:

In [40]:
plotPa('>MR[.qal', 'QR>[.qal', 'vi_subj_animacy')
>MR[.qal
animate      0.992513
inanimate    0.007487
Name: >MR[.qal, dtype: float64 

QR>[.qal
animate      0.991453
inanimate    0.008547
Name: QR>[.qal, dtype: float64 

There are cases in our data where inanimate objects are personified and begin speaking. We show a few of those cases below (results cut off at 5).

In [41]:
B.show(experiments['vi_subj_animacy'].target2basis2result['>MR[.qal']['inanimate'][:5], condenseType='clause')

clause 1

clause xYqX
phrase Conj CP
phrase Pred VP
phrase Subj NP

clause 2

clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb say qal wayq
phrase Subj NP

clause 3

clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb say qal wayq

clause 4

clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb say qal wayq
phrase Subj NP

clause 5

clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb say qal wayq
phrase Subj NP
art the
subs thorny bush

The strong agreement in subject animacy similarity is paired with a high agreement of complement similarity. Here are the top common scores for both terms in the complement animacy frames:

In [42]:
topCommon('>MR[.qal', 'QR>[.qal', 'vf_cmpl_animacy')
Out[42]:
averaged >MR[.qal QR>[.qal
>L_animate 4.417922 6.139394 2.696450
L_animate 3.551028 4.445276 2.656780
L_inanimate 2.891350 3.454115 2.328584
<L_inanimate 1.054641 0.811928 1.297354

Both verbs often direct their action from an animate subject with either the preposition אל or ל towards an animate object. We can note an interesting difference with קרא that ties back its naming tendency; see that "animate" without a qualifying preposition is included. This is presumably the person which is being named.

Finally, אנר and קרא consistently had lower levels of similarity in the adjunct experiments. This happens even though both have relatively higher similarity in the presence/absence experiment(closer to the complement). Why might that be the case?

In [43]:
topUncommon('>MR[.qal', 'QR>[.qal', 'vi_adj+_domain').head(20)
Out[43]:
difference >MR[.qal QR>[.qal
>L_Shame 7.761551 7.761551 0.000000
B<D/_Evil 7.761551 0.000000 7.761551
J<N/>CR_Space 7.761551 7.761551 0.000000
>L_Domestic Animals 7.761551 0.000000 7.761551
L_Bald 7.761551 0.000000 7.761551
MN_Shine 7.761551 0.000000 7.761551
<L_Crops 7.761551 0.000000 7.761551
K_Name 6.761551 0.000000 6.761551
L_Hang 6.761551 6.761551 0.000000
>L_Possess 6.761551 0.000000 6.761551
>L_Leaders 6.761551 6.761551 0.000000
>L_Kinship 6.761551 6.761551 0.000000
B_Curse 6.761551 6.761551 0.000000
>CR_Move 6.761551 6.761551 0.000000
L_Event Referents: Time 6.761551 0.000000 6.761551
L_Curse 6.761551 0.000000 6.761551
L_Think 6.761551 6.761551 0.000000
L_Small 6.761551 0.000000 6.761551
B_Lament 6.761551 6.761551 0.000000
LM<N_Move 6.761551 6.761551 0.000000

The differences here are diverse. We can see a slight tendency for the prepsition ב with אמר in conjunction with various kinds of "speaking" verbs like curse and lament. קרא has more references to landscapes and time references (possibly from the ETCBC Loca and Time tags. At this point it is difficult to perceive where the differences are.

Diving into the raw scores helps:

In [44]:
topUncommon('>MR[.qal', 'QR>[.qal', 'vi_adj+_domain', count_type='raw').head(20)
Out[44]:
difference >MR[.qal QR>[.qal
L_Speak 88.0 112.0 24.0
<D_Time 8.0 1.0 9.0
L_Move 8.0 10.0 2.0
MN_Orientation 5.0 0.0 5.0
B_Name 5.0 1.0 6.0
B_Perception 5.0 0.0 5.0
L_Dead 4.0 4.0 0.0
B_Time 4.0 17.0 13.0
MN_Universe 3.0 0.0 3.0
Event Referents: Location 3.0 0.0 3.0
K_Speak 3.0 3.0 0.0
Event Referents: Time 3.0 6.0 3.0
L_Exist 3.0 3.0 0.0
L_Know 3.0 3.0 0.0
B_Vision 3.0 3.0 0.0
Buildings 3.0 0.0 3.0
B_Just 2.0 0.0 2.0
B_Distress 2.0 0.0 2.0
L_Dwell 2.0 2.0 0.0
B_Afraid 2.0 2.0 0.0

We see references to space and time amongst the קרא data (though time is also frequent in אמר). The adjunct with מן happens 3 times in the קרא list but not at all with אמר. Let's look closer at these instances.

In [45]:
B.show(experiments['vi_adj+_domain'].target2basis2result['QR>[.qal']['MN_Universe'], condenseType='clause')

clause 1

clause WayX

clause 2

clause WayX

clause 3

In [46]:
B.show(experiments['vi_adj+_domain'].target2basis2result['QR>[.qal']['MN_Orientation'], condenseType='clause')

clause 1

clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb call qal wayq
phrase Loca PP
prep from
art the
subs blackberry bush

clause 2

clause 3

clause xYq0

clause 4

clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb call qal wayq
phrase Objc NP

clause 5

clause xYq0

Below looks for any cases where אמר occurs alongside מן in our dataset. It finds none:

In [47]:
[experiments['vi_adj+_domain'].target2basis2result['>MR[.qal'][b] for b in experiments['vi_adj+_domain'].target2basis2result['QR>[.qal'] if b.startswith('MN')]
Out[47]:
[[], [], [], [], [], [], [], [], []]

This difference shows that קרא occurs with a preposition that indicates separation or source. We can tease this out further:

In [48]:
spaces['vf_argAll_animacy'].pmi['QR>[.qal'].sort_values(ascending=False).head(5) # N.B. these are ppmi adjusted scores
Out[48]:
Cmpl.L_animate|Cmpl.L_inanimate                  7.693487
Cmpl.L_animate|adj+.>L_animate|adj+.inanimate    7.693487
Cmpl.>L_animate|adj+.MN_inanimate                7.693487
Cmpl.L_inanimate|Objc.KJ_animate                 7.693487
Cmpl.B_inanimate|Cmpl.L_animate                  7.693487
Name: QR>[.qal, dtype: float64

Note the third-ranked frame with an animate complement and a MN+inanimate adjunct. This shows the action that is directed towards an animate object, a feature which קרא and אמר have in common, with an adjunct that קרא and אמר do not have in common. This is a nice distinction between these two verbs.

Conclusion on אמר and קרא

אמר and קרא share common contexts and frequently occur together. They are similar in their preference for animate subjects and complements, and they share similar prepositional phrases with אל or ל indicating the action towards the animate complement. The verbs differ, though, in their object and adjunct preferences. אמר nearly always takes speech as a direct object, while קרא is more flexible and can occur with or without an object 50% of the time. When either verb does take an object, there are differences in the domain and lexical choices of their objects. Namely, קרא more frequently takes proper noun phrases as objects, indicating its use to refer to naming. In the case of adjunct preference, קרא is also distinct in its use of מן to indicate the source of the action. It seems that קרא may also occur more often with adjuncts representing points in time or space but this hypothesis needs to be more thoroughly tested.

אמר and דבר

In [49]:
say_speak = get_sim_experiments('>MR[.qal', 'DBR[.piel')

The highest levels of agreement between אמר and דבר are seen in the vd experiments, where clause chain context is the highest, followed by the window contexts. We can find the top common context words that were found below:

In [50]:
topCommon('>MR[.qal', 'DBR[.piel', 'vd_con_window').head(10)
Out[50]:
averaged >MR[.qal DBR[.piel
NTN/ 7.803781 8.888743 6.718818
GJXZJ/ 7.333228 8.625709 6.040746
MJKJHW/ 7.248265 8.040746 6.455784
>XJTPL/ 7.125709 7.625709 6.625709
>LJC</ 7.060638 8.268157 5.853119
MRJ/ 7.040746 7.040746 7.040746
NBWT/ 7.040746 7.040746 7.040746
JRMJHW/ 6.946828 7.835632 6.058024
NBL/ 6.925269 6.925269 6.925269
CVR[ 6.833228 6.040746 7.625709

The majority of these terms are proper nouns. We do see שׁטר (participle, "officer").

We see high levels of agreement in the pa inventory experiments; allarg is higher due to the combined value of those experiments. We can notice that the object experiment, though, is much lower in its similarity, as was the case between אמר and קרא. Let's look more closely at vi_objc_pa.

In [51]:
plotPa('>MR[.qal', 'DBR[.piel', 'vi_objc_pa')
>MR[.qal
Objc    0.921019
ø       0.078981
Name: >MR[.qal, dtype: float64 

DBR[.piel
Objc    0.283664
ø       0.716336
Name: DBR[.piel, dtype: float64 

דבר has an aversion for objects that is even stronger than what we saw for קרא (which was about 50/50). We can look at some examples:

In [52]:
B.show(experiments['vi_objc_pa'].target2basis2result['DBR[.piel']['ø'][:10])

verse 1

clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb speak piel wayq
clause Adju InfC
phrase Pred VP
prep to
verb say qal infc

verse 2

clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb walk qal wayq
phrase Subj PrNP
clause Adju xQtX
phrase Conj CP
phrase Pred VP
verb speak piel perf
phrase Subj PrNP
clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb walk qal wayq
phrase Adju PP
prep together with
phrase Subj PrNP

verse 3

clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb fall qal wayq
phrase Subj PrNP
clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb speak piel wayq
phrase Cmpl PP
prep together with
clause Adju InfC
phrase Pred VP
prep to
verb say qal infc

verse 4

clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb be complete piel wayq
clause Adju InfC
phrase Pred VP
prep to
verb speak piel infc
phrase Cmpl PP
prep together with
clause WayX
phrase Conj CP
conj and
phrase Pred VP
verb ascend qal wayq

verse 5

clause Way0
phrase Conj CP
conj and
phrase Pred VP
verb circumcise qal wayq
phrase Objc PP
clause Adju xQtX
phrase Conj CP
phrase Pred VP
verb speak piel perf
phrase Cmpl PP
prep together with

verse 6

clause WYq0
phrase Conj CP
conj and
phrase Pred VP
verb take qal impf
clause WIm0
phrase Conj CP
conj and
phrase Pred VP
verb support qal impv
clause xYq0
phrase Time AdvP
phrase Pred VP
clause xQt0
phrase Conj CP
phrase Adju PP
phrase Pred VP
phrase Cmpl PP