Export to Excel

In a notebook, you can perform searches and view them in a tabular display and zoom in on items with pretty displays.

But there are times that you want to take your results outside Text-Fabric, outside a notebook, outside Python, and just work with them in other programs, such as Excel.

You want to do that not only with query results, but with all kinds of lists of tuples of nodes.

There is a function for that, A.export(), and here we show what it can do.

In [1]:
%load_ext autoreload
%autoreload 2

Incantation

The ins and outs of installing Text-Fabric, getting the corpus, and initializing a notebook are explained in the start tutorial.

In [2]:
import os
from tf.app import use
In [24]:
A = use('bhsa', hoist=globals())
Using bhsa commit 9374f7a8d075a94bc6b7e69e08a7ca86e725215f
  in /Users/dirk/text-fabric-data/__apps__/bhsa
Using etcbc/bhsa/tf - c r1.5 in /Users/dirk/text-fabric-data
Using etcbc/phono/tf - c r1.2 in /Users/dirk/text-fabric-data
Using etcbc/parallels/tf - c r1.2 in /Users/dirk/text-fabric-data

Inspect the contents of a file

We write a function that can peek into file on your system, and show the first few lines. We'll use it to inspect the exported files that we are going to produce.

In [4]:
EXPORT_FILE = os.path.expanduser('~/Downloads/results.tsv')
UPTO = 10

def checkout():
    with open(EXPORT_FILE, encoding='utf_16') as fh:
        for (i, line) in enumerate(fh):
            if i >= UPTO:
                break
            print(line)

Encoding

Our exported .tsv files open in Excel without hassle, even if they contain non-latin characters. That is because TF writes such files in an encoding that works well with Excel: utf_16_le. You can just open them in Excel, there is no need for conversion before or after opening these files.

Should you want to process these files by means of a (Python) program, take care to read them with encoding utf_16.

Example query

We first run a query in order to export the results.

In [5]:
query = '''
book book=Samuel_I
  clause
    word sp=nmpr
'''
results = A.search(query)
  0.53s 1868 results

Bare export

You can export the table of results to Excel.

The following command writes a tab-separated file results.tsv to your downloads directory.

You can specify arguments toDir=directory and toFile=file name to write to a different file. If the directory does not exist, it will be created.

We stick to the default, however.

In [6]:
A.export(results)

Check out the contents:

In [7]:
checkout()
R	S1	S2	S3	NODE1	TYPE1	book1	NODE2	TYPE2	TEXT2	NODE3	TYPE3	TEXT3	sp3

1	1_Samuel	1	1	426592	book	Samuel_I	453942	clause	וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם 	141547	word	אֶפְרָ֑יִם 	nmpr

2	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	141550	word	אֶ֠לְקָנָה 	nmpr

3	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	141552	word	יְרֹחָ֧ם 	nmpr

4	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	141554	word	אֱלִיה֛וּא 	nmpr

5	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	141556	word	תֹּ֥חוּ 	nmpr

6	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	141558	word	צ֖וּף 	nmpr

7	1_Samuel	1	2	426592	book	Samuel_I	453945	clause	שֵׁ֤ם אַחַת֙ חַנָּ֔ה 	141566	word	חַנָּ֔ה 	nmpr

8	1_Samuel	1	2	426592	book	Samuel_I	453946	clause	וְשֵׁ֥ם הַשֵּׁנִ֖ית פְּנִנָּ֑ה 	141571	word	פְּנִנָּ֑ה 	nmpr

9	1_Samuel	1	2	426592	book	Samuel_I	453948	clause	לִפְנִנָּה֙ יְלָדִ֔ים 	141575	word	פְנִנָּה֙ 	nmpr

You see the following columns:

  • R the sequence number of the result tuple in the result list
  • S1 S2 S3 the section as book, chapter, verse, in separate columns
  • NODEi TYPEi the node and its type, for each node i in the result tuple
  • TEXTi the full text of node i, if the node type admits a concise text representation
  • sp3 the value of feature 3, since our query mentions the feature sp on node 3

Richer exports

If we want to see the clause type (feature typ) and the word gender (feature gn) as well, we must mention them in the query.

We can do so as follows:

In [8]:
query = '''
book book=Samuel_I
  clause typ*
    word sp=nmpr gn*
'''
results = A.search(query)
  0.93s 1868 results

The same number of results as before. The * is a trivial condition, it is always true.

We do the export again and peek at the results.

In [9]:
A.export(results)
checkout()
R	S1	S2	S3	NODE1	TYPE1	book1	NODE2	TYPE2	TEXT2	typ2	NODE3	TYPE3	TEXT3	gn3	sp3

1	1_Samuel	1	1	426592	book	Samuel_I	453942	clause	וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם 	WayX	141547	word	אֶפְרָ֑יִם 	unknown	nmpr

2	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	NmCl	141550	word	אֶ֠לְקָנָה 	m	nmpr

3	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	NmCl	141552	word	יְרֹחָ֧ם 	m	nmpr

4	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	NmCl	141554	word	אֱלִיה֛וּא 	m	nmpr

5	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	NmCl	141556	word	תֹּ֥חוּ 	m	nmpr

6	1_Samuel	1	1	426592	book	Samuel_I	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	NmCl	141558	word	צ֖וּף 	m	nmpr

7	1_Samuel	1	2	426592	book	Samuel_I	453945	clause	שֵׁ֤ם אַחַת֙ חַנָּ֔ה 	NmCl	141566	word	חַנָּ֔ה 	f	nmpr

8	1_Samuel	1	2	426592	book	Samuel_I	453946	clause	וְשֵׁ֥ם הַשֵּׁנִ֖ית פְּנִנָּ֑ה 	NmCl	141571	word	פְּנִנָּ֑ה 	f	nmpr

9	1_Samuel	1	2	426592	book	Samuel_I	453948	clause	לִפְנִנָּה֙ יְלָדִ֔ים 	NmCl	141575	word	פְנִנָּה֙ 	f	nmpr

As you see, you have an extra column typ2 and gn3.

This gives you a lot of control over the generation of spreadsheets.

Not from queries

You can also export lists of node tuples that are not obtained by a query:

In [10]:
tuples = (
    tuple(results[0][1:3]),
    tuple(results[1][1:3]),
)

tuples
Out[10]:
((453942, 141547), (453943, 141550))

Two rows, each row has a clause node and a word node.

Let's do a bare export:

In [11]:
A.export(tuples)
checkout()
R	S1	S2	S3	NODE1	TYPE1	TEXT1	book1	NODE2	TYPE2	TEXT2	typ2

1	1_Samuel	1	1	453942	clause	וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם 		141547	word	אֶפְרָ֑יִם 	

2	1_Samuel	1	1	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 		141550	word	אֶ֠לְקָנָה 	

Wait a minute: why is the typ2 there?

It is because we have run a query before where we asked for typ.

If we do not want to be influenced by previous things we've run, we need to reset the display:

In [12]:
A.displayReset('tupleFeatures')

Again:

In [13]:
A.export(tuples)
checkout()
R	S1	S2	S3	NODE1	TYPE1	TEXT1	NODE2	TYPE2	TEXT2

1	1_Samuel	1	1	453942	clause	וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם 	141547	word	אֶפְרָ֑יִם 

2	1_Samuel	1	1	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	141550	word	אֶ֠לְקָנָה 

Display setup

We can get richer exports by means of A.displaySetup(), using the parameter tupleFeatures:

In [14]:
A.displaySetup(tupleFeatures=(
    (0, 'typ rela'),
    (1, 'sp gn nu pdp'),
))

We assign extra features per member of the tuple.

In the above case:

  • the first (0) member (the clause node), gets feature typ;
  • the second (1) member (the word node), gets features sp and gn.
In [15]:
A.export(tuples)
checkout()
R	S1	S2	S3	NODE1	TYPE1	TEXT1	typ1	rela1	NODE2	TYPE2	TEXT2	sp2	gn2	nu2	pdp2

1	1_Samuel	1	1	453942	clause	וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם 	WayX	NA	141547	word	אֶפְרָ֑יִם 	nmpr	unknown	sg	nmpr

2	1_Samuel	1	1	453943	clause	וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 	NmCl	NA	141550	word	אֶ֠לְקָנָה 	nmpr	m	sg	nmpr

Talking about display setup: other parameters also have effect, e.g. the text format.

Let's change it to the phonetic representation.

In [16]:
A.export(tuples, fmt='text-phono-full')
checkout()
R	S1	S2	S3	NODE1	TYPE1	TEXT1	typ1	rela1	NODE2	TYPE2	TEXT2	sp2	gn2	nu2	pdp2

1	1_Samuel	1	1	453942	clause	wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim 	WayX	NA	141547	word	ʔefrˈāyim 	nmpr	unknown	sg	nmpr

2	1_Samuel	1	1	453943	clause	ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 	NmCl	NA	141550	word	ʔelqānˌā 	nmpr	m	sg	nmpr

Chained queries

You can chain queries like this:

In [17]:
results = (
    A.search('''
book book=Samuel_I
  chapter chapter=1
    verse verse=1
      clause
        word sp=nmpr
''')
    +
    A.search('''
book book=Samuel_I
  chapter chapter=1
    verse verse=1
      clause
        word sp=verb nu=pl
''')
)
  0.46s 6 results
  0.63s 1 result

In such cases, it is better to setup the features yourself:

In [18]:
A.displaySetup(
    tupleFeatures=(
        (3, 'typ rela'),
        (4, 'sp gn vt vs'),
    ),
    fmt='text-phono-full',
)

Now we can do a fine export:

In [19]:
A.export(results)
checkout()
R	S1	S2	S3	NODE1	TYPE1	NODE2	TYPE2	NODE3	TYPE3	NODE4	TYPE4	TEXT4	typ4	rela4	NODE5	TYPE5	TEXT5	sp5	gn5	vt5	vs5

1	1_Samuel	1	1	426592	book	426856	chapter	1421483	verse	453942	clause	wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim 	WayX	NA	141547	word	ʔefrˈāyim 	nmpr	unknown	NA	NA

2	1_Samuel	1	1	426592	book	426856	chapter	1421483	verse	453943	clause	ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 	NmCl	NA	141550	word	ʔelqānˌā 	nmpr	m	NA	NA

3	1_Samuel	1	1	426592	book	426856	chapter	1421483	verse	453943	clause	ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 	NmCl	NA	141552	word	yᵊrōḥˈām 	nmpr	m	NA	NA

4	1_Samuel	1	1	426592	book	426856	chapter	1421483	verse	453943	clause	ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 	NmCl	NA	141554	word	ʔᵉlîhˈû 	nmpr	m	NA	NA

5	1_Samuel	1	1	426592	book	426856	chapter	1421483	verse	453943	clause	ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 	NmCl	NA	141556	word	tˌōḥû 	nmpr	m	NA	NA

6	1_Samuel	1	1	426592	book	426856	chapter	1421483	verse	453943	clause	ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 	NmCl	NA	141558	word	ṣˌûf 	nmpr	m	NA	NA

7	1_Samuel	1	1	426592	book	426856	chapter	1421483	verse	453942	clause	wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim 	WayX	NA	141544	word	ṣôfˌîm 	verb	m	ptca	qal

Next steps

Now you now how to escape from Text-Fabric.

We hope that this makes your stay in TF more comfortable. It's not a Hotel California.

  • display become an expert in creating pretty displays of your text structures
  • search turbo charge your hand-coding with search templates
  • share draw in other people's data and let them use yours
  • export export your dataset as an Emdros database