Start¶

This notebook gets you started with using Text-Fabric for coding in the letters of René Descartes.

Familiarity with the underlying data model is recommended.

For provenance, see the documentation: about.

Overview¶

we tell you how to get Text-Fabric on your system;
we tell you how to get the Descartes corpus on your system.

Installing Text-Fabric¶

See the installation instructions.

Running Text-Fabric¶

We will run computer code in the cells below, and this code makes use of the text-fabric library, shortly called tf.

We import some standard Python modules and then we import the use function from text-fabric.

In [1]:

import sys, os
from tf.app import use

Now we are going to use the use function. We want to use a corpus, and if we specify what corpus, text-fabric will the data for us.

If you have cloned the CLARIAH/descartes-tf repository to your local machine under the directory

~/github/CLARIAH/descartes-tf

then you already have the data. In that case you have to call the use command like this:

A = use("CLARIAH/descartes-tf:clone", checkout="clone", hoist=globals())

Below we give the command for the case where you have not cloned the repository. Text-Fabric will fetch the data from the internet and store it in your directory

~/text-fabric-data/github/CLARIAH/descartes-tf.

In both cases, the corpus data will be optimised for fast processing, a one time job.

In [2]:

A = use("CLARIAH/descartes-tf", hoist=globals())

Locating corpus resources ...

app: ~/text-fabric-data/github/CLARIAH/descartes-tf/app

data: ~/text-fabric-data/github/CLARIAH/descartes-tf/tf/1.1

data: ~/text-fabric-data/github/CLARIAH/descartes-tf/parallels/tf/1.1

Text-Fabric: Text-Fabric API 11.2.0, CLARIAH/descartes-tf/app v3, Search Reference
Data: CLARIAH - descartes-tf 1.1, Character table, Feature docs

Node types

Name	# of nodes	# slots/node	% coverage
volume	8	85241.88	100
letter	725	940.60	100
page	2884	236.45	100
postscriptum	56	46.79	0
opener	545	1.97	0
closer	541	13.10	1
address	86	15.22	0
head	725	23.37	2
p	8438	80.82	100
sentence	13074	50.14	96
hi	5972	4.63	4
formula	6200	1.21	1
figure	319	1.00	0
word	681935	1.00	100

Sets: no custom sets
Features:

Similar Sentences

sim

int

similarity between sentences based on the Levenshtein ratio

Descartes = Descartes, all letters

alt_date

str

alternative date of a letter

alt_id

str

alternative ids of a letter, comma separated

cert

str

certainty of something

date

str

date of a letter

id

str

id of a letter

intermediary

str

person involved in the transmission of the letter from sender to receiver

isitalic

str

whether the word is in italic

ismargin

str

whether the word is in the margin

issub

str

whether the word is in subscript

issup

str

whether the word is in supscript

language

str

language of a letter

level

str

level of a paragraph when it acts like a heading

n

int

number of whatever element

notation

str

notation method of a formula

otype

str

punc

str

nonword chars after a word

recipient

str

recipient of a letter

recipientloc

str

location from where a letter was received

resp

str

person responsible for something

sender

str

sender of a letter

senderloc

str

location from where a letter was sent

tex

str

unformatted TeX code of a formula, without the `$`

trans

str

transcription of a word

typ

str

kind of a node; "empty"; "formula", "head", "symbol", "illustration"

url

str

url of a graphic node

oslots

none

Text-Fabric API: names N F E L T S C TF directly usable

data: ~/text-fabric-data/github/CLARIAH/descartes-tf/source/illustrations

Found 5 symbols

Found 310 illustrations

The following loads will be much quicker!¶

Just to show the results of the optimization step: if we give the same command again, the data is loaded much quicker.

In [3]:

A = use("CLARIAH/descartes-tf", hoist=globals())

Locating corpus resources ...

app: ~/text-fabric-data/github/CLARIAH/descartes-tf/app

data: ~/text-fabric-data/github/CLARIAH/descartes-tf/tf/1.1

data: ~/text-fabric-data/github/CLARIAH/descartes-tf/parallels/tf/1.1

Text-Fabric: Text-Fabric API 11.1.2, CLARIAH/descartes-tf/app v3, Search Reference
Data: DESCARTES-TF, Character table, Feature docs

Node types

Name	# of nodes	# slots/node	% coverage
volume	8	85241.88	100
letter	725	940.60	100
page	2884	236.45	100
postscriptum	56	46.79	0
opener	545	1.97	0
closer	541	13.10	1
address	86	15.22	0
head	725	23.37	2
p	8438	80.82	100
sentence	13074	50.14	96
hi	5972	4.63	4
formula	6200	1.21	1
figure	319	1.00	0
word	681935	1.00	100

Sets: no custom sets
Features:

Similar Sentences

sim

int

similarity between sentences based on the Levenshtein ratio

Descartes = Descartes, all letters

alt_date

str

alternative date of a letter

alt_id

str

alternative ids of a letter, comma separated

cert

str

certainty of something

date

str

date of a letter

id

str

id of a letter

intermediary

str

person involved in the transmission of the letter from sender to receiver

isitalic

str

whether the word is in italic

ismargin

str

whether the word is in the margin

issub

str

whether the word is in subscript

issup

str

whether the word is in supscript

language

str

language of a letter

level

str

level of a paragraph when it acts like a heading

n

int

number of whatever element

notation

str

notation method of a formula

otype

str

punc

str

nonword chars after a word

recipient

str

recipient of a letter

recipientloc

str

location from where a letter was received

resp

str

person responsible for something

sender

str

sender of a letter

senderloc

str

location from where a letter was sent

tex

str

unformatted TeX code of a formula, without the `$`

trans

str

transcription of a word

typ

str

kind of a node; "empty"; "formula", "head", "symbol", "illustration"

url

str

url of a graphic node

oslots

none

Text-Fabric API: names N F E L T S C TF directly usable

data: ~/text-fabric-data/github/CLARIAH/descartes-tf/source/illustrations

Found 5 symbols

Found 310 illustrations

The output¶

The messages after loading the corpus contain a lot of information about it.

Tip: click the triangles and the links, and have a quick look.

The Text-Fabric line has various links to the API docs.

Under Node types you find statistics about the corpus.

Under Descartes = Descartes, all letters you find the features of the corpus with short descriptions.

This corpus has additional material: illustrations. They have been downloaded automatically in the process, and you see how many there are.

Highlights¶

This corpus is special in that it has mathematical formulas and illustrations.

We show some of them to whet your appetite.

Formulas¶

There are simple formulas and complex formulas. The latter are represented as TeX codes, and will be typeset nicely.

Let's find the complex ones.

In [4]:

query = """
formula notation=TeX
"""

results = A.search(query)

  0.01s 219 results

Let's show a few.

In [5]:

A.table(results, end=3)

n	p	formula
1	1 1046:11	${1\over 3} {4\over 9} {16\over 27} {64\over 81}$
2	1 1060:3	$4.900x^{6} \ {\it aequat}\ - 4.899x^{5} + 2.354x^{4} + 16.858x^{3} + 9.458xx + 429x - 4.900$
3	1 1060:9	${\displaystyle\strut {3xx - 1x}\over \displaystyle\strut 2}$

You can see them in context as well:

In [6]:

A.show(results, end=3)

result 1

1 1046:11

sentence 1

Vous

me

demandez,

en

troisième

lieu,

comment

se

meut

une

pierre

hi

in

vacuo;

mais

parce

que

vous

avez

oublié

à

mettre

la

figure,

que

vous

supposez

être

à

la

marge

de

votre

lettre,

je

ne

puis

bien

entendre

ce

que

vous

proposez,

et

il

ne

me

semble

point

que

les

proportions

que

vous

mettez,

se

rapportent

à

celles

que

je

vous

ai

autrefois

mandées,

ou

au

lieu

de,

etc.,

comme

vous

m'

écrivez,

je

mettais

formula TeX

notation=TeX

${1\over 3} {4\over 9} {16\over 27} {64\over 81}$

,

etc.,

ce

qui

donne

bien

d'

autres

conséquences.

result 2

1 1060:3

sentence 2

Et

je

trouve

que

la

proportion,

qui

est

entre

le

moindre

côté

du

triangle

formula

ABC

et

le

plus

grand,

est

comme

l'

unité

à

l'

une

des

deux

racines

qui

peuvent

être

tirées

de

cette

équation:

formula TeX

notation=TeX

$4.900x^{6} \ {\it aequat}\ - 4.899x^{5} + 2.354x^{4} + 16.858x^{3} + 9.458xx + 429x - 4.900$

result 3

1 1060:9

sentence 1

(

Lequel

est

nombre

figuré

comme

5,

12,

22,

sont

nombres

pentagonaux

et

formula TeX

notation=TeX

${\displaystyle\strut {3xx - 1x}\over \displaystyle\strut 2}$

sont

les

termes

d'

algebra

qui

expriment

leurs

racines,

et

ils

contiennent

6

unités).

Next steps¶

By now you have an impression how to orient yourself in this corpus. The next steps will show you how to get powerful: searching and computing.

After that it is time for collecting results, use them in new annotations and share them.

start intro and highlights
search turbo charge your hand-coding with search templates
compute sink down a level and compute it yourself
exportExcel make tailor-made spreadsheets out of your results

Advanced

similar sentences find similar sentences

CC-BY Dirk Roorda