You might want to consider the start of this tutorial.

Short introductions to other TF datasets:

or the

Quran

Search Introduction¶

Search in Text-Fabric is a template based way of looking for structural patterns in your dataset.

Within Text-Fabric we have the unique possibility to combine the ease of formulating search templates for complicated syntactical patterns with the power of programmatically processing the results.

This notebook will show you how to get up and running.

Alternative for hand-coding¶

Search is a powerful feature for a wide range of purposes.

Quite a bit of the implementation work has been dedicated to optimize performance. Yet I do not pretend to have found optimal strategies for all possible search templates. Some search tasks may turn out to be somewhat costly or even very costly.

That being said, I think search might turn out helpful in many cases, especially by reducing the amount of hand-coding needed to work with special subsets of your data.

Easy command¶

Search is as simple as saying (just an example)

results = A.search(template)
A.show(results)

See all ins and outs in the search template docs.

In [1]:

%load_ext autoreload
%autoreload 2

Incantation¶

The ins and outs of installing Text-Fabric, getting the corpus, and initializing a notebook are explained in the start tutorial.

In [3]:

from tf.app import use

In [5]:

A = use("etcbc/dhammapada", hoist=globals())

TF-app: ~/text-fabric-data/etcbc/dhammapada/app

data: ~/text-fabric-data/etcbc/dhammapada/tf/0.2

This is Text-Fabric 9.2.0
Api reference : https://annotation.github.io/text-fabric/tf/cheatsheet.html

16 features found and 0 ignored

Text-Fabric: Text-Fabric API 9.2.0, etcbc/dhammapada/app v3, Search Reference
Data: DHAMMAPADA, Character table, Feature docs
Features:

Dhammapada-Latine

clarity

int

word is inserted for clarity, marked by inclusion in ( and ); only in Latin translation

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

format:

1 (=true) or absent (=false)

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

extrastanza

int

word is outside a stanza, between stanzas or in pre/post vagga material

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

format:

1 (=true) or absent (=false)

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

freq_occ

int

the number of times that this word occurs

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

format:

positive integer

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

latin

str

bare word (without non-word-letters)

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

format:

string (for Latin translation) or empty (for Pali original)

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

latinpost

str

non-word letters after word, with trailing spaces

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

format:

string (for Latin translation) or empty (for Pali original)

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

latinpre

str

non-word letters before word, no leading spaces

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

format:

string (for Latin translation) or empty (for Pali original)

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

n

int

number of vagga, stanza (relative to work), sentence, clause (both relative to vagga)

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

format:

positive number, 0 for pre-stanza material in a vagga

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

otype

str

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

pali

str

bare word (without non-word-letters)

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

format:

string (for Pali original) or empty (for Latin translation)

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

palipost

str

non-word letters after word, with trailing spaces

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

format:

string (for Pali original) or empty (for Latin translation)

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

palipre

str

non-word letters before word, no leading spaces

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

format:

string (for Pali original) or empty (for Latin translation)

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

quote

int

word is inside a quote

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

format:

1 (=true) or absent (=false)

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

trans

int

whether the node belongs to the original text or a translation

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

format:

1 (=Latin translation) or absent (=Pali original)

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

uncertain

int

word is marked as uncertain by inclusion in [ and ]; only in Pali original

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

format:

1 (=true) or absent (=false)

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

oslots

none

converters:

Dirk Roorda (Text-Fabric)

copynote1:

Digitisation supported by Shri Brihad Bhartiya Samaj 20 February 2020

dateWritten:

2021-12-24T14:49:10Z

digitizers:

Bee Scherer, Yvonne Mataar

edition:

2nd

editor:

V. Fausboll

institute:

Text and Traditions, VU Amsterdam

language:

pli,lat

place:

London

project:

Dhammapada-latine

publisher:

Luzac & Co.

researcher:

Bee Scherer

sourceFormat:

plain text

stamp:

50480

subtitle:

being a collection of moral verses in Pali

title:

The Dhammapada

version:

0.2

writtenBy:

Text-Fabric

yearPublished:

1900

Text-Fabric API: names N F E L T S C TF directly usable

Basic search command¶

We start with the most simple form of issuing a query. Let's search for the word Māro in the Pali text. We also want to show the clauses in which they occur.

But first: how do you type that ā? To be honest: I don't know either.

Text-Fabric has a handy function to give you a palette of all the non-ASCII characters in the corpus:

In [6]:

A.specialCharacters()

Special characters in text-orig-full â ā ḍ ê ë ḥ î ī ḷ ṃ ñ ṅ ṇ ȏ ṭ û ū

Now, if you click on a letter, it is stored on your clipboard, ready to paste. To help you remember where you clicked last, the letter becomes yellow.

In [7]:

query = """
clause
  word pali=Māro
"""
results = A.search(query)

  0.01s 5 results

We have the results. We only need to display them. Here they are in a table:

In [8]:

A.table(results)

n	p	clause	word
1	1 7	subhānupassiṃ viharantaṃ indriyesu asaṃvutaṃ bhojanamhi câmattaññuṃ kusītaṃ hīnavīriyaṃ taṃ ve pasahatī Māro vāto rukkhaṃ va dubbalaṃ.	Māro
2	1 8	asubhānupassiṃ viharantaṃ indriyesu susaṃvutaṃ bhojanamhi ca mattaññuṃ saddhaṃ āraddhavīriyaṃ taṃ [ve] na-ppasahatī Māro vāto selaṃ va pabbataṃ.	Māro
3	4 57	tesaṃ sampannasīlānaṃ appamādavihārinaṃ sammadaññāvimuttānaṃ Māro maggaṃ na vindati.	Māro
4	8 105	n' eva devo na gandhabbo na Māro saha Brahmunā jitaṃ apajitaṃ kayrā tathārūpassa jantuno.	Māro
5	24 337	taṃ vo vadāmi bhaddaṃ vo yāvant' ettha samāgatā taṇhāya mūlaṃ khanatha usīrattho va bīraṇaṃ mā vo naḷaṃ va soto va Māro bhañji punappunaṃ.	Māro

The hyperlinks in the p column point to the Tipitaka site, to the stanza most relevant to the individual results.

Here is the first one in a pretty display:

In [9]:

A.show(results, end=1)

result 1

1 7

stanza

sentence

clause

subhānupassiṃ

pali=subhānupassiṃ

viharantaṃ

pali=viharantaṃ

indriyesu

pali=indriyesu

asaṃvutaṃ

pali=asaṃvutaṃ

bhojanamhi

pali=bhojanamhi

câmattaññuṃ

pali=câmattaññuṃ

kusītaṃ

pali=kusītaṃ

hīnavīriyaṃ

pali=hīnavīriyaṃ

taṃ

pali=taṃ

ve

pali=ve

pasahatī

pali=pasahatī

Māro

pali=Māro

vāto

pali=vāto

rukkhaṃ

pali=rukkhaṃ

va

pali=va

dubbalaṃ.

pali=dubbalaṃ

sentence

clause

Iucunda

spectantem

viventem,

clause

sensus

non

coercentem

et

in

cibo

modi

nescium,

clause

socordem,

clause

viribus

destitutum,

clause

eum

certe

superat

Māras,

clause

ventus

arborem

sicut

infirmam.

We can also stop unravelling structure at the clause level:

In [10]:

A.show(results, end=2, baseTypes={"clause"})

result 1

1 7

stanza

sentence

clause subhānupassiṃ viharantaṃ indriyesu asaṃvutaṃ bhojanamhi câmattaññuṃ kusītaṃ hīnavīriyaṃ taṃ ve pasahatī Māro vāto rukkhaṃ va dubbalaṃ.

sentence

clause Iucunda spectantem viventem,

clause sensus non coercentem et in cibo modi nescium,

clause socordem,

clause viribus destitutum,

clause eum certe superat Māras,

clause ventus arborem sicut infirmam.

result 2

1 8

stanza

sentence

clause asubhānupassiṃ viharantaṃ indriyesu susaṃvutaṃ bhojanamhi ca mattaññuṃ saddhaṃ āraddhavīriyaṃ taṃ [ve] na-ppasahatī Māro vāto selaṃ va pabbataṃ.

sentence

clause Iucunda non spectantem viventem,

clause sensus bene coercentem et in cibo modum noscentem,

clause fidem habentem,

clause intentis viribus praeditum,

clause eum certe non superat Māras,

clause ventus saxeum volut montem.

Condense results¶

There are two fundamentally different ways of presenting the results: condensed and uncondensed.

In uncondensed view, all results are listed individually. You can keep track of which parts belong to which results. The display can become unwieldy.

This is the default view, because it is the straightest, most logical, answer to your query.

In condensed view all nodes of all results are grouped in containers first (e.g. stanzas), and then presented container by container. You loose the information of what parts belong to what result.

Here is an example of the difference.

In [11]:

query = """
clause
  word pali=maṃ
"""

results = A.search(query)

  0.01s 7 results

In [12]:

A.table(results)

n	p	clause	word
1	1 3	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ
2	1 3	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ
3	1 3	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ
4	1 4	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ
5	1 4	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ
6	1 4	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ
7	26 414	yo' maṃ palipathaṃ duggaṃ saṃsāraṃ moham accagā tiṇṇo pāragato jhāyī anejo akathaṃkathī anupādāya nibbuto tam -	maṃ

There are multiple occurrences of maṃ in the clauses.

Now in condensed mode:

In [13]:

A.table(results, condensed=True)

n	p	clause	word	word	word
1	1 3	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	maṃ	maṃ
2	1 4	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	maṃ	maṃ
3	26 414	yo' maṃ palipathaṃ duggaṃ saṃsāraṃ moham accagā tiṇṇo pāragato jhāyī anejo akathaṃkathī anupādāya nibbuto tam -	maṃ

Much more compact.

And in a pretty display we get for the first 6 hits:

In [14]:

A.show(results, end=2, condensed=True)

stanza 1

1 3

stanza

sentence

clause

"akkocchi

pali=akkocchi

maṃ

pali=maṃ

avadhi

pali=avadhi

maṃ

pali=maṃ

ajini

pali=ajini

maṃ

pali=maṃ

ahāsi

pali=ahāsi

me",

pali=me

clause

ye

pali=ye

taṃ

pali=taṃ

upanayihanti

pali=upanayihanti

veraṃ

pali=veraṃ

tesaṃ

pali=tesaṃ

na

pali=na

sammati.

pali=sammati

sentence

clause

"Conviciis

me

obruit,

clause

verberavit

me,

clause

vicit

me,

clause

spoliavit

me";

clause

qui

isto

(animo)

sese

induunt,

clause

iracundia

eorum

non

sedatur.

stanza 2

1 4

stanza

sentence

clause

"akkocchi

pali=akkocchi

maṃ

pali=maṃ

avadhi

pali=avadhi

maṃ

pali=maṃ

ajini

pali=ajini

maṃ

pali=maṃ

ahāsi

pali=ahāsi

me",

pali=me

clause

ye

pali=ye

taṃ

pali=taṃ

na

pali=na

upanayhanti

pali=upanayhanti

veraṃ

pali=veraṃ

tes'

pali=tes'

ūpasammati.

pali=ūpasammati

sentence

clause

"Conviciis

etc.";

sentence

clause

qui

isto

(animo)

sese

non

induunt,

clause

iracundia

in

iis

sedatur.

We can make it more compact by condensing into clauses instead of stanzas:

In [15]:

A.show(results, end=2, condensed=True, condenseType="clause")

clause 1

1 3

clause

"akkocchi

pali=akkocchi

maṃ

pali=maṃ

avadhi

pali=avadhi

maṃ

pali=maṃ

ajini

pali=ajini

maṃ

pali=maṃ

ahāsi

pali=ahāsi

me",

pali=me

clause 2

1 4

clause

"akkocchi

pali=akkocchi

maṃ

pali=maṃ

avadhi

pali=avadhi

maṃ

pali=maṃ

ajini

pali=ajini

maṃ

pali=maṃ

ahāsi

pali=ahāsi

me",

pali=me

Custom highlighting¶

We can apply different highlight colours to different parts of the result. The words in the pair are member 5 and 6 of the result tuples. The members that we do not map, will not be highlighted. The members that we map to the empty string will be highlighted with the default color.

NB: Choose your colours from the CSS specification.

In [16]:

query = """
clause
  word pali=maṃ
  word pali=avadhi
"""

results = A.search(query)

  0.02s 6 results

In [17]:

A.table(results, condensed=False, colorMap={1: "", 2: "cyan", 3: "magenta"})

n	p	clause	word	word
1	1 3	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	avadhi
2	1 3	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	avadhi
3	1 3	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	avadhi
4	1 4	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	avadhi
5	1 4	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	avadhi
6	1 4	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	avadhi

Or with more glory:

In [18]:

A.show(results, end=2, condensed=False, condenseType="sentence", colorMap={1: "", 2: "cyan", 3: "magenta"})

result 1

1 3

sentence

clause

"akkocchi

pali=akkocchi

maṃ

pali=maṃ

avadhi

pali=avadhi

maṃ

pali=maṃ

ajini

pali=ajini

maṃ

pali=maṃ

ahāsi

pali=ahāsi

me",

pali=me

clause

ye

pali=ye

taṃ

pali=taṃ

upanayihanti

pali=upanayihanti

veraṃ

pali=veraṃ

tesaṃ

pali=tesaṃ

na

pali=na

sammati.

pali=sammati

result 2

1 3

sentence

clause

"akkocchi

pali=akkocchi

maṃ

pali=maṃ

avadhi

pali=avadhi

maṃ

pali=maṃ

ajini

pali=ajini

maṃ

pali=maṃ

ahāsi

pali=ahāsi

me",

pali=me

clause

ye

pali=ye

taṃ

pali=taṃ

upanayihanti

pali=upanayihanti

veraṃ

pali=veraṃ

tesaṃ

pali=tesaṃ

na

pali=na

sammati.

pali=sammati

Color mapping works best for uncondensed results. If you condense results, some nodes may occupy different positions in different results. It is unpredictable which color will be used for such nodes:

In [19]:

A.show(results, end=1, condensed=True, condenseType="sentence", colorMap={1: "", 2: "cyan", 3: "magenta"})

sentence 1

1 3

sentence

clause

"akkocchi

pali=akkocchi

maṃ

pali=maṃ

avadhi

pali=avadhi

maṃ

pali=maṃ

ajini

pali=ajini

maṃ

pali=maṃ

ahāsi

pali=ahāsi

me",

pali=me

clause

ye

pali=ye

taṃ

pali=taṃ

upanayihanti

pali=upanayihanti

veraṃ

pali=veraṃ

tesaṃ

pali=tesaṃ

na

pali=na

sammati.

pali=sammati

Constraining order¶

You can stipulate an order on the words in your template. You only have to put a relational operator between them. Say we want only results where maṃ follows avadhi.

In [20]:

A.specialCharacters()

Special characters in text-orig-full â ā ḍ ê ë ḥ î ī ḷ ṃ ñ ṅ ṇ ȏ ṭ û ū

In [21]:

query = """
clause
  word pali=maṃ
  > word pali=avadhi
"""

results = A.search(query)

  0.02s 4 results

In [22]:

A.table(results, colorMap={1: "", 2: "cyan", 3: "magenta"})

n	p	clause	word	word
1	1 3	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	avadhi
2	1 3	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	avadhi
3	1 4	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	avadhi
4	1 4	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	avadhi

We can also require the words to be adjacent.

In [23]:

query = """
clause
  word pali=maṃ
  :> word pali=avadhi
"""

results = A.search(query)

  0.02s 2 results

In [24]:

A.table(results, colorMap={1: "", 2: "cyan", 3: "magenta"})

n	p	clause	word	word
1	1 3	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	avadhi
2	1 4	"akkocchi maṃ avadhi maṃ ajini maṃ ahāsi me",	maṃ	avadhi

Custom feature display¶

We would like to see the frequency. The way to do that, is to perform a display setup first. By the way, we can also include the highlight colours in the display setup.

In [25]:

A.displaySetup(
    extraFeatures="freq_occ", colorMap={2: "lightsalmon", 3: "mediumaquamarine"}
)

In [26]:

A.show(results, condensed=False, condenseType="sentence")

result 1

1 3

sentence

clause

"akkocchi

freq_occ=2pali=akkocchi

maṃ

freq_occ=7pali=maṃ

avadhi

freq_occ=2pali=avadhi

maṃ

freq_occ=7pali=maṃ

ajini

freq_occ=2pali=ajini

maṃ

freq_occ=7pali=maṃ

ahāsi

freq_occ=2pali=ahāsi

me",

freq_occ=14pali=me

clause

ye

freq_occ=12pali=ye

taṃ

freq_occ=30pali=taṃ

upanayihanti

freq_occ=1pali=upanayihanti

veraṃ

freq_occ=3pali=veraṃ

tesaṃ

freq_occ=6pali=tesaṃ

na

freq_occ=143pali=na

sammati.

freq_occ=1pali=sammati

result 2

1 4

sentence

clause

"akkocchi

freq_occ=2pali=akkocchi

maṃ

freq_occ=7pali=maṃ

avadhi

freq_occ=2pali=avadhi

maṃ

freq_occ=7pali=maṃ

ajini

freq_occ=2pali=ajini

maṃ

freq_occ=7pali=maṃ

ahāsi

freq_occ=2pali=ahāsi

me",

freq_occ=14pali=me

clause

ye

freq_occ=12pali=ye

taṃ

freq_occ=30pali=taṃ

na

freq_occ=143pali=na

upanayhanti

freq_occ=1pali=upanayhanti

veraṃ

freq_occ=3pali=veraṃ

tes'

freq_occ=1pali=tes'

ūpasammati.

freq_occ=1pali=ūpasammati

Now we completely reset the display customization.

In [27]:

A.displayReset()

As you see, you have total control.

All steps¶

start your first step in mastering the bible computationally
search turbo charge your hand-coding with search templates

CC-BY Dirk Roorda