Documentation Tooling Evaluation

In the Hexatomic research project, we have evaluated different software documentation tools for suitability of documenting software sustainably (see https://hexatomic.github.io/documentation/tooling/evaluation.html).

We present the evaluation data here.

In [1]:
# Prepare the raw data
import pandas as pd

# The evaluation scores on a scale from 1-5
data = [
    [5, 3, 1, 3, 4, 3, 4, 4], # Scores for Sphinx (rST)
    [3, 5, 1, 3, 2, 3, 4, 4], # Scores for Sphinx (CM)
    [3, 4, 1, 4, 3, 3, 3, 3], # Scores for Asciidoctor
    [3, 5, 1, 3, 4, 3, 2, 1], # Scores for mkDocs
    [4, 5, 1, 3, 4, 4, 5, 3], # Scores for mdBook
    [4, 5, 1, 3, 4, 2, 2, 2] # Scores for Jekyll
]

# The indices of the rows = tool name
idx = index=['Sphinx (rST)','Sphinx (CM)','Asciidoctor','mkDocs', 'mdBook', 'Jekyll']

# The index of the evaluation category
cols = columns=['1', '3a', '3b', '3c', '3d', '3e', '3f', '4']

# Create a data frame and print it
df = pd.DataFrame(data, index=idx, columns=cols)
df
Out[1]:
1 3a 3b 3c 3d 3e 3f 4
Sphinx (rST) 5 3 1 3 4 3 4 4
Sphinx (CM) 3 5 1 3 2 3 4 4
Asciidoctor 3 4 1 4 3 3 3 3
mkDocs 3 5 1 3 4 3 2 1
mdBook 4 5 1 3 4 4 5 3
Jekyll 4 5 1 3 4 2 2 2
In [2]:
# Calculate the mean of the sub-categories in category 3 (usability)
cat_cols = df.loc[:, '3a':'3f']

# Add the means to a new column in the data frame
df['mean(3)'] = cat_cols.mean(axis=1)

# Change the position of the mean column to go before the '4' column
new_order = [0, 1, 2, 3, 4, 5, 6, 8, 7]
df = df[df.columns[new_order]]

# Print the data frame
df
Out[2]:
1 3a 3b 3c 3d 3e 3f mean(3) 4
Sphinx (rST) 5 3 1 3 4 3 4 3.000000 4
Sphinx (CM) 3 5 1 3 2 3 4 3.000000 4
Asciidoctor 3 4 1 4 3 3 3 3.000000 3
mkDocs 3 5 1 3 4 3 2 3.000000 1
mdBook 4 5 1 3 4 4 5 3.666667 3
Jekyll 4 5 1 3 4 2 2 2.833333 2

Evaluation

The evaluation categories 1, mean(3) and 4 are weighted equally.

In [3]:
# Create a copy of the original data frame for this evaluation
df1 = df.copy()

# Create a sub-data frame
sum_cols1 = df1[['1', 'mean(3)', '4']]

# Print the sub-data frame
sum_cols1
Out[3]:
1 mean(3) 4
Sphinx (rST) 5 3.000000 4
Sphinx (CM) 3 3.000000 4
Asciidoctor 3 3.000000 3
mkDocs 3 3.000000 1
mdBook 4 3.666667 3
Jekyll 4 2.833333 2
In [4]:
# Calculate the mean for sum_cols
simple_avg1 = sum_cols1.mean(axis=1)

# Add a 'score' column with the simple averages to the data frame copy
df1['score'] = simple_avg1

# Order the data frame by score
scored1 = df1.sort_values('score', ascending=False)

# Print the data frame
scored1
Out[4]:
1 3a 3b 3c 3d 3e 3f mean(3) 4 score
Sphinx (rST) 5 3 1 3 4 3 4 3.000000 4 4.000000
mdBook 4 5 1 3 4 4 5 3.666667 3 3.555556
Sphinx (CM) 3 5 1 3 2 3 4 3.000000 4 3.333333
Asciidoctor 3 4 1 4 3 3 3 3.000000 3 3.000000
Jekyll 4 5 1 3 4 2 2 2.833333 2 2.944444
mkDocs 3 5 1 3 4 3 2 3.000000 1 2.333333