Open Access versions of articles in Australian HASS journals

Previously I attempted some analysis of the open access status of research articles published in Australian Historical Studies. I thought it would be interesting to try some comparisons with other Australian HASS subscription-based journals.

I've simplified the process here to make it easier run an analysis of any journal. The steps are:

  1. Get a list of articles published in the journal by querying the CrossRef API with the journal's ISSN.
  2. Remove recurring sections such as 'Editorial' and 'Book reviews' from the list of articles.
  3. Look up the OA status of each remaining article by querying the Unpaywall API with the article's DOI.

I then do some simple analysis of the OA status, and visualise the results over time.

Understanding the OA status

Theh Unpaywall API returns one of five values for the OA status of an article – 'Gold', 'Hybrid', 'Green', 'Bronze', and 'Closed'. There's some more information on how these are determined on the Unpaywall site. Put simply:

  • Gold – the article is freely available, openly licensed, and published in an open access journal
  • Hybrid – the article is freely available, openly licensed, and published in a subscription journal
  • Green – a version of the article (usually the Author's Accepted Manuscript) is freely available from a public repository
  • Bronze – the article is published in a subscription journal, but is freely available from the journal's website
  • Closed – the article is behind a paywall

Caveats

  • The data might not be up-to-date. In particular, I've noticed that some 'bronze' status articles are reported as 'closed'. Presumably this is because the Unpaywall database is running a bit behind changes in the publishers' websites.
  • The definition of an 'article' is not consistent. In earlier issues of some journals it seems that things like book reviews are grouped together under a single DOI, while recent issues have a DOI for each review.

Journals

So far I've looked at the following journals (more suggestions welcome):

Of course, this analysis is focused on subscription journals. There are also open access journals like the Public History Review where all the articles would be 'Gold'!

Results (12 January 2021)

The results are not good. Articles published in Australia's main subscription history journals are about 94% closed. This is despite the fact that Green OA policies allow authors to deposit versions of their articles in public repositories (often after an embargo period).

Journal Closed
Australian Historical Studies 94.6%
History Australia 94.9%
Australian Journal of Politics and History 95.7%*
Journal of Australian Studies 94.2%
Australian Archaeology 83.4%
Archives and Manuscripts (2012-) 24.8%
Journal of the Australian Library and Information Association 52.5%*
Labour History 93.9%
* Problems with data noted below

This can be fixed! If you're in a university, talk to your librarians about depositing a Green OA version of your article in an institutional repository. If not, you can use the Share your paper service to upload a Green OA version to Zenodo. Your research will be easier to find, easier to access, easier to use, and available to everyone – not just those with the luxury of an institutional subscription.

Import what we need

In [34]:
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
import requests_cache
from tqdm.auto import tqdm
import pandas as pd
import altair as alt
import collections

s = requests_cache.CachedSession()
retries = Retry(total=5, backoff_factor=1, status_forcelist=[ 502, 503, 504 ])
s.mount('https://', HTTPAdapter(max_retries=retries))
s.mount('http://', HTTPAdapter(max_retries=retries))

tqdm.pandas(desc="records")
In [35]:
# the APIs are open, but it's polite to let the APIs know who you are
email = '[email protected]'

Define some functions to do the work

In [36]:
def get_total_results(issn):
    '''
    Get the total number of articles in CrossRef for this journal.
    '''
    response = s.get(f'https://api.crossref.org/journals/{issn}/works/', params={'rows': 0})
    data = response.json()
    try:
        total_works = data['message']['total-results']
    except KeyError:
        total_works = 0
    return total_works

def get_title(record):
    '''
    Titles are in a list – join any values
    '''
    title = record.get('title')
    if isinstance(title, list):
        title = ' – '.join(title)
    return title

def harvest_works(issn):
    '''
    Harvest basic details (DOI, title, date) of articles from the journal with the supplied ISSN from CrossRef.
    '''
    harvested = 0
    works = []
    total_results = get_total_results(issn)
    params = {
        'rows': 100,
        'offset': 0
    }
    headers = {
        'User-Agent': f'Jupyter Notebook (mailto:{email})'
    }
    with tqdm(total=total_results) as pbar:
        while harvested <= total_results:
            params['offset'] = harvested
            response = s.get(f'https://api.crossref.org/journals/{issn}/works/', params=params, headers=headers)
            data = response.json()
            try:
                records = data['message']['items']
            except TypeError:
                print('TYPEERROR')
                print(data)
            else:
                for record in records:
                    try:
                        works.append({'doi': record.get('DOI'), 'title': get_title(record), 'year': record['issued']['date-parts'][0][0]})
                    except KeyError:
                        print('KEYERROR')
                        print(record)
            harvested += 100
            pbar.update(len(data['message']['items']))
    return works

def get_oa_status(doi):
    '''
    Get OA status of DOI from the Unpaywall API.
    '''
    response = s.get(f'https://api.unpaywall.org/v2/{doi}?email={email}')
    data = response.json()
    return data['oa_status']

def create_scale(df):
    '''
    Set colour range to match the OA status types.
    '''
    scale = []
    colours = collections.OrderedDict()
    colours['hybrid'] = 'gold'
    colours['green'] = 'green'
    colours['bronze'] = 'brown'
    colours['closed'] = 'lightgrey'
    status_values = list(df['oa_status'].unique())
    for status, colour in colours.items():
        if status in status_values:
            scale.append(colour)
    return scale

def chart_oa_status(df, title):
    # Adding a numeric order column makes it easy to sort by oa_status
    df['order'] = df['oa_status'].replace({val: i for i, val in enumerate(['closed', 'bronze', 'green', 'hybrid'])})
    # Get colour values
    scale = create_scale(df)
    chart = alt.Chart(df).mark_bar().encode(
        x=alt.X('year:O', title='Year'),
        y=alt.Y('count():Q', title='Number of articles', axis=alt.Axis(tickMinStep=1)),
        color=alt.Color('oa_status:N', scale=alt.Scale(range=scale), legend=alt.Legend(title='OA type'), sort=alt.EncodingSortField('order', order='descending')),
        order='order',
        tooltip=[alt.Tooltip('count():Q', title='Number of articles'), alt.Tooltip('oa_status', title='OA type')]
    ).properties(title=title)
    display(chart)

Australian Historical Studies

In [37]:
works_ahs = harvest_works('1031-461X')

In [38]:
df_ahs = pd.DataFrame(works_ahs)
df_ahs.shape
Out[38]:
(1548, 3)
In [39]:
# Make sure there's no duplicates
df_ahs.drop_duplicates(inplace=True)
df_ahs.shape
Out[39]:
(1548, 3)
In [40]:
# Show repeated titles
df_ahs['title'].value_counts()[:25]
Out[40]:
Editorial board                                                                                                               36
Books                                                                                                                         35
Book notes                                                                                                                    30
In this issue                                                                                                                 20
Notes on Contributors                                                                                                         16
In This Issue                                                                                                                 16
Exhibitions                                                                                                                   12
Book reviews                                                                                                                  12
Book Notes                                                                                                                    10
Exhibition                                                                                                                     8
Communications                                                                                                                 7
Reviews                                                                                                                        6
Exhibition review                                                                                                              6
Editorial Board                                                                                                                6
Exhibition reviews                                                                                                             4
Introduction                                                                                                                   4
Editorial                                                                                                                      4
Book Note                                                                                                                      4
Notes on contributors                                                                                                          3
BOOKS                                                                                                                          2
Communication                                                                                                                  2
‘A study corner in the kitchen’: Australian graduate women negotiate family, nation and work in the 1950s and early 1960s1     1
The Snub: Robert Menzies and the Melbourne Club                                                                                1
Historical Thinking for History Teachers: A New Approach to Engaging Students and Developing Historical Consciousness          1
Australian Soldiers in Asia-Pacific in World War II.                                                                           1
Name: title, dtype: int64
In [41]:
# Get rid of titles that appear more than once
df_ahs_unique = df_ahs.copy().drop_duplicates(subset='title', keep=False)
df_ahs_unique.shape
Out[41]:
(1305, 3)
In [42]:
df_ahs_unique['oa_status']  = df_ahs_unique['doi'].progress_apply(get_oa_status)

Results

In [43]:
df_ahs_unique['oa_status'].value_counts()
Out[43]:
closed    1235
green       37
bronze      28
hybrid       5
Name: oa_status, dtype: int64
In [44]:
df_ahs_unique['oa_status'].value_counts(normalize=True).mul(100).round(1).astype(str) + '%'
Out[44]:
closed    94.6%
green      2.8%
bronze     2.1%
hybrid     0.4%
Name: oa_status, dtype: object
In [45]:
chart_oa_status(df_ahs_unique, title='Australian Historical Studies')

History Australia

In [46]:
works_ha = harvest_works('1449-0854')

In [47]:
df_ha = pd.DataFrame(works_ha)
df_ha.shape
Out[47]:
(1249, 3)
In [48]:
df_ha.drop_duplicates(inplace=True)
df_ha.shape
Out[48]:
(1249, 3)
In [49]:
df_ha.loc[df_ha['title'].isnull()]
Out[49]:
doi title year
107 10.2104/ha.2007.4.issue-2 None 2007
220 10.2104/ha.2006.3.issue-1 None 2006
764 10.2104/ha.2008.5.issue-2 None 2008
873 10.2104/ha.2008.5.issue-3 None 2008
1000 10.2104/ha.2006.3.issue-2 None 2006
1046 10.2104/ha.2007.4.issue-1 None 2007
In [50]:
df_ha.dropna(subset=['title'], inplace=True)
df_ha.shape
Out[50]:
(1243, 3)
In [51]:
df_ha['title'].value_counts()[:30]
Out[51]:
From the President                                                                                                                 46
From the Editors                                                                                                                   35
AHA Honour Roll                                                                                                                    15
Exhibition Reviews                                                                                                                 14
AHA Calendar of Events                                                                                                             12
Book Reviews                                                                                                                       11
From the Editor                                                                                                                    10
AHA Prize and Award Winners                                                                                                         9
Australian Historical Association (AHA)                                                                                             5
Film and Radio Reviews                                                                                                              4
AHA Code of Conduct                                                                                                                 4
AHA Affiliates/Network                                                                                                              4
From the editors                                                                                                                    3
Imprint information                                                                                                                 3
From the Guest Editors                                                                                                              3
Review Policy for History Australia                                                                                                 3
Film, Television, Radio and Theatre Reviews                                                                                         3
AHA Prizes and Awards                                                                                                               3
AboutHistory Australia                                                                                                              3
Introduction                                                                                                                        2
AHA Prizes 2008                                                                                                                     2
AHA Prizes 2009–10 in Brief                                                                                                         2
Submitting a manuscript toHistory Australia                                                                                         2
Prizes and Awards                                                                                                                   2
Submitting a manuscript to History Australia                                                                                        2
AHA Prizes 2006 and Beyond                                                                                                          2
Historical Novels Challenging the National Story                                                                                    1
Parallels on the Periphery: The Exploration of Aboriginal History by Local Historical Societies in New South Wales, 1960s-1970s     1
Review of Australianscreen and Moving History: 60 Years of Film Australia                                                           1
The spoils of opportunity: Janet Mitchell and Australian internationalism in the interwar Pacific                                   1
Name: title, dtype: int64
In [52]:
df_ha_unique = df_ha.copy().drop_duplicates(subset='title', keep=False)
df_ha_unique.shape
Out[52]:
(1039, 3)
In [53]:
df_ha_unique['oa_status']  = df_ha_unique['doi'].progress_apply(get_oa_status)

Results

In [54]:
df_ha_unique['oa_status'].value_counts()
Out[54]:
closed    986
green      27
bronze     25
hybrid      1
Name: oa_status, dtype: int64
In [55]:
df_ha_unique['oa_status'].value_counts(normalize=True).mul(100).round(1).astype(str) + '%'
Out[55]:
closed    94.9%
green      2.6%
bronze     2.4%
hybrid     0.1%
Name: oa_status, dtype: object
In [56]:
chart_oa_status(df_ha_unique, title='History Australia')

Australian Journal of Politics and History

There's clearly some problems with dates in the CrossRef data.

In [57]:
works_ajph = harvest_works('1467-8497')

In [58]:
df_ajph = pd.DataFrame(works_ajph)
df_ajph.shape
Out[58]:
(1944, 3)
In [59]:
df_ajph.drop_duplicates(inplace=True)
df_ajph.shape
Out[59]:
(1944, 3)
In [60]:
df_ajph.loc[df_ajph['title'].isnull()]
Out[60]:
doi title year
58 10.1111/ajph.2008.54.issue-4 None 2008
65 10.1111/ajph.2009.55.issue-1 None 2009
82 10.1111/ajph.2008.54.issue-3 None 2008
88 10.1111/ajph.2000.46.issue-1 None 2000
89 10.1111/ajph.2002.48.issue-1 None 2002
... ... ... ...
1819 10.1111/ajph.v66.1 None 2020
1869 10.1111/ajph.v66.4 None 2020
1870 10.1111/ajph.v65.4 None 2019
1908 10.1111/ajph.v66.2 None 2020
1928 10.1111/ajph.v66.3 None 2020

157 rows × 3 columns

In [61]:
df_ajph.dropna(subset=['title'], inplace=True)
df_ajph.shape
Out[61]:
(1787, 3)
In [62]:
df_ajph['title'].value_counts()[:40]
Out[62]:
Book Reviews                                                                               106
Book Notes                                                                                  52
QUEENSLAND                                                                                  18
TASMANIA                                                                                    17
VICTORIA                                                                                    17
SOUTH AUSTRALIA                                                                             17
Political Chronicles                                                                        16
NEW SOUTH WALES                                                                             15
WESTERN AUSTRALIA                                                                           15
BOOK REVIEWS                                                                                14
Journal Notes                                                                               11
Issues in Australian Foreign Policy                                                         10
Issue Information                                                                            8
Review Article                                                                               8
Australian Political Chronicle                                                               5
Introduction                                                                                 4
Political Chronicle: Australia and Papua New Guinea                                          4
Problems of Australian Foreign Policy                                                        4
THE COMMONWEALTH                                                                             4
Queensland                                                                                   3
Books Received                                                                               3
THE TERRITORY OF PAPUA AND NEW GUINEA                                                        3
Western Australia                                                                            3
Northern Territory                                                                           3
ERRATA                                                                                       3
Other Books Received                                                                         2
Commonwealth                                                                                 2
NORTHERN TERRITORY                                                                           2
Victoria                                                                                     2
Tasmania                                                                                     2
PAPUA NEW GUINEA                                                                             2
Rejoinder                                                                                    2
Foreword                                                                                     2
BOOK NOTES                                                                                   2
Volume Index                                                                                 2
Problems in Australian Foreign Policy, July-December 1994                                    2
South Australia                                                                              2
Reflections on the Role of the Military in Civilian Politics: the Case of Sierra Leone*      1
South Australia July to December 1997                                                        1
HITLER AND THE SPANISH CIVIL WAR. A CASE STUDY OF NAZI FOREIGN POLICY.                       1
Name: title, dtype: int64
In [63]:
df_ajph_unique = df_ajph.copy().drop_duplicates(subset='title', keep=False)
df_ajph_unique.shape
Out[63]:
(1400, 3)
In [64]:
df_ajph_unique['oa_status']  = df_ajph_unique['doi'].progress_apply(get_oa_status)

Results

In [65]:
df_ajph_unique['oa_status'].value_counts()
Out[65]:
closed    1340
bronze      36
green       22
hybrid       2
Name: oa_status, dtype: int64
In [66]:
df_ajph_unique['oa_status'].value_counts(normalize=True).mul(100).round(1).astype(str) + '%'
Out[66]:
closed    95.7%
bronze     2.6%
green      1.6%
hybrid     0.1%
Name: oa_status, dtype: object
In [67]:
chart_oa_status(df_ajph_unique, title='Australian Journal of Politics and History')

Journal of Australian Studies

In [68]:
works_jas = harvest_works('1444-3058')

In [69]:
df_jas = pd.DataFrame(works_jas)
df_jas.shape
Out[69]:
(2113, 3)
In [70]:
df_jas.drop_duplicates(inplace=True)
df_jas.shape
Out[70]:
(2113, 3)
In [71]:
df_jas.loc[df_jas['title'].isnull()]
Out[71]:
doi title year
In [72]:
df_jas.dropna(subset=['title'], inplace=True)
df_jas.shape
Out[72]:
(2113, 3)
In [73]:
df_jas['title'].value_counts()[:30]
Out[73]:
Editorial board                                                                                    49
Notes on contributors                                                                              40
Notes                                                                                              32
Contributors                                                                                       31
Reviews                                                                                            28
Notes on Contributors                                                                              28
Book reviews                                                                                       27
NOTES ON CONTRIBUTORS                                                                              19
Introduction                                                                                       16
BOOK REVIEWS                                                                                       16
Short reviews and notices                                                                          12
Editorial                                                                                          12
JAS review of books                                                                                11
John Barrett prize in Australian studies                                                           10
Erratum                                                                                             7
The John Barrett Award for Australian Studies                                                       6
Shorter notices and reviews                                                                         5
Editorial Board                                                                                     5
Acknowledgements                                                                                    5
Book Reviews                                                                                        5
Foreword                                                                                            3
The John Barrett prize                                                                              2
Book review                                                                                         2
Acknowledgments                                                                                     2
Shorter reviews and notices                                                                         2
Australian studies report                                                                           2
Short notices and reviews                                                                           2
Biographical notes on contributors                                                                  2
Health, Medicine and the Sea: Australian Voyages c.1815–1860                                        1
‘O Brave new social order’: The controversy over planning in Australia and Britain in the 1940s     1
Name: title, dtype: int64
In [74]:
df_jas_unique = df_jas.copy().drop_duplicates(subset='title', keep=False)
df_jas_unique.shape
Out[74]:
(1732, 3)
In [75]:
df_jas_unique['oa_status']  = df_jas_unique['doi'].progress_apply(get_oa_status)

Results

In [76]:
df_jas_unique['oa_status'].value_counts()
Out[76]:
closed    1632
green       71
bronze      26
hybrid       3
Name: oa_status, dtype: int64
In [77]:
df_jas_unique['oa_status'].value_counts(normalize=True).mul(100).round(1).astype(str) + '%'
Out[77]:
closed    94.2%
green      4.1%
bronze     1.5%
hybrid     0.2%
Name: oa_status, dtype: object
In [78]:
chart_oa_status(df_jas_unique, title='Journal of Australian Studies')

Australian Archaeology

In [79]:
works_aa = harvest_works('0312-2417')

In [80]:
df_aa = pd.DataFrame(works_aa)
df_aa.shape
Out[80]:
(1485, 3)
In [81]:
df_aa.drop_duplicates(inplace=True)
df_aa.shape
Out[81]:
(1485, 3)
In [82]:
df_aa.loc[df_aa['title'].isnull()]
Out[82]:
doi title year
In [83]:
df_aa.dropna(subset=['title'], inplace=True)
df_aa.shape
Out[83]:
(1485, 3)
In [84]:
df_aa['title'].value_counts()[:30]
Out[84]:
Editorial                                                                                                              57
Book Reviews                                                                                                           34
Front Matter                                                                                                           27
Thesis Abstracts                                                                                                       26
Backfill                                                                                                               23
editorial                                                                                                               8
debitage                                                                                                                7
Debitage                                                                                                                4
Forthcoming Fieldwork                                                                                                   3
Obituary                                                                                                                3
Fieldwork Calendar                                                                                                      3
Archaeologists and Aborigines                                                                                           3
Excavation Calendar                                                                                                     2
Front matter                                                                                                            2
The Aboriginal People of Tasmania, by Julia Clark                                                                       2
Obituaries                                                                                                              2
Honours Theses in Prehistory                                                                                            2
backfill                                                                                                                2
A Technological Analysis Of Stone Artefacts From Big Foot Art Site, Cania Gorge, Central Queensland                     2
Thesis Abstract                                                                                                         2
Trench Shoring For Archaeologists And The Randwick Grave Digging Course                                                 2
Useless graduates?: Why do we all think that something has gone wrong with Australian archaeological training?          1
Broadcasting, listening and the mysteries of public engagement: an investigation of the AAA online audience             1
Gendered Archaeology                                                                                                    1
Bottles For Jam? An Example Of Recycling From A Post-Contact Archaeological Site                                        1
The Patina of Nostalgia                                                                                                 1
Department of Archaeology La Trobe University                                                                           1
Colonial Archaeology in Australia                                                                                       1
Birriwilk rockshelter: A mid- to late Holocene site in ManilikarrCountry, southwest Arnhem Land, Northern Territory     1
Apology from UNSW P