Generate bitcoin addresses for Irving & Holden's 2016 clinical trial word document

By Daniel Himmelstein (@dhimmel)

This notebook computes the bitcoin addresses for the word document from the following study:

How blockchain-timestamped protocols could improve the trustworthiness of medical science [version 2; referees: 3 approved]
Greg Irving, John Holden
F1000Research (2016) DOI: 10.12688/f1000research.8114.2

It uses the method described by Benjamin Gregory Carlisle in a 2014 blog post titled Proof of prespecified endpoints in medical research with the bitcoin blockchain.

Warning: the Carlisle method is not the recommended approach for proof of existence using Bitcoin. This notebook is not an endorsement of the method, but rather a demostration that the address generation in the Irving & Holden study is flawed.

Dependencies

This is a Python 3 notebook. It requires Python Bitcoin Tools, which can be installed with pip install bitcoin. This notebook was generated using bitcoin==1.1.42 from PyPI.

In [1]:
from urllib.request import urlopen
import hashlib

import bitcoin

Generate the private key

Get the sha256 hash for Dataset 1. Unformatted text file.

In [2]:
url = 'https://f1000researchdata.s3.amazonaws.com/datasets/8114/9c9f9a18-a852-40c6-953e-c75107abc714_Appendix_1_-_unformatted_text_file_.docx'
response = urlopen(url)
data = response.read()
checksum = hashlib.sha256(data)
private_key = checksum.hexdigest()
private_key
Out[2]:
'8da3088936035521f9e9b57963679d89e306a06c6aebd1167b4d198e79562326'
In [3]:
# Get the private key's format
bitcoin.get_privkey_format(private_key)
Out[3]:
'hex'

Generate the corresponding public keys

There are two common types of bitcoin public keys (compressed and uncompressed) that result in different bitcoin addresses. Neither Carlisle or Irving & Holden report which type of public they use, so we'll try both.

In [4]:
# Uncompressed public key
public_key = bitcoin.privkey_to_pubkey(private_key)
public_key
Out[4]:
'04a1582612a51aa8cea8e8ced2078d01141ff941e6c5d985bbae2536ce33ef5396bc7946f188aeb99d6b575b935c218ae19780ef77a535107a5272e7390e1001e4'
In [5]:
# Compressed public key
public_key_compressed = bitcoin.compress(public_key)
public_key_compressed
Out[5]:
'02a1582612a51aa8cea8e8ced2078d01141ff941e6c5d985bbae2536ce33ef5396'

Generate the corresponding addresses

Note that neither address matches the address reported by Irving & Holden, which was 1AHjCz2oEUTH8js4S8vViC8NKph4zCACXH.

In [6]:
# Uncompressed address
address = bitcoin.pubkey_to_address(public_key)
address
Out[6]:
'1P6cxmuSsjDqUGCsyaEzgcj7iTEPsMAjhU'
In [7]:
# Compressed address
address_compressed = bitcoin.pubkey_to_address(public_key_compressed)
address_compressed
Out[7]:
'17pJjJGJJTzVsJx9JSfbx6vp1sGkPNoDoA'
In [8]:
# Check whether the Irving & Holden address is wrong
address_irving = '1AHjCz2oEUTH8js4S8vViC8NKph4zCACXH'
if not address_irving in {address, address_compressed}:
    print('Irving & Holden have a big problem.')
Irving & Holden have a big problem.

Check whether either of the correct addresses has ever been used

As of March 6, 2017, neither address has been used.

In [9]:
# URLs for blockchain.info address details
for address in address, address_compressed:
    url = 'https://blockchain.info/address/{}'.format(address)
    print(url)
https://blockchain.info/address/1P6cxmuSsjDqUGCsyaEzgcj7iTEPsMAjhU
https://blockchain.info/address/17pJjJGJJTzVsJx9JSfbx6vp1sGkPNoDoA

Alternative implementation

For an altertative implementation, you can generate the sha256 checksum via the unix shell:

URL=https://f1000researchdata.s3.amazonaws.com/datasets/8114/9c9f9a18-a852-40c6-953e-c75107abc714_Appendix_1_-_unformatted_text_file_.docx
curl --silent $URL | shasum --algorithm 256

Then you can use bitaddress.org to generate the bitcoin addresses. Just go to the "Wallet Details" page and paste the sha256 hash into the "Enter Private Key" field. This approach generates the same addresses as this notebook.

Plain text hashes

Since Xorbin appears to only support hashing of pasted text rather than an uploaded file, it's likely Irving & Holden pasted the word document contents into Xorbin. It's difficult to recreate exactly how the formatted word document was converted to plain text. Below we convert addresses for one possible plain text representation.

In [10]:
# See carlisle.py for the source of the carlisle_method function
# that implements the address generation logic above.
from carlisle import carlisle_method
In [11]:
# This data was produced by selecting all from the work document, copying,
# and pasting on macOS 10.12.3 using Microsoft Word for Mac 2011 Version 14.0.0
# It's entirely possible the version below has already been corrupted due to automated
# newline encoding conversions.
data = b'''\
Study Type:	Interventional 
Study Design:	Allocation: Randomized
Endpoint Classification: Safety/Efficacy Study
Intervention Model: Parallel Assignment
Masking: Open Label
Primary Purpose: Prevention
Official Title:	Cardiovascular and Metabolic Effects of Moderate Alcohol Consumption in Type 2 Diabetes

Further study details as provided by Ben-Gurion University of the Negev:

Primary Outcome Measures: 
Glycemic control [ Time Frame: 6 months ] [ Designated as safety issue: Yes ]

Secondary Outcome Measures: 
CVD status [ Time Frame: 6 months ] [ Designated as safety issue: Yes ]

'''
In [12]:
carlisle_method(data, compress=False)
Out[12]:
{'address': '1Evz8cSTcq4JtYfHHNk6tjXj9rUW88orUt',
 'compressed': False,
 'private_key': 'e32dbf3fb5525d006af1881809269cca7f749b0e6b82e505ecf690ab9c33ad60',
 'public_key': '04616f8ead7203b881f36e4e3ac7dd98ee982611b468ddeee81f94ca8c0b3564100dcca6fc76920b938806b8a560361a1957f1c2a72b62758cc97c1c70228ed220',
 'url ': 'https://blockchain.info/address/1Evz8cSTcq4JtYfHHNk6tjXj9rUW88orUt'}
In [13]:
carlisle_method(data, compress=True)
Out[13]:
{'address': '1JqGTvoGYPRHf3r8vC2nws5BApnFu7wa8V',
 'compressed': True,
 'private_key': 'e32dbf3fb5525d006af1881809269cca7f749b0e6b82e505ecf690ab9c33ad60',
 'public_key': '02616f8ead7203b881f36e4e3ac7dd98ee982611b468ddeee81f94ca8c0b356410',
 'url ': 'https://blockchain.info/address/1JqGTvoGYPRHf3r8vC2nws5BApnFu7wa8V'}

Update for protocol for manuscript version 3

On March 30, 2017, Irving & Holden posted version 3 of their study to F1000Research. This version contains a new "Dataset 1.Unformatted text file", which is a text document rather than word document. Below we find the hash and addresses for this text.

In [14]:
url = 'https://f1000researchdata.s3.amazonaws.com/datasets/8114/da88d341-eeed-4630-b120-78e9ff8a9d38_CASCADE.txt'
response = urlopen(url)
data = response.read()
data
Out[14]:
b'The CArdiovasCulAr Diabetes & Ethanol (CASCADE) Trial (CASCADE)\r\n\r\n\r\nStudy Type:\tInterventional \r\nStudy Design:\tAllocation: Randomized\r\nEndpoint Classification: Safety/Efficacy Study\r\nIntervention Model: Parallel Assignment\r\nMasking: Open Label\r\nPrimary Purpose: Prevention\r\nOfficial Title:\tCardiovascular and Metabolic Effects of Moderate Alcohol Consumption in Type 2 Diabetes\r\n\r\nFurther study details as provided by Ben-Gurion University of the Negev:\r\n\r\nPrimary Outcome Measures: \r\nGlycemic control [ Time Frame: 6 months ] [ Designated as safety issue: Yes ]\r\n\r\nSecondary Outcome Measures: \r\nCVD status [ Time Frame: 6 months ] [ Designated as safety issue: Yes ]\r\n\r\n\r\nEstimated Enrollment:\t200\r\nStudy Start Date:\tMay 2010\r\nEstimated Study Completion Date:\tMay 2012\r\nEstimated Primary Completion Date:\tMay 2012 (Final data collection date for primary outcome measure)\r\n'
In [15]:
carlisle_method(data, compress=False)
Out[15]:
{'address': '1HZ5Cw2iXcXZpBKwWncmCQYyz4Zn5Mj4qk',
 'compressed': False,
 'private_key': '9072d05a5e95d783a6d28745c19ce8c47eac93cb1bedbde1cf43a192287288f3',
 'public_key': '0441dd7bb4328026f417e41d9582214fd0ddba3e6e8649e3ad59a8db055a44746ee6357b0dacc503f75bd2a7cb25f4886e9975e468e98aceecfb3f15493b4aa2d6',
 'url ': 'https://blockchain.info/address/1HZ5Cw2iXcXZpBKwWncmCQYyz4Zn5Mj4qk'}
In [16]:
carlisle_method(data, compress=True)
Out[16]:
{'address': '1HvdUMh6BFrBdxBehMkgTVJydHHaBL1Na6',
 'compressed': True,
 'private_key': '9072d05a5e95d783a6d28745c19ce8c47eac93cb1bedbde1cf43a192287288f3',
 'public_key': '0241dd7bb4328026f417e41d9582214fd0ddba3e6e8649e3ad59a8db055a44746e',
 'url ': 'https://blockchain.info/address/1HvdUMh6BFrBdxBehMkgTVJydHHaBL1Na6'}