Last updated: February 2021
Darshan is a diagnostic tool for HPC I/O profiling. It's design makes it applicable for continuous monitoring and not just sporadic I/O performance investigations. Darshan consists of two independent components for generating and analyzing profiling information. The source code and its release archives are available from https://xgitlab.cels.anl.gov/darshan/darshan.
The profiling information includes basic runtime information, mounted file systems, records, and modules. Records are numeric identifiers representing the storage objects for which profiling information is available. Modules represent different types of I/O operations, such as POSIX, MPI-IO, STDIO, etc. Each module has a number of counters that represent various statistical properties of the module's I/O operations.
Complete Darshan documentation is available at https://www.mcs.anl.gov/research/projects/darshan/documentation/.
PyDarshan is a Python package for Darshan profiling data analysis and visualization.
There are two ways how to install PyDarshan. Those interested in Darshan data analysis only should use the standard Python method with the pip
command. For PyDarshan development, install from the source code in the Darshan repository mentioned above.
First, obtain Darshan source code and install a collection of tools for parsing and summarizing Darshan output files:
$ cd darshan-util
$ ./configure --prefix=INSTALLDIR --enable-shared
$ make install
If successful, add to the $PATH
environment variable the folder where the darshan-parser
tool is located. Then:
$ pip install darshan
installs PyDarshan.
If interested in PyDarshan development, the final step above is slightly different:
$ cd darshan-util/pydarshan
$ pip install -e .
PyDarshan documentation is for now only available from the repository in its "source" reStructuredText format. Use the standard Python method to convert it into any of the typical viewable formats.
HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data. The HDF5 Technology suite includes tools and applications for managing, manipulating, viewing, and analyzing data in the HDF5 format.
Darshan provides HDF5-specific profiling data in two modules: H5F and H5D when the Darshan profiling tool is built with the --enable-hdf5-mod=/path/to/hdf5/install
configure option.
PyDarshan at this time is in early stage of the development so its API may change. We are going to indicate when using any feature marked experimental as of now.
Python version:
from platform import python_version
python_version()
'3.8.7'
Packages and their versions:
from pprint import pprint
import pandas as pd
import numpy as np
import darshan
for _ in (pd, darshan, np):
print(f'{_.__name__} v{_.__version__}')
pandas v1.2.1 darshan v0.0.7 numpy v1.20.0
We are using Darshan profiling data that includes the two HDF5-specific modules. By default, PyDarshan will convert module counter data to pandas.DataFrame objects. The other option are NumPy arrays (dtype='numpy'
) and we opted for that because we create our own pandas.DataFrame objects for HDF5 module counters later in this notebook.
rep = darshan.DarshanReport('./example.darshan', dtype='numpy')
Skipping. Log does not contain data for mod: LUSTRE
The Python object representing the Darshan report:
rep
<darshan.report.DarshanReport at 0x11d7beb20>
Below is the information derived from the default header section:
rep.info(metadata=False)
Filename: ./example.darshan Times: 2020-09-03 16:02:26 to 2020-09-03 16:02:38 (Duration 0:00:12) Executeable: ./hdf5_iotest run-1.ini Processes: 4 JobID: 22156 UID: 1000 Modules in Log: ['POSIX', 'MPI-IO', 'H5F', 'H5D', 'STDIO'] Loaded Records: {'POSIX': 1, 'MPI-IO': 1, 'H5F': 1, 'H5D': 1359, 'STDIO': 3} Name Records: 803 Darshan/Hints: {'lib_ver': '3.2.1', 'h': 'romio_no_indep_rw=true;cb_nodes=4'} DarshanReport: id(4789627680) (tmp)
Note the lines in the above report: Modules in Log and Loaded Records. The first line shows all modules present in the Darshan file, while the second line shows only the modules which data were read in.
Setting metadata=True
adds more metadata information to display:
rep.info(metadata=True)
Filename: ./example.darshan Times: 2020-09-03 16:02:26 to 2020-09-03 16:02:38 (Duration 0:00:12) Executeable: ./hdf5_iotest run-1.ini Processes: 4 JobID: 22156 UID: 1000 Modules in Log: ['POSIX', 'MPI-IO', 'H5F', 'H5D', 'STDIO'] Loaded Records: {'POSIX': 1, 'MPI-IO': 1, 'H5F': 1, 'H5D': 1359, 'STDIO': 3} Name Records: 803 Darshan/Hints: {'lib_ver': '3.2.1', 'h': 'romio_no_indep_rw=true;cb_nodes=4'} DarshanReport: id(4789627680) (tmp) metadata['job']['uid'] = 1000 metadata['job']['start_time'] = 1599163346 metadata['job']['end_time'] = 1599163358 metadata['job']['nprocs'] = 4 metadata['job']['jobid'] = 22156 metadata['job']['metadata'] = {'lib_ver': '3.2.1', 'h': 'romio_no_indep_rw=true;cb_nodes=4'} metadata['exe'] = ./hdf5_iotest run-1.ini
The "metadata" part is also available separately as a dictionary:
pprint(rep.metadata)
{'exe': './hdf5_iotest run-1.ini', 'job': {'end_time': 1599163358, 'jobid': 22156, 'metadata': {'h': 'romio_no_indep_rw=true;cb_nodes=4', 'lib_ver': '3.2.1'}, 'nprocs': 4, 'start_time': 1599163346, 'uid': 1000}}
Darshan report's file name:
rep.filename
'./example.darshan'
Start and end time of the profiling job are available as datetime.datetime
objects but without a time zone specified:
rep.start_time
datetime.datetime(2020, 9, 3, 16, 2, 26)
rep.end_time
datetime.datetime(2020, 9, 3, 16, 2, 38)
rep.start_time.tzinfo is None
True
Available modules and their basic information are available as a dictionary of dictionaries:
rep.modules
{'POSIX': {'len': 270, 'ver': 4, 'idx': 1, 'num_records': 1}, 'MPI-IO': {'len': 212, 'ver': 3, 'idx': 2, 'num_records': 1}, 'H5F': {'len': 50, 'ver': 3, 'idx': 3, 'num_records': 1}, 'H5D': {'len': 91756, 'ver': 1, 'idx': 4, 'num_records': 1359}, 'STDIO': {'len': 174, 'ver': 2, 'idx': 8, 'num_records': 3}}
The mounted file systems and their types are available as a list of two-item tuples (<mount point>, <fs type>)
:
rep.mounts
[('/proc/sys/fs/binfmt_misc', 'autofs'), ('/sys/fs/cgroup/unified', 'cgroup2'), ('/sys/fs/pstore', 'pstore'), ('/mnt/scratch', 'ext4'), ('/sys/fs/bpf', 'bpf'), ('/dev/mqueue', 'mqueue'), ('/boot/efi', 'vfat'), ('/dev', 'devtmpfs'), ('/', 'ext4')]
Record IDs and their full file system paths are available as a dictionary:
len(rep.name_records)
803
rep.name_records
59226111675751: '/mnt/scratch/out-1.h5:/step=0/array=490', 5075477941040549916: '/mnt/scratch/out-1.h5:/step=0/array=491', 2656080831511421580: '/mnt/scratch/out-1.h5:/step=0/array=492', 981813292813725046: '/mnt/scratch/out-1.h5:/step=0/array=493', 7088267117835563390: '/mnt/scratch/out-1.h5:/step=0/array=494', 9797097948874073156: '/mnt/scratch/out-1.h5:/step=0/array=495', 2470883244632348843: '/mnt/scratch/out-1.h5:/step=0/array=496', 5946348691341731964: '/mnt/scratch/out-1.h5:/step=0/array=497', 14802219618710625791: '/mnt/scratch/out-1.h5:/step=0/array=498', 17606145139435315027: '/mnt/scratch/out-1.h5:/step=0/array=499', 561704199900244003: '/mnt/scratch/out-1.h5:/step=1/array=0', 12109967703595006251: '/mnt/scratch/out-1.h5:/step=1/array=1', 17085615950698936687: '/mnt/scratch/out-1.h5:/step=1/array=2', 11597017616278610449: '/mnt/scratch/out-1.h5:/step=1/array=3', 462746467154257129: '/mnt/scratch/out-1.h5:/step=1/array=4', 11381211498974871650: '/mnt/scratch/out-1.h5:/step=1/array=5', 381896227973188897: '/mnt/scratch/out-1.h5:/step=1/array=6', 4181124608011994175: '/mnt/scratch/out-1.h5:/step=1/array=7', 15797554345251048702: '/mnt/scratch/out-1.h5:/step=1/array=8', 354837074424737607: '/mnt/scratch/out-1.h5:/step=1/array=9', 10422339527635718786: '/mnt/scratch/out-1.h5:/step=1/array=10', 7298714799233703336: '/mnt/scratch/out-1.h5:/step=1/array=11', 15485236581301179800: '/mnt/scratch/out-1.h5:/step=1/array=12', 1573396254114704258: '/mnt/scratch/out-1.h5:/step=1/array=13', 16098597253403614910: '/mnt/scratch/out-1.h5:/step=1/array=14', 6191641084346171120: '/mnt/scratch/out-1.h5:/step=1/array=15', 3482351709367614931: '/mnt/scratch/out-1.csv', 8492687671952229618: '/mnt/scratch/out-1.h5:/step=1/array=16', 2499097183607786734: '/mnt/scratch/out-1.h5:/step=1/array=17', 2262264678947598372: '/mnt/scratch/out-1.h5:/step=1/array=18', 14486595813084181844: '/mnt/scratch/out-1.h5:/step=1/array=19', 18367691124428588668: '/mnt/scratch/out-1.h5:/step=1/array=20', 16525337597565834004: '/mnt/scratch/out-1.h5:/step=1/array=21', 11486521408555970373: '/mnt/scratch/out-1.h5:/step=1/array=22', 8528817359234897307: '/mnt/scratch/out-1.h5:/step=1/array=23', 18343014211853338238: '/mnt/scratch/out-1.h5:/step=1/array=24', 8985365463734415873: '/mnt/scratch/out-1.h5:/step=1/array=25', 13552016402151065975: '/mnt/scratch/out-1.h5:/step=1/array=26', 1357004102189539208: '/mnt/scratch/out-1.h5:/step=1/array=27', 11096963204780887793: '/mnt/scratch/out-1.h5:/step=1/array=28', 11293896398886417737: '/mnt/scratch/out-1.h5:/step=1/array=29', 8313173431484861461: '/mnt/scratch/out-1.h5:/step=1/array=30', 6996726802751031583: '/mnt/scratch/out-1.h5:/step=1/array=31', 14637035027740216104: '/mnt/scratch/out-1.h5:/step=1/array=32', 782416802494760780: '/mnt/scratch/out-1.h5:/step=1/array=33', 11344265823939533452: '/mnt/scratch/out-1.h5:/step=1/array=34', 3840640611309374573: '/mnt/scratch/out-1.h5:/step=1/array=35', 3180570969674819858: '/mnt/scratch/out-1.h5:/step=1/array=36', 12888867272604818507: '/mnt/scratch/out-1.h5:/step=1/array=37', 8811484014485052125: '/mnt/scratch/out-1.h5:/step=1/array=38', 12724915768824409474: '/mnt/scratch/out-1.h5:/step=1/array=39', 5938487574049987820: '/mnt/scratch/out-1.h5:/step=1/array=40', 12309154115217485949: '/mnt/scratch/out-1.h5:/step=1/array=41', 16868223250887198785: '/mnt/scratch/out-1.h5:/step=1/array=42', 13695880838830110428: '/mnt/scratch/out-1.h5:/step=1/array=43', 4208275541788785666: '/mnt/scratch/out-1.h5:/step=1/array=44', 15726436027570255532: '/mnt/scratch/out-1.h5:/step=1/array=45', 5949796798382404939: '/mnt/scratch/out-1.h5:/step=1/array=46', 11661495366053669815: '/mnt/scratch/out-1.h5:/step=1/array=47', 9498495257352117373: '/mnt/scratch/out-1.h5:/step=1/array=48', 7815144126927863944: '/mnt/scratch/out-1.h5:/step=1/array=49', 2609568482075845697: '/mnt/scratch/out-1.h5:/step=1/array=50', 14925484947544368862: '/mnt/scratch/out-1.h5:/step=1/array=51', 11711039316704227550: '/mnt/scratch/out-1.h5:/step=1/array=52', 10171820264134867795: '/mnt/scratch/out-1.h5:/step=1/array=53', 266626120084254889: '/mnt/scratch/out-1.h5:/step=1/array=54', 12208899145237471808: '/mnt/scratch/out-1.h5:/step=1/array=55', 3385810819039426223: '/mnt/scratch/out-1.h5:/step=1/array=56', 3333482965068458420: '/mnt/scratch/out-1.h5:/step=1/array=57', 10027142894750181256: '/mnt/scratch/out-1.h5:/step=1/array=58', 17630578062694934470: '/mnt/scratch/out-1.h5:/step=1/array=59', 13880074753203162246: '/mnt/scratch/out-1.h5:/step=1/array=60', 2896299428069923108: '/mnt/scratch/out-1.h5:/step=1/array=61', 17955598013618121875: '/mnt/scratch/out-1.h5:/step=1/array=62', 8372167644642791076: '/mnt/scratch/out-1.h5:/step=1/array=63', 12724842676536952319: '/mnt/scratch/out-1.h5:/step=1/array=64', 12252030019844466818: '/mnt/scratch/out-1.h5:/step=1/array=65', 15220373791737783225: '/mnt/scratch/out-1.h5:/step=1/array=66', 15868723069398515280: '/mnt/scratch/out-1.h5:/step=1/array=67', 17946025449403849568: '/mnt/scratch/out-1.h5:/step=1/array=68', 16165682303639493070: '/mnt/scratch/out-1.h5:/step=1/array=69', 3601971039620088059: '/mnt/scratch/out-1.h5:/step=1/array=70', 16491614312024757241: '/mnt/scratch/out-1.h5:/step=1/array=71', 6008115188592411958: '/mnt/scratch/out-1.h5:/step=1/array=72', 1906609873437085429: '/mnt/scratch/out-1.h5:/step=1/array=73', 14799586160471801708: '/mnt/scratch/out-1.h5:/step=1/array=74', 13688163233387649113: '/mnt/scratch/out-1.h5:/step=1/array=75', 13121856434634578064: '/mnt/scratch/out-1.h5:/step=1/array=76', 1089150648672285312: '/mnt/scratch/out-1.h5:/step=1/array=77', 8987733434988144274: '/mnt/scratch/out-1.h5:/step=1/array=78', 2377239564957602663: '/mnt/scratch/out-1.h5:/step=1/array=79', 17657308812462719678: '/mnt/scratch/out-1.h5:/step=1/array=80', 2459168049857030150: '/mnt/scratch/out-1.h5:/step=1/array=81', 10752310756898053890: '/mnt/scratch/out-1.h5:/step=1/array=82', 9593947436425940930: '/mnt/scratch/out-1.h5:/step=1/array=83', 6557792700872530572: '/mnt/scratch/out-1.h5:/step=1/array=84', 5253505383643202909: '/mnt/scratch/out-1.h5:/step=1/array=85', 14380658721197727042: '/mnt/scratch/out-1.h5:/step=1/array=86', 14547840845451002146: '/mnt/scratch/out-1.h5:/step=1/array=87', 188841098596331698: '/mnt/scratch/out-1.h5:/step=1/array=88', 5768208784889972376: '/mnt/scratch/out-1.h5:/step=1/array=89', 8150498927495525811: '/mnt/scratch/out-1.h5:/step=1/array=90', 6578637384901469592: '/mnt/scratch/out-1.h5:/step=1/array=91', 17464250643379467332: '/mnt/scratch/out-1.h5:/step=1/array=92', 16757742259315597488: '/mnt/scratch/out-1.h5:/step=1/array=93', 4060020236450382802: '/mnt/scratch/out-1.h5:/step=1/array=94', 11785288893647270054: '/mnt/scratch/out-1.h5:/step=1/array=95', 9778000083824306255: '/mnt/scratch/out-1.h5:/step=1/array=96', 16147508975796932732: '/mnt/scratch/out-1.h5:/step=1/array=97', 4331029224511751024: '/mnt/scratch/out-1.h5:/step=1/array=98', 12492657802168721065: '/mnt/scratch/out-1.h5:/step=1/array=99', 2340543408247271029: '/mnt/scratch/out-1.h5:/step=1/array=100', 16243026693249396888: '/mnt/scratch/out-1.h5:/step=1/array=101', 11760073706901699366: '/mnt/scratch/out-1.h5:/step=1/array=102', 15268507328603955581: '/mnt/scratch/out-1.h5:/step=1/array=103', 1790376444317548074: '/mnt/scratch/out-1.h5:/step=1/array=104', 5284590180552083804: '/mnt/scratch/out-1.h5:/step=1/array=105', 5329266020236953846: '/mnt/scratch/out-1.h5:/step=1/array=106', 13398955324908961778: '/mnt/scratch/out-1.h5:/step=1/array=107', 537535326296437356: '/mnt/scratch/out-1.h5:/step=1/array=108', 3622825790868987581: '/mnt/scratch/out-1.h5:/step=1/array=109', 15640472814973728910: '/mnt/scratch/out-1.h5:/step=1/array=110', 13191585128402812568: '/mnt/scratch/out-1.h5:/step=1/array=111', 9383811978843031732: '/mnt/scratch/out-1.h5:/step=1/array=112', 14781827792227424118: '/mnt/scratch/out-1.h5:/step=1/array=113', 3580884761892438237: '/mnt/scratch/out-1.h5:/step=1/array=114', 4708449054934979350: '/mnt/scratch/out-1.h5:/step=1/array=115', 7229313780532420690: '/mnt/scratch/out-1.h5:/step=1/array=116', 4523877881711801366: '/mnt/scratch/out-1.h5:/step=1/array=117', 339629479976377994: '/mnt/scratch/out-1.h5:/step=1/array=118', 14380309210193521963: '/mnt/scratch/out-1.h5:/step=1/array=119', 2770164396205864027: '/mnt/scratch/out-1.h5:/step=1/array=120', 416587486720117586: '/mnt/scratch/out-1.h5:/step=1/array=121', 17576728895611870381: '/mnt/scratch/out-1.h5:/step=1/array=122', 5209118542157154034: '/mnt/scratch/out-1.h5:/step=1/array=123', 3840143255128576240: '/mnt/scratch/out-1.h5:/step=1/array=124', 8494016601954548887: '/mnt/scratch/out-1.h5:/step=1/array=125', 6252344094267030681: '/mnt/scratch/out-1.h5:/step=1/array=126', 2466922389477395590: '/mnt/scratch/out-1.h5:/step=1/array=127', 4184448917532659469: '/mnt/scratch/out-1.h5:/step=1/array=128', 17283914592905121063: '/mnt/scratch/out-1.h5:/step=1/array=129', 17457647645988245344: '/mnt/scratch/out-1.h5:/step=1/array=130', 4108251999178579314: '/mnt/scratch/out-1.h5:/step=1/array=131', 758944213053569953: '/mnt/scratch/out-1.h5:/step=1/array=132', 1759338332309225301: '/mnt/scratch/out-1.h5:/step=1/array=133', 4455906451963497041: '/mnt/scratch/out-1.h5:/step=1/array=134', 10020113759078375205: '/mnt/scratch/out-1.h5:/step=1/array=135', 5293096336862554307: '/mnt/scratch/out-1.h5:/step=1/array=136', 2792004489258821091: '/mnt/scratch/out-1.h5:/step=1/array=137', 12228124600909326748: '/mnt/scratch/out-1.h5:/step=1/array=138', 5365805258179520232: '/mnt/scratch/out-1.h5:/step=1/array=139', 6985320848945709407: '/mnt/scratch/out-1.h5:/step=1/array=140', 6932779135261729626: '/mnt/scratch/out-1.h5:/step=1/array=141', 2881371782936455662: '/mnt/scratch/out-1.h5:/step=1/array=142', 14099892794203978245: '/mnt/scratch/out-1.h5:/step=1/array=143', 14119702524200067500: '/mnt/scratch/out-1.h5:/step=1/array=144', 7707880820729110147: '/mnt/scratch/out-1.h5:/step=1/array=145', 11264796391684315095: '/mnt/scratch/out-1.h5:/step=1/array=146', 7044656468063715408: '/mnt/scratch/out-1.h5:/step=1/array=147', 3969301042393807139: '/mnt/scratch/out-1.h5:/step=1/array=148', 7133891392710228195: '/mnt/scratch/out-1.h5:/step=1/array=149', 1726397550907497696: '/mnt/scratch/out-1.h5:/step=1/array=150', 7845647934224172702: '/mnt/scratch/out-1.h5:/step=1/array=151', 4115309852847441160: '/mnt/scratch/out-1.h5:/step=1/array=152', 7890283024075287234: '/mnt/scratch/out-1.h5:/step=1/array=153', 15823808760815115347: '/mnt/scratch/out-1.h5:/step=1/array=154', 3027476250788859106: '/mnt/scratch/out-1.h5:/step=1/array=155', 6233504853289609296: '/mnt/scratch/out-1.h5:/step=1/array=156', 14418791009176562009: '/mnt/scratch/out-1.h5:/step=1/array=157', 16927700438821638987: '/mnt/scratch/out-1.h5:/step=1/array=158', 11950642777297831247: '/mnt/scratch/out-1.h5:/step=1/array=159', 16038497320366066421: '/mnt/scratch/out-1.h5:/step=1/array=160', 5656296207968503656: '/mnt/scratch/out-1.h5:/step=1/array=161', 3625882615105569521: '/mnt/scratch/out-1.h5:/step=1/array=162', 12647051291472033497: '/mnt/scratch/out-1.h5:/step=1/array=163', 10464229845305901757: '/mnt/scratch/out-1.h5:/step=1/array=164', 8836573323694182769: '/mnt/scratch/out-1.h5:/step=1/array=165', 5772443994698361284: '/mnt/scratch/out-1.h5:/step=1/array=166', 2383794540360303689: '/mnt/scratch/out-1.h5:/step=1/array=167', 16890790681556349777: '/mnt/scratch/out-1.h5:/step=1/array=168', 4057394635380387205: '/mnt/scratch/out-1.h5:/step=1/array=169', 8164817535178530484: '/mnt/scratch/out-1.h5:/step=1/array=170', 4745671710088766566: '/mnt/scratch/out-1.h5:/step=1/array=171', 16376736157462215156: '/mnt/scratch/out-1.h5:/step=1/array=172', 3625382156623038781: '/mnt/scratch/out-1.h5:/step=1/array=173', 6244835532012121663: '/mnt/scratch/out-1.h5:/step=1/array=174', 11028287198362022104: '/mnt/scratch/out-1.h5:/step=1/array=175', 7798802387717506786: '/mnt/scratch/out-1.h5:/step=1/array=176', 949978033827754391: '/mnt/scratch/out-1.h5:/step=1/array=177', 4009183876348383782: '/mnt/scratch/out-1.h5:/step=1/array=178', 8546685463415630195: '/mnt/scratch/out-1.h5:/step=1/array=179', 1458851021576757657: '/mnt/scratch/out-1.h5:/step=1/array=180', 2589279716036368063: '/mnt/scratch/out-1.h5:/step=1/array=181', 4371158008596151013: '/mnt/scratch/out-1.h5:/step=1/array=182', 4889750843583301195: '/mnt/scratch/out-1.h5:/step=1/array=183', 7799307494257798225: '/mnt/scratch/out-1.h5:/step=1/array=184', 15134267188139554977: '/mnt/scratch/out-1.h5:/step=1/array=185', 14630049580125274520: '/mnt/scratch/out-1.h5:/step=1/array=186', 7667829098065046245: '/mnt/scratch/out-1.h5:/step=1/array=187', 4372738983834902358: '/mnt/scratch/out-1.h5:/step=1/array=188', 9794561781051234097: '/mnt/scratch/out-1.h5:/step=1/array=189', 6749028482376053626: '/mnt/scratch/out-1.h5:/step=1/array=190', 5190447444741708478: '/mnt/scratch/out-1.h5:/step=1/array=191', 14368996360224851406: '/mnt/scratch/out-1.h5:/step=1/array=192', 17664398278311790902: '/mnt/scratch/out-1.h5:/step=1/array=193', 9674440713368121895: '/mnt/scratch/out-1.h5:/step=1/array=194', 10993792742483248672: '/mnt/scratch/out-1.h5:/step=1/array=195', 13081601435063105520: '/mnt/scratch/out-1.h5:/step=1/array=196', 12430848352397509439: '/mnt/scratch/out-1.h5:/step=1/array=197', 11391855561279195525: '/mnt/scratch/out-1.h5:/step=1/array=198', 1240381618261707807: '/mnt/scratch/out-1.h5:/step=1/array=199', 12363358693379175255: '/mnt/scratch/out-1.h5:/step=1/array=200', 2956569643850076632: '/mnt/scratch/out-1.h5:/step=1/array=201', 16259393615048411240: '/mnt/scratch/out-1.h5:/step=1/array=202', 8174953751763769767: '/mnt/scratch/out-1.h5:/step=1/array=203', 11863153683078094481: '/mnt/scratch/out-1.h5:/step=1/array=204', 2791742745406159755: '/mnt/scratch/out-1.h5:/step=1/array=205', 3444726039295902896: '/mnt/scratch/out-1.h5:/step=1/array=206', 3040820162201698614: '/mnt/scratch/out-1.h5:/step=1/array=207', 16673097992785029298: '/mnt/scratch/out-1.h5:/step=1/array=208', 9413163635742963782: '/mnt/scratch/out-1.h5:/step=1/array=209', 4012499465552658214: '/mnt/scratch/out-1.h5:/step=1/array=210', 7790254887466621171: '/mnt/scratch/out-1.h5:/step=1/array=211', 16940845500114402536: '/mnt/scratch/out-1.h5:/step=1/array=212', 5420406653431326142: '/mnt/scratch/out-1.h5:/step=1/array=213', 7123021168216731493: '/mnt/scratch/out-1.h5:/step=1/array=214', 10453144517164273873: '/mnt/scratch/out-1.h5:/step=1/array=215', 971338099479365109: '/mnt/scratch/out-1.h5:/step=1/array=216', 9224065359425143872: '/mnt/scratch/out-1.h5:/step=1/array=217', 14387921930234892980: '/mnt/scratch/out-1.h5:/step=1/array=218', 14701509124960351843: '/mnt/scratch/out-1.h5:/step=1/array=219', 16068115537479898974: '/mnt/scratch/out-1.h5:/step=1/array=220', 8855002340048769154: '/mnt/scratch/out-1.h5:/step=1/array=221', 13588490864352657253: '/mnt/scratch/out-1.h5:/step=1/array=222', 1118978315013496171: '/mnt/scratch/out-1.h5:/step=1/array=223', 1811788658729126331: '/mnt/scratch/out-1.h5:/step=1/array=224', 169739931440299243: '/mnt/scratch/out-1.h5:/step=1/array=225', 6800764775952023347: '/mnt/scratch/out-1.h5:/step=1/array=226', 6484301029912163552: '/mnt/scratch/out-1.h5:/step=1/array=227', 17513687111072836006: '/mnt/scratch/out-1.h5:/step=1/array=228', 12333080959601745640: '/mnt/scratch/out-1.h5:/step=1/array=229', 10000055674676306919: '/mnt/scratch/out-1.h5:/step=1/array=230', 11207914033545770601: '/mnt/scratch/out-1.h5:/step=1/array=231', 1511841531474710419: '/mnt/scratch/out-1.h5:/step=1/array=232', 9221725147699572432: '/mnt/scratch/out-1.h5:/step=1/array=233', 17361088067867371472: '/mnt/scratch/out-1.h5:/step=1/array=234', 7770278504457245581: '/mnt/scratch/out-1.h5:/step=1/array=235', 13228306207780695827: '/mnt/scratch/out-1.h5:/step=1/array=236', 14000305091957180577: '/mnt/scratch/out-1.h5:/step=1/array=237', 8422058623420515987: '/mnt/scratch/out-1.h5:/step=1/array=238', 15877134259397614362: '/mnt/scratch/out-1.h5:/step=1/array=239', 2075362978812343108: '/mnt/scratch/out-1.h5:/step=1/array=240', 6722063665390505408: '/mnt/scratch/out-1.h5:/step=1/array=241', 17454934141774426937: '/mnt/scratch/out-1.h5:/step=1/array=242', 13616661398097527790: '/mnt/scratch/out-1.h5:/step=1/array=243', 16610874974763805440: '/mnt/scratch/out-1.h5:/step=1/array=244', 8299053928509615006: '/mnt/scratch/out-1.h5:/step=1/array=245', 15453191289806336053: '/mnt/scratch/out-1.h5:/step=1/array=246', 2966931707055620987: '/mnt/scratch/out-1.h5:/step=1/array=247', 12313572122113761730: '/mnt/scratch/out-1.h5:/step=1/array=248', 14894119045303489837: '/mnt/scratch/out-1.h5:/step=1/array=249', 7031522720113835363: '/mnt/scratch/out-1.h5:/step=1/array=250', 16816546610626596993: '/mnt/scratch/out-1.h5:/step=1/array=251', 12387644909739249225: '/mnt/scratch/out-1.h5:/step=1/array=252', 6023179503171797323: '/mnt/scratch/out-1.h5:/step=1/array=253', 2102469118652088723: '/mnt/scratch/out-1.h5:/step=1/array=254', 14411085020931401567: '/mnt/scratch/out-1.h5:/step=1/array=255', 15550004437032332724: '/mnt/scratch/out-1.h5:/step=1/array=256', 11271486308255737607: '/mnt/scratch/out-1.h5:/step=1/array=257', 4223584569477104569: '/mnt/scratch/out-1.h5:/step=1/array=258', 1456818863408842272: '/mnt/scratch/out-1.h5:/step=1/array=259', 2206875509381200322: '/mnt/scratch/out-1.h5:/step=1/array=260', 18350719498899752920: '/mnt/scratch/out-1.h5:/step=1/array=261', 12137342194959676710: '/mnt/scratch/out-1.h5:/step=1/array=262', 8635509251551616132: '/mnt/scratch/out-1.h5:/step=1/array=263', 7363533760576044577: '/mnt/scratch/out-1.h5:/step=1/array=264', 11818763711670039604: '/mnt/scratch/out-1.h5:/step=1/array=265', 8507330079988830259: '/mnt/scratch/out-1.h5:/step=1/array=266', 10603285669663188614: '/mnt/scratch/out-1.h5:/step=1/array=267', 16707904568593317053: '/mnt/scratch/out-1.h5:/step=1/array=268', 228462414771196728: '/mnt/scratch/out-1.h5:/step=1/array=269', 4682796405398838664: '/mnt/scratch/out-1.h5:/step=1/array=270', 16834851164073663638: '/mnt/scratch/out-1.h5:/step=1/array=271', 17678721908080518110: '/mnt/scratch/out-1.h5:/step=1/array=272', 15575225497957393486: '/mnt/scratch/out-1.h5:/step=1/array=273', 6090734001272452052: '/mnt/scratch/out-1.h5:/step=1/array=274', 12654044241780822548: '/mnt/scratch/out-1.h5:/step=1/array=275', 2539334396108649334: '/mnt/scratch/out-1.h5:/step=1/array=276', 7380736162144973446: '/mnt/scratch/out-1.h5:/step=1/array=277', 198640175704748004: '/mnt/scratch/out-1.h5:/step=1/array=278', 2818297949462573753: '/mnt/scratch/out-1.h5:/step=1/array=279', 7438996831317166754: '/mnt/scratch/out-1.h5:/step=1/array=280', 16778590793803016004: '/mnt/scratch/out-1.h5:/step=1/array=281', 14352401806309419301: '/mnt/scratch/out-1.h5:/step=1/array=282', 1210512577728056775: '/mnt/scratch/out-1.h5:/step=1/array=283', 2008725607192955236: '/mnt/scratch/out-1.h5:/step=1/array=284', 17329728312354990775: '/mnt/scratch/out-1.h5:/step=1/array=285', 16376123998073351106: '/mnt/scratch/out-1.h5:/step=1/array=286', 4140211037130493944: '/mnt/scratch/out-1.h5:/step=1/array=287', 14919157349669063705: '/mnt/scratch/out-1.h5:/step=1/array=288', 16368330118496349649: '/mnt/scratch/out-1.h5:/step=1/array=289', 9242312965968887304: '/mnt/scratch/out-1.h5:/step=1/array=290', 534956015317185232: '/mnt/scratch/out-1.h5:/step=1/array=291', 3607870649597862489: '/mnt/scratch/out-1.h5:/step=1/array=292', 1365004249758095482: '/mnt/scratch/out-1.h5:/step=1/array=293', 1355275519819362448: '/mnt/scratch/out-1.h5:/step=1/array=294', 1310890006768924292: '/mnt/scratch/out-1.h5:/step=1/array=295', 4213945677347829095: '/mnt/scratch/out-1.h5:/step=1/array=296'}
HDF5 records also include dataset path names:
rep.name_records[4213945677347829095]
'/mnt/scratch/out-1.h5:/step=1/array=296'
The above record name has two parts:
/mnt/scratch/out-1.h5
, and/step=1/array=296
.All the counters present in the Darshan file are available as a dictionary from the .counters
property. They are organized per their modules:
rep.counters.keys()
dict_keys(['POSIX', 'MPI-IO', 'H5F', 'H5D', 'STDIO'])
The counters for each module are further separated into integer and floating-point groups:
rep.counters['H5F'].keys()
dict_keys(['counters', 'fcounters'])
For example, integer counters for the H5F module are:
rep.counters['H5F']['counters']
['H5F_OPENS', 'H5F_FLUSHES', 'H5F_USE_MPIIO']
and H5F's floating-point counters are:
rep.counters['H5F']['fcounters']
['H5F_F_OPEN_START_TIMESTAMP', 'H5F_F_CLOSE_START_TIMESTAMP', 'H5F_F_OPEN_END_TIMESTAMP', 'H5F_F_CLOSE_END_TIMESTAMP', 'H5F_F_META_TIME']
H5D module counters are:
pprint(rep.counters['H5D'])
{'counters': ['H5D_OPENS', 'H5D_READS', 'H5D_WRITES', 'H5D_FLUSHES', 'H5D_BYTES_READ', 'H5D_BYTES_WRITTEN', 'H5D_RW_SWITCHES', 'H5D_REGULAR_HYPERSLAB_SELECTS', 'H5D_IRREGULAR_HYPERSLAB_SELECTS', 'H5D_POINT_SELECTS', 'H5D_MAX_READ_TIME_SIZE', 'H5D_MAX_WRITE_TIME_SIZE', 'H5D_SIZE_READ_AGG_0_100', 'H5D_SIZE_READ_AGG_100_1K', 'H5D_SIZE_READ_AGG_1K_10K', 'H5D_SIZE_READ_AGG_10K_100K', 'H5D_SIZE_READ_AGG_100K_1M', 'H5D_SIZE_READ_AGG_1M_4M', 'H5D_SIZE_READ_AGG_4M_10M', 'H5D_SIZE_READ_AGG_10M_100M', 'H5D_SIZE_READ_AGG_100M_1G', 'H5D_SIZE_READ_AGG_1G_PLUS', 'H5D_SIZE_WRITE_AGG_0_100', 'H5D_SIZE_WRITE_AGG_100_1K', 'H5D_SIZE_WRITE_AGG_1K_10K', 'H5D_SIZE_WRITE_AGG_10K_100K', 'H5D_SIZE_WRITE_AGG_100K_1M', 'H5D_SIZE_WRITE_AGG_1M_4M', 'H5D_SIZE_WRITE_AGG_4M_10M', 'H5D_SIZE_WRITE_AGG_10M_100M', 'H5D_SIZE_WRITE_AGG_100M_1G', 'H5D_SIZE_WRITE_AGG_1G_PLUS', 'H5D_ACCESS1_ACCESS', 'H5D_ACCESS1_LENGTH_D1', 'H5D_ACCESS1_LENGTH_D2', 'H5D_ACCESS1_LENGTH_D3', 'H5D_ACCESS1_LENGTH_D4', 'H5D_ACCESS1_LENGTH_D5', 'H5D_ACCESS1_STRIDE_D1', 'H5D_ACCESS1_STRIDE_D2', 'H5D_ACCESS1_STRIDE_D3', 'H5D_ACCESS1_STRIDE_D4', 'H5D_ACCESS1_STRIDE_D5', 'H5D_ACCESS2_ACCESS', 'H5D_ACCESS2_LENGTH_D1', 'H5D_ACCESS2_LENGTH_D2', 'H5D_ACCESS2_LENGTH_D3', 'H5D_ACCESS2_LENGTH_D4', 'H5D_ACCESS2_LENGTH_D5', 'H5D_ACCESS2_STRIDE_D1', 'H5D_ACCESS2_STRIDE_D2', 'H5D_ACCESS2_STRIDE_D3', 'H5D_ACCESS2_STRIDE_D4', 'H5D_ACCESS2_STRIDE_D5', 'H5D_ACCESS3_ACCESS', 'H5D_ACCESS3_LENGTH_D1', 'H5D_ACCESS3_LENGTH_D2', 'H5D_ACCESS3_LENGTH_D3', 'H5D_ACCESS3_LENGTH_D4', 'H5D_ACCESS3_LENGTH_D5', 'H5D_ACCESS3_STRIDE_D1', 'H5D_ACCESS3_STRIDE_D2', 'H5D_ACCESS3_STRIDE_D3', 'H5D_ACCESS3_STRIDE_D4', 'H5D_ACCESS3_STRIDE_D5', 'H5D_ACCESS4_ACCESS', 'H5D_ACCESS4_LENGTH_D1', 'H5D_ACCESS4_LENGTH_D2', 'H5D_ACCESS4_LENGTH_D3', 'H5D_ACCESS4_LENGTH_D4', 'H5D_ACCESS4_LENGTH_D5', 'H5D_ACCESS4_STRIDE_D1', 'H5D_ACCESS4_STRIDE_D2', 'H5D_ACCESS4_STRIDE_D3', 'H5D_ACCESS4_STRIDE_D4', 'H5D_ACCESS4_STRIDE_D5', 'H5D_ACCESS1_COUNT', 'H5D_ACCESS2_COUNT', 'H5D_ACCESS3_COUNT', 'H5D_ACCESS4_COUNT', 'H5D_DATASPACE_NDIMS', 'H5D_DATASPACE_NPOINTS', 'H5D_DATATYPE_SIZE', 'H5D_CHUNK_SIZE_D1', 'H5D_CHUNK_SIZE_D2', 'H5D_CHUNK_SIZE_D3', 'H5D_CHUNK_SIZE_D4', 'H5D_CHUNK_SIZE_D5', 'H5D_USE_MPIIO_COLLECTIVE', 'H5D_USE_DEPRECATED', 'H5D_FASTEST_RANK', 'H5D_FASTEST_RANK_BYTES', 'H5D_SLOWEST_RANK', 'H5D_SLOWEST_RANK_BYTES'], 'fcounters': ['H5D_F_OPEN_START_TIMESTAMP', 'H5D_F_READ_START_TIMESTAMP', 'H5D_F_WRITE_START_TIMESTAMP', 'H5D_F_CLOSE_START_TIMESTAMP', 'H5D_F_OPEN_END_TIMESTAMP', 'H5D_F_READ_END_TIMESTAMP', 'H5D_F_WRITE_END_TIMESTAMP', 'H5D_F_CLOSE_END_TIMESTAMP', 'H5D_F_READ_TIME', 'H5D_F_WRITE_TIME', 'H5D_F_META_TIME', 'H5D_F_MAX_READ_TIME', 'H5D_F_MAX_WRITE_TIME', 'H5D_F_FASTEST_RANK_TIME', 'H5D_F_SLOWEST_RANK_TIME', 'H5D_F_VARIANCE_RANK_TIME', 'H5D_F_VARIANCE_RANK_BYTES']}
Plotting in PyDarshan is currently marked as an experimental feature and must be first enabled and then the plotting functions can be imported from their experimental module:
darshan.enable_experimental()
from darshan.experimental.plots.matplotlib import plot_access_histogram, plot_opcounts
Below are the kinds of plots that do work for our Darshan file with HDF5-specific profiling information. It is important to note that the plots presented here may or may not work for every Darshan report as they are dependent on what profiling data are available in the actual file.
plot_access_histogram(rep, 'POSIX')
Summarizing... iohist POSIX
<module 'matplotlib.pyplot' from '/Users/ajelenak/.pyenv/darshan/lib/python3.8/site-packages/matplotlib/pyplot.py'>
plot_access_histogram(rep, 'MPI-IO')
Summarizing... iohist MPI-IO
<module 'matplotlib.pyplot' from '/Users/ajelenak/.pyenv/darshan/lib/python3.8/site-packages/matplotlib/pyplot.py'>
plot_opcounts(rep)
Summarizing... agg_ioops Read,Write,Open,Stat,Seek,Mmap,Fsync,Layer 4049960,4012490,10,2,6,0,0,POSIX 4049956,4012490,0,0,0,0,0,MPIIND 0,0,8,0,0,0,0,MPICOL 31,14,3,0,0,0,0,STDIO
<module 'matplotlib.pyplot' from '/Users/ajelenak/.pyenv/darshan/lib/python3.8/site-packages/matplotlib/pyplot.py'>
PyDarshan currently does not support HDF5-specific modules so below is a function to reformat Darshan non-DXT module data to a pandas.DataFrame.
def module2df(rep: darshan.DarshanReport, module: str) -> pd.DataFrame:
"""Convert Darshan module log to pandas DataFrame.
"""
mod_df = pd.DataFrame()
for e in rep.records[module]:
df1 = pd.DataFrame({'counter': rep.counters[module]['counters'],
'value': e['counters']})
df2 = pd.DataFrame({'counter': rep.counters[module]['fcounters'],
'value': e['fcounters']})
df1 = pd.concat([df1, df2], axis=0, ignore_index=True)
df1 = df1.assign(rank=e['rank'], record=rep.name_records[e['id']])
mod_df = pd.concat([mod_df, df1], axis=0, ignore_index=True)
return mod_df
Converting the H5F module data to a pandas.Dataframe gives:
h5f = module2df(rep, 'H5F')
h5f.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 8 entries, 0 to 7 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 counter 8 non-null object 1 value 8 non-null float64 2 rank 8 non-null int64 3 record 8 non-null object dtypes: float64(1), int64(1), object(2) memory usage: 384.0+ bytes
h5f
counter | value | rank | record | |
---|---|---|---|---|
0 | H5F_OPENS | 9.000000 | -1 | /mnt/scratch/out-1.h5 |
1 | H5F_FLUSHES | 0.000000 | -1 | /mnt/scratch/out-1.h5 |
2 | H5F_USE_MPIIO | 1.000000 | -1 | /mnt/scratch/out-1.h5 |
3 | H5F_F_OPEN_START_TIMESTAMP | 0.001401 | -1 | /mnt/scratch/out-1.h5 |
4 | H5F_F_CLOSE_START_TIMESTAMP | 9.218065 | -1 | /mnt/scratch/out-1.h5 |
5 | H5F_F_OPEN_END_TIMESTAMP | 11.637774 | -1 | /mnt/scratch/out-1.h5 |
6 | H5F_F_CLOSE_END_TIMESTAMP | 11.637802 | -1 | /mnt/scratch/out-1.h5 |
7 | H5F_F_META_TIME | 0.315327 | -1 | /mnt/scratch/out-1.h5 |
There is not much data above since only one HDF5 file was involved. Still, it is possible to illustrate how to extract useful information in case (many) more HDF5 files were present.
For example, list of all HDF5 files:
h5f['record'].unique()
array(['/mnt/scratch/out-1.h5'], dtype=object)
Grouping H5F module data by the counter enables per-file analysis of all their data:
h5f_grp = h5f.groupby('counter')
The largest number of HDF5 file open operations:
h5f_grp.get_group('H5F_OPENS').max()
counter H5F_OPENS value 9.0 rank -1 record /mnt/scratch/out-1.h5 dtype: object
Top 5 HDF5 files with the longest cumulative times spent in open, close, or flush operations:
h5f_grp.get_group('H5F_F_META_TIME').nlargest(5, 'value', keep='all')
counter | value | rank | record | |
---|---|---|---|---|
7 | H5F_F_META_TIME | 0.315327 | -1 | /mnt/scratch/out-1.h5 |
Reformatting the H5D module data to a pandas.DataFrame yields:
h5d = module2df(rep, 'H5D')
h5d.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 150849 entries, 0 to 150848 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 counter 150849 non-null object 1 value 150849 non-null float64 2 rank 150849 non-null int64 3 record 150849 non-null object dtypes: float64(1), int64(1), object(2) memory usage: 4.6+ MB
This DataFrame has many more rows so let's show just the first 10:
h5d.head(10)
counter | value | rank | record | |
---|---|---|---|---|
0 | H5D_OPENS | 8.0 | -1 | /mnt/scratch/out-1.h5:/step=0/array=409 |
1 | H5D_READS | 4.0 | -1 | /mnt/scratch/out-1.h5:/step=0/array=409 |
2 | H5D_WRITES | 4.0 | -1 | /mnt/scratch/out-1.h5:/step=0/array=409 |
3 | H5D_FLUSHES | 0.0 | -1 | /mnt/scratch/out-1.h5:/step=0/array=409 |
4 | H5D_BYTES_READ | 640000.0 | -1 | /mnt/scratch/out-1.h5:/step=0/array=409 |
5 | H5D_BYTES_WRITTEN | 640000.0 | -1 | /mnt/scratch/out-1.h5:/step=0/array=409 |
6 | H5D_RW_SWITCHES | 4.0 | -1 | /mnt/scratch/out-1.h5:/step=0/array=409 |
7 | H5D_REGULAR_HYPERSLAB_SELECTS | 8.0 | -1 | /mnt/scratch/out-1.h5:/step=0/array=409 |
8 | H5D_IRREGULAR_HYPERSLAB_SELECTS | 0.0 | -1 | /mnt/scratch/out-1.h5:/step=0/array=409 |
9 | H5D_POINT_SELECTS | 0.0 | -1 | /mnt/scratch/out-1.h5:/step=0/array=409 |
The number of records in this module are:
h5d['record'].unique().size
797
H5D module records are HDF5 datasets. Let's show the last 5:
h5d['record'].unique()[-5:]
array(['/mnt/scratch/out-1.h5:/step=1/array=68', '/mnt/scratch/out-1.h5:/step=1/array=62', '/mnt/scratch/out-1.h5:/step=1/array=24', '/mnt/scratch/out-1.h5:/step=1/array=261', '/mnt/scratch/out-1.h5:/step=1/array=20'], dtype=object)
Grouping on counters:
h5d_grp = h5d.groupby('counter')
we can start looking at I/O statistics across all HDF5 datasets or broken down per each dataset. For example, descriptive stats for slowest H5D read operations:
h5d_grp.get_group('H5D_MAX_READ_TIME_SIZE')['value'].describe()
count 1359.0 mean 160000.0 std 0.0 min 160000.0 25% 160000.0 50% 160000.0 75% 160000.0 max 160000.0 Name: value, dtype: float64
Descriptive stats about the total number of bytes written to each HDF5 dataset:
h5d_grp.get_group('H5D_BYTES_WRITTEN')['value'].describe()
count 1359.000000 mean 342251.655629 std 233034.521727 min 160000.000000 25% 160000.000000 50% 160000.000000 75% 640000.000000 max 640000.000000 Name: value, dtype: float64
The 10 smallest cumulative dataset read times:
h5d_grp.get_group('H5D_F_READ_TIME').nsmallest(10, 'value')
counter | value | rank | record | |
---|---|---|---|---|
150285 | H5D_F_READ_TIME | 0.000179 | 3 | /mnt/scratch/out-1.h5:/step=1/array=272 |
120648 | H5D_F_READ_TIME | 0.000179 | 3 | /mnt/scratch/out-1.h5:/step=1/array=108 |
141072 | H5D_F_READ_TIME | 0.000179 | 3 | /mnt/scratch/out-1.h5:/step=1/array=252 |
130083 | H5D_F_READ_TIME | 0.000179 | 3 | /mnt/scratch/out-1.h5:/step=1/array=213 |
147843 | H5D_F_READ_TIME | 0.000179 | 3 | /mnt/scratch/out-1.h5:/step=1/array=244 |
143514 | H5D_F_READ_TIME | 0.000179 | 3 | /mnt/scratch/out-1.h5:/step=1/array=282 |
120870 | H5D_F_READ_TIME | 0.000180 | 3 | /mnt/scratch/out-1.h5:/step=1/array=33 |
134079 | H5D_F_READ_TIME | 0.000180 | 3 | /mnt/scratch/out-1.h5:/step=1/array=151 |
127752 | H5D_F_READ_TIME | 0.000180 | 3 | /mnt/scratch/out-1.h5:/step=1/array=287 |
132303 | H5D_F_READ_TIME | 0.000180 | 3 | /mnt/scratch/out-1.h5:/step=1/array=31 |
Total number of bytes read from and written to each record (HDF5 dataset):
h5d_grp.get_group('H5D_BYTES_READ').groupby('record')['value'].sum()
record /mnt/scratch/out-1.h5:/step=0/array=0 640000.0 /mnt/scratch/out-1.h5:/step=0/array=1 640000.0 /mnt/scratch/out-1.h5:/step=0/array=10 640000.0 /mnt/scratch/out-1.h5:/step=0/array=100 640000.0 /mnt/scratch/out-1.h5:/step=0/array=101 640000.0 ... /mnt/scratch/out-1.h5:/step=1/array=95 480000.0 /mnt/scratch/out-1.h5:/step=1/array=96 480000.0 /mnt/scratch/out-1.h5:/step=1/array=97 480000.0 /mnt/scratch/out-1.h5:/step=1/array=98 480000.0 /mnt/scratch/out-1.h5:/step=1/array=99 480000.0 Name: value, Length: 797, dtype: float64
h5d_grp.get_group('H5D_BYTES_WRITTEN').groupby('record')['value'].sum()
record /mnt/scratch/out-1.h5:/step=0/array=0 640000.0 /mnt/scratch/out-1.h5:/step=0/array=1 640000.0 /mnt/scratch/out-1.h5:/step=0/array=10 640000.0 /mnt/scratch/out-1.h5:/step=0/array=100 640000.0 /mnt/scratch/out-1.h5:/step=0/array=101 640000.0 ... /mnt/scratch/out-1.h5:/step=1/array=95 480000.0 /mnt/scratch/out-1.h5:/step=1/array=96 480000.0 /mnt/scratch/out-1.h5:/step=1/array=97 480000.0 /mnt/scratch/out-1.h5:/step=1/array=98 480000.0 /mnt/scratch/out-1.h5:/step=1/array=99 480000.0 Name: value, Length: 797, dtype: float64
PyDarshan currently does not support plotting HDF5 module data. Below are two example functions which are modeled on the current PyDarshan's plotting capabilities. They showcase how H5D module data as a pandas DataFrame can be as easily utilized for plotting as for analysis.
Another difference from PyDarshan plotting is the use of the hvPlot, a high-level plotting API for the PyData ecosystem. It is an alternative to the plotting API provided by pandas which generates interactive plots that allow panning, zooming, hover data display, etc.
import hvplot
hvplot.__version__
'0.7.0'
def plot_h5d_access_histogram(grpby: pd.core.groupby.DataFrameGroupBy):
"""
"""
ranges = ['0_100', '100_1K', '1K_10K', '10K_100K', '100K_1M', '1M_4M',
'4M_10M', '10M_100M', '100M_1G', '1G_PLUS']
labels = ['0-100', '100-1K', '1K-10K', '10K-100K', '100K-1M', '1M-4M',
'4M-10M', '10M-100M', '100M-1G', '1G+']
read_vals = [h5d_grp.get_group('H5D_SIZE_READ_AGG_' + rng)['value'].sum()
for rng in ranges]
write_vals = [h5d_grp.get_group('H5D_SIZE_WRITE_AGG_' + rng)['value'].sum()
for rng in ranges]
df = pd.DataFrame(data={'read': read_vals, 'write': write_vals},
index=labels)
pd.options.plotting.backend = 'holoviews'
return df.plot.bar(title='Histogram of Access Sizes: H5D', rot=45, ylabel='count')
def plot_h5d_opcounts(grpby: pd.core.groupby.DataFrameGroupBy):
"""
"""
opens = h5d_grp.get_group('H5D_OPENS')['value'].sum()
reads = h5d_grp.get_group('H5D_READS')['value'].sum()
writes = h5d_grp.get_group('H5D_WRITES')['value'].sum()
flushes = h5d_grp.get_group('H5D_FLUSHES')['value'].sum()
df = pd.DataFrame(data={'open': opens, 'read': reads, 'write': writes,
'flush': flushes},
index=[0])
pd.options.plotting.backend = 'holoviews'
return df.plot.bar(title='H5D I/O Operations', ylabel='count')
plot_h5d_access_histogram(h5d_grp)
plot_h5d_opcounts(h5d_grp)