Internally, LIMIX uses the hdf5 file format (http://www.hdfgroup.org) to handle genotype and phenotype data. This file format is flexible and supported by a number of data analysis tools, including R (e.g. rhdf5) and python (e.g. h5py, pandas or perl hdf5).
There is also a growing list of Bioinformatics tools and pipelines that build on hdf5:
Limix offers a simple conversion tool, which can be used to convert [plink]((http://pngu.mgh.harvard.edu/~purcell/plink/)binary files (.bed), csv files and 0,1,2 files, which can be generated using VCFtools.
limix_converter --outfile=./my_file.hdf5 --plink=./my_file
Note, the .bed ending is ommited. If the file my_file.hdf5 already exists, the genoytpe group (not the phenotypes) is deleted. An example plink file is included in the tutorial folder in "data/importer/genotype.(bed/bfam/bim)
VCF files need first to be converted into a G012 file. This can be achieved via vcftools:
vcftools --vcf INFILE --012 --out OUTFILE
If the vcf file is .gz compressed, you need to call
vcftools --vcfgz INFILE --012 --out OUTFILE
Subsequently, the file can be imported into a LIMX hdf5 file, using:
limix_converter --outfile=./my_file.hdf5 --g012=./OUTFILE
Note again that the endings are ommited. VCFtools will require several files in the export statement and both limix_converter and vcftools assume that any file ending is ommitted. An example vcf file is included in the tutorial folder in "data/importer/vcf_sample.vcf.gz.
limix_converter --outfile=./my_file.hdf5 -csv=./phenotype_sample.csv
Note, the phenotype file is expected to be in the format [samples (rows) x phenotypes (columns)], including column headers (phenotype IDs) and rowheader (sample IDs). An example CSV file is included in the tutorials folder in "data/importer/phenotype.csv".