Institut Pasteur blankvertical divider clipartblank C3BI blankvertical divider clipartblank Bioinformatics and Biostatistics Hub blankvertical divider clipartblank GIPhy

Supplementary Data GIPhy



Clostridium baratii core-genome
  Aligned recombination-purged core-genome estimated from 7 C. baratii isolates in FASTA format
  Mazuet C, Legeay C, Sautereau J, Bouchier C, Criscuolo A, Bouvet P, Trehard H, Jourdan Da Silva N, Popoff M (2017) Characterization of Clostridium baratii Type F strains responsible for an outbreak of botulism linked to beef meat consumption in France. PLOS Currents Outbreaks, Edition 1. doi:10.1371/currents.outbreaks.6ed2fe754b58a5c42d0c33d586ffc606


Bordetella pertussis core-gene
  Concatenated multiple sequence alignments of 2,038 cgMLST loci from 55 B. pertussis isolates in FASTA format
  Bouchez V, Guglielmini J, Dazas M, Landier A, Toubiana J, Guillot S, Criscuolo A, Brisse S (2018) Genomic sequencing of Bordetella pertussis for epidemiology and global surveillance of whooping cough. Emerging Infectious Diseases, 24:988–994. doi:10.3201/eid2406.171464


Mucor circinelloides f. circinelloides de novo genome assemblies
  25 FASTA-formatted files, each corresponding to assembled scaffold sequences of the genome of a M. circinelloides f. circinelloides isolate
  Garcia-Hermoso D, Criscuolo A, Lee SC, Legrand M, Chaouat M, Denis B, Lafaurie M, Rouveau M, Soler C, Schaal JV, Mimoun M, Mebazaa A, Heitman J, Dromer F, Brisse S, Bretagne S, Alanio A (2018) Outbreak of invasive wound Mucormycosis in a burn unit due to multiple strains of Mucor circinelloides f. circinelloides resolved by whole-genome sequencing. mBio, 9:e00573–18. doi:10.1128/mBio.00573-18


Clostridium perfringens core-genome
  Aligned recombination-purged core-genome estimated from 115 C. perfringens isolates in FASTA format
  Diancourt L, Sautereau J, Criscuolo A, Popoff MR (2019) Two Clostridium perfringens type E isolates in France. Toxins, 11:E138. doi:10.3390/toxins11030138


Corynebacterium belfantii core-genome
  Read mapping-based core-genome estimated from 6 C. belfantii isolates in FASTA format
  Pivot D, Fanton A, Badell-Ocando E, Benouachkou M, Astruc K, Huet F, Amoureux L, Neuwirth C, Criscuolo A, Aho S, Toubiana J, Brisse S (2019) Carriage of a single strain of nontoxigenic Corynebacterium diphtheriae bv. Belfanti (Corynebacterium belfantii) in four patients with cystic fibrosis. Journal of Clinical Microbiology, 57:00042–19. doi:10.1128/JCM.00042-19


Simulation data for testing alignment-free genome distance estimates
  Each XZ-compressed file contains 500 pairs of genome sequences simulated using SeqGen with the GTR+Γ evolutionary model. Each line contains 18 fields separated by blank spaces: [1] seed value used during simulation, [2] true evolutionary distance d between the two simulated genomes, [3] total no. simulated characters, [4] no. non-indel characters with nucleotide mismatch, [5] no. non-indel characters, [6-9] A, C, G, T frequencies used during simulation, [10-15] GTR parameters used during simulation, [16] Γ distribution parameter used during simulation, [17-18] two simulated sequences with indel events as gaps. Of note, each pair of aligned sequences without gaps can be regenerated using SeqGen with parameters from fields [1,6-16] and the model tree (t1:d,t2:0.000); where d is given in field [2].
  500 simulation parameters and associated genome pairs: d = 0.10 nucleotide substitutions per character, indel probability = 0.0012
  500 simulation parameters and associated genome pairs: d = 0.20 nucleotide substitutions per character, indel probability = 0.0024
  500 simulation parameters and associated genome pairs: d = 0.30 nucleotide substitutions per character, indel probability = 0.0036
  500 simulation parameters and associated genome pairs: d = 0.40 nucleotide substitutions per character, indel probability = 0.0048
  500 simulation parameters and associated genome pairs: d = 0.50 nucleotide substitutions per character, indel probability = 0.0060
  500 simulation parameters and associated genome pairs: d = 0.60 nucleotide substitutions per character, indel probability = 0.0072
  Criscuolo A (2019) A fast alignment-free bioinformatics procedure to infer accurate distance-based phylogenetic trees from genome assemblies. Research Ideas and Outcomes, 5:e36178. doi:10.3897/rio.5.e36178