Discussion
Antigenic variation is a widely employed strategy used by pathogens to evade the host immune response. It requires controlled recombination and gene expression to ensure diversification and mutually exclusive expression of antigens. Both gene expression and recombination are strongly affected by genome architecture. Yet the repetitive nature of antigen arrays has complicated their assembly. Thus, the causal links between genome architecture and mutually exclusive antigen expression remain unknown. To be able to better study the mechanism underlying antigenic variation, we have developed a strategy for de novo genome assembly and scaffolding of complex genomes. Taking advantage of Pacbio long-read sequencing technology and conserved features of chromosome folding, we assembled the genome of the protozoan parasite Trypanosoma brucei, one of the most important model organisms in antigenic variation research. We found T. brucei chromosomes to contain homozygous cores and long heterozygous subtelomeric regions, which code for the extensive variant surface glycoprotein (VSG) repertoire. Using genome-wide chromosome conformation capture (Hi-C) analyses we determined the 3D genome architecture of bloodstream form parasites. Our data revealed folding of the transcriptionally repressed subtelomeric VSG arrays into distinct, highly compacted compartments, characterized by very little interaction with other regions of the same chromosome. We suspect that the observed folding of subtelomeric VSG arrays may represent a means to maintain them in a transcriptionally repressed state, ensuring the expression of a single VSG.