Tue22 Mar10:30am(10 mins)
|
Where:
T/005
Session:
Speaker:
|
Multicopy genes and other repetitive elements cause assembly fragmentation in complex eukaryotic genomes, limiting the study of their variability. The genome of Trypanosoma cruzi, the protozoan parasite that causes Chagas disease, has a high repetitive content, which consist of multigene families, transposable elements, tandem repeats, and satellite sequences. Although many T. cruzi multigene families encode surface proteins that play pivotal roles in host-parasite interactions, their variability is currently underestimated, as their high repetitive content results in collapsed gene variants, even in current long-read assemblies. Also, there are few studies comparing multigene family’s variability among Discrete Typing Units (DTUs), which are usually performed at the level of assembled genomes, using a limited number of strains. To estimate sequence variability and copy number variation of multigene-repetitive families, we have developed a whole-genome-sequencing read-based approach that is independent of gene-specific mapping and de novo assembly. Reads from each parasite isolate are mapped in a reference containing genomic sequences from representative strains, and reads that map to any given gene of a family of interest are recovered and fragmented in 30 nucleotide long k-mers. These k-mers are clustered based on sequence similarity to reduce redundancy. Finally, sums of counts of all k-mers in each cluster are assumed as the cluster copy number. This methodology was used to estimate the copy number and variability of MASP, TcMUC and Trans-Sialidase (TS), the three largest T. cruzi multigene families, in 36 T. cruzi strains, including members of all six parasite DTUs. This analysis has shown that T. cruzi multigene families present a specific pattern of variability and copy number among the distinct parasite DTUs. TcI isolates had the lowest, while hybrid strains present the highest sequence variability, suggesting that maintaining a larger content of their members after hybridization could be advantageous. There were differences observed between the hybrid strains CL Brener and Tulahuen, which suggests that they could have resolved the hybridization differently. The three evaluated multigene families vary in antigenicity in murine model, where the antibody response to MASP and TS had respectively the highest and lowest diversification with chronification. The reactivity of sera from chronic Chagasic human patients was focused on TS antigens, suggesting that targeting TS conserved sequences could be a potential avenue to improve diagnosis and vaccine design against Chagas disease. Finally, the proposed approach can be applied to study multicopy genes in any organism, providing new possibilities to access sequence variability in complex genomes.