A statistical model applied to 544 in vivo HIV-1 recombinants reveals that viral genomic features, especially RNA structure, promote recombination

Mathematical Biology and Ecology Seminar
Wednesday, April 20, 2011 - 11:00
1 hour (actually 50 minutes)
Skiles 005
Departments of Statistics and of Genetics, Development and Cell Biology, Iowa State University
It has long been postulated and somewhat confirmed with limited biological experiment, that RNA structure affects the propensity of HIV-1 reverse transcriptase to undergo strand transfer, a prerequisite for recombination.  Our goal was to use the large resource of in vivo recombinants isolated from patients and stored in the HIV database to determine whether there were signals in the HIV-1 genetic sequence, such as propensity to form RNA secondary structure, that promote recombination.  Starting from 65,000 HIV-1 sequences at least 400 nucleotides long, we identified 2,360 recombinants involving exactly two distinct subtypes.  Since we were interested in mechanistic causes, rather than selective causes, we reduced the number of recombinants to 544 verifiably unique events.  We then fit a Gaussian Markov Random Field model with covariates in the mean to assess the impact of genetic features on recombination.  We found SHAPE reactivities to be most strongly and negatively correlated with recombination rates, which agrees with the observation that pairing probabilities had an opposite, strong relationship with recombination.  Less strongly associated, but still significant, we found G-rich   stretches positively correlated, thermal stability negatively correlated, and GC content positively correlated with recombination.  Interestingly, known in vitro hotspots did not explain much of the in vivo recombination.