Neo X Gene Movement

WARNING: Preliminary Analysis

Neo X Gene Movement

We are interested in the evolutionary impact of X inactivation. There are several hypotheses about why the X would evolve MSCI during spermatogenesis. Here we focus on the idea that genes important for spermatogenesis will move off of the X to avoid silencing. To investigate this question we look at gene movement off of the Neo X chromosome in D. pseudoobscura and D. willistoni. In this group, the Muller element A (X) fused with Muller element D (3L) (Figure 1). We expect that genes important for spermatogenesis, on Muller element D, will move to another chromosome or be lost.

Figure 1: Chromosome synteny across the Drosophila genus. Image modified from http://flybase.org/maps/synteny.

Figure 1: Chromosome synteny across the Drosophila genus. Image modified from http://flybase.org/maps/synteny.

First, we need to characterize gene movement in the Drosophila genus. Then we can look for enrichment of gene movement off of Muller D in genes up regulated in primary spermatocytes. To characterize gene movement we look at gene localization in a subset of the Drosophila genus. We focused on 5 members of the genus: D. pseudoobscura, D. melanogaster, D. willistoni, D. mojavensis, and D. virilis. I downloaded updated gene annotations and ortholog mappings from GSE99574. I supplemented these ortholog mappings with those provided by FlyBase. We selected a set of 13,473 orthologous genes present in D. melanogaster and at least one of the other species. Using these gene locations we can determine how the gene is behaving in D. pseudoobscura and D. willistoni.

Methods

Identification of orthologs

The quality of assemblies and annotations are highly variable across the Drosophila genus. Genes from D. willistoni, D. mojavensis, and D. virilis are currently mapped to scaffolds instead of chromosome arm. We used annotations from D. melanogaster to assign Muller element to each of these scaffolds using a consensus approach. For each scaffold, with > 20 genes, we looked where the orthologs were located in D. melanogaster. We assigned the Muller element using majority rule. To validate our approach, we did this same process for D. pseudoobscura which has chromosomal annotation. We achieved 100% agreement with D. pseudoobscura annotation.

Classification of evolutionary state

We classified each gene as conserved, gene_death, moved_on, or moved_off based on localization across the genus (Table 1). For example, if a gene is located on Muller D in all species then it is conserved (Table 1a). Similarly, recent_conserved genes were on Muller D in Sophophora branch, but were elsewhere in D. virilis and D. mojavensis. To determine movement in D pseudoobscura we did the following. If the gene was on Muller D in all species, but not found in D. pseudoobscura then it was considered gene_death. If the gene was on Muller D in all species, but on another Muller element in D. pseudoobscura then it was considered moved_off. We used the same logic for D. willistoni. We then furthered grouped genes as selection favoring a gene remaining where it is (conserved, moved_on) or selection against where it is (gene_death, moved_off).

Table 1a. Classification of genes on Muller element D.

DmelDpse(Dwil/Dvir/Dmoj)ClassGroup
DDDconservedSelection Favor D
OtherDOthermoved_onSelection Favor D
DOtherDmoved_offSelection Against D
DNot_foundDgene_deathSelection Against D

Table 1b. Classification of genes on Muller element E.

DmelDpse(Dwil/Dvir/Dmoj)ClassGroup
EEEconservedSelection Favor E
OtherEOthermoved_onSelection Favor E
EOtherEmoved_offSelection Against E
ENot_foundEgene_deathSelection Against E

Results

Ortholog selection

Yang et al. only generated orthologs based on genes expressed in their data. There were a number of FlyBase annotated orthologs that were excluded from their list. The majority of orthologs present in both data sets were mapping to the same Muller element (Table 2a). While the similarity of missing orthologs was very different (Table 2a). I merged these ortholog sets by updating the FlyBase annotation using Yang et al. mappings (Table 2a). The number of genes present on each Muller element is similar across species with most elements having ≤ 175 different (Table 2b).

Table 2a. Similarity of Yang et al. vs FlyBase.

speciesNum Non MissingSimilarity of Non MissingNum Yang MissingNum FlyBase MissingSimilarity of MissingNumber Orthologs
dmel12,5991961887-0.057334713,559
dpse9,5330.9872944,5972,2500.30144612,358
dvir9,6570.9936254,4272,6630.39349611,948
dmoj9,7470.9948934,3162,7720.42741711,855
dwil9,7510.9877394,3372,5440.38469412,102

Adj. Rand Score.

Table 2b. Number of genes per muller element.

SpeciesABCDEFUnknown Scaffold
dmel2,1022,5822,7662,6563,3517824
dpse1,8502,2812,4582,3103,0870372
dvir1,9132,1842,4192,3162,93077109
dmoj1,8582,2222,3982,2622,9709352
dwil1,9842,1752,4892,2193,0750160
Std Dev104.55169.13149.52174.11164.3845.72138.14

Gene movement classification

For gene movement results browse to the respective page either using the links below or the side bar.

Table 3. D. pseudoobscura movement classes.

muller_Amuller_Dmuller_E
conserved1,5711,9192,557
recent_conserved140159214
moved_on211019
gene_death245373390
moved_off645544
other247351433

Table 4. D. willistoni movement classes.

muller_Amuller_Dmuller_E
conserved1,5711,9192,557
recent_conserved140159214
moved_on000
gene_death256388487
moved_off3214181
other289260318

Gene movement cell type enrichment

D. pseudoobscura
D. willistoni