The University of Melbourne
9 files

Data repository for "The genomic basis of temporal niche evolution in a diurnal rodent"

Version 4 2023-07-13, 06:24
Version 3 2023-07-13, 06:22
Version 2 2022-12-06, 00:17
Version 1 2022-07-15, 17:28
posted on 2023-07-13, 06:24 authored by CHARLES FEIGINCHARLES FEIGIN

This repository contains datasets assocaited with the publication "The genomic basis of temporal niche evolution in a diurnal rodent", a collaboration between the Mallarino Lab at Princeton University and the Lucas lab at the Universty of Manchester. This study examined the evolution of diel temporal niche traits in the diurnal African striped mouse (Rhabdomys pumilio) from a comparative and functional perspective. Additionally, this study presents the first genome assembly for this species, deposited at NCBI under BioProject:  PRJNA858857.

This repository contains the following:

1) Rhabdomys_pumilio_final.gff.gz: A raw GFF-formatted gene annotation set for the Rhabdomys pumilio genome produced by Funannotate and used in transcriptomic analyses

2) Rhabdomys_pumilio.mouse_gene_name_final.gff.gz: A copy of the above GFF-formatted gene annotation set for the Rhabdomys pumilio genome, in which gene symbols from the laboratory mouse (Mus musculus) have been assigned to their predicted R. pumilio orthologs.

3) CompGenAnno.tar.gz: A folder of GFF-formatted annotations used in comparative genomic analyses, produced by directly lifting-over gene annotations from the Mus musculus genome (annotation: GCF_000001635.27_GRCm39_genomic.gff, assembly: GCF_000001635.27_GRCm39_genomic.fna) onto each of 23 other murid genome assemblies. Additionally, a manifest of each reference genome can be found in .tsv format (manifest_of_genome_assemblies_and_liftover_annotations.txt) along with a file with locus trees (locus_trees.txt) for each orthologous group of genes used in comparative genomic analyses are provided.

4) Table_of_RER_data_for_examined_species.xlsx: A large table showing relative evolutionary rates measurements for each orthologous group of gene sequences (referenced against a Mus musculus transcript), for each species examined. Each species may be listed in multiple columns, reflecting different species representation for each orthologous group (i.e. representing cases in which the branch to a given leaf node originates at a different ancestral node due to sister species not represented in that alignment).

5) Rhabdomys_pumilio_Princeton_asm1.0_preNCBI.fasta.gz: A copy of the genome assembly prior to any re-formatting that NCBI performs after upload.

6) A script used to filter multifastas of orthologous transcripts by the percent of sequences with recovered start codons (prior to mafft alignment)

7) A script used to filter orthologous transcripts based on their pre-alignment length vs the reference Mus musculus ortholog used to annotate them via LiftOff (prior to mafft alignment)

8) A script used to filter orthologous transcripts based on the presence of gaps or insertions (i.e. gaps introduced into the reference Mus musculus ortholog) after initial mafft alignment.

9) calc_plot_murid_RERs.r: A script used to calculate and plot relative evolutionary rates based on RAxML trees for processed murid ortholog alignments.


NIH F32 GM139240-01

NIH F32 GM139253

NIH R35GM133758

Sir Henry Dale Fellowship, jointly funded by the Wellcome Trust and the Royal Society (Grant Number 218556/Z/19/Z)

Wellcome Investigator Award 210684/Z/18/Z

BBSRC project grant BB/V011111/1