Below is an outline of the steps I followed to create this Quickload:
1.Use wget to obtain the .2bit file from UCSC's track hub directory, then rename it
wget https:mv GCF_905220365.1.2bit ilVanCard2.1.2bit
2. Create genome.txt, then check that the chromosome's are ordered logically (i.e., numerically)
./twoBitInfo ilVanCard2.1.2bit genome.txt
cat genome.txt
3. Use Vanessa cardui's taxID (171605) to get the information needed from gene2accession.gz and gene_info.gz to create the BED14 file in a later step
gunzip -c gene2accession.gz | grep '^171605\t' > 171605.gene2accession.txt
gunzip -c gene_info.gz | grep '^171605\t' > 171605.gene_info.txt
4. Download the RefSeqAll BED file from UCSC's table browser (Link: https://genome.ucsc.edu/cgi-bin/hgTables), then create the BED14 file using the following code:
cd ~/Documents/Repos/genomesource/
./ucscToBedDetail.py -a ~/Downloads/171605.gene2accession.txt -g ~/Downloads/171605.gene_info.txt ~/Downloads/V_cardui_ncbiRefSeq.bed.gz ~/Downloads/V_cardui_Feb_2021_ncbiRefSeq.bed
5. Sort, gzip, and tabix the BED14 file
cd ~/Downloads/
sort -k1,1 -k2,2n V_cardui_Feb_2021_ncbiRefSeq.bed | bgzip > V_cardui_Feb_2021_ncbiRefSeq.bed.gz
tabix -0 -s 1 -b 2 -e 3 V_cardui_Feb_2021_ncbiRefSeq.bed.gz
6. Sanity check the 2bit and BED files - Add the 2bit file as a reference, then drag/drop the BED files into IGB. Confirm that gene models are present, labeled correctly, and that no error messages are present in the Log.
7. Create a new directory in the quickload repo, then create annots.xml
cd ~/Documents/Repos/quickload/
svn mkdir V_cardui_Feb_2021
svn cp A_gambiae_Feb_2003/annots.xml V_cardui_Feb_2021
nano V_cardui_Feb_2021/annots.xml
8. Add V_cardui_Feb_2021 to contents.txt and .htaccess
V_cardui_Feb_2021 Vanessa cardui (Feb 2021) painted lady (ilVanCard2.1)
AddDescription "Vanessa cardui (Feb 2021) painted lady (ilVanCard2.1)" V_cardui_Feb_2021
9. Create HEADER.md
../genomesource/writeQuickLoadHeaderUCSC.py V_cardui_Feb_2021 > V_cardui_Feb_2021/HEADER.md
Below is an outline of the steps I followed to create this Quickload:
1.Use wget to obtain the .2bit file from UCSC's track hub directory, then rename it
wget https://hgdownload.soe.ucsc.edu/hubs/GCF/905/220/365/GCF_905220365.1/GCF_905220365.1.2bit mv GCF_905220365.1.2bit ilVanCard2.1.2bit2. Create genome.txt, then check that the chromosome's are ordered logically (i.e., numerically)
3. Use Vanessa cardui's taxID (171605) to get the information needed from gene2accession.gz and gene_info.gz to create the BED14 file in a later step
4. Download the RefSeqAll BED file from UCSC's table browser (Link: https://genome.ucsc.edu/cgi-bin/hgTables), then create the BED14 file using the following code:
5. Sort, gzip, and tabix the BED14 file
6. Sanity check the 2bit and BED files - Add the 2bit file as a reference, then drag/drop the BED files into IGB. Confirm that gene models are present, labeled correctly, and that no error messages are present in the Log.
7. Create a new directory in the quickload repo, then create annots.xml
8. Add V_cardui_Feb_2021 to contents.txt and .htaccess
9. Create HEADER.md