This page lists every 2bit file in our svn repository, as of last week. I then used "svn rm" to remove 2bit files. I also edited the annots.xml files to containe a reference genome sequence "file" tag that points to a genome sequence 2bit file hosted at locations on the UCSC Genome Browser web site's "downloads" sections. Before I made this file listing the 2bit files, I removed three 2bit files and edited the annots.xml files and then checked to see if the repository got smaller. One of the files was a more recently added 2bit file that Paige had made from a fasta format file obtained from the NCBI Web site. I was able to find a 2bit file for that very same genome. It turns out that the Gen Ark project at UCSC has many, many 2bit files for genome assemblies available at NCBI. This means that for the new genomes Paige has added using NCBI as the source, we can also replace those 2bit files with references to locations in Gen Ark hosting in NCBI. However, I first decided to remove the 2bit files available from the older location on the UCSC Web site. I'm calling that older location the "bigzips" location. I marked any files that I removed with an "x" at the start of the lines below. After doing this work, the final version of the repository was revision 223. I "dumped" that revision of the repository and observed that the size of the repository was now only 11 Gb, about a third of the size of the full repository with all versions. x 822M ./P_troglodytes_Oct_2010/P_troglodytes_Oct_2010.2bit 809M ./H_sapiens_Dec_2013/H_sapiens_Dec_2013.2bit - keeping for now because I'm not sure if ours is the latest patch x 800M ./L_africana_Jul_2009/L_africana_Jul_2009.2bit x 786M ./F_catus_Sep_2011/F_catus_Dec_2008.2bit x 778M ./H_sapiens_Feb_2009/H_sapiens_Feb_2009.2bit x 745M ./B_taurus_Oct_2011/B_taurus_Oct_2011.2bit x 725M ./R_norvegicus_Mar_2012/R_norvegicus_Mar_2012.2bit x 718M ./O_aries_Feb_2010/O_aries_Feb_2010.2bit x 716M ./R_norvegicus_Jul_2014/R_norvegicus_Jul_2014.2bit x 708M ./M_musculus_Aug_2005/M_musculus_Aug_2005.2bit x 697M ./C_sabaeus_Mar_2014/C_sabaeus_Mar_2014.2bit x 682M ./M_musculus_Dec_2011/M_musculus_Dec_2011.2bit x 681M ./M_musculus_Jun_2020/M_musculus_Jun_2020.2bit x 680M ./M_musculus_Jul_2007/M_musculus_Jul_2007.2bit x 676M ./C_porcellus_Feb_2008/C_porcellus_Feb_2008.2bit x 669M ./B_taurus_Jun_2014/B_taurus_Jun_2014.2bit x 664M ./M_musculus_Feb_2006/M_musculus_Feb_2006.2bit x 660M ./R_norvegicus_Nov_2020/R_norvegicus_Nov_2020.2bit x 617M ./E_caballus_Sep_2007/E_caballus_Sep_2007.2bit x 577M ./A_melanoleuca_Dec_2009/A_melanoleuca_Dec_2009.2bit 494M ./Z_mays_B73_Mar_2010/Z_mays_B73_Mar_2010.2bit - not in bigzips x 421M ./D_rerio_May_2017/D_rerio_May_2017.2bit x 373M ./X_tropicalis_Nov_2009/X_tropicalis_Nov_2009.2bit 370M ./A_albopictus_Jan_2024/A_albopictus_Jan_2024.2bit - not in bigzips x 357M ./X_tropicalis_Jul_2016/X_tropicalis_Jul_2016.2bit x 355M ./D_rerio_Jul_2010/D_rerio_Jul_2010.2bit 342M ./A_rostrata_Oct_2016/A_rostrata_Oct_2016.2bit - not in bigzips 318M ./F_albicollis_Jun_2013/F_albicollis_Jun_2013.2bit - not in bigzips x 296M ./G_gallus_Nov_2011/G_gallus_Nov_2011.2bit 279M ./H_vulgaris_Apr_2024/H_vulgaris_Apr_2024.2bit - not in bigzips 233M ./G_max_Jan_2014/G_max_Jan_2014.2bit - not in bigzips 232M ./G_max_Aug_2010/G_max_Aug_2010.2bit - not in bigzips 214M ./S_bicolor_Jan_2009/S_bicolor_Jan_2009.2bit - not in bigzips 210M ./O_latipes_Jul_2017/O_latipes_Jul_2017.2bit - this *might* be the same as oryLat2 in "bigzips", but oryLat2 is from 2005; not sure where this sequence came from; we need to use the genbank accessions in the contents.txt second column for data obtained from NCBI; check the HEADER links (missing https in ncbi link) 209M ./C_sativa_CBDRx_Dec_2018/C_sativa_CBDRx_Dec_2018.2bit - not in bigzips 198M ./S_lycopersicum_Feb_2017/S_lycopersicum_Feb_2017.2bit - not in bigzips 197M ./S_lycopersicum_Feb_2014/S_lycopersicum_Feb_2014.2bit - not in bigzips 191M ./S_lycopersicum_Jun_2022/S_lycopersicum_Jun_2022.2bit - not in bigzips 187M ./S_lycopersicum_Sep_2019/S_lycopersicum_Sep_2019.2bit - not in bigzips 187M ./S_lycopersicum_May_2012/S_lycopersicum_May_2012.2bit - not in bigzips 173M ./S_tuberosum_DM_Jul_2011/S_tuberosum_DM_Jul_2011.2bit - not in bigzips 173M ./S_bicolor_Jun_2013/S_bicolor_Jun_2013.2bit - not in bigzips 169M ./M_domestica_Borkh_Jun_2017/M_domestica_Borkh_Jun_2017.2bit 155M ./C_sativa_DH55_Apr_2014/C_sativa_DH55_Apr_2014.2bit - not in bigzips 127M ./V_cardui_Feb_2021/V_cardui_Feb_2021.2bit - not in bigzips (apple) x 123M ./A_mellifera_Jan_2005/A_mellifera_Jan_2005.2bit 116M ./V_vinifera_Mar_2010/V_vinifera_Mar_2010.2bit - not in bigzips (grape) 115M ./P_patens_Jan_2008/P_patens_Jan_2008.2bit - not in bigzips 113M ./P_patens_Jan_2018/P_patens_Jan_2018.2bit - not in bigzips 108M ./M_acuminata_DH_Pahang_Jan_2016/M_acuminata_DH_Pahang_Jan_2016.2bit - not in bigzips (banana) 107M ./T_rubripes_Oct_2011/T_rubripes_Oct_2011.2bit 104M ./P_trichocarpa_Aug_2012/P_trichocarpa_Aug_2012.2bit 100M ./P_trichocarpa_Jan_2010/P_trichocarpa_Jan_2010.2bit 90M ./C_teleta_Jan_2013/C_teleta_Jan_2013.2bit 89M ./O_sativa_japonica_Oct_2011/O_sativa_japonica_Oct_2011.2bit 89M ./O_sativa_japonica_Jun_2009/O_sativa_japonica_Jun_2009.2bit 86M ./N_vectensis_Jun_2007/N_vectensis_Jun_2007.2bit 84M ./R_communis_May_2008/R_communis_May_2008.2bit 82M ./C_papaya_Aug_2010/C_papaya_Aug_2010.2bit 73M ./M_truncatula_Mar_2009/M_truncatula_Mar_2009.2bit 70M ./A_gambiae_Feb_2003/A_gambiae_Feb_2003.2bit 65M ./B_distachyon_Aug_2010/B_distachyon_Aug_2010.2bit 65M ./A_gambiae_Oct_2006/A_gambiae_Oct_2006.2bit 55M ./T_castaneum_Mar_2008/T_castaneum_Mar_2008.2bit 54M ./P_persica_Apr_2010/P_persica_Apr_2010.2bit 49M ./F_vesca_Sep_2011/F_vesca_Sep_2011.2bit 49M ./A_lyrata_Apr_2011/A_lyrata_Apr_2011.2bit 47M ./D_pulex_Feb_2011/D_pulex_Feb_2011.2bit 42M ./D_melanogaster_Apr_2006/D_melanogaster_Apr_2006.2bit 37M ./D_pulex_Sep_2016/D_pulex_Sep_2016.2bit 35M ./D_melanogaster_Aug_2014/D_melanogaster_Aug_2014.2bit 35M ./D_magna_Jan_2019/D_magna_Jan_2019.2bit 34M ./D_melanogaster_Jul_2014/D_melanogaster_Jul_2014.2bit 33M ./V_carteri_Jul_2010/V_carteri_Jul_2010.2bit 32M ./D_magna_Apr_2010/D_magna_Apr_2010.2bit 30M ./T_parvula_May_2012/T_parvula_May_2012.2bit 29M ./H_exemplaris_Z151_Apr_2017/H_exemplaris_Z151_Apr_2017.2bit 29M ./A_thaliana_Jun_2009/A_thaliana_Jun_2009.2bit 29M ./A_thaliana_Apr_2008/A_thaliana_Apr_2008.2bit 27M ./C_reinhardtii_Nov_2010/C_reinhardtii_Nov_2010.2bit 25M ./C_elegans_Oct_2010/C_elegans_Oct_2010.2bit 25M ./C_elegans_May_2008/C_elegans_May_2008.2bit 24M ./C_elegans_Feb_2013/C_elegans_Feb_2013.2bit 15M ./R_varieornatus_YOKOZUNA-1_Nov_2016/R_varieornatus_YOKOZUNA-1_Nov_2016.2bit 8.2M ./D_discoideum_Ax4_May_2009/D_discoideum_Ax4_May_2009.2bit 7.9M ./L_passim_Apr_2014/L_passim_Apr_2014.2bit 6.9M ./C_albicans_SC5314_Jun_2014/C_albicans_SC5314_Jun_2014.2bit 4.9M ./Y_lipolytica_PO1f_May_2014/Y_lipolytica_PO1f_May_2014.2bit 4.9M ./Y_lipolytica_CLIB122_Jul_2004/Y_lipolytica_CLIB122_Jul_2004.2bit 4.8M ./Y_lipolytica_WSH-Z06_Mar_2014/Y_lipolytica_WSH-Z06_Mar_2014.2bit 3.4M ./C_albicans_WO-1_Jul_2010/C_albicans_WO-1_Jul_2010.2bit 3.4M ./C_albicans_WO-1_Dec_2013/C_albicans_WO-1_Dec_2013.2bit 3.4M ./C_albicans_SC5314_Nov_2008/C_albicans_SC5314_Nov_2008.2bit 3.0M ./S_pombe_May_2012/S_pombe_May_2012.2bit 2.9M ./S_cerevisiae_Jun_2008/S_cerevisiae_Jun_2008.2bit 2.9M ./S_cerevisiae_Apr_2011/S_cerevisiae_Apr_2011.2bit 1.1M ./E_coli_K12-MG1655_Jan_2012/E_coli_K12-MG1655_Jan_2012.2bit 512K ./T_kodakaraensis_KOD1_May_2004/T_kodakaraensis_KOD1_May_2004.2bit