Tardigrade genome data found here
Hd-act-1 GenBank accession number: CK326228
Found 745 bp interrupted by introns within scaffold 209 of the tardigrade genome.
Scaffold 209 has been saved as its own file for convenience.
Introns denoted by indent
AAA
TGGTTTTGTGAACTCTCCGCCAGAGCCGGT
AGACATCTTTTCCAAATCAGTCAGTTCCAGCACCGACGCCACCCGGATTCCCCCGTCATCGCATCGCTGTTCGCTATTCATCCCTGCGCTATTTTTCACTTT
AGGTGAGTGAATTTTAATCGCACCGGGAGGATTTGTTGGTGTTTTTTGTAGGGGGAGAGTGTGAAGAGAGGGTTTCTAGCAGGTCGCCAGCGTTGGAATTTTTTTTCGCCCGATTTTCTGAATCTTCGGGAAAGTTTCATTTTTGGGAATTTATTCAGGCATAGGCAGGACACAGGAATTTAATGCCAGGCAGCGGGAGATTAATTTTGTGGTGAGAGAGGGACAGGTCGAGTAAAGTGGTTTGGCTCGGGCCTGTATGGTATGCGCGTTCGTTTCAGGCCTGCATTCTCGTATCGAGCTTGTGAATAACCGTATTTTGGCCTTGAATTAATTATTCTGTCCCCCTCAGGCAGTTCCGTGTGCAAAATTGCTTGACCTCGGCTTACTTTCCGCTCAATCGATAGTCAGTAGCCCCATCTTGGAACCTAAAGTGAGGTTCTGAAATTTCAGTCCAGGATTTCAAGAAGCAAATTAAAAAGGCGTCATTTTGTGTGCGGTCTGTTGGGCAGAGCGATTGATTTCTATTTACTGTGCATCATTCAGGCTTTTTGAGTGGTGTGGGGGAGTTTCCCCCACTTCAGAGGTCGGTTAAAGAACTCCATTTAACATTTTTCGGACCCCTTTTTTCTTGAGAGATTCAAGTGGATGGTCAGACGTTCGCTCGAATTAATTTGCATTCAATAAATTCACGAATAGATTCTATGTTTGTCTGAAAAAATTCTCCCATTTCATTTTTCATTAGTACTAGTAAATCCTTCATTACTGAATGGTAACAACGCTCTTTTCATTCATTTCATTTCTAACTTCATTGTTAGTAATTTTTAAACACAGAAAAATTTCTTTGTATGCTACTTCGTCGGCCATGTCGGAGGGCGTCAAAGATGACACATCGACGACGCTCTCTTTCAGAACCTTGTGTGTATTTGCCTCCCGCTAGGTGGCGGTATCGGTCACTGGGCCCTTGAATCAGGTGCAGGCGGCGTAAGGCGTTTCTCTCTCCTCCACGTTCCCAGGCGTTGCTTTGCATTTCGAAATATATATCGAGCAAAATCTCCGTTTTTCCTTTAAAAGTTTCAAATTCTCCTCCCCCCATTTTCGTTCATTCCACCACATCGCTAGCCAACGCCTCTCCAGTTTTCGGGACAAATAAGGCATGTGTCTGTTAATATCTATGTGTATATGTAACACTTCAGCAGCTGAGGAAAGGGATATCTCTCTGAGGGTGTTAAAAATAACATTTTTCCCCTATCTCAACGCACGACGCGGGCCATGTGTGGTTCTGGAGCGGTTGCGCGGATTACTTAACTTGGTCATCGCTTCTGAGTGTGCGTGTCTGTGAGGGGAGAAGATATAGCGCGCGGACCGGAGATGCCGAAGAGTGAGATCTAATTTTGGAGTGAGACAGACAGACACGACACACACATAGGATGCTGTAGCGTGGCGTGCACAGTGAAAAAGTTACACTCGCGCTTGCCGGGCGCGCGGAAGAGGGCGTCAGAGGCGGCGGTGTACTAGTCGCTCTCTCGCCTAGTCTCTCATCGCCATTTAAGGCCTGTGTGTGCTCTTGTGCTGTGTGTGTGTGTGTGTGTACTAGTATGACTTTGACTGTCAGTCTCTGGGCTCATCCATCAGTCATCCATCCTGCCATTCCAATCCGCTGGGGCCCCTCTCTCATACTCTCTTTAGTTCGGCATTATCAGTTTCCGTTCATCCGGTCAACTTTTATTNNNNNNNNNNNNNNNNNNNGTCAGAGGCGGCGGTGTACTAGTCGCTCTCTCGCCTAGTCTCTCATCGCCATTTAAGGCCTGTGTGTGCTCTTGTGCTGTGTGCGTGTGTACTAGTATGACTTTGACTGTCAGTCTGTGGGCTCATCCATCAGTCATCCATCCTGCCATTCCAATCCGCTGGGGCCCCTCTCTCATACTCTCTTTAGTTCGGCATTATCAGTTTCCGTTCATCCGGTCAACTTTTATTATTTATCAGTTTGAAGAAGCTTTTCCGCTAACGTGTCTGTTCTCTCTCCCCGTTGCAGTTTCGCGTAATCCCTCAGAACAGTCGCAATGGAAGACGAAGTTGCCGCCTTGGTCGTGGACAATGGATCCGGTATGTGCAAGGCCGGATTTGCCGGAGATGACGCTCCCCGCGCCGTCTTCCCCTCCNTCATCCATCAGTCATCCATCCTGCCATTCCAATCCGCTGGGGCCCCTCTCTCATACTCTCTTTAGTTCGGCATTATCAGTTTCCGTTCATCCGGTCAACTTTTATTTATCAGTTTGAAGAAGCTTTTCCGCTAACGTGTCTGTTCTCTCCCCGTTGC
AGTTTCGCGTAATCCCTCAGAACAGTCGCAATGGAAGACGAAGTTGCCGCCTTGGTCGTGGACAATGGATCCGGTATGTGCAAGGCCGGATTTGCCGGAGATGACGCTCCCCGCGCCGTCTTCCCCTCCATCGTTGGCCGACCCCGTCATC
AGGTATGTCTGGTTTACACTAGCACTTGGAAGTCAACTGTAGGCAGACAGACACGAAGGATCAGACAATCAAGGTACACAGACAACCCGCTTTGACAATGCAGCCAGGCAGGCATGAAAGTACAGACAGTCAAGGCAGACGGACCAGACAGACGTACCGCTTACGTTGCAGACGAACGAAGGTGGTGACAGTAGTATATGCAGACAGACAGACACGGTTCAGACAAGGCGTAGATGATGGTGTCGGATGTGTTCGGTGCCGGTCGGCCACTGCTGGGTGTGACTGATTGACATTTGCCTCCGTGTTACCTTGT
AGGGTGTCATGGTCGGTATGGGTCAAAAGGACAGCTACGTCGGTGATGAGGCCCAGAGCAAGCGCGGTATCCTGACGCTCAAGTACCCCATCGAGCACGGCATCGTCACCAACTGGGATGACATGGAGAAGATCTGGCATCACACCTTCTACAACGAGCTCCGCGTGGCTCCCGAGGAACACCCCGTCCTCCTGACTGAGGCTCCCCTCAACCCCAAGGCCAACAGGGAAAAGATGACCCAG
GTATGAACCATGAACTCACTCGCCCTGCTCGAACCTCAGATGTCTGTGGCCGAATCGACTATGGACATGCCTGTGATTTGGATGATGAATTCTCACTGACTGGGTTGTTCTATTTTGCAG
ATCATGTTCGAGACATTCAACACCCCCGCCATGTACGTCGCGATTCAGGCTGTGCTCTCCCTGTACGCGTCCGGTCGTACCACCGGTATCGTGCTGGACTCTGGTGATGGTGTCTCCCACACTGTCCCCATCTACGAAGGTTATGCTCTGCCTCACGCCATCCTCCGTCTGGATTTGGCCGGTCGCGACTTGACTGACTACTTGATGAAGATCCTGACTGAGCGCGGT
T
ACAGCTTCGTCACAAC
CGG TGC GTG ACA TAG GCC
GCT
Hd-mag-1 GenBank accession number: CK326599
Found the following a 508 bp out of the 586 bp of this gene within scaffold 20 of the tardigrade genome.
AGGAAAACGAAAATGACGGAGGACCGTTTCTACGTGCGATACTACGTCGGCCATAAAGGCAAGTTCGGGCATGAATTTTTGGAATTCGAATTTCGCCCGGACGGTCGCCTGCGTTACGCCAACAATTCCAACTACAAGAACGACACGATGATCCGCAAGGAGGTGGTCGTCCATCCGGCTATCTTGGAAGAGGTGAAGCGCATCATTCAGGACAGTGAGATTCTCCGGGAGAACGACGCCAAGTGGCCGCAGCCGGACCGCGTGGGCCGGCAGGAGTTAGAGATTCTGCTGGACGACGAGCACATCTCCTTCAACACGGGAAAGATCGGCTCGCTGATGGACGTCAACAACAGCCCCGATCCAGAAGGTCTGCGTTGCTTCTACTACCTCGTGCAGGATCTCAAGTGTCTCGTCTTCTCCCTCATTGCGCTGCACTTTAAAATCAAGCCGATTTGAACGCTTGTAATCAACCAATCAAGCGC
T
AAGATTTGCACTAGCACCCTCGTCGAC
Note T should actually be G according to Hd-mag-1.fa file, this is possibly due to a read error.
Scaffold 20 as its own file for convenience. As well colored the 508 bp region within a scaffold 20 Snapgene file.
Please see iGEM Tardigrade CRISPR Analysis notebook for gRNA identification.
The ngg2 python script produced by UW's Roberson Lab was used for identifying all unique PAM sites within the Hypsibius Dujardini Tardigrade genome.
Running this:
ngg2 --outputFile pam_sites.csv nHd.2.3.abv500.fna
Produced a file pam_site.csv of all possible PAM sites within the nHd.2.3.abv500.fna genome as well as the following output:
2016-09-11 15:00:50,294 - ngg2 - INFO - ngg2 vv1.3.0
Options
=======
FASTA: nHd.2.3.abv500.fna
Output file: pam_sites.csv
Target: All contigs
Allow non-canonical starts?: False
Max G-bases per site: 15
Scan type: Exhaustive
Buffered scan: Yes
Test site uniqueness: Yes
Only unique sites: No
Processes: 1
Unfortunately we were unsuccessful at getting CRISPRseek to work. Please refer to crispr_seek.R script which is based on the following bioconductor script. This resulted in the following errors:
Running this:
results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = TRUE,REpatternFile = REpatternFile,findPairedgRNAOnly = TRUE,chromToSearch = "", outputDir = outputDir, overwrite = TRUE)
Produced the following output:
Error in offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = TRUE, :
Please specify an REpattern file as fasta file with
restriction enzyme recognition sequences!
Similarly running this:
results <- offTargetAnalysis('nHd.2.3.abv500.fna', findgRNAsWithREcutOnly = TRUE, REpatternFile = 'HD-act-1.fa', findPairedgRNAOnly = TRUE,chromToSearch = "", outputDir = outputDir, overwrite = TRUE)
Produced the following output:
Error in fromXStringViewsToStringSet(x, out.of.limits = out.of.limits, :
'x' has "out of limits" views