Skip to content

genspace/iGEM-Tardigrade-CRISPR-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hypsibius Dujardini Tardigrade genome

Tardigrade genome data found here

Target Knockouts

Actin-1 Gene

Hd-act-1 GenBank accession number: CK326228

Found 745 bp interrupted by introns within scaffold 209 of the tardigrade genome.

Scaffold 209 has been saved as its own file for convenience.

Introns denoted by indent

AAA

TGGTTTTGTGAACTCTCCGCCAGAGCCGGT

AGACATCTTTTCCAAATCAGTCAGTTCCAGCACCGACGCCACCCGGATTCCCCCGTCATCGCATCGCTGTTCGCTATTCATCCCTGCGCTATTTTTCACTTT

AGGTGAGTGAATTTTAATCGCACCGGGAGGATTTGTTGGTGTTTTTTGTAGGGGGAGAGTGTGAAGAGAGGGTTTCTAGCAGGTCGCCAGCGTTGGAATTTTTTTTCGCCCGATTTTCTGAATCTTCGGGAAAGTTTCATTTTTGGGAATTTATTCAGGCATAGGCAGGACACAGGAATTTAATGCCAGGCAGCGGGAGATTAATTTTGTGGTGAGAGAGGGACAGGTCGAGTAAAGTGGTTTGGCTCGGGCCTGTATGGTATGCGCGTTCGTTTCAGGCCTGCATTCTCGTATCGAGCTTGTGAATAACCGTATTTTGGCCTTGAATTAATTATTCTGTCCCCCTCAGGCAGTTCCGTGTGCAAAATTGCTTGACCTCGGCTTACTTTCCGCTCAATCGATAGTCAGTAGCCCCATCTTGGAACCTAAAGTGAGGTTCTGAAATTTCAGTCCAGGATTTCAAGAAGCAAATTAAAAAGGCGTCATTTTGTGTGCGGTCTGTTGGGCAGAGCGATTGATTTCTATTTACTGTGCATCATTCAGGCTTTTTGAGTGGTGTGGGGGAGTTTCCCCCACTTCAGAGGTCGGTTAAAGAACTCCATTTAACATTTTTCGGACCCCTTTTTTCTTGAGAGATTCAAGTGGATGGTCAGACGTTCGCTCGAATTAATTTGCATTCAATAAATTCACGAATAGATTCTATGTTTGTCTGAAAAAATTCTCCCATTTCATTTTTCATTAGTACTAGTAAATCCTTCATTACTGAATGGTAACAACGCTCTTTTCATTCATTTCATTTCTAACTTCATTGTTAGTAATTTTTAAACACAGAAAAATTTCTTTGTATGCTACTTCGTCGGCCATGTCGGAGGGCGTCAAAGATGACACATCGACGACGCTCTCTTTCAGAACCTTGTGTGTATTTGCCTCCCGCTAGGTGGCGGTATCGGTCACTGGGCCCTTGAATCAGGTGCAGGCGGCGTAAGGCGTTTCTCTCTCCTCCACGTTCCCAGGCGTTGCTTTGCATTTCGAAATATATATCGAGCAAAATCTCCGTTTTTCCTTTAAAAGTTTCAAATTCTCCTCCCCCCATTTTCGTTCATTCCACCACATCGCTAGCCAACGCCTCTCCAGTTTTCGGGACAAATAAGGCATGTGTCTGTTAATATCTATGTGTATATGTAACACTTCAGCAGCTGAGGAAAGGGATATCTCTCTGAGGGTGTTAAAAATAACATTTTTCCCCTATCTCAACGCACGACGCGGGCCATGTGTGGTTCTGGAGCGGTTGCGCGGATTACTTAACTTGGTCATCGCTTCTGAGTGTGCGTGTCTGTGAGGGGAGAAGATATAGCGCGCGGACCGGAGATGCCGAAGAGTGAGATCTAATTTTGGAGTGAGACAGACAGACACGACACACACATAGGATGCTGTAGCGTGGCGTGCACAGTGAAAAAGTTACACTCGCGCTTGCCGGGCGCGCGGAAGAGGGCGTCAGAGGCGGCGGTGTACTAGTCGCTCTCTCGCCTAGTCTCTCATCGCCATTTAAGGCCTGTGTGTGCTCTTGTGCTGTGTGTGTGTGTGTGTGTACTAGTATGACTTTGACTGTCAGTCTCTGGGCTCATCCATCAGTCATCCATCCTGCCATTCCAATCCGCTGGGGCCCCTCTCTCATACTCTCTTTAGTTCGGCATTATCAGTTTCCGTTCATCCGGTCAACTTTTATTNNNNNNNNNNNNNNNNNNNGTCAGAGGCGGCGGTGTACTAGTCGCTCTCTCGCCTAGTCTCTCATCGCCATTTAAGGCCTGTGTGTGCTCTTGTGCTGTGTGCGTGTGTACTAGTATGACTTTGACTGTCAGTCTGTGGGCTCATCCATCAGTCATCCATCCTGCCATTCCAATCCGCTGGGGCCCCTCTCTCATACTCTCTTTAGTTCGGCATTATCAGTTTCCGTTCATCCGGTCAACTTTTATTATTTATCAGTTTGAAGAAGCTTTTCCGCTAACGTGTCTGTTCTCTCTCCCCGTTGCAGTTTCGCGTAATCCCTCAGAACAGTCGCAATGGAAGACGAAGTTGCCGCCTTGGTCGTGGACAATGGATCCGGTATGTGCAAGGCCGGATTTGCCGGAGATGACGCTCCCCGCGCCGTCTTCCCCTCCNTCATCCATCAGTCATCCATCCTGCCATTCCAATCCGCTGGGGCCCCTCTCTCATACTCTCTTTAGTTCGGCATTATCAGTTTCCGTTCATCCGGTCAACTTTTATTTATCAGTTTGAAGAAGCTTTTCCGCTAACGTGTCTGTTCTCTCCCCGTTGC

AGTTTCGCGTAATCCCTCAGAACAGTCGCAATGGAAGACGAAGTTGCCGCCTTGGTCGTGGACAATGGATCCGGTATGTGCAAGGCCGGATTTGCCGGAGATGACGCTCCCCGCGCCGTCTTCCCCTCCATCGTTGGCCGACCCCGTCATC

AGGTATGTCTGGTTTACACTAGCACTTGGAAGTCAACTGTAGGCAGACAGACACGAAGGATCAGACAATCAAGGTACACAGACAACCCGCTTTGACAATGCAGCCAGGCAGGCATGAAAGTACAGACAGTCAAGGCAGACGGACCAGACAGACGTACCGCTTACGTTGCAGACGAACGAAGGTGGTGACAGTAGTATATGCAGACAGACAGACACGGTTCAGACAAGGCGTAGATGATGGTGTCGGATGTGTTCGGTGCCGGTCGGCCACTGCTGGGTGTGACTGATTGACATTTGCCTCCGTGTTACCTTGT

AGGGTGTCATGGTCGGTATGGGTCAAAAGGACAGCTACGTCGGTGATGAGGCCCAGAGCAAGCGCGGTATCCTGACGCTCAAGTACCCCATCGAGCACGGCATCGTCACCAACTGGGATGACATGGAGAAGATCTGGCATCACACCTTCTACAACGAGCTCCGCGTGGCTCCCGAGGAACACCCCGTCCTCCTGACTGAGGCTCCCCTCAACCCCAAGGCCAACAGGGAAAAGATGACCCAG

GTATGAACCATGAACTCACTCGCCCTGCTCGAACCTCAGATGTCTGTGGCCGAATCGACTATGGACATGCCTGTGATTTGGATGATGAATTCTCACTGACTGGGTTGTTCTATTTTGCAG

ATCATGTTCGAGACATTCAACACCCCCGCCATGTACGTCGCGATTCAGGCTGTGCTCTCCCTGTACGCGTCCGGTCGTACCACCGGTATCGTGCTGGACTCTGGTGATGGTGTCTCCCACACTGTCCCCATCTACGAAGGTTATGCTCTGCCTCACGCCATCCTCCGTCTGGATTTGGCCGGTCGCGACTTGACTGACTACTTGATGAAGATCCTGACTGAGCGCGGT

T

ACAGCTTCGTCACAAC

CGG TGC GTG ACA TAG GCC

GCT

Mago Nashi Gene (incomplete)

Hd-mag-1 GenBank accession number: CK326599

Found the following a 508 bp out of the 586 bp of this gene within scaffold 20 of the tardigrade genome.

AGGAAAACGAAAATGACGGAGGACCGTTTCTACGTGCGATACTACGTCGGCCATAAAGGCAAGTTCGGGCATGAATTTTTGGAATTCGAATTTCGCCCGGACGGTCGCCTGCGTTACGCCAACAATTCCAACTACAAGAACGACACGATGATCCGCAAGGAGGTGGTCGTCCATCCGGCTATCTTGGAAGAGGTGAAGCGCATCATTCAGGACAGTGAGATTCTCCGGGAGAACGACGCCAAGTGGCCGCAGCCGGACCGCGTGGGCCGGCAGGAGTTAGAGATTCTGCTGGACGACGAGCACATCTCCTTCAACACGGGAAAGATCGGCTCGCTGATGGACGTCAACAACAGCCCCGATCCAGAAGGTCTGCGTTGCTTCTACTACCTCGTGCAGGATCTCAAGTGTCTCGTCTTCTCCCTCATTGCGCTGCACTTTAAAATCAAGCCGATTTGAACGCTTGTAATCAACCAATCAAGCGC

T

AAGATTTGCACTAGCACCCTCGTCGAC

Note T should actually be G according to Hd-mag-1.fa file, this is possibly due to a read error.

Scaffold 20 as its own file for convenience. As well colored the 508 bp region within a scaffold 20 Snapgene file.

Methodology

Please see iGEM Tardigrade CRISPR Analysis notebook for gRNA identification.

ngg2

The ngg2 python script produced by UW's Roberson Lab was used for identifying all unique PAM sites within the Hypsibius Dujardini Tardigrade genome.

Running this:

ngg2 --outputFile pam_sites.csv nHd.2.3.abv500.fna

Produced a file pam_site.csv of all possible PAM sites within the nHd.2.3.abv500.fna genome as well as the following output:

2016-09-11 15:00:50,294 - ngg2 - INFO - ngg2 vv1.3.0

Options
=======
FASTA: nHd.2.3.abv500.fna
Output file: pam_sites.csv
Target: All contigs
Allow non-canonical starts?: False
Max G-bases per site: 15
Scan type: Exhaustive
Buffered scan: Yes
Test site uniqueness: Yes
Only unique sites: No
Processes: 1

CRISPR Seek

CRISPRseek Documentation

Unfortunately we were unsuccessful at getting CRISPRseek to work. Please refer to crispr_seek.R script which is based on the following bioconductor script. This resulted in the following errors:

Running this:

results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = TRUE,REpatternFile = REpatternFile,findPairedgRNAOnly = TRUE,chromToSearch = "", outputDir = outputDir, overwrite = TRUE)

Produced the following output:

Error in offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = TRUE,  :
  Please specify an REpattern file as fasta file with
            restriction enzyme recognition sequences!

Similarly running this:

results <- offTargetAnalysis('nHd.2.3.abv500.fna', findgRNAsWithREcutOnly = TRUE, REpatternFile = 'HD-act-1.fa', findPairedgRNAOnly = TRUE,chromToSearch = "", outputDir = outputDir, overwrite = TRUE)

Produced the following output:

Error in fromXStringViewsToStringSet(x, out.of.limits = out.of.limits,  :
  'x' has "out of limits" views

About

Designing gRNAs to knockout genes within the Tardigrade genome

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors