Bioinformatic_Notes
  • Bioinformatic_Notes by HUXI
  • Docker
  • Linux
  • Blast
Powered by GitBook
On this page
  • Pipeline
  • Data Structure
  • Blast
  • Tips/Utilities
  • Homework and more

Blast

PreviousLinux

Last updated 6 years ago

Pipeline

Data Structure

了解存储sequence的常用文件格式

FASTA格式 (.fasta or .fa )

>gi|47115317|emb|CAG28618.1| VIM [Homo sapiens]MSTRSVSSSSYRRMFGGPGTASRPSSSRSYVTTSTRTYSLGSALRPSTSRSLYASSPGGVYATRSSAVRL

The word following the ">" symbol is the identifier of the sequence, and the rest of the line is the description (optional). Normally, identifiers are simply protein accession, name or Entrez gi's (e.g., Q5I7T1, AG10B_HUMAN, 129295), but a bar-separated NCBI sequence identifier (e.g., gi|129295) will also be accepted. Any arbitrary user-specified sequence identifier can also be used (e.g., CLONE00073452).

Blast

Tips/Utilities

Homework and more