Bioinformatics
STAT115-2020
Lect1.2 Big Data Challenge

Lect1.2 Big Data Challenge

Sequencing in 2001 :

Genome Center for Sequencing in 2001:

AspectProductionSequencing
Location
- Room of EquipmentYes
- Sample PreparationsYes
Number of People3510
Duration (Weeks)3-4
Sequencing Details
- Capillary Sequences74x
- Runs per Day15-40
- Mb per Instrument per Day1-2
- Total Capacity per Day (Mb)120

Sequencing in 2007

AspectProductionSequencing
Equipment1x cluster station1x Genome Analyzer
Number of People1Same Person
Duration (Days)13-5 days per run
Data Output per Day (Gb)Not applicable0.5Gb per instrument

Few Machines for this:

  • ISeq 100
  • MiniSeq
  • MiSeq SEries
  • NextSeq 550 Series
  • NextSeq 2000
  • NovaSeq 6000

Initialy human Genome cost 30 billions over 30 years now it can cost couple of thousands, in 2001 it would cost 100 million.

Every time you sequence Genome it's terabytes and terabytes of data, there are algo which are need to process quicly if take raw data it will take a lot of sotage space. Algorith for mutation 99 percent of our geno is repetative so there is no need to save the smilira seq (opens in a new tab)

Example of Angelina Jolie's preventative Double Mastectomy inherated risk of developing cancer. This can also helps to check how much of the asian affrican boold you have which can heritate certain diesase.