Explore tens of thousands of sets crafted by our community.
Bioinformatics Essentials
33
Flashcards
0/33
FASTA Format
Text-based format for representing nucleotide sequences or peptide sequences; used for storing sequence data and database searches.
GenBank
A genetic sequence database, maintained by the National Center for Biotechnology Information, that provides annotated collections of all publicly available DNA sequences.
BLAST
Basic Local Alignment Search Tool; it is used to find regions of similarity between biological sequences, helping to identify homologous genes and proteins.
Multiple Sequence Alignment
A methodology used to align three or more biological sequences, allowing for the identification of conserved sequences, important for protein or gene function.
Hidden Markov Model (HMM)
A statistical model which is used for pattern recognition in sequences, particularly useful in predicting protein domains and gene structures.
Phylogenetic Tree
A branching diagram or 'tree' showing the evolutionary relationships among various biological species based on their genetic or physical characteristics.
Microarray Analysis
A process used to assess gene expression levels of thousands of genes simultaneously to study the effects of certain treatments, diseases, and developmental stages on gene expression.
Next Generation Sequencing (NGS)
Encompasses several high-throughput sequencing technologies that allow for rapid sequencing of the whole genome or targeted regions of the genome.
RNA-seq
A sequencing technique which uses next-generation sequencing to reveal the presence and quantity of RNA in a biological sample at a given moment.
Protein Data Bank (PDB)
A database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids.
Sequence Homology
The relationship between DNA, RNA, or protein sequences that are derived from a common ancestor; indicating shared evolutionary origins.
Protein Domain
A part of a protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain.
STRING Database
A database of known and predicted protein–protein interactions, which includes direct (physical) and indirect (functional) associations.
ChIP-sequencing (ChIP-seq)
A method used to analyze protein interactions with DNA; it combines chromatin immunoprecipitation with massively parallel DNA sequencing.
SNP Genotyping
The measurement of genetic variations of single nucleotide polymorphisms (SNPs) between members of a species; it's important for mapping genomes.
Orthologs
Genes in different species that evolved from a common ancestral gene by speciation; they retain the same function in the course of evolution.
Paralogs
Genes related by duplication within a genome that may evolve new functions.
Proteomics
The large-scale study of proteomes and their functions; includes analyzing the structure, function, and interactions of proteins expressed by a genome.
Metagenomics
The study of genetic material recovered directly from environmental samples, bypassing the need for isolating and lab-culturing individual species.
Bioconductor
An open-source software project that provides tools for the analysis and comprehension of high-throughput genomic data in the R statistical programming environment.
KEGG
Kyoto Encyclopedia of Genes and Genomes; a collection of databases dealing with genomes, enzymatic pathways, and biological chemicals; useful for understanding high-level functions of the biological system.
Pairwise Sequence Alignment
The alignment of sequences in a pair, used to find the best-matching piecewise (localized) or global alignments of two query sequences.
De novo Sequencing
The process of sequencing a novel genome where there is no reference sequence available for alignment.
CRISPR
A gene editing tool that allows for precise, directed changes to genomic DNA; revolutionized fields of genetics and genomics by enabling genome modifications in living organisms.
MicroRNA (miRNA)
Small non-coding RNA molecules that function in RNA silencing and post-transcriptional regulation of gene expression.
Pfam
A database of protein families, including their alignments and hidden Markov models (HMMs), which can be used to identify and classify protein domains.
GO Terms
Gene Ontology terms are structured, controlled vocabulary for describing genes and gene product attributes across all species.
Read Alignment
The process of aligning short sequencing reads to a reference genome, used to infer where the reads came from and often in re-sequencing.
Gapped Alignment
An extension of sequence alignment that allows for insertions and deletions in one or both of the sequences being compared.
Tandem Repeat
A pattern of two or more nucleotides that is repeated directly adjacent to each other; they can be used as markers in genetic mapping.
Ensembl
A genome browser providing access to various vertebrate genomes, allowing users to retrieve and visualize gene and other genomic information from various species.
UCSC Genome Browser
An online tool that integrates genomic data, including genomes of various organisms, gene predictions, expression and disease data, and other large-scale sequencing projects.
Quantitative Trait Loci (QTL) Mapping
A statistical method that links certain complex phenotypes to specific regions of chromosomes; used to identify the locations and effects of genes related to those traits.
© Hypatia.Tech. 2024 All rights reserved.