Explore tens of thousands of sets crafted by our community.
Bioinformatics Essentials
33
Flashcards
0/33
Bioconductor
An open-source software project that provides tools for the analysis and comprehension of high-throughput genomic data in the R statistical programming environment.
KEGG
Kyoto Encyclopedia of Genes and Genomes; a collection of databases dealing with genomes, enzymatic pathways, and biological chemicals; useful for understanding high-level functions of the biological system.
CRISPR
A gene editing tool that allows for precise, directed changes to genomic DNA; revolutionized fields of genetics and genomics by enabling genome modifications in living organisms.
STRING Database
A database of known and predicted protein–protein interactions, which includes direct (physical) and indirect (functional) associations.
Pairwise Sequence Alignment
The alignment of sequences in a pair, used to find the best-matching piecewise (localized) or global alignments of two query sequences.
De novo Sequencing
The process of sequencing a novel genome where there is no reference sequence available for alignment.
Pfam
A database of protein families, including their alignments and hidden Markov models (HMMs), which can be used to identify and classify protein domains.
Microarray Analysis
A process used to assess gene expression levels of thousands of genes simultaneously to study the effects of certain treatments, diseases, and developmental stages on gene expression.
RNA-seq
A sequencing technique which uses next-generation sequencing to reveal the presence and quantity of RNA in a biological sample at a given moment.
Paralogs
Genes related by duplication within a genome that may evolve new functions.
Orthologs
Genes in different species that evolved from a common ancestral gene by speciation; they retain the same function in the course of evolution.
Gapped Alignment
An extension of sequence alignment that allows for insertions and deletions in one or both of the sequences being compared.
Quantitative Trait Loci (QTL) Mapping
A statistical method that links certain complex phenotypes to specific regions of chromosomes; used to identify the locations and effects of genes related to those traits.
Phylogenetic Tree
A branching diagram or 'tree' showing the evolutionary relationships among various biological species based on their genetic or physical characteristics.
Read Alignment
The process of aligning short sequencing reads to a reference genome, used to infer where the reads came from and often in re-sequencing.
Protein Data Bank (PDB)
A database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids.
MicroRNA (miRNA)
Small non-coding RNA molecules that function in RNA silencing and post-transcriptional regulation of gene expression.
FASTA Format
Text-based format for representing nucleotide sequences or peptide sequences; used for storing sequence data and database searches.
Tandem Repeat
A pattern of two or more nucleotides that is repeated directly adjacent to each other; they can be used as markers in genetic mapping.
UCSC Genome Browser
An online tool that integrates genomic data, including genomes of various organisms, gene predictions, expression and disease data, and other large-scale sequencing projects.
GO Terms
Gene Ontology terms are structured, controlled vocabulary for describing genes and gene product attributes across all species.
Ensembl
A genome browser providing access to various vertebrate genomes, allowing users to retrieve and visualize gene and other genomic information from various species.
GenBank
A genetic sequence database, maintained by the National Center for Biotechnology Information, that provides annotated collections of all publicly available DNA sequences.
Next Generation Sequencing (NGS)
Encompasses several high-throughput sequencing technologies that allow for rapid sequencing of the whole genome or targeted regions of the genome.
Sequence Homology
The relationship between DNA, RNA, or protein sequences that are derived from a common ancestor; indicating shared evolutionary origins.
SNP Genotyping
The measurement of genetic variations of single nucleotide polymorphisms (SNPs) between members of a species; it's important for mapping genomes.
Metagenomics
The study of genetic material recovered directly from environmental samples, bypassing the need for isolating and lab-culturing individual species.
Multiple Sequence Alignment
A methodology used to align three or more biological sequences, allowing for the identification of conserved sequences, important for protein or gene function.
Hidden Markov Model (HMM)
A statistical model which is used for pattern recognition in sequences, particularly useful in predicting protein domains and gene structures.
Proteomics
The large-scale study of proteomes and their functions; includes analyzing the structure, function, and interactions of proteins expressed by a genome.
BLAST
Basic Local Alignment Search Tool; it is used to find regions of similarity between biological sequences, helping to identify homologous genes and proteins.
Protein Domain
A part of a protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain.
ChIP-sequencing (ChIP-seq)
A method used to analyze protein interactions with DNA; it combines chromatin immunoprecipitation with massively parallel DNA sequencing.
© Hypatia.Tech. 2024 All rights reserved.