Skip to content

C1: Introduction to Protein Modeling

Welcome to the Protein Modeling Lab Class! This first session will introduce you to the fundamental concepts of proteins and the field of protein modeling. Understanding these basics is crucial for all subsequent topics in this course.

Before diving in, watch this short video for a quick overview of proteins:

Video: What Is A Protein? | Proteins | Biology | FuseSchool (Source: FuseSchool - Global Education YouTube Channel)

Proteins are large, complex macromolecules that play a myriad of critical roles in all living organisms. They are composed of smaller units called amino acids, linked together in long chains called polypeptides. The sequence of these amino acids dictates the protein’s structure, which in turn determines its function.

Think of proteins as the workhorses of the cell, involved in virtually every biological process, including:

  • Catalyzing metabolic reactions (enzymes)
  • DNA replication
  • Responding to stimuli (receptors)
  • Providing structural support (e.g., collagen)
  • Transporting molecules (e.g., hemoglobin)
Formation of a peptide bond between two amino acids

Figure 1: Amino acids (spheres) linking via peptide bonds to form a polypeptide chain. (Source: Wikimedia Commons)

Click or tap the card to reveal the back.

Protein modeling, also known as computational protein structure prediction, is the process of deriving a three-dimensional (3D) model of a protein from its amino acid sequence. Since a protein’s function is intimately linked to its 3D structure, knowing this structure is vital for understanding how it works, how it interacts with other molecules, and how it might be affected by mutations.

Experimental methods like X-ray crystallography and NMR spectroscopy can determine protein structures, but they can be time-consuming, expensive, and not always feasible for every protein. Protein modeling offers a computational alternative or complement to these methods.

Conceptual diagram of protein sequence to structure

Figure 2: Conceptual representation of predicting a 3D protein structure (right) from its amino acid sequence (left). (Image adapted for illustrative purposes - original source context: Institute for Systems Biology)

  • Understanding Function: The 3D structure provides insights into how a protein performs its specific biological role.
  • Drug Design: Models can be used to design drugs that bind to a protein target, for example, to inhibit an enzyme in a pathogen.
  • Disease Mechanisms: Understanding how mutations alter protein structure can explain their link to diseases.
  • Protein Engineering: Designing new proteins or modifying existing ones for industrial or therapeutic applications.

The complexity of protein structure is typically described at four hierarchical levels. This video provides a great visual explanation:

Video: Protein Structure and Folding (Source: Amoeba Sisters YouTube Channel)

Levels of Protein Structure

Figure 3: The four levels of protein structure: Primary, Secondary, Tertiary, and Quaternary. (Source: Wikimedia Commons, User: Scurran)

Click or tap the card to reveal the back.

Example: ...-Alanine-Glycine-Serine-Valine-...

Click or tap the card to reveal the back.

Visuals:

Alpha Helix
α-helix (Source: Wikimedia Commons)
Beta Sheet
β-sheet (Source: Wikimedia Commons)

Click or tap the card to reveal the back.

This level defines the functional form of many proteins.

Click or tap the card to reveal the back.

Test Your Understanding: Protein Structure

Section titled “Test Your Understanding: Protein Structure”

Complete the sentences about protein structure:

The sequence of amino acids is known as the structure. Local folding patterns like α-helices and β-sheets constitute the structure. The overall 3D shape of a single polypeptide is its structure, and some proteins have a structure, formed by multiple polypeptide subunits.

4. Introduction to Protein Modeling Approaches

Section titled “4. Introduction to Protein Modeling Approaches”

There are several computational strategies to predict protein structure. We will delve deeper into some of these later in the course, but here’s a brief overview:

Major Protein Modeling Approaches

Builds a model using a known experimental structure of a related (homologous) protein as a template. Most accurate when a close homolog exists. (Covered in C7, C8, C9)

Identifies the known protein fold that best matches the query sequence, even with low sequence similarity. The sequence is "threaded" through a library of known folds.

Predicts structure from the amino acid sequence alone, based on physical and chemical principles, without relying on known structural templates. Computationally intensive and often less accurate for larger proteins.

5. Key Tools and Resources in Protein Modeling

Section titled “5. Key Tools and Resources in Protein Modeling”

As you embark on your journey in protein modeling, you’ll encounter several essential databases and software tools:

  • Protein Data Bank (PDB): The single global archive for information about the 3D structures of proteins, nucleic acids, and complex assemblies. Structures are typically determined by X-ray crystallography, NMR spectroscopy, or cryo-electron microscopy. (https://www.rcsb.org/)
  • UniProt: A comprehensive, high-quality, and freely accessible database of protein sequence and functional information. (https://www.uniprot.org/)
  • NCBI Databases: Including GenBank (DNA sequences), RefSeq (curated reference sequences), and tools like BLAST (which we will cover in C5). (https://www.ncbi.nlm.nih.gov/)
  • Visualization Software: Tools like PyMOL, UCSF Chimera/ChimeraX, and VMD are used to view, analyze, and generate images of protein structures. We will explore some of these later.

Here’s a short introduction to UCSF ChimeraX, a popular visualization tool:

Video: Introduction to UCSF ChimeraX - PDB Fetch and Basic Display (Source: UCSF ChimeraX YouTube Channel)

This course aims to provide you with a theoretical understanding and practical skills in protein modeling. Here’s a glimpse of what we’ll cover:

  • C1 (This session): Introduction to Protein Modeling
  • C2: Amino acids and protein qualitative tests
  • C3: Amino acids titration curves study
  • C4: Protein Structural Features
  • C5: Advanced Protein BLAST: PSI BLAST
  • C6: Protein-Protein Interaction Analysis
  • C7: Protein Homology Modeling
  • C8 & C9: Protein Modeling: Modeller (Parts I & II)

We will combine lectures with hands-on lab exercises to help you grasp the concepts and tools effectively.

Let’s see what you’ve learned so far!

What are the fundamental building blocks of proteins?

Protein modeling can only be done if an experimentally determined structure of the exact same protein already exists.

Which level of protein structure refers to the overall 3D shape of a single polypeptide chain?


We hope this introduction gives you a good starting point for our exploration into protein modeling. Come prepared with questions for our first class!

```