Proteins

Dr. Lisa Bartee; Jack Brook

14 Proteins

Proteins are one of the most abundant organic molecules in living systems and have the most diverse range of functions of all macromolecules. Proteins may be structural, regulatory, contractile, or protective; they may serve in transport, storage, or membranes; or they may be toxins or enzymes. Each cell in a living system may contain thousands of different proteins, each with a unique function. Their structures, like their functions, vary greatly. They are all, however, polymers of amino acids, arranged in a linear sequence and connected together by covalent bonds.

Amino acids are the monomers that make up proteins (Figure 1). Each amino acid has the same fundamental structure, which consists of a central carbon atom, also known as the alpha (α) carbon, bonded to an amino group (NH₂), a carboxyl group (COOH), and to a hydrogen atom. Every amino acid also has another atom or group of atoms bonded to the central atom known as the R group.

amino acid structure. On the left is the amino group, composed of two white balls labeled H representing hydrogens connected to a blue ball labeled N representing nitrogen. The N is connected to a black ball containing a C to the right. Connected above is a white ball labeled H. Connected below is a white box labeled R (side chain). To the right of the black ball labeled C is a second black ball labeled C. This ball is connected with one line representing a covalent bond to a red ball labeled O. The O is connected to a white ball labeled H. The black ball is also connected with two lines to another red ball labeled O. This end of the amino acid is labeled carboxyl group. — **Figure 1** Amino acids have a central asymmetric carbon to which an amino group, a carboxyl group, a hydrogen atom, and a side chain (R group) are attached.

Function	Examples	Description
Defense	Immunoglobulins	Antibodies bind to specific foreign particles, such as viruses and bacteria, to help protect the body.
Enzyme	Digestive enzymes such as amylase, lipase, pepsin, trypsin	Enzymes carry out almost all of the thousands of chemical reactions that take place in cells. They also assist with the formation of new molecules by reading the genetic information stored in DNA.
Messenger	Insulin, thyroxine	Messenger proteins, such as some types of hormones, transmit signals to coordinate biological processes between different cells, tissues, and organs.
Structural component	Actin, tubulin, keratin	These proteins provide structure and support for cells. On a larger scale, they also allow the body to move.
Transport/ storage	Hemoglobin, albumin, Legume storage proteins, egg white (albumin)	These proteins bind and carry atoms and small molecules within cells and throughout the body. Some provide nourishment in early development of the embryo and the seedling
Contractile	Actin, myosin	Affect muscle contraction.

You may have noticed that “source of energy” was not listed among the function of proteins. This is because proteins in our diet are typically broken back down into individual amino acids that our cells then assemble into our own proteins. Humans are actually unable to build some amino acids inside our own cells – we require them in our diet (these are the so-called “essential” amino acids). Our cells can digest proteins to release energy, but will usually only do so when carbohydrates or lipids are not available.

photo of meat and seafood — **Figure 2** Examples of foods that contain high levels of protein. (“Protein” by National Cancer Institute is in the Public Domain)

The functions of proteins can be very diverse because they are made up of are 20 different chemically distinct amino acids that form long chains, and the amino acids can be in any order. The function of the protein is dependent on the protein’s shape. The shape of a protein is determined by the order of the amino acids. Proteins are often hundreds of amino acids long and they can have very complex shapes because there are so many different possible orders for the 20 amino acids (Figure 3)!

amino acid structures — **Figure 3** There are 20 common amino acids commonly found in proteins, each with a different R group (variant group) that determines its chemical nature.

The chemical nature of the side chain determines the nature of the amino acid (that is, whether it is acidic, basic, polar, or nonpolar). For example, the amino acid glycine has a hydrogen atom as the R group. Amino acids such as valine, methionine, and alanine are nonpolar or hydrophobic in nature, while amino acids such as serine, threonine, and cysteine are polar and have hydrophilic side chains. The side chains of lysine and arginine are positively charged, and therefore these amino acids are also known as basic amino acids. Proline has an R group that is linked to the amino group, forming a ring-like structure. Proline is an exception to the standard structure of an animo acid since its amino group is not separate from the side chain (Figure 3). Amino acids are represented by a single upper case letter as well as a three-letter abbreviation. For example, valine is known by the letter V or the three-letter symbol val.

Just as some fatty acids are essential to a diet, some amino acids are necessary as well. They are known as essential amino acids, and in humans they include isoleucine, leucine, and cysteine. Essential amino acids refer to those necessary for construction of proteins in the body, although not produced by the body; which amino acids are essential varies from organism to organism.

The sequence and the number of amino acids ultimately determine the protein’s shape, size, and function. Each amino acid is attached to another amino acid by a covalent bond, known as a peptide bond, which is formed by a dehydration reaction. The carboxyl group of one amino acid and the amino group of the incoming amino acid combine, releasing a molecule of water. The resulting bond is the peptide bond (Figure 4).

chemical structure of peptide bond — **Figure 4** Peptide bond formation is a dehydration synthesis reaction. The carboxyl group of one amino acid is linked to the amino group of the incoming amino acid. In the process, a molecule of water is released.

Protein Structure

As discussed earlier, the shape of a protein is critical to its function. For example, an enzyme can bind to a specific substrate at a site known as the active site. If this active site is altered because of local changes or changes in overall protein structure, the enzyme may be unable to bind to the substrate. To understand how the protein gets its final shape or conformation, we need to understand the four levels of protein structure: primary, secondary, tertiary, and quaternary (Figure 5).

Primary Structure

The unique sequence of amino acids in a polypeptide chain is its primary structure. For example, the pancreatic hormone insulin is made up of two polypeptide chains, A and B. The primary structure (sequences of amino acids) of the A and B chains are unique to insulin. Amino acids are linked together in polypeptide chains by strong covalent bonds.

amino acid chain for insulin represented with green balls containing names of amino acids. Insulin is composed of an A chain of green balls and a separate B chain, which is slightly longer. The two chains are connected together with lines representing disulfide bonds. — **Figure 6** Bovine serum insulin is a protein hormone made of two peptide chains, A (21 amino acids long) and B (30 amino acids long). In each chain, primary structure is indicated by three-letter abbreviations that represent the names of the amino acids in the order they are present. The amino acid cysteine (cys) has a sulfhydryl (SH) group as a side chain. Two sulfhydryl groups can react in the presence of oxygen to form a disulfide (S-S) bond. Two disulfide bonds connect the A and B chains together, and a third helps the A chain fold into the correct shape. Note that all disulfide bonds are the same length, but are drawn different sizes for clarity.

Secondary Structure

The local folding of the polypeptide in some regions gives rise to the secondary structure of the protein. The most common are the α-helix and β-pleated sheet structures (Figure 7). Both structures are held in shape by hydrogen bonds. The hydrogen bonds form between the oxygen atom in the carbonyl group in one amino acid and another amino acid that is four amino acids farther along the chain.

alpha-helix shown as green spiral, next to a chemical structure showing hydrogen bonds connecting the spiral into shape. Blue anti-parallel arrors respresent beta-pleated sheet, next to a chemical structure showing hydrogen bonds holding the strands parallel to each other. — **Figure 7** The α-helix and β-pleated sheet are secondary structures of proteins that form because of hydrogen bonding between carbonyl and amino groups in the peptide backbone. Certain amino acids have a propensity to form an α-helix, while others have a propensity to form a β-pleated sheet.

Tertiary Structure

The unique three-dimensional structure of a polypeptide is its tertiary structure (Figure 8). This structure is in part due to chemical interactions at work on the polypeptide chain. Primarily, the interactions among R groups (the variable part of the amino acid) creates the complex three-dimensional tertiary structure of a protein. For example, R groups with like charges are repelled by each other and those with unlike charges are attracted to each other (ionic bonds). Partially charged atoms within the R groups can form hydrogen bonds. When protein folding takes place, the hydrophobic R groups of non-polar amino acids lay in the interior of the protein, whereas the hydrophilic R groups lay on the outside. Interaction between cysteine side chains forms disulfide linkages in the presence of oxygen, which is the only covalent bond forming during protein folding and tertiary structure.

A red tube labeled polypeptide backbone squiggles around the image. connecting two sections into a loop are a positively charged amino acid on the left connected with an ionic bond to a negatively charged amino acid at the other end of the loop. A second loop is formed below connected by two amino acids in a pink box labeled hydrophobic interactions. A third loop is connected by lines connecting two S's, labeled disulfide linkage. Within this loop is a second connection with two amino acids connected by a blue rectangle labeled hydrogen bond. — **Figure 8** The tertiary structure of proteins is determined by a variety of chemical interactions. These include hydrophobic interactions, ionic bonding, hydrogen bonding and disulfide linkages.

Quaternary Structure

In nature, some proteins are formed from several polypeptides, also known as subunits, and the interaction of these subunits forms the quaternary structure. Weak interactions between the subunits help to stabilize the overall structure. For example, insulin (a globular protein) has a combination of hydrogen bonds and disulfide bonds that cause it to be mostly clumped into a ball shape. Insulin starts out as a single polypeptide and loses some internal sequences in the presence of post-translational modification after the formation of the disulfide linkages that hold the remaining chains together. Silk (a fibrous protein), however, has a β-pleated sheet structure that is the result of hydrogen bonding between different chains.

The four levels of protein structure (primary, secondary, tertiary, and quaternary) are illustrated in Figure 9.

another diagram of the levels of protein structure. A chain of blue balls is labeled primary protein structure: sequence of a chain of amino acids. A spiral labeled alpha-helix and a pleated sheet are labeled secondary protein structure: hydrogen bonding of the peptide backbone causes the amino acids to fold into a repeating pattern. A 3-dimensional squiggle is labeled: tertiary protein structure: three-dimensional folding pattern of a protein due to side chain interactions. Two complex 3-d squiggles right next to each other are labeled quaternary protein structure: protein consisting of more than one amino acid chain. — **Figure 9** The four levels of protein structure can be observed in these illustrations. (credit: modification of work by National Human Genome Research Institute)

The unique shape for every protein is ultimately determined by the gene that encodes the protein. Any change in the gene sequence may lead to a different amino acid being added to the polypeptide chain, causing a change in protein structure and function. Individuals who are affected by sickle cell anemia can have a variety of serious health problems, such as breathlessness, dizziness, headaches, and abdominal pain. In this disease, the hemoglobin β chain has a single amino acid substitution, causing a change in both the structure (shape) and function (job) of the protein. What is most remarkable to consider is that a hemoglobin molecule is made up of about 600 amino acids. The structural difference between a normal hemoglobin molecule and a sickle cell molecule is a single amino acid of the 600 (Figure 10).

ribbon protein structure: complex 3-d shape composed of spirals (alpha helices). Different helices are different colors. — **Figure 10** The unique shape of the normal hemoglobin protein. (“Structure of hemoglobin Gower 2” by Emw is licensed under CC BY-SA 3.0)

Denaturation and Protein Folding

Each protein has its own unique sequence and shape that are held together by chemical interactions. If the protein is subject to changes in temperature, pH, or exposure to chemicals, the protein structure may change, losing its shape without losing its primary sequence in what is known as denaturation. Denaturation is often reversible because the primary structure of the polypeptide is conserved in the process if the denaturing agent is removed, allowing the protein to resume its function. Sometimes denaturation is irreversible, leading to loss of function. One example of irreversible protein denaturation is when an egg is fried. The albumin protein in the liquid egg white is denatured when placed in a hot pan. Not all proteins are denatured at high temperatures; for instance, bacteria that survive in hot springs have proteins that function at temperatures close to boiling. The stomach is also very acidic, has a low pH, and denatures proteins as part of the digestion process; however, the digestive enzymes of the stomach retain their activity under these conditions.

a photo of a fried egg. — **Figure 11** The reason an egg white turns white as you cook it is because the albumin in the white denatures and then reconnects in an abnormal fashion. Credit Matthew Murdock; https://www.flickr.com/photos/54423233@N05/13916201522

Protein folding is critical to its function. It was originally thought that the proteins themselves were responsible for the folding process. Only recently was it found that often they receive assistance in the folding process from protein helpers known as chaperones (or chaperonins) that associate with the target protein during the folding process. They act by preventing aggregation of polypeptides that make up the complete protein structure, and they disassociate from the protein once the target protein is folded.

How does protein structure relate to function?

Recall that a protein is built from a long chain of amino acids connected together in a specific order. The specific order of amino acids determines how they will interact together to form the 3-D shape of the protein. The shape of a protein determines its function. Therefore, the order of the amino acids determines the protein’s shape, which determines its function.

Because there are 20 different amino acids, they can be combined together in a practically infinite number of ways. This means that there is a huge number of different protein shapes that can be assumed based on the amino acid order. This is very important since proteins fulfill so many different functions within cells.

References

Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016 http://cnx.org/contents/s8Hh0oOc@9.10:QhGQhr4x@6/Biological-Molecules

License

Icon for the Creative Commons Attribution 4.0 International License