Principles of Biology: Biology 211, 212, and 213

Principles of Biology: Biology 211, 212, and 213

Lisa Bartee

Open Oregon Educational Resources

Portland, Ore.



Principles of Biology: The Principles of Biology sequence (BI 211, 212 and 213) introduces biology as a scientific discipline for students planning to major in biology and other science disciplines. Laboratories and classroom activities introduce techniques used to study biological processes and provide opportunities for students to develop their ability to conduct research.


The Process of Science

Learning Objectives

Course Outcomes for this section:

Apply the scientific method to biological questions by designing experiments and using the resulting data to form a conclusion.

  1. Design a controlled experiment to answer a biological question.
  2. Predict the outcome of an experiment.
  3. Collect, manipulate, and analyze quantitative and qualitative data
  4. Answer a biological question using data.

Select, evaluate, and utilize discipline-specific information and literature to research a biological topic.

  1. Differentiate between questions that can and cannot be answered using science.
  2. Identify appropriate credible sources of information to research a topic.
  3. Evaluate sources of information for their strengths and weaknesses.
  4. Differentiate between popular and scholarly sources.

Like geology, physics, and chemistry, biology is a science that gathers knowledge about the natural world. Specifically, biology is the study of life. The discoveries of biology are made by a community of researchers who work individually and together using agreed-on methods. In this sense, biology, like all sciences is a social enterprise like politics or the arts. The methods of science include careful observation, record keeping, logical and mathematical reasoning, experimentation, and submitting conclusions to the scrutiny of others. Science also requires considerable imagination and creativity; a well-designed experiment is commonly described as elegant, or beautiful. Like politics, science has considerable practical implications and some science is dedicated to practical applications, such as the prevention of disease (see Figure 1). Other science proceeds largely motivated by curiosity. Whatever its goal, there is no doubt that science, including biology, has transformed human existence and will continue to do so.

pill-shaped E coli

Figure 1 Escherichia coli (E. coli) bacteria, seen in this scanning electron micrograph, are normal residents of our digestive tracts that aid in the absorption of vitamin K and other nutrients. However, virulent strains are sometimes responsible for disease outbreaks. (credit: Eric Erbe, digital colorization by Christopher Pooley, both of USDA, ARS, EMU)


OpenStax, Biology. OpenStax CNX. May 27, 2016


The Nature of Science

Biology is a science, but what exactly is science? What does the study of biology share with other scientific disciplines? Science (from the Latin scientia, meaning “knowledge”) can be defined as knowledge about the natural world. Science is a very specific way of learning, or knowing, about the world. The history of the past 500 years demonstrates that science is a very powerful way of knowing about the world; it is largely responsible for the technological revolutions that have taken place during this time. There are however, areas of knowledge and human experience that the methods of science cannot be applied to. These include such things as answering purely moral questions, aesthetic questions, or what can be generally categorized as spiritual questions. Science can not investigate these areas because they are outside the realm of material phenomena, the phenomena of matter and energy, and can not be observed and measured.

The scientific method is a method of research with defined steps that include experiments and careful observation. One of the most important aspects of this method is the testing of hypotheses. A hypothesis is a suggested explanation for an event, which can be tested. Hypotheses, or tentative explanations, are generally produced within the context of a scientific theory. A scientific theory is a generally accepted, thoroughly tested and confirmed explanation for a set of observations or phenomena. Scientific theory is the foundation of scientific knowledge. In addition, in many scientific disciplines (less so in biology) there are scientific laws, often expressed in mathematical formulas, which describe how elements of nature will behave under certain specific conditions. There is not an evolution of hypotheses through theories to laws as if they represented some increase in certainty about the world. Hypotheses are the day-to-day material that scientists work with and they are developed within the context of theories. Laws are concise descriptions of parts of the world that are amenable to formulaic or mathematical description.

The scientific community has been debating for the last few decades about the value of different types of science. Is it valuable to pursue science for the sake of simply gaining knowledge, or does scientific knowledge only have worth if we can apply it to solving a specific problem or bettering our lives? This question focuses on the differences between two types of science: basic science and applied science.

Some individuals may perceive applied science as “useful” and basic science as “useless.” A question these people might pose to a scientist advocating knowledge acquisition would be, “What for?” A careful look at the history of science, however, reveals that basic knowledge has resulted in many remarkable applications of great value. Many scientists think that a basic understanding of science is necessary before an application is developed; therefore, applied science relies on the results generated through basic science. Other scientists think that it is time to move on from basic science and instead to find solutions to actual problems. Both approaches are valid. It is true that there are problems that demand immediate attention; however, few solutions would be found without the help of the knowledge generated through basic science.

One example of how basic and applied science can work together to solve practical problems occurred after the discovery of DNA structure led to an understanding of the molecular mechanisms governing DNA replication. Strands of DNA, unique in every human, are found in our cells, where they provide the instructions necessary for life. During DNA replication, new copies of DNA are made, shortly before a cell divides to form new cells. Understanding the mechanisms of DNA replication enabled scientists to develop laboratory techniques that are now used to identify genetic diseases, pinpoint individuals who were at a crime scene, and determine paternity. Without basic science, it is unlikely that applied science would exist.

Another example of the link between basic and applied research is the Human Genome Project, a study in which each human chromosome was analyzed and mapped to determine the precise sequence of DNA subunits and the exact location of each gene. (The gene is the basic unit of heredity; an individual’s complete collection of genes is his or her genome.) Other organisms have also been studied as part of this project to gain a better understanding of human chromosomes. The Human Genome Project (Figure 1) relied on basic research carried out with non-human organisms and, later, with the human genome. An important end goal eventually became using the data for applied research seeking cures for genetically related diseases.

Figure 1 The Human Genome Project was a 13-year collaborative effort among researchers working in several different fields of science. The project, which sequenced the entire human genome, was completed in 2003. (credit: the U.S. Department of Energy Genome Programs (

While research efforts in both basic science and applied science are usually carefully planned, it is important to note that some discoveries are made by serendipity, that is, by means of a fortunate accident or a lucky surprise. Penicillin was discovered when biologist Alexander Fleming accidentally left a petri dish of Staphylococcus bacteria open. An unwanted mold grew, killing the bacteria. The mold turned out to be Penicillium, and a new antibiotic was discovered. Even in the highly organized world of science, luck—when combined with an observant, curious mind—can lead to unexpected breakthroughs.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016


The Scientific Process

Biologists study the living world by posing questions about it and seeking science-based responses. This approach is common to other sciences as well and is often referred to as the scientific method. The scientific process was used even in ancient times, but it was first documented by England’s Sir Francis Bacon (1561–1626) (Figure 1), who set up inductive methods for scientific inquiry. The scientific method is not exclusively used by biologists but can be applied to almost anything as a logical problem solving method.

a painting of a guy wearing historical clothing

Figure 1 Sir Francis Bacon (1561–1626) is credited with being the first to define the scientific method. (credit: Paul van Somer)

The scientific process typically starts with an observation (often a problem to be solved) that leads to a question. Remember that science is very good at answering questions having to do with observations about the natural world, but is very bad at answering questions having to do with morals, ethics, or personal opinions.

Questions that can be
answered using science
Questions that cannot be
answered using science
• What is the optimum temperature for the growth of E. coli bacteria? • How tall is Santa Claus?
• Do birds prefer bird feeders of a specific color? • Do angels exist?
• What is the cause of this disease? • Which is better: classical music or rock and roll?
• How effective is this drug in treating this disease? • What are the ethical implications of human cloning?

Let’s think about a simple problem that starts with an observation and apply the scientific method to solve the problem. Imagine that one morning when you wake up and flip a the switch to turn on your bedside lamp, the light won’t turn on. That is an observation that also describes a problem: the lights won’t turn on. Of course, you would next ask the question: “Why won’t the light turn on?”

Recall that a hypothesis is a suggested explanation that can be tested. A hypothesis is NOT the question you are trying to answer – it is what you think the answer to the question will be. To solve a problem, several hypotheses may be proposed. For example, one hypothesis might be, “The light won’t turn on because the bulb is burned out.” But there could be other answers to the question, and therefore other hypotheses may be proposed. A second hypothesis might be, “The light won’t turn on because the lamp is unplugged” or “The light won’t turn on because the power is out.”

A hypothesis must be testable to ensure that it is valid. For example, a hypothesis that depends on what a dog thinks is not testable, because it can never be known what a dog thinks. It should also be falsifiable, meaning that it can be disproven by experimental results. An example of an unfalsifiable hypothesis is “Red is a better color than blue.” There is no experiment that might show this statement to be false. To test a hypothesis, a researcher will conduct one or more experiments designed to eliminate one or more of the hypotheses. This is important. A hypothesis can be disproven, or eliminated, but it can never be proven. Science does not deal in proofs like mathematics. If an experiment fails to disprove a hypothesis, then we find support for that explanation, but this is not to say that down the road a better explanation will not be found, or a more carefully designed experiment will be found to falsify the hypothesis.

Once a hypothesis has been selected, a prediction can be made that predicts what you would observe if you tested this hypothesis. A prediction is different from a hypothesis because a prediction describes what you will actually observe in your experiment. The hypothesis is the reason why you will observe your prediction. Your prediction can only be made after you have designed your experiment so that you know specifically what you will be testing.

A variable is any part of the experiment that can vary or change during the experiment. Typically, an experiment only tests one variable and all the other conditions in the experiment are held constant.

A prediction often has the format “If [I change the independent variable in this way] then [I will observe that the dependent variable does this]” For example, the prediction for the first hypothesis might be, “If you change the light bulb, then the light will turn on.” In this experiment, the independent variable (the thing that you are testing) would be changing the light bulb and the dependent variable is whether or not the light turns on. It would be important to hold all the other aspects of the environment constant, for example not messing with the lamp cord or trying to turn the lamp on using a different light switch. If the entire house had lost power during the experiment because a car hit the power pole, that would be a confounding variable.

You may have learned that a hypothesis can be phrased as an “If..then…” statement, which is what I just told you was often the format for a prediction. The part of the statement that is an “If..then…” is the prediction. The hypothesis is the “because” at the end. For example, in the light experiment, the prediction and hypothesis could both be given as “If you change the light bulb, then the light will turn on (prediction) because the bulb is burned out (hypothesis).”

The results of your experiment are the data that you collect as the outcome.  In the light experiment, your results are either that the light turns on or the light doesn’t turn on. Based on your results, you can make a conclusion. Your conclusion uses the results to answer your original question.

flow chart

Figure 4 The basic process of the scientific method. This is what science looks like in a simplified world.

We can put the experiment with the light that won’t go in into the figure above:

  1. Observation: the light won’t turn on.
  2. Question: why won’t the light turn on?
  3. Hypothesis: the lightbulb is burned out.
  4. Prediction: if I change the lightbulb (independent variable), then the light will turn on (dependent variable).
  5. Experiment: change the lightbulb while leaving all other variables the same.
  6. Analyze the results: the light didn’t turn on.
  7. Conclusion: The lightbulb isn’t burned out. The results do not support the hypothesis, time to develop a new one!
  8. Hypothesis 2: the lamp is unplugged.
  9. Prediction 2: if I plug in the lamp, then the light will turn on.
  10. Experiment: plug in the lamp
  11. Analyze the results: the light turned on!
  12. Conclusion: The light wouldn’t turn on because the lamp was unplugged. The results support the hypothesis, it’s time to move on to the next experiment!

In practice, the scientific method is not as rigid and structured as it might at first appear. Sometimes an experiment leads to conclusions that favor a change in approach; often, an experiment brings entirely new scientific questions to the puzzle. Many times, science does not operate in a linear fashion; instead, scientists continually draw inferences and make generalizations, finding patterns as their research proceeds. Scientific reasoning is more complex than the scientific method alone suggests.

Figure 5 The actual process of using the scientific method. “The general process of scientific investigations” by Laura Guerin, CK-12 Foundation is licensed under CC BY-NC 3.0

Another important aspect of designing an experiment is the presence of one or more control groups. A control group is a sample that is not treated with the independent variable, but is otherwise treated the same way as your experimental sample. Control groups allow you to make a comparison that is important for interpreting your results. The control group contains every feature of the experimental group except it is not given the manipulation that is hypothesized about (it does not get treated with the independent variable). Therefore, if the results of the experimental group differ from the control group, the difference must be due to the hypothesized manipulation, rather than some outside factor.

Example 1

Question: Which fertilizer will produce the greatest number of tomatoes when applied to the plants?

Prediction and Hypothesis: If I apply different brands of fertilizer to tomato plants, the most tomatoes will be produced from plants watered with Brand A because Brand A advertises that it produces twice as many tomatoes as other leading brands.

Experiment: Purchase 10 tomato plants of the same type from the same nursery. Pick plants that are similar in size and age. Divide the plants into two groups of 5. Apply Brand A to the first group and Brand B to the second group according to the instructions on the packages. After 10 weeks, count the number of tomatoes on each plant.

Independent Variable: Brand of fertilizer.

Dependent Variable: Number of tomatoes.

The number of tomatoes produced depends on the brand of fertilizer applied to the plants.

Constants: amount of water, type of soil, size of pot, amount of light, type of tomato plant, length of time plants were grown.

Confounding variables: any of the above that are not held constant, plant health, diseases present in the soil or plant before it was purchased.

Results: Tomatoes fertilized with Brand A  produced an average of 20 tomatoes per plant, while tomatoes fertilized with Brand B produced an average of 10 tomatoes per plant.

You’d want to use Brand A next time you grow tomatoes, right? But what if I told you that plants grown without fertilizer produced an average of 30 tomatoes per plant! Now what will you use on your tomatoes?


Results including control group: Tomatoes which received no fertilizer produced more tomatoes than either brand of fertilizer.

Conclusion: Although Brand A fertilizer produced more tomatoes than Brand B, neither fertilizer should be used because plants grown without fertilizer produced the most tomatoes!

Example 2

You are interested in testing a new brand of natural cleaning product. You spray it around your kitchen sink and then take a sample of the bacteria remaining near the drain. You find, to your horror, that there are still 100 bacteria per square inch after cleaning! That seems awful, unless you have the proper control to compare it to: the number of bacteria present on the surface before it was cleaned. According to WebMD, there are more than 500,000 bacteria per square inch around kitchen drains. That means the cleaner actually killed well over 99.9% of the bacteria around the drain.

In this experiment:

  • Question: Is the new brand of natural cleaning product effective in killing bacteria in a  kitchen sink?
  • Hypothesis: Yes, it is effective based on its advertising campaign.
  • Prediction: The natural cleaning product will kill all the bacteria in the sink.
  • Independent variable: use of cleaning product
  • Dependent variable: number of bacteria present on surface
  • Constants: same sink, same sampling process for bacteria
  • Confounding variables: not all bacteria can be grown in culture in a lab – what if the cleaner doesn’t kill these bacteria? We’d never be able to tell!
  • Control Group: Number of bacteria in the sink before use of the cleaner.
  • Results: 99.9% of the bacteria in the sink were killed.
  • Conclusion: The natural cleaning product seems effective, despite not killing all the bacteria present in the sink.


Text adapted from: OpenStax, Biology. OpenStax CNX. May 27, 2016


Presenting Data - Graphs and Tables

Types of Data

There are different types of data that can be collected in an experiment. Typically, we try to design experiments that collect objective, quantitative data.

Objective data is fact-based, measurable, and observable. This means that if two people made the same measurement with the same tool, they would get the same answer. The measurement is determined by the object that is being measured. The length of a worm measured with a ruler is an objective measurement. The observation that a chemical reaction in a test tube changed color is an objective measurement. Both of these are observable facts.

Subjective data is based on opinions, points of view, or emotional judgment. Subjective data might give two different answers when collected by two different people. The measurement is determined by the subject who is doing the measuring. Surveying people about which of two chemicals smells worse is a subjective measurement. Grading the quality of a presentation is a subjective measurement. Rating your relative happiness on a scale of 1-5 is a subjective measurement. All of these depend on the person who is making the observation – someone else might make these measurements differently.

Quantitative measurements gather numerical data. For example, measuring a worm as being 5cm in length is a quantitative measurement.

Qualitative measurements describe a quality, rather than a numerical value. Saying that one worm is longer than another worm is a qualitative measurement.

Quantitative Qualitative
Objective The chemical reaction has produced 5cm of bubbles. The chemical reaction has produced a lot of bubbles.
Subjective I give the amount of bubbles a score of 7 on a scale of 1-10. I think the bubbles are pretty.

After you have collected data in an experiment, you need to figure out the best way to present that data in a meaningful way. Depending on the type of data, and the story that you are trying to tell using that data, you may present your data in different ways.

Data Tables

The easiest way to organize data is by putting it into a data table. In most data tables, the independent variable (the variable that you are testing or changing on purpose) will be in the column to the left and the dependent variable(s) will be across the top of the table.

Be sure to:


You are evaluating the effect of different types of fertilizers on plant growth. You plant 12 tomato plants and divide them into three groups, where each group contains four plants. To the first group, you do not add fertilizer and the plants are watered with plain water. The second and third groups are watered with two different brands of fertilizer. After three weeks, you measure the growth of each plant in centimeters and calculate the average growth for each type of fertilizer.

The effect of different brands of fertilizer on tomato plant growth over three weeks
Treatment Plant Number
1 2 3 4 Average
No treatment 10 12 8 9 9.75
Brand A 15 16 14 12 14.25
Brand B 22 25 21 27 23.75

Scientific Method Review: Can you identify the key parts of the scientific method from this experiment?

  • Independent variable – Type of treatment (brand of fertilizer)
  • Dependent variable – plant growth in cm
  • Control group(s) – Plants treated with no fertilizer
  • Experimental group(s) – Plants treated with different brands of fertilizer

Graphing data

Graphs are used to display data because it is easier to see trends in the data when it is displayed visually compared to when it is displayed numerically in a table. Complicated data can often be displayed and interpreted more easily in a graph format than in a data table.

In a graph, the X-axis runs horizontally (side to side) and the Y-axis runs vertically (up and down). Typically, the independent variable will be shown on the X axis and the dependent variable will be shown on the Y axis (just like you learned in math class!).

Line Graph

Line graphs are the best type of graph to use when you are displaying a change in something over a continuous range. For example, you could use a line graph to display a change in temperature over time. Time is a continuous variable because it can have any value between two given measurements. It is measured along a continuum. Between 1 minute and 2 minutes are an infinite number of values, such as 1.1 minute or 1.93456 minutes.

Changes in several different samples can be shown on the same graph by using lines that differ in color, symbol, etc.

Figure 1: Change in bubble height in centimeters over 120 seconds for three samples containing different amounts of enzyme. Sample A contained no enzyme, sample B contained 1mL of enzyme, sample C contained 2 mL of enzyme.

Bar Graph

Bar graphs are used to compare measurements between different groups. Bar graphs should be used when your data is not continuous, but rather is divided into different categories. If you counted the number of birds of different species, each species of bird would be its own category. There is no value between “robin” and “eagle”, so this data is not continuous.

Figure 2: Final bubble height after 120 seconds for three samples containing different amounts of enzyme. Sample A contained no enzyme, sample B contained 1 mL of enzyme, sample C contained 2 mL of enzyme.

Scatter Plot

Scatter Plots are used to evaluate the relationship between two different continuous variables. These graphs compare changes in two different variables at once. For example, you could look at the relationship between height and weight. Both height and weight are continuous variables. You could not use a scatter plot to look at the relationship between number of children in a family and weight of each child because the number of children in a family is not a continuous variable: you can’t have 2.3 children in a family.

Figure 3: The relationship between height (in meters) and weight (in kilograms) of members of the girls softball team. “OLS example weight vs height scatterplot” by Stpasha is in the Public Domain

How to make a graph

  1. Identify your independent and dependent variables.
  2. Choose the correct type of graph by determining whether each variable is continuous or not.
  3. Determine the values that are going to go on the X and Y axis. If the values are continuous, they need to be evenly spaced based on the value.
  4. Label the X and Y axis, including units.
  5. Graph your data.
  6. Add a descriptive caption to your graph. Note that data tables are titled above the figure and graphs are captioned below the figure.


Let’s go back to the data from our fertilizer experiment and use it to make a graph. I’ve decided to graph only the average growth for the four plants because that is the most important piece of data. Including every single data point would make the graph very confusing.

  1. The independent variable is type of treatment and the dependent variable is plant growth (in cm).
  2. Type of treatment is not a continuous variable. There is no midpoint value between fertilizer brands (Brand A 1/2 doesn’t make sense). Plant growth is a continuous variable. It makes sense to sub-divide centimeters into smaller values. Since the independent variable is categorical and the dependent variable is continuous, this graph should be a bar graph.
  3. Plant growth (the dependent variable) should go on the Y axis and type of treatment (the independent variable) should go on the X axis.
  4. Notice that the values on the Y axis are continuous and evenly spaced. Each line represents an increase of 5cm.
  5. Notice that both the X and the Y axis have labels that include units (when required).
  6. Notice that the graph has a descriptive caption that allows the figure to stand alone without additional information given from the procedure: you know that this graph shows the average of the measurements taken from four tomato plants.

Figure 4: Average growth (in cm) of tomato plants when treated with different brands of fertilizer. There were four tomato plants in each group (n = 4).

Descriptive captions

All figures that present data should stand alone – this means that you should be able to interpret the information contained in the figure without referring to anything else (such as the methods section of the paper). This means that all figures should have a descriptive caption that gives information about the independent and dependent variable. Another way to state this is that the caption should describe what you are testing and what you are measuring. A good starting point to developing a caption is “the effect of [the independent variable] on the [dependent variable].”

Here are some examples of good caption for figures:

Here are a few less effective captions:


Writing for Science

Whether scientific research is basic science or applied science, scientists must share their findings for other researchers to expand and build upon their discoveries. Communication and collaboration within and between sub disciplines of science are key to the advancement of knowledge in science. For this reason, an important aspect of a scientist’s work is disseminating results and communicating with peers. Scientists can share results by presenting them at a scientific meeting or conference, but this approach can reach only the limited few who are present. Instead, most scientists present their results in peer-reviewed articles that are published in scientific journals. Peer-reviewed articles are scientific papers that are reviewed, usually anonymously by a scientist’s colleagues, or peers. These colleagues are qualified individuals, often experts in the same research area, who judge whether or not the scientist’s work is suitable for publication. The process of peer review helps to ensure that the research described in a scientific paper or grant proposal is original, significant, logical, and thorough. Grant proposals, which are requests for research funding, are also subject to peer review. Scientists publish their work so other scientists can reproduce their experiments under similar or different conditions to expand on the findings. The experimental results must be consistent with the findings of other scientists.

There are many journals and the popular press that do not use a peer-review system. A large number of online open-access journals, journals with articles available without cost, are now available many of which use rigorous peer-review systems, but some of which do not. Results of any studies published in these forums without peer review are not reliable and should not form the basis for other scientific work. In one exception, journals may allow a researcher to cite a personal communication from another researcher about unpublished results with the cited author’s permission.


Scientific articles are not literary works. Instead, they are meant to transmit information effectively and concisely. The need for clarity and brevity is especially important for other forms of science communication such as posters where the audience must be able to understand the significance of your research in just a few minutes, but the need is there for all forms of scientific communication.

There is an explicit format that scientific papers follow, with relatively small variations in style among journals. Papers are broken down into the following sections: title, abstract, introduction, methods, results, and discussion. Every section, except the title, should be labeled as such. Generally the section name is centered and underlined (or bold-faced) over the text. Although posters follow the same format as a paper, each section is abbreviated (once again, clarity is critical).


The title should give the reader a concise, informative description of the content and scope of the paper.



The abstract is a concise summary of the major findings of the study. It should be no longer than 9-10 sentences. It should summarize every subsequent section of the paper. It should state the purposes of the study, and then briefly summarize the methods, results, and conclusions of the study. The abstract should be able to stand-alone. Do not refer to any figures or tables, or cite any references. Because the abstract is a distillation of the paper, it is often written last. It is typically the hardest part of the paper to write.



In many journals, the introduction is also unlabeled, and simply starts after the abstract.

The introduction gives the rationale for the research. It answers the question “Why should anyone be interested in this work?” It usually includes background information, including the work of others, and a description of your objectives. If you are studying a particular species, give both the scientific (Latin) name and the common name the first time you mention your study animal. The scientific name is always underlined or italicized, and the genus name is capitalized while the species name is not. Cite only references pertinent to your study. Direct quotations are rarely used in scientific writing; instead state the findings of others in your own words. Furthermore, footnotes are rarely used in a scientific paper. Instead cite the author by last name, and the year that the source was published.

Smith (1987) found that male mice prefer the odor of non-pregnant female mice to that of pregnant female mice. Male mice prefer the odor of non- pregnant female mice to that of pregnant female mice (Smith, 1987).

When two people 
co-author a paper, both are cited:

For instance:

When more than two people co-author a paper, cite only the first author, and refer to the other authors with the Latin phrase, “et al.”, indicating “and others”:

Undergraduate students who came to lectures were more likely to receive a high grade on the exams (Thatcher, et al., 2000).

Harrett and Garrett (1999) found no differences between male and female elephants in their response to the tape of a female vocalization.

The full reference for each work must be given in the literature cited section at the end of the paper. For references, select work from the primary literature: that is, work that is published by the same people who did it. In general, citing an encyclopedia or textbook is not appropriate for a scientific paper.

When organizing your introduction, begin with a general description of the topic, and then become more specific. For example, in a study of the olfaction in the reproductive behavior of mice, the skeleton of the introduction might be:

For reproduction to be successful, animals must be able to correctly assess the reproductive condition of a potential partner. Many different signals have evolved in animals to facilitate such assessment.
Olfactory signals seem to be particularly important in mammals.

Mice are particularly suited for studying the role of olfaction in reproductive behavior. Odor cues are involved in several aspects of mouse reproductive behavior, including… The aim of this study was…

Each of these sentences would be a good topic sentence of a different paragraph in the introduction. References should be cited where appropriate.

In sum, an introduction should convey your overall purpose in conducting the experiment as well as your specific objectives.


This section is also often called Materials and Methods. 
This section is a very concise summary of the subjects, equipment, and procedures used. This section should contain enough information so that someone else could replicate your work. It is NOT a list, but a narrative description. Because it is a narrative, it should not include a list of your materials. Rather, they should be described in the narrative as required. For example, you could say: “We measured 5mL of enzyme solution into a test tube and heated it on a hot plate until it boiled.” From this, it is obvious that you used some sort of tool to accurately measure 5mL of solution, as well as a test tube, and a hot plate.

Only include information that is relevant to your experiment: do not include information that any scientist should know to do or that won’t affect the results (label the tubes, clean up afterwards, make a graph). If you are following the methods of another paper or a lab manual, simply cite the source. Then, you can concentrate on describing any changes that you made to that procedure. A common mistake is to let results creep into this section. 



The results includes presentations of your data and the results of statistical analysis of your data. First, state the overall trend of the data. Did the majority of the data statistically support or contradict the null hypothesis?

Address each statistical test separately, often in separate paragraphs. For each type of data analyzed say whether your results are statistically significant, and in parentheses give the statistical test used, the value of the test statistic, and the probability level for that computed value. For example, “Male mice visited non-pregnant females significantly more often than pregnant females (chi square = 4.69; p < 0.05).”

Do not present your raw data. Instead, present data in an easy to read form. You will probably use a figure or a table to present your results. Refer to each table by a number (Table 1, Table 2, etc.) It should have a concise heading at the top. Graphs and diagrams are both called figures and are numbered consecutively (Fig. 1, Fig. 2, etc.) They have headings at the bottom. Axes on graphs should be clearly labeled. See the section on Presenting Your Data for more information.

You must refer to every table and figure at least once in the text. Often this can be done parenthetically: “Male mice visited non-pregnant females significantly more often than pregnant females (chi square = 4.69; p <0.05; Fig. 2).”

Do not use the word “significant” unless it can be supported by statistical evidence.
A common mistake is to discuss the implications of your findings. Save that for the discussion section.


Here you are to give a reader the “take home” message of the study. Begin by briefly summarizing the major findings of your study. Then discuss each finding one at a time (usually in separate paragraphs).

Interpret your results in light of the biology you are studying. Your discussion section should parallel your introduction: if you discussed the role of reproductive biology of the mouse at the beginning of your study, come back to it again here. The paper should come full circle.

Use references throughout your discussion to support your points. Compare your findings with those of similar studies.

Do not make statements that cannot be supported by the data, and be sure none of your conclusions are contradicted by the data. Discuss unexpected results or possible errors in the experiment, but don’t focus on “what didn’t work”. We all know this was a classroom research project! 

Literature cited 

Each academic discipline uses a different format to cite the references they use. These differences can be dramatic (English vs. Science, for example) or small (Psychology vs. Biology), but they are based on what information is seen as important. In this course, we follow the format of the most biology journals by using CSE format. See the section on Citing Your Sources for more specific information.

General hints

For stylistic hints, browse one of the many books in the library on scientific writing. Remember, being a good writer in English “121” doesn’t mean your skills will translate to science writing without work (though you have a great start!).

Outline your paper. Use topic sentences for every paragraph. You should be able to go back and underline each topic sentence after you are finished.

Keep your report as short as you can, consistent with clarity and completeness. Do not “pad” with a lot of irrelevant information just to show you know a lot.


A note on Plagiarism: Plagiarism is a serious academic offense. However, most instances of plagiarism are the result of a lack of care and effort, and not intentional misbehavior. Here is a general rule to follow: Don’t Cut and Paste! Accidental or not, any occurrence of academic dishonesty will be treated seriously. Ignorance is no defense.

Be sure to proofread for typographical errors, poor grammar, or unclear sentence structure.

Try to start paragraphs with a topic sentence or a summary statement. Then follow it with supporting statements. This technique makes your writing clearer and easier to follow. Ideally, someone could read the first sentence of each paragraph and still understand the gist of our paper.

PLEASE avoid dull scientific writing, particularly the use of the passive voice. As much as possible, use an active voice. Passive writing takes up more space and is dull, dull, dull. Look at the example here; see how this is more exciting and can lead to an interesting ecological observation about the importance of the predator – prey relationship involved?

BAD: Mussels are eaten by sea stars.
GOOD: Sea stars eat mussels.
 BETTER STILL: Sea stars are voracious predators of mussels.

Make sure the object to which words such as “this” or “it” refer is clear.

Combine sentences with low information content into one sentence. This will make your writing more streamlined and less repetitive. But don’t write run-on sentences either!

Always refer to work people have done in the past in the past tense. Refer to species attributes or other on going, continuing states in the present tense.

The word “data” is plural. Say either “these data are…” or “this datum is…”

BAD: Wentworth (1985) studied vegetation in Arizona. He found that tree species distributions followed gradients.
GOOD: In the Huachuca Mountains of Arizona, both elevation and the amount of light influenced tree species distributions (Wentworth 1985).

Scientific names of animals and plants are underlined or italicized (as are most Latin words), such as Homo sapiens or Homo sapiens (genera and all higher taxa are capitalized, species names are lowercase).

Do not anthropomorphize. A honeybee or a dandelion does not have the same consciousness or emotional life as your roommate. In extreme forms, this type of writing is appropriate for the tabloids in supermarket checkout lines…

Try varying the length of your sentences, and keep in mind that a sentence with 4 words is probably too short, and one with 20 too long.

BAD: Kudzu, an Asian super weed, intends to dominate and conquer the entire southeastern United States.
GOOD: Kudzu is a noxious weed introduced from Asia that has quickly spread from its point of introduction throughout the southeastern United States.

Avoid using too many clauses in one sentence. If you see that you
have a lot of commas, that is a clue that you’ve overdone the number of clauses in the sentence.

Try reading your work out loud. Anything that is written poorly will be difficult to read. This technique will alert you to problem areas in your writing.

BE PREPARED TO WRITE SEVERAL DRAFTS! Good, hard editing will turn you from a mediocre to a good writer. And with good writing, you are able to show your GREAT thinking!


These instructions are adapted by Walter Shriner. Originally from Jakob, E. 1995. Laboratory manual for animal behavior. Bowling Green University and Muller, K. 1991. Ornithology laboratory. University of California, Davis.

Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016


Using Credible Sources

Learning Objectives

Course Objective for this section: Select, evaluate, and utilize discipline-specific information and literature to explore topics.

  • Differentiate between questions that can and cannot be answered using science.
  • Identify appropriate credible sources of information to research a topic.
  • Evaluate sources of information for their strengths and weaknesses.


Science is a very specific way of learning, or knowing, about the world. Humans have used the process of science to learn a huge amount about the way the natural world works. Science is responsible for amazing innovations in medicine, hygiene, and technology. There are however, areas of knowledge and human experience that the methods of science cannot be applied to. These include such things as answering purely moral questions, aesthetic questions, or what can be generally categorized as spiritual questions. Science has cannot investigate these areas because they are outside the realm of material phenomena, the phenomena of matter and energy, and cannot be observed and measured.

Questions that can be

answered using science

Questions that cannot be

answered using science

·       What is the optimum temperature for the growth of E. coli bacteria?

·       Do birds prefer bird feeders of a specific color?

·       What is the cause of this disease?

·       How effective is this drug in treating this disease?

·       How tall is Santa Claus?

·       Do angels exist?

·       Which is better: classical music or rock and roll?

·       What are the ethical implications of human cloning?

 Since this is a biology class, we will be focusing on questions that can be answered scientifically. Remember that in the scientific process, observations lead to questions. A scientific question is one that can be answered by using the process of science (testing hypotheses, making observations about the natural world, designing experiments).

Sometimes you will directly make observations yourself about the natural world that lead you to ask scientific questions, other times you might hear or read something that leads you to ask a question. Regardless of how you make your initial observation, you will want to do research about your topic before you start setting up an experiment. When you’re learning about a topic, it’s important to use credible sources of information.

Types of Sources

Whether conducting research in the social sciences, humanities (especially history), arts, or natural sciences, the ability to distinguish between primary and secondary source material is essential. Basically, this distinction illustrates the degree to which the author of a piece is removed from the actual event being described. This means whether the author is reporting information first hand (or is first to record these immediately following an event), or conveying the experiences and opinions of others—that is, second hand. In biology, the distinction would be between the person (or people) who conducted the research and someone who didn’t actually do the research, but is merely reporting on it.

Primary sources

These are contemporary accounts of an event, written by someone who experienced or witnessed the event in question. In general, these original documents (i.e., they are not about another document or account) are often diaries, letters, memoirs, journals, speeches, manuscripts, interviews, photographs, audio or video recordings, or original literary or theatrical works.

In science, a “primary source” or the “primary literature” refers to the original publication of a scientist’s new data, results, and conclusions. These articles are written for other experts in a specific scientific field.

You’ve probably done a writing assignment or other project during which you have participated in a peer review process. During this process, your project was critiqued and evaluated by people of similar competence to yourself (your peers). This gave you feedback on which to improve your work. Scientific articles typically go through a peer review process before they are published in an academic journal. In this case, the peers who are reviewing the article are other experts in the specific field about which the paper is written. This allows other scientists to critique experimental design, data, and conclusions before that information is published in an academic journal. Often, the scientists who did the experiment and who are trying to publish it are required to do additional work or edit their paper before it is published. The goal of the scientific peer review process is to ensure that published primary articles contain the best possible science.

Secondary sources

The function of a secondary source is to interpret the primary source. A secondary source can be described as at least one step removed from the event or phenomenon under review. Secondary source materials interpret, assign value to, conjecture upon, and draw conclusions about the events reported in primary sources. These are usually in the form of published works such as magazine articles or books, but may include radio or television documentaries, or conference proceedings.

Popular vs. Scholarly Sources

Broad range of topics, presented in shorter articles Specific, narrowly focused topics in lengthy, in-depth articles
Articles offer overview of subject matter; interpretation, rather than original research; sometimes contain feature articles and reports on current social issues and public opinion Articles often contain previously unpublished research and detail new developments in field
Intended to attract a general readership without any particular expertise or advanced education Intended for specialist readership of researchers, academics, students and professionals
Written by staff (not always attributed) or freelance writers using general, popular language Written by identified specialists and researchers in subject area, usually employing technical, subject-specific language and jargon
Edited and approved for publication in-house (not peer-reviewed) Critically evaluated by peers (fellow scholars)   in field for content, scholarly soundness, and academic value
Articles rarely contain references or footnotes and follow no specific format Well-researched, documented articles nearly always follow standard format:

abstract, introduction, literature review, methodology, results, conclusion, bibliography/references

Designed to attract eye of potential newsstand customers: usually filled with photographs or illustrations, printed on glossier paper Sober design: mostly text with some tables or graphs accompanying articles; usually little or no photography; negligible, if any, advertising; rarely printed on high-gloss paper
Each issue begins with page number ‘1’ Page numbers of issues within a volume (year) are usually consecutive (i.e., first page of succeeding issue is number following last page number of previous issue)
Presented to entertain, promote point of view, and/or sell products Intended to present researchers’ opinions and findings based on original research
Examples: Newsweek, Rolling Stone, Vogue Examples: Science, Nature, Journal of Microbial and Biochemical Technology

 In science, it is often extremely difficult to read and understand primary articles unless you are an expert in that specific scientific field. Secondary sources are typically easier to read and can give you the important information from a primary source, but only if the secondary source has interpreted the information correctly! It is always better to go to the primary source if possible because otherwise you are relying on someone else’s interpretation of the information. However, it is always better to use a source that you can read and understand rather than a source that you can’t. For this reason, it is very important to be able to identify credible secondary sources.

Evaluating Credibility

When you write a scientific paper (or any paper, really), you want to back up your statements with credible sources. You will need to identify credible sources to help you research scientific topics to help you develop interesting scientific questions. You will also need sources to help you form a well-educated hypothesis that is not just based on your guess about what will happen. A credible source is one that is trustworthy from which the information can be believed. Credible sources are written by people who are experts in the field (or at least are very knowledgeable) about the subject that they are commenting on.

We will be using a variation of the CRAAP test to help you determine whether or not sources that you find are credible or not. The CRAAP Test was created by Sarah Blakeslee, of the University of California at Chico’s Meriam Library. It is adapted below. When evaluating the credibility of sources using this method, if it’s CRAAP, it’s good!

You can use the table below to help you evaluate the credibility of your sources.

Credibility Table

Factors to consider Least reliable 

(0 points)

Possibly reliable

(1 point)

Most reliable

(2 points)

Currency No date of publication or revision given Outdated for this particular topic Recently published or revised
Reliable source Unreliable website, no additional info available Possibly reliable Official government or organization, institutional sites, academic journals
Author No author is given / the author is not qualified to write about this topic Author is educated on topic or is staff of an organization assumed to be knowledgeable on this specific topic Specifically identified expert in this field with degrees / credentials in this subject
Accuracy No review process and information is not supported by evidence from cited sources The information may have been reviewed or edited by someone knowledgeable in the field. It mentions but does not directly cite other sources The information has been peer reviewed and is supported by evidence from cited credible sources
Purpose Obviously biased or trying to sell you something Sponsored source; may present unbalanced information Balanced, neutral, presents all sides of the issue fully

In general, do not use a source if it doesn’t pass the CRAAP test! For our purposes, do not use any sources that score less than 6 points using the credibility table.

Several examples are given below for sources that you might come across if you were researching the topic of vaccine safety.

Example 1:

CDC (Centers for Disease Control and Prevention). Aug 28, 2015. Vaccine Safety [Internet]. [cited May 12, 2016]. Available from:

  Score Discussion – why did you give that score?
Currency 2 Aug 28 2015 is recent and shows that this information is updated frequently.
Reliable source 2 I looked at the “about this organization” and learned that the CDC is a major government organization that works to protect Americans from health, safety, and security threats. They are a division of the US department of health and human services.
Author 1 A specific author was not identified, but the page states that the content is from the CDC, which suggests that it was written by a knowledgeable staff member.
Accuracy 1.5 No information is given about the review process, but it was probably edited by staff at the CDC. There is a list of citations and links to primary scientific articles supporting the information.
Purpose 2 The point of view does not appear to be biased because it seems to be presenting factual information. Admittedly, it only presents the pro-vaccine side of the argument. There are no ads on the page or other information trying to change the reader’s viewpoint.
Credibility Score 8.5/10 This seems like an excellent source to use for research. It’s readable and I could look at the primary articles if I wanted to check them out.

Example 2: 

Stop Mandatory Vaccination. N.d.. The Dangers of Vaccines and Vaccinations [Internet]. [cited May 12, 2016]. Available from:


  Score Discussion – why did you give that score?
Currency 1 The copyright is given as 2015, but there is no date for this specific article. It does reference something that took place in 2015, so it is likely written after that.
Reliable source 0 The “About” page states that the organization was started by Larry Cook using a GoFundMe platform
Author 0 Larry Cook has been devoted to the natural lifestyle for 25 years, but doesn’t appear to have any degrees or specific expertise on this topic. Other contributing authors include Landee Martin, who has a Bachelor’s of Science in Psychology (which isn’t related to vaccine safety), and Brittney Kara, who is a mother who has studied holistic living for the last 17 years. None of the individuals specifically identified on the website appear to be experts in the field.
Accuracy 0.5 It seems unlikely that there is any review process. There are links to several sources, but none of them appear to be primary scientific articles. Several are links to interviews.
Purpose 0 This source is extremely biased. Even the name of the website is biased. There is a link to donate to the webpage. There are at least 10 ads for anti-vaccine books and websites.
Credibility Score 1.5/10 I would not want to use this source to research this topic. It’s extremely biased and doesn’t seem to offer much evidence for its assertions.

Citing Your Sources

One of the goals for any class is to help students become better scholars. And, one of the important skills of scholarship is proper citation of resources used. Citations demonstrate your “credentials” as a scholar, and provide a resource to your readers of good reference material.

Why do you have to cite your sources?

No research paper is complete without a list of the sources that you used in your writing.  Scholars are very careful to keep accurate records of the resources they’ve used, and of the ideas and concepts they’ve quoted or used from others. This record keeping is generally presented in the form of citations.

A citation is a description of a book, article, URL, etc. that provides enough information so that others can locate the source you used themselves. It allows you to credit the authors of the sources you use and clarify which ideas belong to you and which belong to other sources. And providing a citation or reference will allow others to find and use these sources as well.  Most research papers have a list of citations or cited references and there are special formatting guidelines for different types of research.

However, there are many “proper” formats because each discipline has its own rules. In general we ask only that you use one of the “official” formats and that you use it consistently. To understand what we mean by “consistent”, compare the citations in two scientific journals. You will notice that each journal has its own rules for whether an article title is in quotes, bold, underlined, etc., but within each journal the rule applies to all reference citations. Below is a condensed guide to the general format used in science (CSE). For more detailed information consult one of the online citation guides and generators.


Plagiarism is presenting the words or ideas of someone else as your own without proper acknowledgment of the source. When you work on a research paper you will probably find supporting material for your paper from works by others. It’s okay to quote people and use their ideas, but you do need to correctly credit them. Even when you summarize or paraphrase information found in books, articles, or Web pages, you must acknowledge the original author. To avoid plagiarism, include a reference to any material you use that provides a fact not commonly known, or whenever you use information from another author.  In short, if you didn’t collect the data or reach the conclusion on your own, cite it!

These are all examples of plagiarism:

 Tips for Avoiding Plagiarism:

Citing Sources in CSE Format

The Council of Science Editors (CSE) citation format is commonly used in scientific writing. CSE format emphasizes the information that is important when writing scientifically: who wrote the information and when they wrote it. In different fields, there is an emphasis on different types of information. In the humanities, MLA format is commonly used. This style emphasizes the author’s name and the page number. This information allows a reader to track down the exact quotes that are being discussed. Another commonly used format, APA, emphasizes the author’s name and the year the information was published.

The standard format for citing a source in science writing is the Name-year format. In this format, the first author’s last name is followed by the date. For example:  Not all populations of alligators in the everglades are at risk from habitat loss (Nicholson, 2002).

If you are not familiar with the CSE citation style, you can get additional information and examples at

Beware of computerized “citation creators.”  While they can get you part way to a correct citation, they rarely are 100% correct.  For example, they often fail to put the last name first.


Citing a scientific journal article

Author’s last name first initial, next author’s last name first initial. Date published. Title of Article. Journal Name. Volume (issue): pages.

Please note that you need to cite the JOURNAL, not the DATABASE that you got it from. Citing the database in which you found a scientific journal article is like citing Google for an internet resource that you are using.

Flores-Cruz Z, Allen C. 2011. Necessity of OxyR for the hydrogen peroxide stress response and full virulence in Ralstonia solanacearum. Appl Environ Microbiol. 77(18):6426-6432.

Werling BP, Lowenstein DM, Straub CS, Gratton C. 2012. Multi-predator effects produced by functionally distinct species vary with prey density. J Insect Sci; 12(30): 346-378.

Shriner, W.M. 1998. Yellow-bellied marmot and golden-mantled ground squirrel responses to heterospecific alarm calls. Animal Behaviour 55:529-536.

Citing an internet resource

Author’s last name, first initial. Date published. Title of Website [Internet]. Publisher information. [cited on date that you accessed the information]. Available from: URL where you accessed the source.

Williamson RC. 2004. Deciduous tree galls [Internet]. Madison (WI): University of Wisconsin-Madison; [cited 2013 Sep 12]. Available from

[BP] The Biology Project. 2003. The chemistry of amino acids [Internet]. University of Arizona; [cited 2004 Mar 17]. Available from:

Hilton-Taylor C, compiler. 2000. 2000 IUCN red list of threatened species [Internet]. Gland (Switzerland) and Cambridge (UK): IUCN; [cited 2002 Feb 12]. Available from:

Citing Sources Within Text

We will be using the CSE Name-year format for citations. When you want to provide a citation reference for a statement that you are making, you should end the sentence with (First author’s last name, year). If the article was written by an organization and not a specific author, you can use the name of the organization (or an abbreviation for the name).

Example: Sickle cell anemia is caused by abnormally-shaped haemoglobin proteins (NIH, 2012).

In the References Cited section (a.k.a. Literature cited…) list all the sources you cited in your paper, but do not include any items that you did not specifically cite within the body of your paper or project, even if you read them!  Except in rare instances, do not cite a reference that you have not personally read.

You should then list all your references in the Literature Cited section alphabetically by author’s last name.

For more information and lots of examples of what to do in specific instances, please visit


“Cite Your Sources” by University of California, Santa Cruz, University Library is licensed under CC BY 3.0

“What is plagiarism?” by University of California, Santa Cruz, University Library is licensed under CC BY 3.0


“Distinguish Between Primary and Secondary Sources” by University of California, Santa Cruz, University Library is licensed under CC BY 3.0

“Distinguish between Popular and Scholarly Journals” by University of California, Santa Cruz, University Library is licensed under CC BY 3.0


Chemistry for Biology

Learning Outcomes

Course Outcomes for this section: 

  • Describe the structure of biologically-important molecules (carbohydrates, lipids, proteins, nucleic acids, water) and how their structure leads to their function.

Living things are highly organized and structured, following a hierarchy that can be examined on a scale from small to large. The examination of the smallest parts involves a knowledge of chemistry. We can put the levels of organization of living things in order from smallest to largest.

These first 2 levels (or 3, depending on how you categorize macromolecules) are typically studied in chemistry or biochemistry courses. However, a working knowledge of atoms and molecules is required to understand how these small pieces work to make larger, living organisms.

Once we move beyond one single organism, we have reached the study of ecology. You’ll look at these topics in BI213.

A flow chart shows the hierarchy of living organisms. From smallest to largest, this hierarchy includes: (1) Organelles, such as nuclei, that exist inside cells. (2) Cells, such as a red blood cell. (3) Tissues, such as human skin tissue. (4) Organs such as the stomach make up the human digestive system, an example of an organ system. (5) Organisms, populations, and communities. In a forest, each pine tree is an organism. Together, all the pine trees make up a population. All the plant and animal species in the forest comprise a community. (6) Ecosystems: the coastal ecosystem in the Southeastern United States includes living organisms and the environment in which they live. (7) The biosphere: encompasses all the ecosystems on Earth.

Figure 1 The biological levels of organization of living things are shown. From a single organelle to the entire biosphere, living organisms are parts of a highly structured hierarchy. (credit “organelles”: modification of work by Umberto Salvagnin; credit “cells”: modification of work by Bruce Wetzel, Harry Schaefer/ National Cancer Institute; credit “tissues”: modification of work by Kilbad; Fama Clamosa; Mikael Häggström; credit “organs”: modification of work by Mariana Ruiz Villareal; credit “organisms”: modification of work by “Crystal”/Flickr; credit “ecosystems”: modification of work by US Fish and Wildlife Service Headquarters; credit “biosphere”: modification of work by NASA)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 25, 2017



An atom is the smallest component of an element that retains all of the chemical properties of that element. For example, one hydrogen atom has all of the properties of the element hydrogen, such as it exists as a gas at room temperature, and it bonds with oxygen to create a water molecule. Hydrogen atoms cannot be broken down into anything smaller while still retaining the properties of hydrogen. If a hydrogen atom were broken down into subatomic particles, it would no longer have the properties of hydrogen.

At the most basic level, all organisms are made of a combination of elements. They contain atoms that combine together to form molecules. In multicellular organisms, such as animals, molecules can interact to form cells that combine to form tissues, which make up organs. These combinations continue until entire multicellular organisms are formed.

All atoms contain protons, electrons, and neutrons (Figure 1). The only exception is hydrogen (H), which is made of one proton and one electron. A proton is a positively charged particle that resides in the nucleus (the core of the atom) of an atom and has a mass of 1 and a charge of +1. An electron is a negatively charged particle that travels in the space around the nucleus. In other words, it resides outside of the nucleus. It has a negligible mass and has a charge of –1.

Illustration of an atom showing two neutrons and two protons in the center, with a circle labeled as the nucleus around them. Another circle shows an orbit with two electrons outside of the nucleus

Figure 1 Atoms are made up of protons and neutrons located within the nucleus, and electrons surrounding the nucleus.

Neutrons, like protons, reside in the nucleus of an atom. They have a mass of 1 and no charge. The positive (protons) and negative (electrons) charges balance each other in a neutral atom, which has a net zero charge.

Because protons and neutrons each have a mass of 1, the mass of an atom is equal to the number of protons and neutrons of that atom. The number of electrons does not factor into the overall mass, because their mass is so small.

As stated earlier, each element has its own unique properties. Each contains a different number of protons and neutrons, giving it its own atomic number and mass number. The atomic number of an element is equal to the number of protons that element contains. The mass number, or atomic mass, is the number of protons plus the number of neutrons of that element. Therefore, it is possible to determine the number of neutrons by subtracting the atomic number from the mass number.

These numbers provide information about the elements and how they will react when combined. Different elements have different melting and boiling points, and are in different states (liquid, solid, or gas) at room temperature. They also combine in different ways. Some form specific types of bonds, whereas others do not. How they combine is based on the number of electrons present. Because of these characteristics, the elements are arranged into the periodic table of elements, a chart of the elements that includes the atomic number and relative atomic mass of each element. The periodic table also provides key information about the properties of elements (Figure 2) —often indicated by color-coding. The arrangement of the table also shows how the electrons in each element are organized and provides important details about how atoms will react with each other to form molecules.

Isotopes are different forms of the same element that have the same number of protons, but a different number of neutrons. Some elements, such as carbon, potassium, and uranium, have naturally occurring isotopes. Carbon-12, the most common isotope of carbon, contains six protons and six neutrons. Therefore, it has a mass number of 12 (six protons and six neutrons) and an atomic number of 6 (which makes it carbon). Carbon-14 contains six protons and eight neutrons. Therefore, it has a mass number of 14 (six protons and eight neutrons) and an atomic number of 6, meaning it is still the element carbon. These two alternate forms of carbon are isotopes. Some isotopes are unstable and will lose protons, other subatomic particles, or energy to form more stable elements. These are called radioactive isotopes or radioisotopes.

Periodic table of elements.

Figure 2 Arranged in columns and rows based on the characteristics of the elements, the periodic table provides key information about the elements and how they might interact with each other to form molecules. Most periodic tables provide a key or legend to the information they contain.


Evolution in Action

Carbon Dating: Carbon-14 (14C) is a naturally occurring radioisotope that is created in the atmosphere by cosmic rays. This is a continuous process, so more 14C is always being created. As a living organism develops, the relative level of 14C in its body is equal to the concentration of 14C in the atmosphere. When an organism dies, it is no longer ingesting 14C, so the ratio will decline. 14C decays to 14N by a process called beta decay; it gives off energy in this slow process.

After approximately 5,730 years, only one-half of the starting concentration of 14C will have been converted to 14N. The time it takes for half of the original concentration of an isotope to decay to its more stable form is called its half-life. Because the half-life of 14C is long, it is used to age formerly living objects, such as fossils. Using the ratio of the 14C concentration found in an object to the amount of 14C detected in the atmosphere, the amount of the isotope that has not yet decayed can be determined. Based on this amount, the age of the fossil can be calculated to about 50,000 years (Figure 3). Isotopes with longer half-lives, such as potassium-40, are used to calculate the ages of older fossils. Through the use of carbon dating, scientists can reconstruct the ecology and biogeography of organisms living within the past 50,000 years.

Photograph shows scientists digging pygmy mammoth skeleton fossils from the ground.

Figure 3 The age of remains that contain carbon and are less than about 50,000 years old, such as this pygmy mammoth, can be determined using carbon dating. (credit: Bill Faulkner/NPS)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. March 22, 2017


Chemical Bonds

How elements interact with one another depends on how their electrons are arranged and how many openings for electrons exist at the outermost region where electrons are present in an atom. Electrons exist at energy levels that form shells around the nucleus. The closest shell can hold up to two electrons. The closest shell to the nucleus is always filled first, before any other shell can be filled. Hydrogen has one electron; therefore, it has only one spot occupied within the lowest shell. Helium has two electrons; therefore, it can completely fill the lowest shell with its two electrons. If you look at the periodic table, you will see that hydrogen and helium are the only two elements in the first row. This is because they only have electrons in their first shell. Hydrogen and helium are the only two elements that have the lowest shell and no other shells.

The second and third energy levels can hold up to eight electrons. The eight electrons are arranged in four pairs and one position in each pair is filled with an electron before any pairs are completed.

Looking at the periodic table again (Figure 1), you will notice that there are seven rows. These rows correspond to the number of shells that the elements within that row have. The elements within a particular row have increasing numbers of electrons as the columns proceed from left to right. Although each element has the same number of shells, not all of the shells are completely filled with electrons. If you look at the second row of the periodic table, you will find lithium (Li), beryllium (Be), boron (B), carbon (C), nitrogen (N), oxygen (O), fluorine (F), and neon (Ne). These all have electrons that occupy only the first and second shells. Lithium has only one electron in its outermost shell, beryllium has two electrons, boron has three, and so on, until the entire shell is filled with eight electrons, as is the case with neon.

Not all elements have enough electrons to fill their outermost shells, but an atom is at its most stable when all of the electron positions in the outermost shell are filled. Because of these vacancies in the outermost shells, we see the formation of chemical bonds, or interactions between two or more of the same or different elements that result in the formation of molecules. To achieve greater stability, atoms will tend to completely fill their outer shells and will bond with other elements to accomplish this goal by sharing electrons, accepting electrons from another atom, or donating electrons to another atom. Because the outermost shells of the elements with low atomic numbers (up to calcium, with atomic number 20) can hold eight electrons, this is referred to as the octet rule. An element can donate, accept, or share electrons with other elements to fill its outer shell and satisfy the octet rule.

When an atom does not contain equal numbers of protons and electrons, it is called an ion. Because the number of electrons does not equal the number of protons, each ion has a net charge. Positive ions are formed by losing electrons and are called cations. Negative ions are formed by gaining electrons and are called anions.

For example, sodium only has one electron in its outermost shell. It takes less energy for sodium to donate that one electron than it does to accept seven more electrons to fill the outer shell. If sodium loses an electron, it now has 11 protons and only 10 electrons, leaving it with an overall charge of +1. It is now called a sodium ion.

The chlorine atom has seven electrons in its outer shell. Again, it is more energy-efficient for chlorine to gain one electron than to lose seven. Therefore, it tends to gain an electron to create an ion with 17 protons and 18 electrons, giving it a net negative (–1) charge. It is now called a chloride ion. This movement of electrons from one element to another is referred to as electron transfer. As Figure 1 illustrates, a sodium atom (Na) only has one electron in its outermost shell, whereas a chlorine atom (Cl) has seven electrons in its outermost shell. A sodium atom will donate its one electron to empty its shell, and a chlorine atom will accept that electron to fill its shell, becoming chloride. Both ions now satisfy the octet rule and have complete outermost shells. Because the number of electrons is no longer equal to the number of protons, each is now an ion and has a +1 (sodium) or –1 (chloride) charge.

Diagram shows electron transfer between elements.

Figure 1 Elements tend to fill their outermost shells with electrons. To do this, they can either donate or accept electrons from other elements.


Ionic Bonds

There are four types of bonds or interactions: ionic, covalent, hydrogen bonds, and van der Waals interactions. Ionic and covalent bonds are strong interactions that require a larger energy input to break apart. When an element donates an electron from its outer shell, as in the sodium atom example above, a positive ion is formed (Figure 2). The element accepting the electron is now negatively charged. Because positive and negative charges attract, these ions stay together and form an ionic bond, or a bond between ions. The elements bond together with the electron from one element staying predominantly with the other element. When Na+ and Cl ions combine to produce NaCl, an electron from a sodium atom stays with the other seven from the chlorine atom, and the sodium and chloride ions attract each other in a lattice of ions with a net zero charge.

Chlorine donates an electron to sodium.

Figure 2 In the formation of an ionic compound, metals lose electrons and nonmetals gain electrons to achieve an octet.

Covalent Bonds

Another type of strong chemical bond between two or more atoms is a covalent bond. These bonds form when an electron is shared between two elements and are the strongest and most common form of chemical bond in living organisms. Covalent bonds form between the elements that make up the biological molecules in our cells. Unlike ionic bonds, covalent bonds do not dissociate in water.

The hydrogen and oxygen atoms that combine to form water molecules are bound together by covalent bonds. The electron from the hydrogen atom divides its time between the outer shell of the hydrogen atom and the incomplete outer shell of the oxygen atom. To completely fill the outer shell of an oxygen atom, two electrons from two hydrogen atoms are needed, hence the subscript “2” in H2O. The electrons are shared between the atoms, dividing their time between them to “fill” the outer shell of each. This sharing is a lower energy state for all of the atoms involved than if they existed without their outer shells filled.

There are two types of covalent bonds: polar and nonpolar. Nonpolar covalent bonds form between two atoms of the same element or between different elements that share the electrons equally. For example, an oxygen atom can bond with another oxygen atom to fill their outer shells. This association is nonpolar because the electrons will be equally distributed between each oxygen atom. Two covalent bonds form between the two oxygen atoms because oxygen requires two shared electrons to fill its outermost shell. Nitrogen atoms will form three covalent bonds (also called triple covalent) between two atoms of nitrogen because each nitrogen atom needs three electrons to fill its outermost shell. Another example of a nonpolar covalent bond is found in the methane (CH4) molecule. The carbon atom has four electrons in its outermost shell and needs four more to fill it. It gets these four from four hydrogen atoms, each atom providing one. These elements all share the electrons equally, creating four nonpolar covalent bonds (Figure 3).

In a polar covalent bond, the electrons shared by the atoms spend more time closer to one nucleus than to the other nucleus. Because of the unequal distribution of electrons between the different nuclei, a slightly positive (δ+) or slightly negative (δ–) charge develops. The covalent bonds between hydrogen and oxygen atoms in water are polar covalent bonds. The shared electrons spend more time near the oxygen nucleus, giving it a small negative charge, than they spend near the hydrogen nuclei, giving these molecules a small positive charge.

Diagram depicting polar and nonpolar covalent bonds

Figure 3 The water molecule (left) depicts a polar bond with a slightly positive charge on the hydrogen atoms and a slightly negative charge on the oxygen. Examples of nonpolar bonds include methane (middle) and oxygen (right).

Hydrogen Bonds

Ionic and covalent bonds are strong bonds that require considerable energy to break. However, not all bonds between elements are ionic or covalent bonds. Weaker bonds can also form. These are attractions that occur between positive and negative charges that do not require much energy to break. Two weak bonds that occur frequently are hydrogen bonds and van der Waals interactions. These bonds give rise to the unique properties of water and the unique structures of DNA and proteins.

When polar covalent bonds containing a hydrogen atom form, the hydrogen atom in that bond has a slightly positive charge. This is because the shared electron is pulled more strongly toward the other element and away from the hydrogen nucleus. Because the hydrogen atom is slightly positive (δ+), it will be attracted to neighboring negative partial charges (δ–). When this happens, a weak interaction occurs between the δ+ charge of the hydrogen atom of one molecule and the δ– charge of the other molecule. This interaction is called a hydrogen bond. This type of bond is common; for example, the liquid nature of water is caused by the hydrogen bonds between water molecules (Figure 4). Hydrogen bonds give water the unique properties that sustain life. If it were not for hydrogen bonding, water would be a gas rather than a liquid at room temperature.

Diagram showing hydrogen bonds formed between adjacent water molecules.

Figure 4 Hydrogen bonds form between slightly positive (δ+) and slightly negative (δ–) charges of polar covalent molecules, such as water.

Hydrogen bonds can form between different molecules and they do not always have to include a water molecule. Hydrogen atoms in polar bonds within any molecule can form bonds with other adjacent molecules. For example, hydrogen bonds hold together two long strands of DNA to give the DNA molecule its characteristic double-stranded structure. Hydrogen bonds are also responsible for some of the three-dimensional structure of proteins.

van der Waals Interactions

Like hydrogen bonds, van der Waals interactions are weak attractions or interactions between molecules. They occur between polar, covalently bound, atoms in different molecules. Some of these weak attractions are caused by temporary partial charges formed when electrons move around a nucleus. These weak interactions between molecules are important in biological systems.

ReferencesUnless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. March 22, 2017



Drop bouncing up out of a drip of water

Figure 1 Water: without it, life wouldn’t exist. Photo credit ronymichaud; CC0 license;

Do you ever wonder why scientists spend time looking for water on other planets? It is because water is essential to life; even minute traces of it on another planet can indicate that life could or did exist on that planet. Water is one of the more abundant molecules in living cells and the one most critical to life as we know it. Approximately 60–70 percent of your body is made up of water. Without it, life simply would not exist.

Water Is Polar

The hydrogen and oxygen atoms within water molecules form polar covalent bonds. The shared electrons spend more time associated with the oxygen atom than they do with hydrogen atoms. There is no overall charge to a water molecule, but there is a slight positive charge on each hydrogen atom and a slight negative charge on the oxygen atom. Because of these charges, the slightly positive hydrogen atoms repel each other and form the unique shape. Each water molecule attracts other water molecules because of the positive and negative charges in the different parts of the molecule.

structure of a water molecule showing electrons

Figure 2 The electrons in the covalent bond connecting the two hydrogens to the atom of oxygen in a water molecule spend more time on the oxygen atom. This gives the oxygen atom a slightly negative charge (since electrons are negatively charged). Credit Anatomy & Physiology, Connexions Web site., Jun 19, 2013.

Water also attracts other polar molecules (such as sugars), forming hydrogen bonds. When a substance readily forms hydrogen bonds with water, it can dissolve in water and is referred to as hydrophilic (“water-loving”). Hydrogen bonds are not readily formed with nonpolar substances like oils and fats (Figure 3). These nonpolar compounds are hydrophobic (“water-fearing”) and will not dissolve in water.

Picture of oil in water.

Figure 3 As this macroscopic image of oil and water show, oil is a nonpolar compound and, hence, will not dissolve in water. Oil and water do not mix. (credit: Gautam Dogra)

Water Stabilizes Temperature

The hydrogen bonds in water allow it to absorb and release heat energy more slowly than many other substances. Temperature is a measure of the motion (kinetic energy) of molecules. As the motion increases, energy is higher and thus temperature is higher. Water absorbs a great deal of energy before its temperature rises. Increased energy disrupts the hydrogen bonds between water molecules. Because these bonds can be created and disrupted rapidly, water absorbs an increase in energy and temperature changes only minimally. This means that water moderates temperature changes within organisms and in their environments. As energy input continues, the balance between hydrogen-bond formation and destruction swings toward the destruction side. More bonds are broken than are formed. This process results in the release of individual water molecules at the surface of the liquid (such as a body of water, the leaves of a plant, or the skin of an organism) in a process called evaporation. Evaporation of sweat, which is 90 percent water, allows for cooling of an organism, because breaking hydrogen bonds requires an input of energy and takes heat away from the body.

Conversely, as molecular motion decreases and temperatures drop, less energy is present to break the hydrogen bonds between water molecules. These bonds remain intact and begin to form a rigid, lattice-like structure (e.g., ice) (Figure 4a). When frozen, ice is less dense than liquid water (the molecules are farther apart). This means that ice floats on the surface of a body of water (Figure4b). In lakes, ponds, and oceans, ice will form on the surface of the water, creating an insulating barrier to protect the animal and plant life beneath from freezing in the water. If this did not happen, plants and animals living in water would freeze in a block of ice and could not move freely, making life in cold temperatures difficult or impossible.

Part A shows the lattice-like molecular structure of ice. Part B is a photo of ice on water.

Figure 4 (a) The lattice structure of ice makes it less dense than the freely flowing molecules of liquid water. Ice’s lower density enables it to (b) float on water. (credit a: modification of work by Jane Whitney; credit b: modification of work by Carlos Ponte)

Water Is an Excellent Solvent

Because water is polar, with slight positive and negative charges, ionic compounds and polar molecules can readily dissolve in it. Water is, therefore, what is referred to as a solvent—a substance capable of dissolving another substance. The charged particles will form hydrogen bonds with a surrounding layer of water molecules. This is referred to as a sphere of hydration and serves to keep the particles separated or dispersed in the water. In the case of table salt (NaCl) mixed in water (Figure , the sodium and chloride ions separate, or dissociate, in the water, and spheres of hydration are formed around the ions. A positively charged sodium ion is surrounded by the partially negative charges of oxygen atoms in water molecules. A negatively charged chloride ion is surrounded by the partially positive charges of hydrogen atoms in water molecules. These spheres of hydration are also referred to as hydration shells. The polarity of the water molecule makes it an effective solvent and is important in its many roles in living systems.

Illustration of spheres of hydration around sodium and chlorine ions.

Figure 5 When table salt (NaCl) is mixed in water, spheres of hydration form around the ions.


Water Is Cohesive

Have you ever filled up a glass of water to the very top and then slowly added a few more drops? Before it overflows, the water actually forms a dome-like shape above the rim of the glass. This water can stay above the glass because of the property of cohesion. In cohesion, water molecules are attracted to each other (because of hydrogen bonding), keeping the molecules together at the liquid-air (gas) interface, although there is no more room in the glass. Cohesion gives rise to surface tension, the capacity of a substance to withstand rupture when placed under tension or stress. When you drop a small scrap of paper onto a droplet of water, the paper floats on top of the water droplet, although the object is denser (heavier) than the water. This occurs because of the surface tension that is created by the water molecules. Cohesion and surface tension keep the water molecules intact and the item floating on the top. It is even possible to “float” a steel needle on top of a glass of water if you place it gently, without breaking the surface tension (Figure 6).

Picture of a needle floating on top of water because of cohesion and surface tension.

Figure 6 The weight of a needle on top of water pulls the surface tension downward; at the same time, the surface tension of the water is pulling it up, suspending the needle on the surface of the water and keeping it from sinking. Notice the indentation in the water around the needle. (credit: Cory Zanker)


These cohesive forces are also related to the water’s property of adhesion, or the attraction between water molecules and other molecules. This is observed when water “climbs” up a straw placed in a glass of water. You will notice that the water appears to be higher on the sides of the straw than in the middle. This is because the water molecules are attracted to the straw and therefore adhere to it.

Cohesive and adhesive forces are important for sustaining life. For example, because of these forces, water can flow up from the roots to the tops of plants to feed the plant.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. March 22, 2017


Buffers, pH, Acids, and Bases

The pH of a solution is a measure of its acidity or alkalinity. You have probably used litmus paper, paper that has been treated with a natural water-soluble dye so it can be used as a pH indicator, to test how much acid or base (alkalinity) exists in a solution. You might have even used some to make sure the water in an outdoor swimming pool is properly treated. In both cases, this pH test measures the amount of hydrogen ions that exists in a given solution. High concentrations of hydrogen ions yield a low pH, whereas low levels of hydrogen ions result in a high pH. The overall concentration of hydrogen ions is inversely related to its pH and can be measured on the pH scale (Figure 1). Therefore, the more hydrogen ions present, the lower the pH; conversely, the fewer hydrogen ions, the higher the pH.

The pH scale ranges from 0 to 14. A change of one unit on the pH scale represents a change in the concentration of hydrogen ions by a factor of 10, a change in two units represents a change in the concentration of hydrogen ions by a factor of 100. Thus, small changes in pH represent large changes in the concentrations of hydrogen ions. Pure water is neutral. It is neither acidic nor basic, and has a pH of 7.0. Anything below 7.0 (ranging from 0.0 to 6.9) is acidic, and anything above 7.0 (from 7.1 to 14.0) is alkaline (basic). The blood in your veins is slightly alkaline (pH = 7.4). The environment in your stomach is highly acidic (pH = 1 to 2). Orange juice is mildly acidic (pH = approximately 3.5), whereas baking soda is basic (pH = 9.0).

The pH scale with representative substances and their pHs.

Figure 1 The pH scale measures the amount of hydrogen ions (H+) in a substance. (credit: modification of work by Edward Stevens)


Acids are substances that provide hydrogen ions (H+) and lower pH, whereas bases provide hydroxide ions (OH) and raise pH. The stronger the acid, the more readily it donates H+. For example, hydrochloric acid and lemon juice are very acidic and readily give up H+ when added to water. Conversely, bases are those substances that readily donate OH. The OH ions combine with H+ to produce water, which raises a substance’s pH. Sodium hydroxide and many household cleaners are very alkaline and give up OH rapidly when placed in water, thereby raising the pH.

Most cells in our bodies operate within a very narrow window of the pH scale, typically ranging only from 7.2 to 7.6. If the pH of the body is outside of this range, the respiratory system malfunctions, as do other organs in the body. Cells no longer function properly, and proteins will break down. Deviation outside of the pH range can induce coma or even cause death.

So how is it that we can ingest or inhale acidic or basic substances and not die? Buffers are the key. Buffers readily absorb excess H+ or OH, keeping the pH of the body carefully maintained in the aforementioned narrow range. Carbon dioxide is part of a prominent buffer system in the human body; it keeps the pH within the proper range. This buffer system involves carbonic acid (H2CO3) and bicarbonate (HCO3) anion. If too much H+ enters the body, bicarbonate will combine with the H+ to create carbonic acid and limit the decrease in pH. Likewise, if too much OH is introduced into the system, carbonic acid will rapidly dissociate into bicarbonate and H+ ions. The H+ ions can combine with the OH ions, limiting the increase in pH. While carbonic acid is an important product in this reaction, its presence is fleeting because the carbonic acid is released from the body as carbon dioxide gas each time we breathe. Without this buffer system, the pH in our bodies would fluctuate too much and we would fail to survive.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. March 22, 2017


Absolutely Necessary Chemistry Summary


Chemical Bonds


pH, Acids, Bases, and Buffers


Biological Molecules

Learning Outcomes

  • Describe the structure of biologically-important molecules (carbohydrates, lipids, proteins, nucleic acids, water) and how their structure leads to their function.

Food provides an organism with nutrients—the matter it needs to survive. Many of these critical nutrients come in the form of biological macromolecules, or large molecules necessary for life. These macromolecules are built from different combinations of smaller organic molecules. What specific types of biological macromolecules do living things require? How are these molecules formed? What functions do they serve? In this chapter, we will explore these questions.

There are four major classes of biological macromolecules (carbohydrates, lipids, proteins, and nucleic acids), and each is an important component of the cell and performs a wide array of functions. Combined, these molecules make up the majority of a cell’s mass. Biological macromolecules are organic, meaning that they contain carbon atoms. In addition, they may contain atoms of hydrogen, oxygen, nitrogen, phosphorus, sulfur, and additional minor elements.

These molecules are made up of subunits called monomers. Each type of biological molecule is made up of different monomers. The monomers are connected together into a chain by strong covalent bonds. It is important that covalent bonds connect the monomers. If they were connected by hydrogen bonds the monomers would easily separate from each other and the biological molecule would come apart. If ionic bonds connected the monomers, the biological molecule would be likely to fall apart if it came into contact with water.

beads on a string

Figure 1 The structure of a macromolecule can be compared to a necklace: both are larger structures that are built out of small pieces connected together into a chain. The “string” in a macromolecule would be strong covalent bonds connecting the individual subunits together. (“Beads on a string” by Daniel is licensed under CC BY-NC-ND 2.0)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016



Carbohydrates are macromolecules with which most consumers are somewhat familiar. To lose weight, some individuals adhere to “low-carb” diets. Athletes, in contrast, often “carb-load” before important competitions to ensure that they have sufficient energy to compete at a high level. Carbohydrates are, in fact, an essential part of our diet; grains, fruits, and vegetables are all natural sources of carbohydrates. Carbohydrates provide energy to the body, particularly through glucose, a simple sugar. Carbohydrates also have other important functions in humans, animals, and plants.

Figure 1 Bread, pasta, and sugar all contain high levels of carbohydrates. (“Wheat products” by US Department of Agriculture is in the Public Domain)

Carbohydrates can be represented by the stoichiometric formula (CH2O)n, where n is the number of carbons in the molecule. In other words, the ratio of carbon to hydrogen to oxygen is 1:2:1 in carbohydrate molecules. This formula also explains the origin of the term “carbohydrate”: the components are carbon (“carbo”) and the components of water (hence, “hydrate”). Carbohydrates are classified into three subtypes: monosaccharides, disaccharides, and polysaccharides.


Monosaccharides (mono- = “one”; sacchar- = “sweet”) are simple sugars, the most common of which is glucose. In monosaccharides, the number of carbons usually ranges from three to seven. Most monosaccharide names end with the suffix -ose.

The chemical formula for glucose is C6H12O6. In humans, glucose is an important source of energy. During cellular respiration, energy is released from glucose, and that energy is used to help make adenosine triphosphate (ATP). Plants synthesize glucose using carbon dioxide and water, and glucose in turn is used for energy requirements for the plant. Excess glucose is often stored as starch that is catabolized (the breakdown of larger molecules by cells) by humans and other animals that feed on plants.

Galactose (part of lactose, or milk sugar) and fructose (found in sucrose, in fruit) are other common monosaccharides. Although glucose, galactose, and fructose all have the same chemical formula (C6H12O6), they differ structurally and chemically (and are known as isomers) because of the different arrangement of functional groups around the asymmetric carbon; all of these monosaccharides have more than one asymmetric carbon. Within one monosaccharide, all of the atoms are connected to each other with strong covalent bonds.

Figure 2 Glucose, galactose, and fructose are all hexoses. They are structural isomers, meaning they have the same chemical formula (C6H12O6) but a different arrangement of atoms. The lines between atoms represent covalent bonds.


Disaccharides (di- = “two”) form when two monosaccharides undergo a dehydration reaction (also known as a condensation reaction or dehydration synthesis). During this process, the hydroxyl (OH) group of one monosaccharide combines with the hydrogen of another monosaccharide, releasing a molecule of water and forming a covalent bond which joins the two monosaccharides together.

Common disaccharides include lactose, maltose, and sucrose (Figure 3). Lactose is a disaccharide consisting of the monomers glucose and galactose. It is formed by a dehydration reaction between the glucose and the galactose molecules, which removes a water molecule and forms a covalent bond. connected by a covalent bond. It is found naturally in milk. Maltose, or malt sugar, is a disaccharide composed of two glucose molecules connected by a covalent bond. The most common disaccharide is sucrose, or table sugar, which is composed of the monomers glucose and fructose, also connected by a covalent bond.

structures of disaccharides

Figure 3 Common disaccharides include maltose (grain sugar), lactose (milk sugar), and sucrose (table sugar).


A long chain of monosaccharides linked by glycosidic bonds is known as a polysaccharide (poly- = “many”). The chain may be branched or unbranched, and it may contain different types of monosaccharides. All of the monosaccharides are connected together by covalent bonds. The molecular weight may be 100,000 daltons or more depending on the number of monomers joined. Starch, glycogen, cellulose, and chitin are primary examples of polysaccharides.

Starch is the stored form of sugars in plants and is made up of a mixture of amylose and amylopectin (both polymers of glucose). Basically, starch is a long chain of glucose monomers. Plants are able to synthesize glucose, and the excess glucose, beyond the plant’s immediate energy needs, is stored as starch in different plant parts, including roots and seeds. The starch in the seeds provides food for the embryo as it germinates and can also act as a source of food for humans and animals. The starch that is consumed by humans is broken down by enzymes, such as salivary amylases, into smaller molecules, such as maltose and glucose. The cells can then absorb the glucose.

Glycogen is the storage form of glucose in humans and other vertebrates and is made up of monomers of glucose. Glycogen is the animal equivalent of starch and is a highly branched molecule usually stored in liver and muscle cells. Whenever blood glucose levels decrease, glycogen is broken down to release glucose in a process known as glycogenolysis.

structures of starch

Figure 4 Amylose and amylopectin are two different forms of starch. Amylose is composed of unbranched chains of glucose monomers. Amylopectin is composed of branched chains of glucose monomers. Because of the way the subunits are joined, the glucose chains have a helical structure. Glycogen (not shown) is similar in structure to amylopectin but more highly branched.

Cellulose is the most abundant natural biopolymer. The cell wall of plants is mostly made of cellulose; this provides structural support to the cell. Wood and paper are mostly cellulosic in nature. Cellulose is made up of glucose monomers (Figure 5).

cellulose fibrils

Figure 5 In cellulose, glucose monomers are linked in unbranched chains. Because of the way the glucose subunits are joined, every glucose monomer is flipped relative to the next one resulting in a linear, fibrous structure.

Carbohydrates serve various functions in different animals. Arthropods (insects, crustaceans, and others) have an outer skeleton, called the exoskeleton, which protects their internal body parts (as seen in the bee in Figure 6). This exoskeleton is made of the biological macromolecule chitin, which is a polysaccharide-containing nitrogen. It is made of repeating units of N-acetyl-β-d-glucosamine, a modified sugar. Chitin is also a major component of fungal cell walls; fungi are neither animals nor plants and form a kingdom of their own in the domain Eukarya.

bee flying towards a purple flower

Figure 6 Insects have a hard outer exoskeleton made of chitin, a type of polysaccharide. (credit: Louise Docker)

How does carbohydrate structure relate to function?

Energy can be stored within the bonds of a molecule. Bonds connecting two carbon atoms or connecting a carbon atom to a hydrogen atom are high energy bonds. Breaking these bonds releases energy. This is why our cells can get energy from a molecule of glucose (C6H12O6).

Polysaccharides form long, fibrous chains which are able to build strong structures such as cell walls.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016



Lipids are a diverse group of compounds that are united by a common feature. Lipids are hydrophobic (“water-fearing”), or insoluble in water. Lipids perform many different functions in a cell. Cells store energy for long-term use in the form of lipids called fats. Lipids also provide insulation from the environment for plants and animals. For example, they help keep aquatic birds and mammals dry because of their water-repelling nature. Lipids are also the building blocks of many hormones and are an important constituent of the plasma membrane. Lipids include fats, oils, waxes, phospholipids, and steroids.


Fats and Oils

A fat molecule consists of two main components—glycerol and fatty acids. Glycerol is an organic compound (an alcohol) that contains three carbons, five hydrogens, and three hydroxyl (OH) groups. Fatty acids have a long chain of hydrocarbons to which a carboxyl group is attached, hence the name “fatty acid.” The number of carbons in the fatty acid may range from 4 to 36; most common are those containing 12–18 carbons. In a fat molecule, the fatty acids are attached to each of the three carbons of the glycerol molecule with a covalent bond. This molecule is called a triglyceride.

chemical structures

Figure 4 Triacylglycerol is formed by the joining of three fatty acids to a glycerol backbone in a dehydration reaction (remember this removes a water molecule and forms a covalent bond). Three molecules of water are released in the process.


Wax covers the feathers of some aquatic birds and the leaf surfaces of some plants. Because of the hydrophobic nature of waxes, they prevent water from sticking on the surface (Figure 5). Waxes are made up of long fatty acid chains covalently bonded to long-chain alcohols.

photo of leaves

Figure 5 Waxy coverings on some leaves are made of lipids. (credit: Roger Griffith)


Phospholipids are major constituents of the plasma membrane, the outermost layer of animal cells. Like fats, they are composed of fatty acid chains covalently bonded to a glycerol or sphingosine backbone. Instead of three fatty acids attached as in triglycerides, however, there are two fatty acids forming diacylglycerol, and the third carbon of the glycerol backbone is occupied by a modified phosphate group (Figure 6). Phosphatidylcholine and phosphatidylserine are two important phospholipids that are found in plasma membranes.

structure of phospholipid

Figure 6 A phospholipid is a molecule with two fatty acids and a modified phosphate group attached to a glycerol backbone. The phosphate may be modified by the addition of charged or polar chemical groups. Two chemical groups that may modify the phosphate, choline and serine, are shown here. Both choline and serine attach to the phosphate group at the position labeled R via the hydroxyl group indicated in green.

A phospholipid is an amphipathic molecule, meaning it has a hydrophobic and a hydrophilic part. The fatty acid chains are hydrophobic and cannot interact with water, whereas the phosphate-containing group is hydrophilic and interacts with water (Figure 7). The head is the hydrophilic part, and the tail contains the hydrophobic fatty acids. In a membrane, a bilayer of phospholipids forms the matrix of the structure, the fatty acid tails of phospholipids face inside, away from water, whereas the phosphate group faces the outside, aqueous side. This forms a hydrophobic layer on the inside of the bilayer, where the tails are located.

binary of phospholipids

Figure 7 The phospholipid bilayer is the major component of all cellular membranes. The hydrophilic head groups of the phospholipids face the aqueous solution. The hydrophobic tails are sequestered in the middle of the bilayer.

Phospholipids are responsible for the dynamic nature of the plasma membrane. If a drop of phospholipids is placed in water, it spontaneously forms a structure known as a micelle, where the hydrophilic phosphate heads face the outside and the fatty acids face the interior of this structure (Figure 8).

Figure 8 A micelle may be the very early precursor of a cell. It is a single layer of phospholipids that form spontaneously. Credit AmitWo, Wikimedia;


Unlike the phospholipids and fats discussed earlier, steroids have a fused ring structure. Although they do not resemble the other lipids, they are grouped with them because they are also hydrophobic and insoluble in water. All steroids have four linked carbon rings and several of them, like cholesterol, have a short tail (Figure 9). Many steroids also have the –OH functional group, which puts them in the alcohol classification (sterols). Remember that each line in these diagrams of chemical structures represents a covalent bond. The points where the lines connect to each other show the location of carbon atoms – these carbon atoms are not labeled, but their existence is implied in the chemical structure.

Figure 9 Steroids such as cholesterol and cortisol are composed of four fused hydrocarbon rings.

Cholesterol is the most common steroid. Cholesterol is mainly synthesized in the liver and is the precursor to many steroid hormones such as testosterone and estradiol, which are secreted by the gonads and endocrine glands. It is also the precursor to Vitamin D. Cholesterol is also the precursor of bile salts, which help in the emulsification of fats and their subsequent absorption by cells. Although cholesterol is often spoken of in negative terms by lay people, it is necessary for proper functioning of the body. It is a component of the plasma membrane of animal cells and is found within the phospholipid bilayer. Being the outermost structure in animal cells, the plasma membrane is responsible for the transport of materials and cellular recognition and it is involved in cell-to-cell communication.

How does lipid structure relate to function?

Fats (triglycerides) are made up of three fatty acid hydrocarbon chains connected to a glycerol. Fatty acid chains contain large numbers of carbon-carbon and carbon-hydrogen bonds – they are typically made up of between 4 and 28 carbons connected together in a chain. Just like the carbon-carbon and carbon-hydrogen bonds in glucose allow that molecule to store energy, the bonds in fatty acids allow triglycerides to store energy. In fact, triglycerides can store much more energy than carbohydrates because they contain so many more bonds! This is why fats contain more calories (a measure of energy) than sugars do.

Waxes function to provide a waterproof coating on a surface. Because they are hydrophobic, they can form a coating that repels water.

The structure of phospholipids is very important to their function. Because they are amphipathic (have a hydrophobic and a hydrophilic portion), they self-assemble into structures where the hydrophobic tails are hidden away from the watery environment. This gives the cell membrane a structure that prevents many molecules from moving through it.

Cholesterol is also amphipathic. It can insert into cell membranes in a manner similar to phospholipids. The presence of cholesterol within a membrane prevents the phospholipid tails from packing together tightly. This allows the membrane to remain fluid at lower temperatures.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016



Proteins are one of the most abundant organic molecules in living systems and have the most diverse range of functions of all macromolecules. Proteins may be structural, regulatory, contractile, or protective; they may serve in transport, storage, or membranes; or they may be toxins or enzymes. Each cell in a living system may contain thousands of different proteins, each with a unique function. Their structures, like their functions, vary greatly. They are all, however, polymers of amino acids, arranged in a linear sequence and connected together by covalent bonds.

Amino acids are the monomers that make up proteins (Figure 1). Each amino acid has the same fundamental structure, which consists of a central carbon atom, also known as the alpha (α) carbon, bonded to an amino group (NH2), a carboxyl group (COOH), and to a hydrogen atom. Every amino acid also has another atom or group of atoms bonded to the central atom known as the R group.

amino acid structure

Figure 1 Amino acids have a central asymmetric carbon to which an amino group, a carboxyl group, a hydrogen atom, and a side chain (R group) are attached.

Function Examples Description
Defense Immunoglobulins Antibodies bind to specific foreign particles, such as viruses and bacteria, to help protect the body.
Enzyme Digestive enzymes such as amylase, lipase, pepsin, trypsin Enzymes carry out almost all of the thousands of chemical reactions that take place in cells. They also assist with the formation of new molecules by reading the genetic information stored in DNA.
Messenger Insulin, thyroxine Messenger proteins, such as some types of hormones, transmit signals to coordinate biological processes between different cells, tissues, and organs.
Structural component Actin, tubulin, keratin These proteins provide structure and support for cells. On a larger scale, they also allow the body to move.
Transport/ storage Hemoglobin, albumin, Legume storage proteins, egg white (albumin) These proteins bind and carry atoms and small molecules within cells and throughout the body. Some provide nourishment in early development of the embryo and the seedling
Contractile Actin, myosin Affect muscle contraction.

You may have noticed that “source of energy” was not listed among the function of proteins. This is because proteins in our diet are typically broken back down into individual amino acids that our cells then assemble into our own proteins. Humans are actually unable to build some amino acids inside our own cells – we require them in our diet (these are the so-called “essential” amino acids). Our cells can digest proteins to release energy, but will usually only do so when carbohydrates or lipids are not available.

meat and seafood

Figure 2 Examples of foods that contain high levels of protein. (“Protein” by National Cancer Institute is in the Public Domain)

The functions of proteins can be very diverse because they are made up of are 20 different chemically distinct amino acids that form long chains, and the amino acids can be in any order. The function of the protein is dependent on the protein’s shape. The shape of a protein is determined by the order of the amino acids. Proteins are often hundreds of amino acids long and they can have very complex shapes because there are so many different possible orders for the 20 amino acids (Figure 3)!

amino acid structures

Figure 3 There are 20 common amino acids commonly found in proteins, each with a different R group (variant group) that determines its chemical nature.

The chemical nature of the side chain determines the nature of the amino acid (that is, whether it is acidic, basic, polar, or nonpolar). For example, the amino acid glycine has a hydrogen atom as the R group. Amino acids such as valine, methionine, and alanine are nonpolar or hydrophobic in nature, while amino acids such as serine, threonine, and cysteine are polar and have hydrophilic side chains. The side chains of lysine and arginine are positively charged, and therefore these amino acids are also known as basic amino acids. Proline has an R group that is linked to the amino group, forming a ring-like structure. Proline is an exception to the standard structure of an animo acid since its amino group is not separate from the side chain (Figure 3). Amino acids are represented by a single upper case letter as well as a three-letter abbreviation. For example, valine is known by the letter V or the three-letter symbol val.

 Just as some fatty acids are essential to a diet, some amino acids are necessary as well. They are known as essential amino acids, and in humans they include isoleucine, leucine, and cysteine. Essential amino acids refer to those necessary for construction of proteins in the body, although not produced by the body; which amino acids are essential varies from organism to organism.

The sequence and the number of amino acids ultimately determine the protein’s shape, size, and function. Each amino acid is attached to another amino acid by a covalent bond, known as a peptide bond, which is formed by a dehydration reaction. The carboxyl group of one amino acid and the amino group of the incoming amino acid combine, releasing a molecule of water. The resulting bond is the peptide bond (Figure 4).

chemical structure of peptide bond

Figure 4 Peptide bond formation is a dehydration synthesis reaction. The carboxyl group of one amino acid is linked to the amino group of the incoming amino acid. In the process, a molecule of water is released.

Protein Structure

As discussed earlier, the shape of a protein is critical to its function. For example, an enzyme can bind to a specific substrate at a site known as the active site. If this active site is altered because of local changes or changes in overall protein structure, the enzyme may be unable to bind to the substrate. To understand how the protein gets its final shape or conformation, we need to understand the four levels of protein structure: primary, secondary, tertiary, and quaternary (Figure 5).

protein structure levels

Figure 5 Main levels of protein structure. (“Main protein structure levels en” by LadyofHats is in the Public Domain)

Primary Structure

The unique sequence of amino acids in a polypeptide chain is its primary structure. For example, the pancreatic hormone insulin has two polypeptide chains, A and B, and they are linked together by disulfide bonds. The N terminal amino acid of the A chain is glycine, whereas the C terminal amino acid is asparagine (). The sequences of amino acids in the A and B chains are unique to insulin.

amino acid chain for insulin

Figure 6 Bovine serum insulin is a protein hormone made of two peptide chains, A (21 amino acids long) and B (30 amino acids long). In each chain, primary structure is indicated by three-letter abbreviations that represent the names of the amino acids in the order they are present. The amino acid cysteine (cys) has a sulfhydryl (SH) group as a side chain. Two sulfhydryl groups can react in the presence of oxygen to form a disulfide (S-S) bond. Two disulfide bonds connect the A and B chains together, and a third helps the A chain fold into the correct shape. Note that all disulfide bonds are the same length, but are drawn different sizes for clarity.

Secondary Structure

The local folding of the polypeptide in some regions gives rise to the secondary structure of the protein. The most common are the α-helix and β-pleated sheet structures (Figure 7). Both structures are the α-helix structure—the helix held in shape by hydrogen bonds. The hydrogen bonds form between the oxygen atom in the carbonyl group in one amino acid and another amino acid that is four amino acids farther along the chain.

secondary structure

Figure 7 The α-helix and β-pleated sheet are secondary structures of proteins that form because of hydrogen bonding between carbonyl and amino groups in the peptide backbone. Certain amino acids have a propensity to form an α-helix, while others have a propensity to form a β-pleated sheet.

Tertiary Structure

The unique three-dimensional structure of a polypeptide is its tertiary structure (Figure 8). This structure is in part due to chemical interactions at work on the polypeptide chain. Primarily, the interactions among R groups (the variable part of the amino acid) creates the complex three-dimensional tertiary structure of a protein. The nature of the R groups found in the amino acids involved can counteract the formation of the hydrogen bonds described for standard secondary structures. For example, R groups with like charges are repelled by each other and those with unlike charges are attracted to each other (ionic bonds). When protein folding takes place, the hydrophobic R groups of nonpolar amino acids lay in the interior of the protein, whereas the hydrophilic R groups lay on the outside. The former types of interactions are also known as hydrophobic interactions. Interaction between cysteine side chains forms disulfide linkages in the presence of oxygen, the only covalent bond forming during protein folding.

looks like a snake

Figure 8 The tertiary structure of proteins is determined by a variety of chemical interactions. These include hydrophobic interactions, ionic bonding, hydrogen bonding and disulfide linkages.

Quaternary Structure

In nature, some proteins are formed from several polypeptides, also known as subunits, and the interaction of these subunits forms the quaternary structure. Weak interactions between the subunits help to stabilize the overall structure. For example, insulin (a globular protein) has a combination of hydrogen bonds and disulfide bonds that cause it to be mostly clumped into a ball shape. Insulin starts out as a single polypeptide and loses some internal sequences in the presence of post-translational modification after the formation of the disulfide linkages that hold the remaining chains together. Silk (a fibrous protein), however, has a β-pleated sheet structure that is the result of hydrogen bonding between different chains.

The four levels of protein structure (primary, secondary, tertiary, and quaternary) are illustrated in Figure 9.

levels of protein structure

Figure 9 The four levels of protein structure can be observed in these illustrations. (credit: modification of work by National Human Genome Research Institute)

The unique shape for every protein is ultimately determined by the gene that encodes the protein. Any change in the gene sequence may lead to a different amino acid being added to the polypeptide chain, causing a change in protein structure and function. Individuals who are affected by sickle cell anemia can have a variety of serious health problems, such as breathlessness, dizziness, headaches, and abdominal pain. In this disease, the hemoglobin β chain has a single amino acid substitution, causing a change in both the structure (shape) and function (job) of the protein. What is most remarkable to consider is that a hemoglobin molecule is made up of about 600 amino acids. The structural difference between a normal hemoglobin molecule and a sickle cell molecule is a single amino acid of the 600 (Figure 10).

ribbon protein structure

Figure 10 The unique shape of the normal hemoglobin protein. (“Structure of hemoglobin Gower 2” by Emw is licensed under CC BY-SA 3.0)

Denaturation and Protein Folding

Each protein has its own unique sequence and shape that are held together by chemical interactions. If the protein is subject to changes in temperature, pH, or exposure to chemicals, the protein structure may change, losing its shape without losing its primary sequence in what is known as denaturation. Denaturation is often reversible because the primary structure of the polypeptide is conserved in the process if the denaturing agent is removed, allowing the protein to resume its function. Sometimes denaturation is irreversible, leading to loss of function. One example of irreversible protein denaturation is when an egg is fried. The albumin protein in the liquid egg white is denatured when placed in a hot pan. Not all proteins are denatured at high temperatures; for instance, bacteria that survive in hot springs have proteins that function at temperatures close to boiling. The stomach is also very acidic, has a low pH, and denatures proteins as part of the digestion process; however, the digestive enzymes of the stomach retain their activity under these conditions.

Figure 11 The reason an egg white turns white as you cook it is because the albumin in the white denatures and then reconnects in an abnormal fashion. Credit Matthew Murdock;

Protein folding is critical to its function. It was originally thought that the proteins themselves were responsible for the folding process. Only recently was it found that often they receive assistance in the folding process from protein helpers known as chaperones (or chaperonins) that associate with the target protein during the folding process. They act by preventing aggregation of polypeptides that make up the complete protein structure, and they disassociate from the protein once the target protein is folded.

How does protein structure relate to function?

Recall that a protein is  built from a long chain of amino acids connected together in a specific order. The specific order of amino acids determines how they will interact together to form the 3-D shape of the protein. The shape of a protein determines its function. Therefore, the order of the amino acids determines the protein’s shape, which determines its function.

Because there are 20 different amino acids, they can be combined together in a practically infinite number of ways. This means that there is a huge number of different protein shapes that can be assumed based on the amino acid order. This is very important since proteins fulfill so many different functions within cells.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016


Nucleic Acids

Nucleic acids are key macromolecules in the continuity of life. They carry the genetic blueprint of a cell and carry instructions for the functioning of the cell. The two main types of nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). DNA is the genetic material found in all living organisms, ranging from single-celled bacteria to multicellular mammals. The other type of nucleic acid, RNA, is mostly involved in protein synthesis. The DNA molecules never leave the nucleus, but instead use an RNA intermediary to communicate with the rest of the cell. Other types of RNA are also involved in protein synthesis and its regulation. We will be going into more detail about nucleic acids in a later section.

DNA and RNA are made up of monomers known as nucleotides connected together in a chain with covalent bonds. Each nucleotide is made up of three components: a nitrogenous base, five-carbon sugar, and a phosphate group (Figure 1). The nitrogenous base in one nucleotide is attached to the sugar molecule, which is attached to the phosphate group.

Figure 1 A nucleotide is made up of three components: a nitrogenous base, a pentose sugar, and one or more phosphate groups.

The nitrogenous bases, important components of nucleotides, are organic molecules and are so named because they contain carbon and nitrogen. They are bases because they contain an amino group that has the potential of binding an extra hydrogen, and thus, decreases the hydrogen ion concentration in its environment, making it more basic. Each nucleotide in DNA contains one of four possible nitrogenous bases: adenine (A), guanine (G) cytosine (C), and thymine (T). RNA contains the base uracil (U) instead of thymine. The order of the bases in a nucleic acid determines the information that the molecule of DNA or RNA carries. This is because the order of the bases in a DNA gene determines the order that amino acids will be assembled together to form a protein.

The pentose sugar in DNA is deoxyribose, and in RNA, the sugar is ribose (Figure 1). The difference between the sugars is the presence of the hydroxyl group on the second carbon of the ribose and hydrogen on the second carbon of the deoxyribose. The carbon atoms of the sugar molecule are numbered as 1′, 2′, 3′, 4′, and 5′ (1′ is read as “one prime”). The phosphate residue is attached to the hydroxyl group of the 5′ carbon of one sugar and the hydroxyl group of the 3′ carbon of the sugar of the next nucleotide, which forms a 5′–3′ phosphodiester linkage (a specific type of covalent bond). A polynucleotide may have thousands of such phosphodiester linkages.

DNA Double-Helical Structure

DNA has a double-helical structure (Figure 2). It is composed of two strands, or chains, of nucleotides. The double helix of DNA is often compared to a twisted ladder. The strands (the outside parts of the ladder) are formed by linking the phosphates and sugars of adjacent nucleotides with strong chemical bonds, called covalent bonds. The rungs of the twisted ladder are made up of the two bases attached together with a weak chemical bond, called a hydrogen bonds. Two bases hydrogen bonded together is called a base pair. The ladder twists along its length, hence the “double helix” description, which means a double spiral.

twisted ladder of DNA

Figure 2 The double-helix model shows DNA as two parallel strands of intertwining molecules. (credit: Jerome Walker, Dennis Myts).

The alternating sugar and phosphate groups lie on the outside of each strand, forming the backbone of the DNA. The nitrogenous bases are stacked in the interior, like the steps of a staircase, and these basespair; the pairs are bound to each other by hydrogen bonds. The bases pair in such a way that the distance between the backbones of the two strands is the same all along the molecule.

How does nucleic acid structure determine function?

The major function of both DNA and RNA is to store and carry genetic information. The specific order of nucleotides in the molecule of DNA or RNA is what determines the genetic information it carries. You can think of it like letters in a book – if the order of the letters were changed, the book would no longer contain the same (or correct) information.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016.


What is Life?

What is biology? In simple terms, biology is the study of living organisms and their interactions with one another and their environments. This is a very broad definition because the scope of biology is vast. Biologists may study anything from the microscopic or submicroscopic view of a cell to ecosystems and the whole living planet. Listening to the daily news, you will quickly realize how many aspects of biology are discussed every day. For example, recent news topics include Zika virus and using a new technology called CRISPR to specifically target and edit human genes. Other subjects include efforts toward finding a cure for AIDS, Alzheimer’s disease, and cancer. On a global scale, many researchers are committed to finding ways to protect the planet, solve environmental issues, and reduce the effects of climate change. All of these diverse endeavors are related to different facets of the discipline of biology.


What makes something living?

All living organisms share several key characteristics or functions: order, sensitivity or response to the environment, reproduction, adaptation, growth and development, homeostasis, energy processing, and evolution. When viewed together, these characteristics serve to define life. Different sources may use slightly different terms to describe these characteristics, but the basic ideas are always present.


A photo shows a light-colored toad covered in bright green spots.

Figure 1 A toad represents a highly organized structure consisting of cells, tissues, organs, and organ systems. (credit: “Ivengo”/Wikimedia Commons)

Organisms are highly organized, coordinated structures that consist of one or more cells. Even very simple, single-celled organisms are remarkably complex: inside each cell, atoms make up molecules; these in turn make up cell organelles and other cellular inclusions. In multicellular organisms, such as the toad seen in Figure 1, similar cells form tissues. Tissues, in turn, collaborate to create organs (body structures with a distinct function). Organs work together to form organ systems.

In this class, we will be focusing on how cells function, so we will be concentrating on biological molecules, how they make up cells, and how those cells function.

Sensitivity or Response to Stimuli

A photograph of the Mimosa pudica shows a plant with many tiny leaves connected to a central stem. Four of these stems connect together.

Figure 2 The leaves of this sensitive plant (Mimosa pudica) will instantly droop and fold when touched. After a few minutes, the plant returns to normal. (credit: Alex Lomas)

Organisms respond to diverse stimuli. For example, plants can bend toward a source of light, climb on fences and walls, or respond to touch (Figure 2). Even tiny bacteria can move toward or away from chemicals (a process called chemotaxis) or light (phototaxis).


Single-celled organisms reproduce by first duplicating their DNA, and then dividing it equally as the cell prepares to divide to form two new cells. Multicellular organisms often produce specialized reproductive germline (reproductive) cells that will form new individuals. When reproduction occurs, DNA is passed from the organism to that organism’s offspring. DNA contains the instructions to produce all the physical traits for the organism. This means that because parents and offspring share DNA ensures that the offspring will belong to the same species and will have similar characteristics, such as size and shape.

Growth and Development

All living things increase in size and/or change over their lifespan. For example, a human grows from a baby into an adult and goes through developmental processes such as puberty. Organisms grow and develop following specific instructions coded for by their genes (DNA). These genes provide instructions that will direct cellular growth and development, ensuring that a species’ young will grow up to exhibit many of the same characteristics as its parents, like the kittens seen in Figure 3.

A photograph depicts a mother cat nursing three kittens: one has an orange and white tabby coat, another is black with a white foot, while the third has a black and white tabby coat.

Figure 3 Although no two look alike, these kittens have inherited genes from both parents and share many of the same characteristics. (credit: Rocky Mountain Feline Rescue)

Homeostasis and Regulation

A photos shows a white, furry polar bear.

Figure 4 Polar bears (Ursus maritimus) and other mammals living in ice-covered regions maintain their body temperature by generating heat and reducing heat loss through thick fur and a dense layer of fat under their skin. (credit: “longhorndave”/Flickr)

In order to function properly, cells need to have appropriate conditions such as proper temperature, pH, and appropriate concentration of diverse chemicals. These conditions may, however, change from one moment to the next. Organisms are able to maintain internal conditions within a narrow range almost constantly, despite environmental changes, through homeostasis (literally, “steady state”)—the ability of an organism to maintain constant internal conditions. For example, an organism needs to regulate body temperature through a process known as thermoregulation. Organisms that live in cold climates, such as the polar bear (Figure 4), have body structures that help them withstand low temperatures and conserve body heat. Structures that aid in this type of insulation include fur, feathers, blubber, and fat. In hot climates, organisms have methods (such as perspiration in humans or panting in dogs) that help them to shed excess body heat.

Even the smallest organisms are complex and require multiple regulatory mechanisms to coordinate internal functions, respond to stimuli, and cope with environmental stresses. Two examples of internal functions regulated in an organism are nutrient transport and blood flow. Organs (groups of tissues working together) perform specific functions, such as carrying oxygen throughout the body, removing wastes, delivering nutrients to every cell, and cooling the body.

Energy Processing

Photo shows a California condor in flight with a tag on its wing.

Figure 5 The California condor (Gymnogyps californianus) uses chemical energy derived from food to power flight. California condors are an endangered species; this bird has a wing tag that helps biologists identify the individual. (credit: Pacific Southwest Region U.S. Fish and Wildlife Service)

All organisms use a source of energy for their metabolic activities. Some organisms capture energy from the sun and convert it into chemical energy in food (such as grass and bacteria that can perform photosynthesis); others use chemical energy in molecules they take in as food (such as the condor seen in Figure 5).


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 25, 2017


Levels of Organization

Living things are highly organized and structured, following a hierarchy that can be examined on a scale from small to large. The atom is the smallest and most fundamental unit of matter. It consists of a nucleus surrounded by electrons. Atoms form molecules. A molecule is a chemical structure consisting of at least two atoms held together by one or more chemical bonds. Many molecules that are biologically important are macromolecules, large molecules that are typically formed by polymerization (a polymer is a large molecule that is made by combining smaller units called monomers, which are simpler than macromolecules). An example of a macromolecule is deoxyribonucleic acid (DNA) (Figure 1), which contains the instructions for the structure and functioning of all living organisms. See the section of your textbook about the chemistry of biological molecules for more information.

Molecular model depicts a DNA molecule, showing its double helix structure.

Figure 1 All molecules, including this DNA molecule, are composed of atoms. (credit: “brian0918”/Wikimedia Commons)

Some cells contain aggregates of macromolecules surrounded by membranes; these are called organelles. Organelles are small structures that exist within cells. Examples of organelles include mitochondria and chloroplasts, which carry out indispensable functions: mitochondria produce energy to power the cell, while chloroplasts enable green plants to utilize the energy in sunlight to make sugars. All living things are made of cells; the cell itself is the smallest fundamental unit of structure and function in living organisms. This requirement is one of the reasons why viruses are not considered living: they are not made of cells. To make new viruses, they have to invade and hijack the reproductive mechanism of a living cell; only then can they obtain the materials they need to reproduce. Some organisms consist of a single cell and others are multicellular. Cells are classified as prokaryotic or eukaryotic. Prokaryotes are single-celled or colonial organisms that do not have membrane-bound nuclei; in contrast, the cells of eukaryotes do have membrane-bound organelles and a membrane-bound nucleus.

In larger organisms, cells combine to make tissues, which are groups of similar cells carrying out similar or related functions. Organs are collections of tissues grouped together performing a common function. Organs are present not only in animals but also in plants. An organ system is a higher level of organization that consists of functionally related organs. Mammals have many organ systems. For instance, the circulatory system transports blood through the body and to and from the lungs; it includes organs such as the heart and blood vessels. Organisms are individual living entities. For example, each tree in a forest is an organism. Single-celled prokaryotes and single-celled eukaryotes are also considered organisms and are typically referred to as microorganisms.

All the individuals of a species living within a specific area are collectively called a population. For example, a forest may include many pine trees. All of these pine trees represent the population of pine trees in this forest. Different populations may live in the same specific area. For example, the forest with the pine trees includes populations of flowering plants and also insects and microbial populations. A community is the sum of populations inhabiting a particular area. For instance, all of the trees, flowers, insects, and other populations in a forest form the forest’s community. The forest itself is an ecosystem. An ecosystem consists of all the living things in a particular area together with the abiotic, non-living parts of that environment such as nitrogen in the soil or rain water. At the highest level of organization, the biosphere is the collection of all ecosystems, and it represents the zones of life on earth. It includes land, water, and even the atmosphere to a certain extent.

A flow chart shows the hierarchy of living organisms. From smallest to largest, this hierarchy includes: (1) Organelles, such as nuclei, that exist inside cells. (2) Cells, such as a red blood cell. (3) Tissues, such as human skin tissue. (4) Organs such as the stomach make up the human digestive system, an example of an organ system. (5) Organisms, populations, and communities. In a forest, each pine tree is an organism. Together, all the pine trees make up a population. All the plant and animal species in the forest comprise a community. (6) Ecosystems: the coastal ecosystem in the Southeastern United States includes living organisms and the environment in which they live. (7) The biosphere: encompasses all the ecosystems on Earth.

Figure 1 The biological levels of organization of living things are shown. From a single organelle to the entire biosphere, living organisms are parts of a highly structured hierarchy. (credit “organelles”: modification of work by Umberto Salvagnin; credit “cells”: modification of work by Bruce Wetzel, Harry Schaefer/ National Cancer Institute; credit “tissues”: modification of work by Kilbad; Fama Clamosa; Mikael Häggström; credit “organs”: modification of work by Mariana Ruiz Villareal; credit “organisms”: modification of work by “Crystal”/Flickr; credit “ecosystems”: modification of work by US Fish and Wildlife Service Headquarters; credit “biosphere”: modification of work by NASA)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 25, 2017


The Diversity of Life

The fact that biology, as a science, has such a broad scope has to do with the tremendous diversity of life on earth. The source of this diversity is evolution, the process of gradual change during which new species arise from older species. Evolutionary biologists study the evolution of living things in everything from the microscopic world to ecosystems.

The evolution of various life forms on Earth can be summarized in a phylogenetic tree (Figure 1). A phylogenetic tree is a diagram showing the evolutionary relationships among biological species based on similarities and differences in genetic or physical traits or both. A phylogenetic tree is composed of branches (the lines) and nodes (places where two lines diverge). The internal nodes represent ancestors and are points in evolution when, based on scientific evidence, an ancestor is thought to have diverged to form two new species. The length of each branch is proportional to the time elapsed since the split.

This phylogenetic tree shows that the three domains of life, bacteria, archaea and eukarya, all arose from a common ancestor.

Figure 1 This phylogenetic tree was constructed by microbiologist Carl Woese using data obtained from sequencing ribosomal RNA genes. The tree shows the separation of living organisms into three domains: Bacteria, Archaea, and Eukarya. Bacteria and Archaea are prokaryotes, single-celled organisms lacking intracellular organelles. (credit: Eric Gaba; NASA Astrobiology Institute)

While this is the most common way that is used to group organisms, other divisions have been proposed.

Figure 2 The relationship between Archae (in red) and Eukaryotes (green) may be closer than you think. Figure credit: Crion, Wikimedia.


None of the three systems currently include non-cellular life. As of 2011 there is talk about Nucleocytoplasmic large DNA viruses possibly being a fourth branch domain of life, a view supported by researchers in 2012.

Stefan Luketa in 2012 proposed a five-domain system, adding Prionobiota (acellular and without nucleic acid) and Virusobiota (acellular but with nucleic acid) to the traditional three domains.

Evolution Connection

Carl Woese and the Phylogenetic Tree

In the past, biologists grouped living organisms into five kingdoms: animals, plants, fungi, protists, and bacteria. The organizational scheme was based mainly on physical features, as opposed to physiology, biochemistry, or molecular biology, all of which are used by modern systematics. The pioneering work of American microbiologist Carl Woese in the early 1970s has shown, however, that life on Earth has evolved along three lineages, now called domains—Bacteria, Archaea, and Eukarya. The first two are prokaryotic cells with microbes that lack membrane-enclosed nuclei and organelles. The third domain contains the eukaryotes and includes unicellular microorganisms together with the four original kingdoms (excluding bacteria). Woese defined Archaea as a new domain, and this resulted in a new taxonomic tree (Figure 1). Many organisms belonging to the Archaea domain live under extreme conditions and are called extremophiles. To construct his tree, Woese used genetic relationships rather than similarities based on morphology (shape).

Woese’s tree was constructed from comparative sequencing of the genes that are universally distributed, present in every organism, and conserved (meaning that these genes have remained essentially unchanged throughout evolution). Woese’s approach was revolutionary because comparisons of physical features are insufficient to differentiate between the prokaryotes that appear fairly similar in spite of their tremendous biochemical diversity and genetic variability (Figure 3). The comparison of homologous DNA and RNA sequences provided Woese with a sensitive device that revealed the extensive variability of prokaryotes, and which justified the separation of the prokaryotes into two domains: bacteria and archaea.

Photo depict: A: bacterial cells. Photo depict: B: a natural hot vent. Photo depict: C: a sunflower. Photo depict: D: a lion.

Figure 4 These images represent different domains. The (a) bacteria in this micrograph belong to Domain Bacteria, while the (b) extremophiles (not visible) living in this hot vent belong to Domain Archaea. Both the (c) sunflower and (d) lion are part of Domain Eukarya. (credit a: modification of work by Drew March; credit b: modification of work by Steve Jurvetson; credit c: modification of work by Michael Arrighi; credit d: modification of work by Leszek Leszcynski)



Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from:

OpenStax, Concepts of Biology. OpenStax CNX. May 25, 2017

Eocyte Hypothesis, Wikipedia.  May 25, 2017.

Domain (biology), Wikipedia. May 25, 2017.


Cell Structure and Function

Learning Objectives

Course Objectives for this section:

  1. Explain how basic units of cellular structure define the function of all living things.
  • Explain how various cell structures participate in the function of a cell and/or organism.
  • Discuss the role of evolution in shaping cellular structure and function.

Close your eyes and picture a brick wall. What is the basic building block of that wall? It is a single brick, of course. Like a brick wall, your body is composed of basic building blocks, and the building blocks of your body are cells (Figure 1a-c).

Your body has many kinds of cells, each specialized for a specific purpose. Just as a home is made from a variety of building materials, the human body is constructed from many cell types. For example, epithelial cells protect the surface of the body and cover the organs and body cavities within. Bone cells help to support and protect the body. Cells of the immune system fight invading bacteria. Additionally, red blood cells carry oxygen throughout the body. Each of these cell types plays a vital role during the growth, development, and day-to-day maintenance of the body. In spite of their enormous variety, however, all cells share certain fundamental characteristics.

cells viewed under a microscope

Figure 1 (a) Nasal sinus cells (viewed with a light microscope), (b) onion cells (viewed with a light microscope), and (c) Vibrio tasmaniensis bacterial cells (viewed using a scanning electron microscope) are from very different organisms, yet all share certain characteristics of basic cell structure. (credit a: modification of work by Ed Uthman, MD; credit b: modification of work by Umberto Salvagnin; credit c: modification of work by Anthony D’Onofrio; scale-bar data from Matt Russell)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


How Cells Are Studied

A cell is the smallest unit of a living thing. A living thing, like you, is called an organism. Thus, cells are the basic building blocks of all organisms.

In multicellular organisms, cells of one particular cell type interconnect with each other and perform shared functions to form tissues (for example, muscle tissue, connective tissue, and nervous tissue), several tissues combine to form an organ (for example, stomach, heart, or brain), and several organs make up an organ system (such as the digestive system, circulatory system, or nervous system). Several systems functioning together form an organism (such as an elephant, for example).

There are many types of cells, and all are grouped into one of two broad categories: prokaryotic and eukaryotic. Animal cells, plant cells, fungal cells, and protist cells are classified as eukaryotic, whereas bacteria and archaea cells are classified as prokaryotic. Before discussing the criteria for determining whether a cell is prokaryotic or eukaryotic, let us first examine how biologists study cells.


Cells vary in size. With few exceptions, individual cells are too small to be seen with the naked eye, so scientists use microscopes to study them. A microscope is an instrument that magnifies an object. Most images of cells are taken with a microscope and are called micrographs.

Light Microscopes

To give you a sense of the size of a cell, a typical human red blood cell is about eight millionths of a meter or eight micrometers (abbreviated as μm) in diameter; the head of a pin is about two thousandths of a meter (millimeters, or mm) in diameter. That means that approximately 250 red blood cells could fit on the head of a pin.

The optics of the lenses of a light microscope changes the orientation of the image. A specimen that is right-side up and facing right on the microscope slide will appear upside-down and facing left when viewed through a microscope, and vice versa. Similarly, if the slide is moved left while looking through the microscope, it will appear to move right, and if moved down, it will seem to move up. This occurs because microscopes use two sets of lenses to magnify the image. Due to the manner in which light travels through the lenses, this system of lenses produces an inverted image (binoculars and a dissecting microscope work in a similar manner, but include an additional magnification system that makes the final image appear to be upright).

figure_03_02 pictures of microscopes

Figure 1 (a) Most light microscopes used in a college biology lab can magnify cells up to approximately 400 times. (b) Dissecting microscopes have a lower magnification than light microscopes and are used to examine larger objects, such as tissues.

Most student microscopes are classified as light microscopes (Figure 1a). Visible light both passes through and is bent by the lens system to enable the user to see the specimen. Light microscopes are advantageous for viewing living organisms, but since individual cells are generally transparent, their components are not distinguishable unless they are colored with special stains. Staining, however, usually kills the cells.

Light microscopes commonly used in the undergraduate college laboratory magnify up to approximately 400 times. Two parameters that are important in microscopy are magnification and resolving power. Magnification is the degree of enlargement of an object. Resolving power is the ability of a microscope to allow the eye to distinguish two adjacent structures as separate; the higher the resolution, the closer those two objects can be, and the better the clarity and detail of the image. When oil immersion lenses are used, magnification is usually increased to 1,000 times for the study of smaller cells, like most prokaryotic cells. Because light entering a specimen from below is focused onto the eye of an observer, the specimen can be viewed using light microscopy. For this reason, for light to pass through a specimen, the sample must be thin or translucent.

A second type of microscope used in laboratories is the dissecting microscope (Figure 1b). These microscopes have a lower magnification (20 to 80 times the object size) than light microscopes and can provide a three-dimensional view of the specimen. Thick objects can be examined with many components in focus at the same time. These microscopes are designed to give a magnified and clear view of tissue structure as well as the anatomy of the whole organism.

Like light microscopes, most modern dissecting microscopes are also binocular, meaning that they have two separate lens systems, one for each eye. The lens systems are separated by a certain distance, and therefore provide a sense of depth in the view of their subject to make manipulations by hand easier. Dissecting microscopes also have optics that correct the image so that it appears as if being seen by the naked eye and not as an inverted image. The light illuminating a sample under a dissecting microscope typically comes from above the sample, but may also be directed from below.

Electron Microscopes

In contrast to light microscopes, electron microscopes use a beam of electrons instead of a beam of light (Figure 2). Not only does this allow for higher magnification and, thus, more detail, it also provides higher resolving power. Preparation of a specimen for viewing under an electron microscope will kill it; therefore, live cells cannot be viewed using this type of microscopy. In addition, the electron beam moves best in a vacuum, making it impossible to view living materials. There are two major types of electron microscopes which differ in the images they provide:

figure_03_03a-1 salmonella bacteria

Figure 2 Salmonella bacteria are viewed with a light microscope. (credit: credit a: modification of work by CDC, Armed Forces Institute of Pathology, Charles N. Farmer)


figure_03_03b-2 salmonella SEM

Figure 3 This scanning electron micrograph (SEM) shows Salmonella bacteria (in red) invading human cells. (credit: modification of work by Rocky Mountain Laboratories, NIAID, NIH; scale-bar data from Matt Russell)

Cell Theory

The microscopes we use today are far more complex than those used in the 1600s by Antony van Leeuwenhoek, a Dutch shopkeeper who had great skill in crafting lenses. Despite the limitations of his now-ancient lenses, van Leeuwenhoek observed the movements of protists (a type of single-celled organism) and sperm, which he collectively termed “animalcules.”

In a 1665 publication called Micrographia, experimental scientist Robert Hooke coined the term “cell” (from the Latin cella, meaning “small room”) for the box-like structures he observed when viewing cork tissue through a lens. In the 1670s, van Leeuwenhoek discovered bacteria and protozoa. Later advances in lenses and microscope construction enabled other scientists to see different components inside cells.

By the late 1830s, botanist Matthias Schleiden and zoologist Theodor Schwann were studying tissues and proposed the unified cell theory. This theory has three principles which still stand today.  They are:

  1. All living things are composed of one or more cells.
  2. The cell is the basic unit of life.
  3. All new cells arise from existing cells.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Comparing Prokaryotic and Eukaryotic Cells

Cells fall into one of two broad categories: prokaryotic and eukaryotic. The predominantly single-celled organisms of the domains Bacteria and Archaea are classified as prokaryotes (pro– = before; –karyon– = nucleus). Animal cells, plant cells, fungi, and protists are eukaryotes (eu– = true).

All cells share four common components: 1) a plasma membrane, an outer covering that separates the cell’s interior from its surrounding environment; 2) cytoplasm, consisting of a gel-like region within the cell in which other cellular components are found; 3) DNA, the genetic material of the cell; and 4) ribosomes, particles that synthesize proteins.

Components of Prokaryotic Cells

Prokaryotes differ from eukaryotic cells in several important ways. A prokaryotic cell is a simple, single-celled (unicellular) organism that lacks a nucleus, or any other membrane-bound organelle. We will shortly come to see that this is significantly different in eukaryotes. Prokaryotic DNA is found in the central part of the cell: a darkened region called the nucleoid (Figure 1).

figure_03_05 prokaryote

Figure 1 This figure shows the generalized structure of a prokaryotic cell.

Unlike Archaea and eukaryotes, bacteria have a cell wall made of peptidoglycan, comprised of sugars and amino acids, and many have a polysaccharide (carbohydrate) capsule (Figure 1). The cell wall acts as an extra layer of protection, helps the cell maintain its shape, and prevents dehydration. The capsule enables the cell to attach to surfaces in its environment. Some prokaryotes have flagella, pili, or fimbriae. Flagella are used for locomotion, while most pili are used to exchange genetic material during a type of reproduction called conjugation.

Components of Eukaryotic Cells

In nature, the relationship between form and function is apparent at all levels, including the level of the cell, and this will become clear as we explore eukaryotic cells. The principle “form follows function” is found in many contexts. For example, birds and fish have streamlined bodies that allow them to move quickly through the medium in which they live, be it air or water. It means that, in general, one can deduce the function of a structure by looking at its form, because the two are matched.

A eukaryotic cell is a cell that has a membrane-bound nucleus and other membrane-bound compartments or sacs, called organelles, which have specialized functions. The rest of this chapter will discuss functions of the various organelles. The word eukaryotic means “true kernel” or “true nucleus,” alluding to the presence of the membrane-bound nucleus in these cells. The word “organelle” means “little organ,” and, as already mentioned, organelles have specialized cellular functions, just as the organs of your body have specialized functions. 

drawing of a cell

Figure 2 A generalized eukaryotic cell showing some of the organelles.

Both animals and plants are eukaryotes. Despite their fundamental similarities, there are some striking differences between animal and plant cells. Animal cells have centrioles, centrosomes (discussed under the cytoskeleton), and lysosomes, whereas plant cells do not. Plant cells have a cell wall, chloroplasts, plasmodesmata, and plastids used for storage, and a large central vacuole, whereas animal cells do not.


Cell Size

At 0.1–5.0 μm in diameter, prokaryotic cells are significantly smaller than eukaryotic cells, which have diameters ranging from 10–100 μm (Figure 3). The small size of prokaryotes allows ions and organic molecules that enter them to quickly spread to other parts of the cell. Similarly, any wastes produced within a prokaryotic cell can quickly move out. However, larger eukaryotic cells have evolved different structural adaptations to enhance cellular transport. Indeed, the large size of these cells would not be possible without these adaptations. In general, cell size is limited because volume increases much more quickly than does cell surface area. As a cell becomes larger, it becomes more and more difficult for the cell to acquire sufficient materials to support the processes inside the cell, because the relative size of the surface area across which materials must be transported declines.

Figure 3 This figure shows the relative sizes of different kinds of cells and cellular components. An adult human is shown for comparison.

Small size, in general, is necessary for all cells, whether prokaryotic or eukaryotic. Let’s examine why that is so. First, we’ll consider the area and volume of a typical cell. Not all cells are spherical in shape, but most tend to approximate a sphere. You may remember from your geometry course that the formula for the surface area of a sphere is 4πr2, while the formula for its volume is 4πr3/3. Thus, as the radius of a cell increases, its surface area increases as the square of its radius, but its volume increases as the cube of its radius (much more rapidly). Therefore, as a cell increases in size, its surface area-to-volume ratio decreases. This same principle would apply if the cell had the shape of a cube (Figure 4). If the cell grows too large, the plasma membrane will not have sufficient surface area to support the rate of diffusion required for the increased volume. In other words, as a cell grows, it becomes less efficient. One way to become more efficient is to divide; another way is to develop organelles that perform specific tasks. These adaptations lead to the development of more sophisticated cells called eukaryotic cells.

cube cells

Figure 4 Volume increases faster than surface area. The surface area of the small cell is 1mm x 1mm x 6 sides = 6mm2. The volume of the small cell is 1mm x 1mm x 1mm = 1mm3. This gives a surface area to volume ratio of 6:1. The surface area of the larger cell is 2mm x 2mm x 6 sides = 24mm2. The volume of the large cell is 2mm x 2mm x 2mm = 8mm3. This gives a surface area to volume ratio of 3:1 (24:8 reduces to 3:1).


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


The Plasma Membrane and The Cytoplasm

At this point, it should be clear that eukaryotic cells have a more complex structure than do prokaryotic cells. Organelles allow for various functions to occur in the cell at the same time. Before discussing the functions of organelles within a eukaryotic cell, let us first examine two important components of all cells (prokaryotic and eukaryotic): the plasma membrane and the cytoplasm.


Figure 1 A prokaryotic cell. The cytoplasm is not labeled, but is the light blue area inside the cell membrane. The ribosome label is pointing to one of the small brown dots representing the ribosome.


Figure 2 This figure shows a typical animal cell

figure_03_07b typical plant cell

Figure 3 This figure shows a typical plant cell.

The Plasma Membrane

Like prokaryotes, eukaryotic cells have a plasma membrane (Find it in Figures 1-3, then look at the detailed structure in Figure 4) made up of a phospholipid bilayer with embedded proteins that separates the internal contents of the cell from its surrounding environment. A phospholipid is a lipid molecule composed of two fatty acid chains, a glycerol backbone, and a phosphate group. The plasma membrane regulates the passage of some substances, such as organic molecules, ions, and water, preventing the passage of some to maintain internal conditions, while actively bringing in or removing others. Other compounds move passively across the membrane.

Figure  4 The plasma membrane is a phospholipid bilayer with embedded proteins. There are other components, such as cholesterol and carbohydrates, which can be found in the membrane in addition to phospholipids and protein.

The plasma membranes of cells that specialize in absorption are folded into fingerlike projections called microvilli (singular = microvillus). This folding increases the surface area of the plasma membrane. Such cells are typically found lining the small intestine, the organ that absorbs nutrients from digested food (Figure 5). This is an excellent example of form matching the function of a structure.

electron micrograph and cartoon of microvilli

Figure 5 Microvilli, shown here as they appear on cells lining the small intestine, increase the surface area available for absorption. These microvilli are only found on the area of the plasma membrane that faces the cavity from which substances will be absorbed. (credit “micrograph”: modification of work by Louisa Howard)

The Cytoplasm

The cytoplasm comprises the contents of a cell between the plasma membrane and the nuclear envelope (a structure to be discussed shortly). It is made up of organelles suspended in the gel-like cytosol, the cytoskeleton, and various chemicals (Find it in Figures 1-3). Even though the cytoplasm consists of 70 to 80 percent water, it has a semi-solid consistency, which comes from the proteins within it. However, proteins are not the only organic molecules found in the cytoplasm. Glucose and other simple sugars, polysaccharides, amino acids, nucleic acids, fatty acids, and derivatives of glycerol are found there too. Ions of sodium, potassium, calcium, and many other elements are also dissolved in the cytoplasm. Many metabolic reactions, including protein synthesis, take place in the cytoplasm. Take note that the cytoplasm is not “empty” or “filler” – it is a vitally important component of cells that allows chemical reactions to take place!


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016



Ribosomes are the cellular structures responsible for protein synthesis. The word “synthesis” means “to combine things to produce something else.” In this context, protein synthesis means combining different amino acids together to form a protein. Ribosomes join amino acids together in a chain to form a protein (Figure 1). This amino acid chain then folds into a complex 3-dimensional structure. The shape of a protein is what gives the protein its specific function.


Figure 1 Protein structure. The colored balls at the top of this diagram represent different amino acids. Amino acids are the subunits that are joined together by the ribosome to form a protein. This chain of amino acids then folds to form a complex 3D structure. (Credit: Lady of Hats from Wikipedia; public domain)

 Helpful Hint: Proteins are not typically used as a source of energy for the body. Protein from your diet is broken down into individual amino acids which are reassembled by your ribosomes into proteins that your cells need. Ribosomes do not produce energy.

When viewed through an electron microscope, free ribosomes appear as either clusters or single tiny dots floating freely in the cytoplasm. Ribosomes may be attached to either the cytoplasmic side of the plasma membrane or the cytoplasmic side of the rough endoplasmic reticulum (Figure 2).

Figure 2 Ribosomes can be found free in the cytoplasm (not shown in this diagram), or attached to the outer membrane of the nucleus and the rough endoplasmic reticulum (RER). Credit CFCF; Wikimedia; CC license.

Because protein synthesis is essential for all cells, ribosomes are found in practically every cell, although they are smaller in prokaryotic cells. They are particularly abundant in immature red blood cells for the synthesis of hemoglobin, which functions in the transport of oxygen throughout the body. Electron microscopy has shown us that ribosomes, which are large complexes of protein and RNA, consist of two subunits, aptly called large and small (Figure 3). Ribosomes receive their “orders” for protein synthesis from the nucleus where the DNA is transcribed into messenger RNA (mRNA). The mRNA travels to the ribosomes, which translate the code provided by the sequence of the nitrogenous bases in the mRNA into a specific order of amino acids in a protein. Amino acids are the building blocks of proteins.

diagram of a ribosome showing small and large subunit

Figure 3 Ribosomes are made up of a large subunit (top) and a small subunit (bottom). During protein synthesis, ribosomes assemble amino acids into proteins.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


The Cytoskeleton

If you were to remove all the organelles from a cell, would the plasma membrane and the cytoplasm be the only components left? No. Within the cytoplasm, there would still be ions and organic molecules, plus a network of protein fibers known as the cytoskeleton.

Both prokaryotes and eukaryotes have a cytoskeleton. Both types of organisms use their cytoskeleton for cell division, protection, and shape determination.

In addition, in eukaryotes the cytoskeleton also functions to secure certain organelles in specific positions, and to allow cytoplasm and vesicles to move within the cell. It also enables unicellular organisms to move independently. There are three types of fibers within the cytoskeleton: microfilaments, also known as actin filaments, intermediate filaments, and microtubules (Figure 1).

figure_03_09 cytoskeleton components

Figure 1 Microfilaments, intermediate filaments, and microtubules compose a cell’s cytoskeleton.


Of the three types of protein fibers in the cytoskeleton, microfilaments are the narrowest. They function in cellular movement, have a diameter of about 7 nm, and are made of two intertwined strands of a globular protein called actin. For this reason, microfilaments are also known as actin filaments.

ATP is required for actin proteins to assemble into long filaments. These long actin filaments serve as a track for the movement of a motor protein called myosin. Actin and myosin are plentiful in muscle cells. When your actin and myosin filaments slide past each other, your muscles contract. Actin also enables your cells to engage in cellular events requiring motion, such as cell division in animal cells and cytoplasmic streaming, which is the circular movement of the cell cytoplasm in plant cells.

Microfilaments also provide some rigidity and shape to the cell. They can depolymerize (disassemble) and reform quickly, thus enabling a cell to change its shape and move. White blood cells (your body’s infection-fighting cells) make good use of this ability. They can move to the site of an infection and phagocytize the pathogen.

Intermediate Filaments

Intermediate filaments are made of several strands of fibrous proteins that are wound together. These elements of the cytoskeleton get their name from the fact that their diameter, 8 to 10 nm, is between those of microfilaments and microtubules.

Intermediate filaments have no role in cell movement. Their function is purely structural. They bear tension, thus maintaining the shape of the cell, and anchor the nucleus and other organelles in place. Figure 1 shows how intermediate filaments create a supportive scaffolding inside the cell.

The intermediate filaments are the most diverse group of cytoskeletal elements. Several types of fibrous proteins are found in the intermediate filaments. You are probably most familiar with keratin, the fibrous protein that strengthens your hair, nails, and the epidermis of the skin.


As their name implies, microtubules are small hollow tubes. The walls of the microtubule are made of polymerized dimers of α-tubulin and β-tubulin, two globular proteins. With a diameter of about 25 nm, microtubules are the widest components of the cytoskeleton. They help the cell resist compression, provide a track along which vesicles move through the cell, and pull replicated chromosomes to opposite ends of a dividing cell. Like microfilaments, microtubules can dissolve and reform quickly.

Microtubules are also the structural elements of flagella, cilia, and centrioles (the latter are the two perpendicular bodies of the centrosome). In fact, in animal cells, the centrosome is the microtubule-organizing center. In eukaryotic cells, flagella and cilia are quite different structurally from their counterparts in prokaryotes, as discussed below.

The centrosome replicates itself before a cell divides, and the centrioles play a role in pulling the duplicated chromosomes to opposite ends of the dividing cell. However, the exact function of the centrioles in cell division is not clear, since cells that have the centrioles removed can still divide, and plant cells, which lack centrioles, are capable of cell division.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Flagella and Cilia

Flagella (singular = flagellum) are long, hair-like structures that extend from the plasma membrane and are used to move an entire cell, (for example, sperm, Euglena). When present, the cell has just one flagellum or a few flagella. Prokaryotes sometimes have flagella, but they are structurally very different from eukaryotic flagella. Prokaryotes can have more than one flagella. They serve the same function in both prokaryotes and eukaryotes (to move an entire cell).

Figure 1 Examples of bacterial flagella arrangement schemes. Credit Adenosine; Wikimedia.

When cilia (singular = cilium) are present, however, they are many in number and extend along the entire surface of the plasma membrane. They are short, hair-like structures that are used to move entire cells (such as paramecium) or move substances along the outer surface of the cell (for example, the cilia of cells lining the fallopian tubes that move the ovum toward the uterus, or cilia lining the cells of the respiratory tract that move particulate matter toward the throat that mucus has trapped). Cilia are not found on prokaryotes.

Figure 2 Scanning electron microscope image of lung trachea epithelium. There are both ciliated and non-ciliated cells in this epithelium. Note the difference in size between the cilia and the microvilli (on the non-ciliated cell surface). Photo credit Charles Daghlian; Wikimedia; public domain.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


The Endomembrane System

The endomembrane system (endo = within) is a group of membranes and organelles (see Figure 1) in eukaryotic cells that work together to modify, package, and transport lipids and proteins. It includes the nuclear envelope, lysosomes, and vesicles, the endoplasmic reticulum and Golgi apparatus, which we will cover shortly. Although not technically within the cell, the plasma membrane is included in the endomembrane system because, as you will see, it interacts with the other endomembranous organelles. None of the organelles that make up the endomembrane system are found in prokaryotes with the exception of the plasma membrane.

Figure 1 Membrane and secretory proteins are synthesized in the rough endoplasmic reticulum (RER). The RER also sometimes modifies proteins. In this illustration, a (green) integral membrane protein in the ER is modified by attachment of a (purple) carbohydrate. Vesicles with the integral protein bud from the ER and fuse with the cis face of the Golgi apparatus. As the protein passes along the Golgi’s cisternae, it is further modified by the addition of more carbohydrates. After its synthesis is complete, it exits as integral membrane protein of the vesicle that bud from the Golgi’s trans face and when the vesicle fuses with the cell membrane the protein becomes integral portion of that cell membrane. (credit: modification of work by Magnus Manske)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


The Nucleus

Typically, the nucleus is the most prominent organelle in a cell. The nucleus (plural = nuclei) houses the cell’s DNA in the form of chromatin and directs the synthesis of ribosomes and proteins. Let us look at it in more detail (Figure 1).

figure_03_10 endomembrane system

Figure 1 The outermost boundary of the nucleus is the nuclear envelope. Notice that the nuclear envelope consists of two phospholipid bilayers (membranes)—an outer membrane and an inner membrane—in contrast to the plasma membrane, which consists of only one phospholipid bilayer. (credit: modification of work by NIGMS, NIH)

The nuclear envelope is a double-membrane structure that constitutes the outermost portion of the nucleus (Figure 2). Both the inner and outer membranes of the nuclear envelope are phospholipid bilayers.

Figure 2 This illustration shows the double membrane structure surrounding the nucleus. Notice that both membranes are composed of a phospholipid bilayer. Credit Boumphreyfr; Wikimedia


The nuclear envelope is punctuated with pores that control the passage of ions, molecules, and RNA between the nucleoplasm and the cytoplasm (Figure 2). The nucleoplasm is the semi-solid fluid inside the nucleus, where we find the chromatin and the nucleolus.

You may remember that in prokaryotes, DNA is organized into a single circular chromosome. In eukaryotes, chromosomes are linear structures. In eukaryotes, chromosomes are structures within the nucleus that are made up of DNA, the hereditary material, and proteins. This combination of DNA and proteins is called chromatin.  Every species has a specific number of chromosomes in the nucleus of its body cells. For example, in humans, the chromosome number is 46, whereas in fruit flies, the chromosome number is eight.

Figure 3 This image shows paired chromosomes. Each pair of chromosomes is shown in a different color. In reality, chromosomes are not colorful and typically look grayish. (Credit: modification of work by NIH; scale-bar data from Matt Russell)

Chromosomes are only visible and distinguishable from one another when the cell is getting ready to divide. When the cell is in the growth and maintenance phases of its life cycle, the chromosomes resemble an unwound, jumbled bunch of threads. These unwound protein-chromosome complexes are called chromatin (Figure 4); chromatin describes the material that makes up the chromosomes both when condensed and decondensed.


Figure 4 This image shows various levels of the organization of chromatin (DNA and protein).


We already know that the nucleus directs the synthesis of ribosomes, but how does it do this? Some chromosomes have sections of DNA that encode ribosomal RNA. A darkly staining area within the nucleus, called the nucleolus (plural = nucleoli) (See Figure 1), aggregates the ribosomal RNA with associated proteins to assemble the ribosomal subunits that are then transported through the nuclear pores into the cytoplasm.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


The Endoplasmic Reticulum

The endoplasmic reticulum (ER) is a series of interconnected membranous tubules that collectively modify proteins and synthesize lipids. However, these two functions are performed in separate areas of the endoplasmic reticulum: the rough endoplasmic reticulum and the smooth endoplasmic reticulum, respectively.

Figure 1 The rough and smooth endoplasmic reticulum are part of the endomembrane system.

The hollow portion of the ER tubules is called the lumen or cisternal space. The membrane of the ER, which is a phospholipid bilayer embedded with proteins, is continuous with the nuclear envelope (Figure 1).

The rough endoplasmic reticulum (RER) is so named because the ribosomes attached to its cytoplasmic surface give it a studded appearance when viewed through an electron microscope (Figure 2). The ribosomes synthesize proteins while attached to the ER, resulting in transfer of their newly synthesized proteins into the lumen of the RER where they undergo modifications such as folding or addition of sugars. The RER also makes phospholipids for cell membranes.

Figure 2 (a) The ER is a winding network of thin membranous sacs found in close association with the cell nucleus. The smooth and rough endoplasmic reticula are very different in appearance and function (source: mouse tissue). (b) Rough ER is studded with numerous ribosomes, which are sites of protein synthesis (source: mouse tissue). EM × 110,000. (c) Smooth ER synthesizes phospholipids, steroid hormones, regulates the concentration of cellular Ca++, metabolizes some carbohydrates, and breaks down certain toxins (source: mouse tissue). EM × 110,510. (Micrographs provided by the Regents of University of Michigan Medical School © 2012). Figure from The Cytoplasm and Cellular Organelles; OpenStax.

If the phospholipids or modified proteins are not destined to stay in the RER, they will be packaged within vesicles and transported from the RER by budding from the membrane (Figure 3). Since the RER is engaged in modifying proteins that will be secreted from the cell, it is abundant in cells that secrete proteins, such as the liver.

The smooth endoplasmic reticulum (SER) is continuous with the RER but has few or no ribosomes on its cytoplasmic surface (see Figures 1-3). The SER’s functions include synthesis of carbohydrates, lipids, and steroid hormones; detoxification of medications and poisons; alcohol metabolism; and storage of calcium ions.

In muscle cells, a specialized SER called the sarcoplasmic reticulum is responsible for storage of the calcium ions that are needed to trigger the coordinated contractions of the muscle cells.

Figure 3 The endomembrane system works to modify, package, and transport lipids and proteins. (credit: modification of work by Magnus Manske)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


The Golgi Apparatus

We have already mentioned that vesicles can bud from the ER, but where do the vesicles go? Before reaching their final destination, the lipids or proteins within the transport vesicles need to be sorted, packaged, and tagged so that they wind up in the right place. The sorting, tagging, packaging, and distribution of lipids and proteins take place in the Golgi apparatus (also called the Golgi body), a series of flattened membranous sacs (Figure 1).

figure_03_11 golgi electron micrograph

Figure 1 The Golgi apparatus in this transmission electron micrograph of a white blood cell is visible as a stack of semicircular flattened rings in the lower portion of this image. Several vesicles can be seen near the Golgi apparatus.(credit: modification of work by Louisa Howard; scale-bar data from Matt Russell)

The Golgi apparatus has a receiving face near the endoplasmic reticulum (the cis face) and a releasing face on the side away from the ER, toward the cell membrane (the trans face) (Figure 2). The transport vesicles that form from the ER travel to the receiving face, fuse with it, and empty their contents into the lumen (empty space inside) of the Golgi apparatus. As the proteins and lipids travel through the Golgi, they undergo further modifications. The most frequent modification is the addition of short chains of sugar molecules. The newly modified proteins and lipids are then tagged with small molecular groups to enable them to be routed to their proper destinations.

Figure 2 Diagram of the Golgi apparatus showing the cis and trans faces. The cis face would be near the nucleus while the trans face would be facing the cell membrane. Credit Kelvinsong; Wikimedia

Finally, the modified and tagged proteins are packaged into vesicles that bud from the opposite face of the Golgi. While some of these vesicles, transport vesicles, deposit their contents into other parts of the cell where they will be used, others, secretory vesicles, fuse with the plasma membrane and release their contents outside the cell.

The amount of Golgi in different cell types again illustrates that form follows function within cells. Cells that engage in a great deal of secretory activity (such as cells of the salivary glands that secrete digestive enzymes or cells of the immune system that secrete antibodies) have an abundant number of Golgi.

In plant cells, the Golgi has an additional role of synthesizing polysaccharides, some of which are incorporated into the cell wall and some of which are used in other parts of the cell.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Vesicles and Vacuoles, Lysosomes, and Peroxisomes

Vesicles and Vacuoles

Vesicles and vacuoles are membrane-bound sacs that function in storage and transport. Vacuoles are somewhat larger than vesicles, and the membrane of a vacuole does not fuse with the membranes of other cellular components. Vesicles can fuse with other membranes within the cell system (Figure 1). Additionally, enzymes within plant vacuoles can break down macromolecules.

Figure 1 The endomembrane system works to modify, package, and transport lipids and proteins. (credit: modification of work by Magnus Manske)

The Central Vacuole (plants)

Previously, we mentioned vacuoles as essential components of plant cells. If you look at Figure 2, you will see that plant cells each have a large, central vacuole that occupies most of the cell.

Figure 2 Diagram of a plant cell.

The central vacuole plays a key role in regulating the cell’s concentration of water in changing environmental conditions. In plant cells, the liquid inside the central vacuole provides turgor pressure, which is the outward pressure caused by the fluid inside the cell. Have you ever noticed that if you forget to water a plant for a few days, it wilts? That is because as the water concentration in the soil becomes lower than the water concentration in the plant, water moves out of the central vacuoles and cytoplasm and into the soil. As the central vacuole shrinks, it leaves the cell wall unsupported. This loss of support to the cell walls of a plant results in the wilted appearance. Additionally, this fluid has a very bitter taste, which discourages consumption by insects and animals. The central vacuole also functions to store proteins in developing seed cells.


In animal cells, the lysosomes are the cell’s “garbage disposal.” Digestive enzymes within the lysosomes aid the breakdown of proteins, polysaccharides, lipids, nucleic acids, and even worn-out organelles. In single-celled eukaryotes, lysosomes are important for digestion of the food they ingest and the recycling of organelles. These enzymes are active at a much lower pH (more acidic) than those located in the cytoplasm. Many reactions that take place in the cytoplasm could not occur at a low pH, thus the advantage of compartmentalizing the eukaryotic cell into organelles is apparent.

Lysosomes also use their hydrolytic enzymes to destroy disease-causing organisms that might enter the cell. A good example of this occurs in a group of white blood cells called macrophages, which are part of your body’s immune system. In a process known as phagocytosis, a section of the plasma membrane of the macrophage invaginates (folds in) and engulfs a pathogen. The invaginated section, with the pathogen inside, then pinches itself off from the plasma membrane and becomes a vesicle. The vesicle fuses with a lysosome. The lysosome’s hydrolytic enzymes then destroy the pathogen (Figure 3).

Lysosomes are basically small bags of membrane containing enzymes, so they look structurally similar to a small vacuole.

figure_03_12 macrophage being eaten

Figure 3 A macrophage has phagocytized a potentially pathogenic bacterium into a vesicle, which then fuses with a lysosome within the cell so that the pathogen can be destroyed. Other organelles are present in the cell, but for simplicity, are not shown.


Peroxisomes are small, round organelles enclosed by single membranes (so again, they look similar to small vacuoles). They carry out oxidation reactions that break down fatty acids and amino acids. They also detoxify many poisons that may enter the body. Alcohol is detoxified by peroxisomes in liver cells. A byproduct of these oxidation reactions is hydrogen peroxide, H2O2, which is contained within the peroxisomes to prevent the chemical from causing damage to cellular components outside of the organelle. Hydrogen peroxide is safely broken down by peroxisomal enzymes into water and oxygen.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Mitochondria and Chloroplasts


Mitochondria (singular = mitochondrion) are often called the “powerhouses” or “energy factories” of a cell because they are responsible for making adenosine triphosphate (ATP), the cell’s main energy-carrying molecule. The formation of ATP from the breakdown of glucose is known as cellular respiration. Mitochondria are oval-shaped, double-membrane organelles (Figure 1) that have their own ribosomes and DNA. Each membrane is a phospholipid bilayer embedded with proteins. The inner layer has folds called cristae, which increase the surface area of the inner membrane. The area surrounded by the folds is called the mitochondrial matrix. The cristae and the matrix have different roles in cellular respiration.

In keeping with our theme of form following function, it is important to point out that muscle cells have a very high concentration of mitochondria because muscle cells need a lot of energy to contract.


Figure 1 This transmission electron micrograph shows a mitochondrion as viewed with an electron microscope. Notice the inner and outer membranes, the cristae, and the mitochondrial matrix. (credit: modification of work by Matthew Britton; scale-bar data from Matt Russell)

Like mitochondria, chloroplasts also have their own DNA and ribosomes. Chloroplasts function in photosynthesis and can be found in eukaryotic cells such as plants and algae. Carbon dioxide (CO2), water, and light energy are used to make glucose and oxygen in photosynthesis. This is the major difference between plants and animals: Plants (autotrophs) are able to make their own food, like glucose, whereas animals (heterotrophs) must rely on other organisms for their organic compounds or food source.

Like mitochondria, chloroplasts have outer and inner membranes, but within the space enclosed by a chloroplast’s inner membrane is a set of interconnected and stacked, fluid-filled membrane sacs called thylakoids (Figure 2). Each stack of thylakoids is called a granum (plural = grana). The fluid enclosed by the inner membrane and surrounding the grana is called the stroma.


Figure 2 This simplified diagram of a chloroplast shows the outer membrane, inner membrane, thylakoids, grana, and stroma.

 The chloroplasts contain a green pigment called chlorophyll, which captures the energy of sunlight for photosynthesis. Like plant cells, photosynthetic protists also have chloroplasts. Some bacteria also perform photosynthesis, but they do not have chloroplasts. Their photosynthetic pigments are located in the thylakoid membrane within the cell itself.

Theory of Endosymbiosis

We have mentioned that both mitochondria and chloroplasts contain DNA and ribosomes. Have you wondered why? Strong evidence points to endosymbiosis as the explanation.

Symbiosis is a relationship in which organisms from two separate species live in close association and typically exhibit specific adaptations to each other. Endosymbiosis (endo-= within) is a relationship in which one organism lives inside the other. Endosymbiotic relationships abound in nature. Microbes that produce vitamin K live inside the human gut. This relationship is beneficial for us because we are unable to synthesize vitamin K. It is also beneficial for the microbes because they are protected from other organisms and are provided a stable habitat and abundant food by living within the large intestine.

Scientists have long noticed that bacteria, mitochondria, and chloroplasts are similar in size. We also know that mitochondria and chloroplasts have DNA and ribosomes, just as bacteria do. Scientists believe that host cells and bacteria formed a mutually beneficial endosymbiotic relationship when the host cells ingested aerobic bacteria and cyanobacteria but did not destroy them. Through evolution, these ingested bacteria became more specialized in their functions, with the aerobic bacteria becoming mitochondria and the photosynthetic bacteria becoming chloroplasts.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


The Cell Wall

The cell wall is a rigid covering that protects the cell, provides structural support, and gives shape to the cell. Cell walls are found in both prokaryotes and eukaryotes, although not all cells have cell walls. In Figure 1, the diagram of a plant cell, you see a structure external to the plasma membrane which is the cell wall.  The cell wall is the reason why vegetables such as celery crunch when you bite into them.

Fungal and protist cells also have cell walls, but they are structurally different from those found in plants..

Figure 1 Note that the cell wall is located outside the cell membrane.

While the chief component of prokaryotic cell walls is peptidoglycan, the major organic molecule in the plant cell wall is cellulose, a polysaccharide made up of long, straight chains of glucose units. When nutritional information refers to dietary fiber, it is referring to the cellulose content of food. Fungal cell walls are made up of a molecule called chitin.

Animal cells do not have cell walls. Steak does not crunch when you bite it.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Extracellular matrix and intercellular junctions

Extracellular Matrix of Animal Cells

Most animal cells release materials into the extracellular space. The primary components of these materials are glycoproteins and the protein collagen. Collectively, these materials are called the extracellular matrix (Figure 1). Not only does the extracellular matrix hold the cells together to form a tissue, but it also allows the cells within the tissue to communicate with each other.

figure_03_16 extra cellular matrix

Figure 1 The extracellular matrix consists of a network of substances secreted by cells.

Blood clotting provides an example of the role of the extracellular matrix in cell communication.

When the cells lining a blood vessel are damaged, they display a protein receptor called tissue factor. When tissue factor binds with another factor in the extracellular matrix, it causes platelets to adhere to the wall of the damaged blood vessel, stimulates adjacent smooth muscle cells in the blood vessel to contract (thus constricting the blood vessel), and initiates a series of steps that stimulate the platelets to produce clotting factors.

Intercellular Junctions

Cells can also communicate with each other by direct contact, referred to as intercellular junctions. There are some differences in the ways that plant and animal cells do this. Plasmodesmata (singular = plasmodesma) are junctions between plant cells, whereas animal cell contacts include tight and gap junctions, and desmosomes.

In general, long stretches of the plasma membranes of neighboring plant cells cannot touch one another because they are separated by the cell walls surrounding each cell. Plasmodesmata are numerous channels that pass between the cell walls of adjacent plant cells, connecting their cytoplasm and enabling signal molecules and nutrients to be transported from cell to cell (Figure 2a).

figure_03_17 cellular junctions

Figure 2 There are four kinds of connections between cells. (a) A plasmodesma is a channel between the cell walls of two adjacent plant cells. (b) Tight junctions join adjacent animal cells. (c) Desmosomes join two animal cells together. (d) Gap junctions act as channels between animal cells. (credit b, c, d: modification of work by Mariana Ruiz Villareal)

A tight junction is a watertight seal between two adjacent animal cells (Figure 2b). Proteins hold the cells tightly against each other. This tight adhesion prevents materials from leaking between the cells. Tight junctions are typically found in the epithelial tissue that lines internal organs and cavities, and composes most of the skin. For example, the tight junctions of the epithelial cells lining the urinary bladder prevent urine from leaking into the extracellular space.

Also found only in animal cells are desmosomes, which act like spot welds between adjacent epithelial cells (Figure 2c). They keep cells together in a sheet-like formation in organs and tissues that stretch, like the skin, heart, and muscles. 

Gap junctions in animal cells are like plasmodesmata in plant cells in that they are channels between adjacent cells that allow for the transport of ions, nutrients, and other substances that enable cells to communicate (Figure 2d). Structurally, however, gap junctions and plasmodesmata differ.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


The Production of a Protein

Proteins are one of the most abundant organic molecules in living systems and have an incredibly diverse range of functions. Proteins are used to:

Each cell in a living system may contain thousands of different proteins, each with a unique function. Their structures, like their functions, vary greatly. They are all, however, polymers of amino acids, arranged in a linear sequence (Figure 1).

The functions of proteins are very diverse because they are made up of are 20 different chemically distinct amino acids that form long chains, and the amino acids can be in any order. The function of the protein is dependent on the protein’s shape. The shape of a protein is determined by the order of the amino acids. Proteins are often hundreds of amino acids long and they can have very complex shapes because there are so many different possible orders for the 20 amino acids!

Figure 1 Protein structure. The colored balls at the top of this diagram represent different amino acids. Amino acids are the subunits that are joined together by the ribosome to form a protein. This chain of amino acids then folds to form a complex 3D structure. (Credit: Lady of Hats from Wikipedia; public domain)

Contrary to what you may believe, proteins are not typically used as a source of energy by cells. Protein from your diet is broken down into individual amino acids which are reassembled by your ribosomes into proteins that your cells need. Ribosomes do not produce energy.

foods containing proteins

Figure 2 Examples of foods that contain high levels of protein. (“Protein” by National Cancer Institute is in the Public Domain)

The information to produce a protein is encoded in the cell’s DNA. When a protein is produced, a copy of the DNA is made (called mRNA) and this copy is transported to a ribosome. Ribosomes read the information in the mRNA and use that information to assemble amino acids into a protein. If the protein is going to be used within the cytoplasm of the cell, the ribosome creating the protein will be free-floating in the cytoplasm. If the protein is going to be targeted to the lysosome, become a component of the plasma membrane, or be secreted outside of the cell, the protein will be synthesized by a ribosome located on the rough endoplasmic reticulum (RER). After being synthesized, the protein will be carried in a vesicle from the RER to the cis face of the Golgi (the side facing the inside of the cell). As the protein moves through the Golgi, it can be modified. Once the final modified protein has been completed, it exits the Golgi in a vesicle that buds from the trans face. From there, the vesicle can be targeted to a lysosome or targeted to the plasma membrane. If the vesicle fuses with the plasma membrane, the protein will become part of the membrane or be ejected from the cell.

diagram of eukaryotic cell with organelles labeled

Figure 3 Diagram of a eukaryotic cell. (Photo credit: Mediran, Wikimedia. 14 Aug 2002)


Insulin is a protein hormone that is made by specific cells inside the pancreas called beta cells. When the beta cells sense that glucose (sugar) levels in the bloodstream are high, they produce insulin protein and secrete it outside of the cells into the bloodstream. Insulin signals cells to absorb sugar from the bloodstream. Cells can’t absorb sugar without  insulin. Insulin protein is first produced as an immature, inactive chain of amino acids (preproinsulin – See Figure 4). It contains a signal sequence that targets the immature protein to the rough endoplasmic reticulum, where it folds into the correct shape. The targeting sequence is then cut off of the amino acid chain to form proinsulin. This trimmed, folded protein is then shipped to the Golgi inside a vesicle. In the Golgi, more amino acids (chain C) are trimmed off of the protein to produce the final mature insulin. Mature insulin is stored inside special vesicles until a signal is received for it to be released into the bloodstream.

diagram showing maturation of insulin

Figure 4 Insulin maturation. (Photo credit: Beta Cell Biology Consortium, Wikimedia. 2004. This picture is in the public domain.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Summary Table of Prokaryotic and Eukaryotic Cells and Functions

Table 1 Components of Prokaryotic and Eukaryotic Cells and Functions

Cell Component Function Present in Prokaryotes Present in Animal Cells Present in Plant Cells
Plasma Membrane Separates cell from external environment;

controls passage of organic molecules, ions, water, oxygen, and wastes into and out of the cell

Yes Yes Yes
Cytoplasm Provides structure to cell; site of many metabolic reactions; medium in which

organelles are found

Yes Yes Yes
Nucleoid Location of DNA Yes No No
Nucleus Cell organelle that houses DNA and directs synthesis of ribosomes and proteins No Yes Yes
Ribosomes Protein synthesis Yes Yes Yes
Mitochondria ATP production/cellular respiration No Yes Yes


Oxidizes and breaks down fatty acids and

amino acids, and detoxifies poisons

No Yes Yes
Vesicles and



Storage and transport; digestive function in

plant cells

No Yes Yes



Unspecified role in cell division in animal

cells; organizing center of microtubules in

animal cells

No Yes No


Digestion of macromolecules; recycling of worn-out organelles No Yes No
Cell wall



Protection, structural support and

maintenance of cell shape

Yes, primarily

peptidoglycan in bacteria but not Archaea

No Yes, primarily


Chloroplasts Photosynthesis No No Yes


Modifies proteins and synthesizes lipids No Yes Yes
Golgi apparatus


Modifies, sorts, tags, packages, and

distributes lipids and proteins

No Yes Yes


Maintains cell’s shape, secures organelles in

specific positions, allows cytoplasm and

vesicles to move within the cell, and enables

unicellular organisms to move independently

Yes Yes Yes


Cellular locomotion Some Some No, except for some plant sperm.


Cellular locomotion, movement of particles

along extracellular surface of plasma

membrane, and filtration

No Some No

Table 1 This table provides the components of prokaryotic and eukaryotic cells and their respective functions.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


The Cell Membrane and Transport

Learning Objectives

By the end of this section, you will be able to:

  • Explain how the structure of cell membranes leads to its various functions including selective permeability and transport, and cell signaling.

The plasma membrane, which is also called the cell membrane, has many functions, but the most basic one is to define the borders of the cell and keep the cell functional. The plasma membrane is selectively permeable. This means that the membrane allows some materials to freely enter or leave the cell, while other materials cannot move freely, but require the use of a specialized structure, and occasionally, even energy investment for crossing.


Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


The Plasma Membrane

Cells closely control the exchange of substances in and out of the cell.  Some substances are excluded, others are taken in, and still others are excreted – all in controlled quantities. Although the plasma membrane encloses the cell’s borders, it is far from being a static barrier; it is dynamic and constantly in flux. The plasma membrane must be sufficiently flexible to allow certain cells, such as red blood cells and white blood cells, to change shape as they pass through narrow capillaries. In addition to these more obvious functions, the surface of the plasma membrane carries markers which allow cells to recognize one another.  This is vital as these markers play a role in the “self” versus “non-self” distinction of the immune response.

Fluid Mosaic Model

In 1972, S. J. Singer and Garth L. Nicolson proposed a new model of the plasma membrane. This theory, compared to earlier theories, best explains both microscopic observations and the function of the plasma membrane. This theory is called the fluid mosaic model. The model has evolved somewhat over time, but still best accounts for the structure and functions of the plasma membrane as we now understand them. The fluid mosaic model describes the structure of the plasma membrane as comprised of diverse components—including phospholipids, cholesterol, proteins, and carbohydrates—that are able to flow and change position, while maintaining the basic integrity of the membrane. Both phospholipid molecules and embedded proteins are able to move laterally in the membrane. The fluidity of the plasma membrane is necessary for the activities of certain enzymes and transport molecules within the membrane.

Plasma membranes range from 5–10 nm thick. As a comparison, human red blood cells, visible via light microscopy, are approximately 8 μm thick, or approximately 1,000 times thicker than a plasma membrane.

Figure 1 The fluid mosaic model of the plasma membrane structure describes the plasma membrane as a fluid combination of phospholipids, cholesterol, proteins, and carbohydrates.

Components of the Plasma Membrane

The plasma membrane is made up primarily of a bilayer of phospholipids with embedded proteins, carbohydrates, glycolipids, and glycoproteins, and, in animal cells, cholesterol (Figure 1).


The main fabric of the membrane is composed of two layers of phospholipid molecules, and the polar ends of these molecules (which look like a collection of balls in an artist’s rendition of the model) (Figure 2) are in contact with aqueous fluid both inside and outside the cell. Thus, both surfaces of the plasma membrane are hydrophilic (“water loving”). In contrast, the interior of the membrane, between its two surfaces, is a hydrophobic (“water fearing”) or nonpolar region because of the fatty acid tails. This region has no attraction for water or other polar molecules.

Figure 1 Phospholipid bilayer

Figure 2 Phospholipid bilayer. “Extracellular” = outside the cell; “Intracellular” = inside the cell. Photo credit: OpenStax Anatomy and Physiology.

A phospholipid molecule (Figure 3) consists of a three-carbon glycerol backbone with two fatty acid molecules attached to carbons 1 and 2, and a phosphate-containing group attached to the third carbon. This arrangement gives the overall molecule an area described as its head (the phosphate-containing group), which has a polar character or negative charge, and an area called the tail (the fatty acids), which has no charge. The head can form hydrogen bonds, but the tail cannot.

diagram of a phospholipid

Figure 3 This phospholipid molecule is composed of a hydrophilic head and two hydrophobic tails. The hydrophilic head group consists of a phosphate-containing group attached to a glycerol molecule. The hydrophobic tails, each containing either a saturated or an unsaturated fatty acid, are long hydrocarbon chains.


Proteins make up the second major chemical component of plasma membranes (see Figure 1). Integral proteins are embedded in the plasma membrane and may span all or part of the membrane (Figure 1). Integral proteins may serve as channels or pumps to move materials into or out of the cell. Peripheral proteins are found on the exterior or interior surfaces of membranes, attached either to integral proteins or to phospholipid molecules (Figure 1). Both integral and peripheral proteins may serve as enzymes, as structural attachments for the fibers of the cytoskeleton, or as part of the cell’s recognition sites.

The recognition sites on the plasma membrane are called receptors, which are attachment sites for substances that interact with the cell. Each receptor is structured to bind with a specific substance. The binding of a specific substance to its receptor on the plasma membrane can activate processes within the interior of the cell – such as activating enzymes involved in metabolic pathways. These metabolic pathways might be vital for providing the cell with energy, making substances for the cell, or breaking down cellular waste or toxins for disposal. Likewise, extracellular hormones and neurotransmitters bind to plasma membrane receptors which transmit a signal into the cell to intracellular molecules. Some recognition sites are used by viruses as attachment points. Although they are highly specific, pathogens like viruses may evolve to exploit receptors to gain entry to a cell by mimicking the specific substance that the receptor is meant to bind. This specificity helps to explain why human immunodeficiency virus (HIV) or any of the five types of hepatitis viruses invade only specific cells.

Cystic Fibrosis is caused by a defect in an integral protein in the cell membrane which acts as a channel. The CFTR protein moves ions from one side of the membrane to another. When it is not functioning correctly, this causes very thick mucus to build up in the lungs and digestive tract.

cartoon of CFTR channel in membrane

When the CFTR channel protein is functioning correctly (1), ions (small balls) are able to pass through the membrane. When it is not functioning correctly (2), ions are unable to cross the membrane. Photo credit: LBudd14,  May, 2013. Wikimedia.


Carbohydrates are the third major component of plasma membranes. They are always found on the exterior surface of cells and are bound either to proteins (forming glycoproteins) or to lipids (forming glycolipids). These carbohydrate chains may consist of 2–60 monosaccharide units and may be either straight or branched. Along with peripheral proteins, carbohydrates form specialized sites on the cell surface that allow cells to recognize each other. These sites have unique patterns that allow the cell to be recognized, much the way that the facial features unique to each person allow him or her to be recognized. This recognition function is very important to cells, as it allows the immune system to differentiate between body cells (called “self”) and foreign cells or tissues (called “non-self”). Similar types of glycoproteins and glycolipids are found on the surfaces of viruses and may change frequently, preventing immune cells from recognizing and attacking them.

The carbohydrates that make up glycoproteins are responsible for determining human A, B, O blood types. These glycoproteins are recognized by the immune system, which leads to incompatibility in blood types.

ABO Blood types. In this figure, the membrane carbohydrate is represented by the “lollipops”. They are termed “antigens”. Photo credit: InvictaHOG, 2006. Wikimedia.

Membrane Fluidity

The mosaic characteristic of the membrane, described in the fluid mosaic model, helps to illustrate its nature. The proteins and other components that exist in the membrane can move with respect to each other, rather like boats floating on a lake. The membrane is not like a balloon, however, that can expand and contract; rather, it is fairly rigid and can burst if penetrated or if a cell takes in too much water. However, because of its mosaic nature, a very fine needle can easily penetrate a plasma membrane without causing it to burst, and the membrane will flow and self-seal when the needle is extracted.

The mosaic characteristics of the membrane explain some but not all of its fluidity. There are two other factors that help maintain this fluid characteristic. One factor is the nature of the phospholipids themselves. The structure of the fatty acid tails in each phospholipid can make the membrane more dense and rigid, or less dense and flexible.  The relative fluidity of the membrane is particularly important in a cold environment. A cold environment tends to make membranes less fluid and more susceptible to rupturing. Many organisms (fish are one example) are capable of adapting to cold environments by changing the proportion of different types of fatty acids in their membranes in response to the lowering of the temperature.

Animals have an additional membrane constituent that assists in maintaining fluidity. Cholesterol, which lies alongside the phospholipids in the membrane, tends to dampen the effects of temperature on the membrane. Thus, this lipid functions as a buffer, preventing lower temperatures from inhibiting fluidity and preventing increased temperatures from increasing fluidity too much. Thus, cholesterol extends, in both directions, the range of temperature in which the membrane is appropriately fluid and consequently functional. Cholesterol also serves other functions, such as organizing clusters of transmembrane proteins into lipid rafts.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Transport Across Membranes

Plasma membranes act not only as a barrier, but also as a gatekeeper. It must allow needed substances to enter and cell products to leave the cell, while preventing entrance of harmful material and exit of essential material. In other words, plasma membranes are selectively permeable—they allow some substances through but not others. If the membrane were to lose this selectivity, the cell would no longer be able to maintain homeostasis, or to sustain itself, and it would be destroyed. Some cells require larger amounts of specific substances than other cells; they must have a way of obtaining these materials from the extracellular fluids.

This may happen passively, as certain materials move back and forth, or the cell may have special mechanisms that ensure transport. Most cells expend most of their energy, in the form of adenosine triphosphate (ATP), to create and maintain an uneven distribution of ions on the opposite sides of their membranes. The structure of the plasma membrane contributes to these functions.

Selective Permeability

Plasma membranes are asymmetric, meaning that despite the mirror image formed by the phospholipids, the interior of the membrane is not identical to the exterior of the membrane. Integral proteins that act as channels or pumps work in one direction. Carbohydrates, attached to lipids or proteins, are also found on the exterior surface of the plasma membrane.

These carbohydrate complexes help the cell bind substances in the extracellular fluid that the cell needs. This adds considerably to the selective nature of plasma membranes.

Recall that plasma membranes have hydrophilic and hydrophobic regions. This characteristic helps the movement of certain materials through the membrane and hinders the movement of others. Lipid-soluble material can easily slip through the hydrophobic lipid core of the membrane. Substances such as the fat-soluble vitamins A, D, E, and K readily pass through the plasma membranes in the digestive tract and other tissues. Fat-soluble drugs also gain easy entry into cells and are readily transported into the body’s tissues and organs. Molecules of oxygen and carbon dioxide have no charge and pass through by simple diffusion.

Polar substances, with the exception of water, present problems for the membrane. While some polar molecules connect easily with the outside of a cell, they cannot readily pass through the lipid core of the plasma membrane. Additionally, whereas small ions could easily slip through the spaces in the mosaic of the membrane, their charge prevents them from doing so. Ions such as sodium, potassium, calcium, and chloride must have a special means of penetrating plasma membranes. Simple sugars and amino acids also need help with transport across plasma membranes.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Passive Transport: Diffusion

The most direct forms of membrane transport are passive. Passive transport is a naturally occurring phenomenon and does not require the cell to expend energy to accomplish the movement. In passive transport, substances move from an area of higher concentration to an area of lower concentration in a process called diffusion. A physical space in which there is a different concentration of a single substance is said to have a concentration gradient.


Diffusion is a passive process of transport. A single substance tends to move from an area of high concentration to an area of low concentration until the concentration is equal across the space. You are familiar with diffusion of substances through the air. For example, think about someone opening a bottle of perfume in a room filled with people. The perfume is at its highest concentration in the bottle and is at its lowest at the edges of the room. The perfume vapor will diffuse, or spread away, from the bottle, and gradually, more and more people will smell the perfume as it spreads. Materials move within the cell’s cytosol by diffusion, and certain materials move through the plasma membrane by diffusion (Figure 1). Diffusion expends no energy. Rather the different concentrations of materials in different areas are a form of potential energy, and diffusion is the dissipation of that potential energy as materials move down their concentration gradients, from high to low.

Figure 1 Diffusion through a permeable membrane follows the concentration gradient of a substance, moving the substance from an area of high concentration to one of low concentration. (credit: modification of work by Mariana Ruiz Villarreal)

Each separate substance in a medium, such as the extracellular fluid, has its own concentration gradient, independent of the concentration gradients of other materials. Additionally, each substance will diffuse according to that gradient.

Several factors affect the rate of diffusion:


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Passive Transport: Facilitated Transport

In facilitated transport, also called facilitated diffusion, material moves across the plasma membrane with the assistance of transmembrane proteins down a concentration gradient (from high to low concentration) without the expenditure of cellular energy. However, the substances that undergo facilitated transport would otherwise not diffuse easily or quickly across the plasma membrane. The solution to moving polar substances and other substances across the plasma membrane rests in the proteins that span its surface. The material being transported is first attached to protein or glycoprotein receptors on the exterior surface of the plasma membrane. This allows the material that is needed by the cell to be removed from the extracellular fluid. The substances are then passed to specific integral proteins that facilitate their passage, because they form channels or pores that allow certain substances to pass through the membrane. The integral proteins involved in facilitated transport are collectively referred to as transport proteins, and they function as either channels for the material or carriers.


The integral proteins involved in facilitated transport are collectively referred to as transport proteins, and they function as either channels for the material or carriers. In both cases, they are transmembrane proteins (they span across the membrane). Channels are specific for the substance that is being transported. Channel proteins have hydrophilic domains exposed to the intracellular and extracellular fluids; they additionally have a hydrophilic channel through their core that provides a hydrated opening through the membrane layers (Figure 1). Passage through the channel allows polar compounds to avoid the nonpolar central layer of the plasma membrane that would otherwise slow or prevent their entry into the cell. Aquaporins are channel proteins that allow water to pass through the membrane at a very high rate.

Figure 1 Facilitated transport moves substances down their concentration gradients. They may cross the plasma membrane with the aid of channel proteins. (credit: modification of work by Mariana Ruiz Villareal)


Carrier Proteins

Another type of protein embedded in the plasma membrane is a carrier protein. This aptly named protein binds a substance and, in doing so, triggers a change of its own shape, moving the bound molecule from the outside of the cell to its interior (Figure 2); depending on the gradient, the material may move in the opposite direction. Carrier proteins are typically specific for a single substance. This selectivity adds to the overall selectivity of the plasma membrane. The exact mechanism for the change of shape is poorly understood. Proteins can change shape when their hydrogen bonds are affected, but this may not fully explain this mechanism. Each carrier protein is specific to one substance, and there are a finite number of these proteins in any membrane. This can cause problems in transporting enough of the material for the cell to function properly. When all of the proteins are bound to their ligands, they are saturated and the rate of transport is at its maximum. Increasing the concentration gradient at this point will not result in an increased rate of transport.

An example of this process occurs in the kidney. Glucose, water, salts, ions, and amino acids needed by the body are filtered in one part of the kidney. This filtrate, which includes glucose, is then reabsorbed in another part of the kidney. Because there are only a finite number of carrier proteins for glucose, if more glucose is present than the proteins can handle, the excess is not transported and it is excreted from the body in the urine. In a diabetic individual, this is described as “spilling glucose into the urine.” A different group of carrier proteins called glucose transport proteins, or GLUTs, are involved in transporting glucose and other hexose sugars through plasma membranes within the body.

Channel and carrier proteins transport material at different rates. Channel proteins transport much more quickly than do carrier proteins. Channel proteins facilitate diffusion at a rate of tens of millions of molecules per second, whereas carrier proteins work at a rate of a thousand to a million molecules per second.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Passive Transport: Osmosis

Osmosis is the diffusion of water through a semipermeable membrane according to the concentration gradient of water across the membrane. Whereas diffusion transports material across membranes and within cells, osmosis transports only water across a membrane and the membrane limits the diffusion of solutes in the water. Osmosis is a special case of diffusion. Water, like other substances, moves from an area of higher concentration to one of lower concentration. Imagine a beaker with a semipermeable membrane, separating the two sides or halves (Figure 3). On both sides of the membrane, the water level is the same, but there are different concentrations on each side of a dissolved substance, or solute, that cannot cross the membrane. If the volume of the water is the same, but the concentrations of solute are different, then there are also different concentrations of water, the solvent, on either side of the membrane.

osmosis through a semipermeable membrane

Figure 3 In osmosis, water always moves from an area of higher concentration (of water) to one of lower concentration (of water). In this system, the solute cannot pass through the selectively permeable membrane.

A principle of diffusion is that the molecules move around and will spread evenly throughout the medium if they can. However, only the material capable of getting through the membrane will diffuse through it. In this example, the solute cannot diffuse through the membrane, but the water can. Water has a concentration gradient in this system. Therefore, water will diffuse down its concentration gradient, crossing the membrane to the side where it is less concentrated. This diffusion of water through the membrane— osmosis —will continue until the concentration gradient of water goes to zero. Osmosis proceeds constantly in living systems.


Tonicity describes the amount of solute in a solution. The measure of the tonicity of a solution, or the total amount of solutes dissolved in a specific amount of solution, is called its osmolarity. Three terms—hypotonic, isotonic, and hypertonic—are used to relate the osmolarity of a cell to the osmolarity of the extracellular fluid that contains the cells. All three of these terms are a comparison between two different solutions (for example, inside a cell compared to outside the cell).

In a hypotonic solution, such as tap water, the extracellular fluid has a lower concentration of solutes than the fluid inside the cell, and water enters the cell. (In living systems, the point of reference is always the cytoplasm, so the prefix hypo– means that the extracellular fluid has a lower concentration of solutes, or a lower osmolarity, than the cell cytoplasm.) It also means that the extracellular fluid has a higher concentration of water than does the cell. In this situation, water will follow its concentration gradient and enter the cell. This may cause an animal cell to burst, or lyse.

In a hypertonic solution (the prefix hyper– refers to the extracellular fluid having a higher concentration of solutes than the cell’s cytoplasm), the fluid contains less water than the cell does, such as seawater. Because the cell has a lower concentration of solutes, the water will leave the cell. In effect, the solute is drawing the water out of the cell. This may cause an animal cell to shrivel, or crenate.

In an isotonic solution, the extracellular fluid has the same osmolarity as the cell. If the concentration of solutes of the cell matches that of the extracellular fluid, there will be no net movement of water into or out of the cell. The cell will retain its “normal” appearance. Blood cells in hypertonic, isotonic, and hypotonic solutions take on characteristic appearances (Figure 4).

Remember that all three of these terms are comparisons between two solutions (i.e. inside and outside the cell). A solution can’t be hypotonic, that would be like saying that Bob is taller. That doesn’t make sense – you need to say that Bob is taller than Mike. You can say that the solution inside the cell is hypotonic to the solution outside the cell. That also means that the solution outside is hypertonic to the solution inside (just like Mike would be shorter than Bob).

figure_03_21 osmosis in red blood cells

Figure 4 Osmotic pressure changes the shape of red blood cells in hypertonic, isotonic, and hypotonic solutions. (credit: modification of work by Mariana Ruiz Villarreal)

Some organisms, such as plants, fungi, bacteria, and some protists, have cell walls that surround the plasma membrane and prevent cell lysis. The plasma membrane can only expand to the limit of the cell wall, so the cell will not lyse. In fact, the cytoplasm in plants is always slightly hypertonic compared to the cellular environment, and water will always enter the plant cell if water is available. This influx of water produces turgor pressure, which stiffens the cell walls of the plant (Figure 5). In nonwoody plants, turgor pressure supports the plant. If the plant cells become hypertonic, as occurs in drought or if a plant is not watered adequately, water will leave the cell. Plants lose turgor pressure in this condition and wilt.

figure_03_22-1 turgor pressure

Figure 5 The turgor pressure within a plant cell depends on the tonicity of the solution that it is bathed in. (credit: modification of work by Mariana Ruiz Villarreal)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Active Transport

Active transport mechanisms require the use of the cell’s energy, usually in the form of adenosine triphosphate (ATP). If a substance must move into the cell against its concentration gradient, that is, if the concentration of the substance inside the cell must be greater than its concentration in the extracellular fluid, the cell must use energy to move the substance. Some active transport mechanisms move small-molecular weight material, such as ions, through the membrane.

In addition to moving small ions and molecules through the membrane, cells also need to remove and take in larger molecules and particles. Some cells are even capable of engulfing entire unicellular microorganisms. You might have correctly hypothesized that the uptake and release of large particles by the cell requires energy. A large particle, however, cannot pass through the membrane, even with energy supplied by the cell.

Electrochemical Gradient

We have discussed simple concentration gradients—differential concentrations of a substance across a space or a membrane. However, in living systems gradients are more complex. Cells contain many proteins, most of which are negatively charged.  Due to these negatively charged proteins, coupled with the movement of ions into and out of cells, there is an electrical gradient (a difference of charge) across the plasma membrane. The interior of living cells is electrically negative as compared to the extracellular fluid in which cells are bathed; at the same time, cells contain higher concentrations of potassium (K+) and lower concentrations of sodium (Na+) than does the extracellular fluid. Thus, in a living cell, the concentration gradient and electrical gradient of Na+ promotes diffusion of the ion into the cell, and the electrical gradient of Na+ (a positive ion) tends to drive it inward to the negatively charged interior. The situation is more complex, however, for other elements such as potassium. The electrical gradient of K+ promotes diffusion of the ion into the cell, but the concentration gradient of K+ promotes diffusion out of the cell (Figure 5). The combined gradient that affects an ion is called its electrochemical gradient, and it is especially important to muscle and nerve cells.

figure_03_23 electrochemical gradient

Figure 5 Electrochemical gradients arise from the combined effects of concentration gradients and electrical gradients. (credit: modification of work by “Synaptitude”/Wikimedia Commons)

Moving Against a Gradient

To move substances against a concentration or an electrochemical gradient, the cell must use energy. This energy is harvested from ATP that is generated through cellular metabolism. Active transport mechanisms, collectively called pumps or carrier proteins, work against electrochemical gradients. With the exception of ions, small substances constantly pass through plasma membranes. Active transport maintains concentrations of ions and other substances needed by living cells in the face of these passive changes. Much of a cell’s supply of metabolic energy may be spent maintaining these processes. As active transport mechanisms depend on cellular metabolism for energy, they are sensitive to many metabolic poisons that interfere with the supply of ATP.

Two mechanisms exist for the transport of small-molecular weight material and macromolecules. Primary active transport moves ions across a membrane and creates a difference in charge across that membrane. The primary active transport system uses ATP to move a substance, such as an ion, into the cell, and often at the same time, a second substance is moved out of the cell. The sodium-potassium pump, an important pump in animal cells, expends energy to move potassium ions into the cell and a different number of sodium ions out of the cell (Figure 6). The action of this pump results in a concentration and charge difference across the membrane.

Secondary active transport describes the movement of material using the energy of the electrochemical gradient established by primary active transport. Using the energy of the electrochemical gradient created by the primary active transport system, other substances such as amino acids and glucose can be brought into the cell through membrane channels. ATP itself is formed through secondary active transport using a hydrogen ion gradient in the mitochondrion.


Endocytosis is a type of active transport that moves particles, such as large molecules, parts of cells, and even whole cells, into a cell. There are different variations of endocytosis, but all share a common characteristic: The plasma membrane of the cell invaginates, forming a pocket around the target particle. The pocket pinches off, resulting in the particle being contained in a newly created vacuole that is formed from the plasma membrane.

Figure 7 Three variations of endocytosis are shown. (a) In one form of endocytosis, phagocytosis, the cell membrane surrounds the particle and pinches off to form an intracellular vacuole. (b) In another type of endocytosis, pinocytosis, the cell membrane surrounds a small volume of fluid and pinches off, forming a vesicle. (c) In receptor-mediated endocytosis, uptake of substances by the cell is targeted to a single type of substance that binds at the receptor on the external cell membrane. (credit: modification of work by Mariana Ruiz Villarreal) 

Phagocytosis is the process by which large particles, such as cells, are taken in by a cell. For example, when microorganisms invade the human body, a type of white blood cell called a neutrophil removes the invader through this process, surrounding and engulfing the microorganism, which is then destroyed by the neutrophil (Figure 7a).

A variation of endocytosis is called pinocytosis. This literally means “cell drinking” and was named at a time when the assumption was that the cell was purposefully taking in extracellular fluid. In reality, this process takes in solutes that the cell needs from the extracellular fluid (Figure 7b).

A targeted variation of endocytosis employs binding proteins in the plasma membrane that are specific for certain substances (Figure 7c). The particles bind to the proteins and the plasma membrane invaginates, bringing the substance and the proteins into the cell. If passage across the membrane of the target of receptor-mediated endocytosis is ineffective, it will not be removed from the tissue fluids or blood. Instead, it will stay in those fluids and increase in concentration.

Some human diseases are caused by a failure of receptor-mediated endocytosis. For example, the form of cholesterol termed low-density lipoprotein or LDL (also referred to as “bad” cholesterol) is removed from the blood by receptor mediated endocytosis. In the human genetic disease familial hypercholesterolemia, the LDL receptors are defective or missing entirely. People with this condition have life-threatening levels of cholesterol in their blood, because their cells cannot clear the chemical from their blood.


In contrast to these methods of moving material into a cell is the process of exocytosis. Exocytosis is the opposite of the processes discussed above in that its purpose is to expel material from the cell into the extracellular fluid. A particle enveloped in membrane fuses with the interior of the plasma membrane. This fusion opens the membranous envelope to the exterior of the cell, and the particle is expelled into the extracellular space (Figure 8).



Figure 8 In exocytosis, a vesicle migrates to the plasma membrane, binds, and releases its contents to the outside of the cell. (credit: modification of work by Mariana Ruiz Villarreal)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Cell Communication

Learning Objectives

Course Outcomes for this section:

Explain how basic units of cellular structure define the function of all living things.

d. Explain how the structure of cell membranes leads to its various functions including selective permeability and transport, and cell signaling.

Imagine what life would be like if you and the people around you could not communicate. You would not be able to express your wishes to others, nor could you ask questions to find out more about your environment. Social organization is dependent on communication between the individuals that comprise that society; without communication, society would fall apart.

As with people, it is vital for individual cells to be able to interact with their environment. This is true whether a cell is growing by itself in a pond or is one of many cells that form a larger organism. In order to properly respond to external stimuli, cells have developed complex mechanisms of communication that can receive a message, transfer the information across the plasma membrane, and then produce changes within the cell in response to the message.

In multicellular organisms, cells send and receive chemical messages constantly to coordinate the actions of distant organs, tissues, and cells. The ability to send messages quickly and efficiently enables cells to coordinate and fine-tune their functions.

While the necessity for cellular communication in larger organisms seems obvious, even single-celled organisms communicate with each other. Yeast cells signal each other to aid mating. Some forms of bacteria coordinate their actions in order to form large complexes called biofilms or to organize the production of toxins to remove competing organisms. The ability of cells to communicate through chemical signals originated in single cells and was essential for the evolution of multicellular organisms. The efficient and error-free function of communication systems is vital for all life as we know it.


OpenStax, Biology. OpenStax CNX. October 13, 2017.


A Summary of Cell Communication

Receptors are protein molecules inside the target cell or on its surface that receive a chemical signal. Chemical signals are released by signaling cells in the form of small, usually volatile or soluble molecules called ligands. A ligand is a molecule that binds another specific molecule, in some cases, delivering a signal in the process. Ligands can thus be thought of as signaling molecules. Ligands and receptors exist in several varieties; however, a specific ligand will have a specific receptor that typically binds only that ligand.

There are two basic types of receptors: internal receptors and cell surface receptors.

There are several different types of ligands.

Once a ligand binds to a receptor, the signal is transmitted through the membrane and into the cytoplasm. Continuation of a signal in this manner is called signal transduction. Signal transduction only occurs with cell-surface receptors because internal receptors are able to interact directly with DNA in the nucleus to initiate protein synthesis.

Signal transduction pathways can be extremely complicated and involve large numbers of enzymes and other proteins. These pathways can help amplify a signal received by one receptor. There can also be different effects from the same ligand in different cell types due to different proteins present in different types of cells.

There are several categories of cellular responses to signals.

Stopping cell signaling pathways at the right time is just as important as starting them correctly. Tumors often display abnormal responses to cell signaling pathways.


Types of Receptors

A cell within a multicellular organism may need to signal to other cells that are at various distances from the original cell (Figure 1). Not all cells are affected by the same signals. Different types of signaling are used for different purposes.

The illustration shows four forms of chemical signaling. In autocrine signaling, a cell targets itself. In signaling across a gap junction, a cell targets a cell connected via gap junctions. In paracrine signaling, a cell targets a nearby cell. In endocrine signaling, a cell targets a distant cell via the bloodstream

Figure 1 In chemical signaling, a cell may target itself (autocrine signaling), a cell connected by gap junctions, a nearby cell (paracrine signaling), or a distant cell (endocrine signaling). Paracrine signaling acts on nearby cells, endocrine signaling uses the circulatory system to transport ligands, and autocrine signaling acts on the signaling cell. Signaling via gap junctions involves signaling molecules moving directly between adjacent cells.

Receptors are protein molecules inside the target cell or on its surface that receive a chemical signal. Chemical signals are released by signaling cells in the form of small, usually volatile or soluble molecules called ligands. A ligand is a molecule that binds another specific molecule, in some cases, delivering a signal in the process. Ligands can thus be thought of as signaling molecules. Ligands and receptors exist in several varieties; however, a specific ligand will have a specific receptor that typically binds only that ligand.

Internal receptors

Internal receptors, also known as intracellular or cytoplasmic receptors, are found in the cytoplasm of the cell and respond to hydrophobic ligand molecules that are able to travel across the plasma membrane. Once inside the cell, many of these molecules bind to proteins that act as regulators of mRNA synthesis. Recall that mRNA carries genetic information from the DNA in a cell’s nucleus out to the ribosome, where the protein is assembled. When the ligand binds to the internal receptor, a change in shape is triggered that exposes a DNA-binding site on the receptor protein. The ligand-receptor complex moves into the nucleus, then binds to specific regions of the DNA and promotes the production of mRNA from specific genes (Figure 2). Internal receptors can directly influence gene expression (how much of a specific protein is produced from a gene) without having to pass the signal on to other receptors or messengers.

This illustration shows a hydrophobic signaling molecule that diffuses across the plasma membrane and binds an intracellular receptor in the cytoplasm. The intracellular receptor-signaling molecule complex then travels to the nucleus and binds DNA.

Figure 2 Hydrophobic signaling molecules typically diffuse across the plasma membrane and interact with intracellular receptors in the cytoplasm. Many intracellular receptors are transcription factors that interact with DNA in the nucleus and regulate gene expression.

Cell-Surface Receptors

Cell-surface receptors, also known as transmembrane receptors, are proteins that are found attached to the cell membrane. These receptors bind to external ligand molecules (ligands that do not travel across the cell membrane). This type of receptor spans the plasma membrane and performs signal transduction, in which an extracellular signal is converted into an intercellular signal. Ligands that interact with cell-surface receptors do not have to enter the cell that they affect. Cell-surface receptors are also called cell-specific proteins or markers because they are specific to individual cell types.

Each cell-surface receptor has three main components: an external ligand-binding domain, a hydrophobic membrane-spanning region, and an intracellular domain inside the cell. The size and extent of each of these domains vary widely, depending on the type of receptor.

Figure 3 Cell-surface receptors function by transmitting a signal through the cell membrane. The ligand does not directly enter the cell. Photo credit Laozhengzz; Wikimedia commons. 

Cell-surface receptors are involved in most of the signaling in multicellular organisms. There are three general categories of cell-surface receptors: ion channel-linked receptors, G-protein-linked receptors, and enzyme-linked receptors.

Ion channel-linked receptors

Ion channel-linked receptors bind a ligand and open a channel through the membrane that allows specific ions to pass through. To form a channel, this type of cell-surface receptor has an extensive membrane-spanning region. When a ligand binds to the extracellular region of the channel, there is a conformational change in the proteins structure that allows ions such as sodium, calcium, magnesium, and hydrogen to pass through (Figure 4).

This illustration shows a gated ion channel that is closed in the absence of a signaling molecule. When a signaling molecule binds, a pore in the middle of the channel opens, allowing ions to enter the cell.

Figure 4 Gated ion channels form a pore through the plasma membrane that opens when the signaling molecule binds. The open pore then allows ions to flow into or out of the cell.

G-protein-coupled receptors

G-protein-coupled receptors bind a ligand and activate a membrane protein called a G-protein. The activated G-protein then interacts with either an ion channel or an enzyme in the membrane (Figure 5). Before the ligand binds, the inactive G-protein can bind to a site on a specific receptor. Once the G-protein binds to the receptor, the G-protein changes shape, becomes active, and splits into two different subunits. One or both of these subunits may be able to activate other proteins as a result.

This illustration shows the activation pathway for a heterotrimeric G-protein, which has three subunits: alpha beta, and gamma, all associated with the inside of the plasma membrane. When a signaling molecule binds to a G-protein-coupled receptor in the plasma membrane, a GDP molecule associated with the alpha subunit is exchanged for GTP. The alpha subunit dissociates from the beta and gamma subunits and triggers a cellular response. Hydrolysis of GTP to GDP terminates the signal.

Figure 5 When a signaling molecule binds to a G-protein-coupled receptor in the plasma membrane, a GDP molecule associated with the G-protein is exchanged for GTP. The subunits come apart from each other, and a cellular response is triggered either by one or both of the subunits. Hydrolysis of GTP to GDP terminates the signal.

Enzyme-linked receptors

Enzyme-linked receptors are cell-surface receptors with intracellular domains that are associated with an enzyme. In some cases, the intracellular domain of the receptor itself is an enzyme. Other enzyme-linked receptors have a small intracellular domain that interacts directly with an enzyme. When a ligand binds to the extracellular domain, a signal is transferred through the membrane, activating the enzyme. Activation of the enzyme sets off a chain of events within the cell that eventually leads to a response.

How Viruses Recognize a Host

Unlike living cells, many viruses do not have a plasma membrane or any of the structures necessary to sustain life. Some viruses are simply composed of an inert protein shell containing DNA or RNA. To reproduce, viruses must invade a living cell, which serves as a host, and then take over the hosts cellular apparatus. But how does a virus recognize its host?

Viruses often bind to cell-surface receptors on the host cell. For example, the virus that causes human influenza (flu) binds specifically to receptors on membranes of cells of the respiratory system. Chemical differences in the cell-surface receptors among hosts mean that a virus that infects a specific species (for example, humans) cannot infect another species (for example, chickens).

However, viruses have very small amounts of DNA or RNA compared to humans, and, as a result, viral reproduction can occur rapidly. Viral reproduction invariably produces errors that can lead to changes in newly produced viruses; these changes mean that the viral proteins that interact with cell-surface receptors may evolve in such a way that they can bind to receptors in a new host. Such changes happen randomly and quite often in the reproductive cycle of a virus, but the changes only matter if a virus with new binding properties comes into contact with a suitable host. In the case of influenza, this situation can occur in settings where animals and people are in close contact, such as poultry and swine farms (Sigalov, 2010). Once a virus jumps to a new host, it can spread quickly. Scientists watch newly appearing viruses (called emerging viruses) closely in the hope that such monitoring can reduce the likelihood of global viral epidemics.



Text adapted from: OpenStax, Biology. OpenStax CNX. October 13, 2017.

A. B. Sigalov, The School of Nature. IV. Learning from Viruses, Self/Nonself 1, no. 4 (2010): 282-298. Y. Cao, X. Koh, L. Dong, X. Du, A. Wu, X. Ding, H. Deng, Y. Shu, J. Chen, T. Jiang, Rapid Estimation of Binding Activity of Influenza Virus Hemagglutinin to Human and Avian Receptors, PLoS One 6, no. 4 (2011): e18664.


Types of signaling molecules

Ligands are produced by signaling cells and act as chemical signals that travel to target cells to coordinate responses. The types of molecules that serve as ligands are incredibly varied and range from small proteins to small ions like calcium (Ca2+).

Small Hydrophobic Ligands

Small hydrophobic ligands can directly diffuse through the plasma membrane and interact with internal receptors. Important members of this class of ligands are the steroid hormones. Steroids are lipids that have a hydrocarbon skeleton with four fused rings; different steroids have different functional groups attached to the carbon skeleton. Steroid hormones include the female sex hormone, estradiol, which is a type of estrogen; the male sex hormone, testosterone; and cholesterol, which is an important structural component of biological membranes and a precursor of steriod hormones (Figure 1). Other hydrophobic hormones include thyroid hormones and vitamin D.

The molecular structures of estradiol, testosterone, and cholesterol are shown. All three molecules share a four-ring structure but differ in the types of functional groups attached to it.

Figure 1 Steroid hormones have similar chemical structures to their precursor, cholesterol. Because these molecules are small and hydrophobic, they can diffuse directly across the plasma membrane into the cell, where they interact with internal receptors.

Water-Soluble Ligands

Water-soluble ligands are polar and therefore cannot pass through the plasma membrane unaided; sometimes, they are too large to pass through the membrane at all. Instead, most water-soluble ligands bind to the portion of a cell-surface receptor which is on the outside of the cell. This group of ligands is quite diverse and includes small molecules, peptides (short chains of amino acids), and proteins.

Other Ligands

Nitric oxide (NO) is a gas that also acts as a ligand. It is able to diffuse directly across the plasma membrane, and one of its roles is to interact with receptors in smooth muscle and induce relaxation of the tissue. NO has a very short half-life and therefore only functions over short distances. Nitroglycerin, a treatment for heart disease, acts by triggering the release of NO, which causes blood vessels to dilate (expand), thus restoring blood flow to the heart. NO has become better known recently because the pathway that it affects is targeted by prescription medications for erectile dysfunction, such as Viagra (erection involves dilated blood vessels).


OpenStax, Biology. OpenStax CNX. October 13, 2017.


Propagation of the signal

Once a ligand binds to a receptor, the signal is transmitted through the membrane and into the cytoplasm. Continuation of a signal in this manner is called signal transduction. Signal transduction only occurs with cell-surface receptors because internal receptors are able to interact directly with DNA in the nucleus to initiate protein synthesis.

Binding Initiates a Signaling Pathway

After the ligand binds to the cell-surface receptor, the activation of the receptor’s intracellular components sets off a chain of events that is called a signaling pathway or a signaling cascade. Signaling pathways can get very complicated very quickly because most cellular proteins can affect different downstream events, depending on the conditions within the cell. A single pathway can branch off toward different endpoints based on the interplay between two or more signaling pathways, and the same ligands are often used to initiate different signals in different cell types. This variation in response is due to differences in protein expression in different cell types. Another complicating element is signal integration of the pathways, in which signals from two or more different cell-surface receptors merge to activate the same response in the cell. This process can ensure that multiple external requirements are met before a cell commits to a specific response.

The effects of extracellular signals can also be amplified by enzymatic cascades. At the initiation of the signal, a single ligand binds to a single receptor. However, activation of a receptor-linked enzyme can activate many copies of a component of the signaling cascade, which amplifies the signal.

Figure 1 Example of a signal transduction cascade. In this example, insulin serves as the ligand and activates a cascade that leads to activation of the GLUT4 glucose transporter on the cell membrane. Photo credit Luuis12321; Wikimedia commons

Methods of Intracellular Signaling

The activation of a signaling pathway depends on the modification of a cellular component by an enzyme. There are numerous types of enzymatic modifications that can occur, and they are recognized in turn by the next component downstream. The following are some of the more common events in intracellular signaling.


One of the most common chemical modifications that occurs in signaling pathways is the addition of a phosphate group (PO4–3) to a molecule such as a protein in a process called phosphorylation. The transfer of the phosphate is catalyzed by an enzyme called a kinase. Various kinases are named for the substrate they phosphorylate. Phosphorylation can create a binding site that interacts with downstream components in the signaling cascade. Phosphorylation may activate or inactivate enzymes, and the reversal of phosphorylation, dephosphorylation by a phosphatase, will reverse the effect.

Second Messengers

Second messengers are small molecules that help to spread a signal through the cytoplasm after a ligand binds to a receptor. They do this by altering the behavior of certain cellular proteins. Some examples of second messengers are cAMP (a modified version of AMP, which is related to ATP but only contains one phosphate) and calcium ions.


OpenStax, Biology. OpenStax CNX. October 13, 2017.


Response to the signal

Ligands which can enter the cell and bind to internal receptors are able to directly affect the cell’s DNA and protein-producing machinery. Ligands which can not enter the cell bind to receptors in the plasma membrane and use signal transduction pathways to produce a variety of effects. The results of signaling pathways are extremely varied and depend on the type of cell involved as well as the external and internal conditions. A small sampling of responses is described below.

Gene Expression

Some signal transduction pathways regulate the production of mRNA. Others regulate the synthesis of proteins from mRNA by ribosomes. Typically these pathways increase gene expression so that more of a specific protein is produced from a gene, but some pathways do decrease gene expression.

An example of a protein that regulates translation in the nucleus is the MAP kinase ERK. ERK is activated in a phosphorylation cascade when epidermal growth factor (EGF) binds the EGF receptor. Upon phosphorylation, ERK enters the nucleus and activates a protein kinase that, in turn, regulates protein translation (Figure 1).

This illustration shows the pathway by which ERK, a MAP kinase, activates protein synthesis. Phosphorylated ERK phosphorylates MNK1, which in turn phosphorylates eIF-4E, which is associated with mRNA. When eIF-4E is phosphorylated, the mRNA unfolds and protein synthesis begins.

Figure 1 ERK is a MAP kinase that activates translation when it is phosphorylated. ERK phosphorylates MNK1, which in turn phosphorylates eIF-4E, an elongation initiation factor that, with other initiation factors, is associated with mRNA. When eIF-4E becomes phosphorylated, the mRNA unfolds, allowing protein synthesis in the nucleus to begin.

Increase in Cellular Metabolism

The result of another signaling pathway affects muscle cells. The activation of β-adrenergic receptors in muscle cells by adrenaline leads to an increase in cyclic AMP (cAMP) inside the cell. Also known as epinephrine, adrenaline is a hormone (produced by the adrenal gland attached to the kidney) that readies the body for short-term emergencies. Cyclic AMP activates PKA (protein kinase A), which in turn phosphorylates two enzymes. The first enzyme promotes the degradation of glycogen by activating intermediate glycogen phosphorylase kinase (GPK) that in turn activates glycogen phosphorylase (GP) that catabolizes glycogen into glucose. (Recall that your body converts excess glucose to glycogen for short-term storage. When energy is needed, glycogen is quickly reconverted to glucose.) Phosphorylation of the second enzyme, glycogen synthase (GS), inhibits its ability to form glycogen from glucose. In this manner, a muscle cell obtains a ready pool of glucose by activating its formation via glycogen degradation and by inhibiting the use of glucose to form glycogen, thus preventing a futile cycle of glycogen degradation and synthesis. The glucose is then available for use by the muscle cell in response to a sudden surge of adrenaline—the “fight or flight” reflex.

Cell Growth

Cell signaling pathways also play a major role in cell division. Cells do not normally divide unless they are stimulated by signals from other cells. The ligands that promote cell growth are called growth factors. Most growth factors bind to cell-surface receptors that are linked to tyrosine kinases. These cell-surface receptors are called receptor tyrosine kinases (RTKs). Activation of RTKs initiates a signaling pathway that includes a G-protein called RAS, which activates the MAP kinase pathway described earlier. The enzyme MAP kinase then stimulates the expression of proteins that interact with other cellular components to initiate cell division.

Cell Death

When a cell is damaged, superfluous, or potentially dangerous to an organism, a cell can initiate a mechanism to trigger programmed cell death, or apoptosis. Apoptosis allows a cell to die in a controlled manner that prevents the release of potentially damaging molecules from inside the cell. There are many internal checkpoints that monitor a cell’s health; if abnormalities are observed, a cell can spontaneously initiate the process of apoptosis. However, in some cases, such as a viral infection or uncontrolled cell division due to cancer, the cell’s normal checks and balances fail. External signaling can also initiate apoptosis. For example, most normal animal cells have receptors that interact with the extracellular matrix, a network of glycoproteins that provides structural support for cells in an organism. The binding of cellular receptors to the extracellular matrix initiates a signaling cascade within the cell. However, if the cell moves away from the extracellular matrix, the signaling ceases, and the cell undergoes apoptosis. This system keeps cells from traveling through the body and proliferating out of control, as happens with tumor cells that metastasize.

Another example of external signaling that leads to apoptosis occurs in T-cell development. T-cells are immune cells that bind to foreign macromolecules and particles, and target them for destruction by the immune system. Normally, T-cells do not target “self” proteins (those of their own organism), a process that can lead to autoimmune diseases. In order to develop the ability to discriminate between self and non-self, immature T-cells undergo screening to determine whether they bind to so-called self proteins. If the T-cell receptor binds to self proteins, the cell initiates apoptosis to remove the potentially dangerous cell.

Apoptosis is also essential for normal embryological development. In vertebrates, for example, early stages of development include the formation of web-like tissue between individual fingers and toes (Figure 2). During the course of normal development, these unneeded cells must be eliminated, enabling fully separated fingers and toes to form. A cell signaling mechanism triggers apoptosis, which destroys the cells between the developing digits.

This photo shows a histological section of a foot of a 15-day-old mouse embryo. Tissue connects the space between the toes.

Figure 2 The histological section of a foot of a 15-day-old mouse embryo, visualized using light microscopy, reveals areas of tissue between the toes, which apoptosis will eliminate before the mouse reaches its full gestational age at 27 days. (credit: modification of work by Michal Mañas)

Termination of the Signal Cascade

The aberrant signaling often seen in tumor cells is proof that the termination of a signal at the appropriate time can be just as important as the initiation of a signal. One method of stopping a specific signal is to degrade the ligand or remove it so that it can no longer access its receptor. One reason that hydrophobic hormones like estrogen and testosterone trigger long-lasting events is because they bind carrier proteins. These proteins allow the insoluble molecules to be soluble in blood, but they also protect the hormones from degradation by circulating enzymes.

Inside the cell, many different enzymes reverse the cellular modifications that result from signaling cascades. For example, phosphatases are enzymes that remove the phosphate group attached to proteins by kinases in a process called dephosphorylation. Cyclic AMP (cAMP) is degraded into AMP by phosphodiesterase, and the release of calcium stores is reversed by the Ca2+ pumps that are located in the external and internal membranes of the cell.


OpenStax, Biology. OpenStax CNX. October 13, 2017.


Enzyme-catalyzed reactions

Learning Objectives

By the end of this section, you will be able to:

  • Explain the role of enzyme-catalyzed reactions in cellular metabolism. 

Figure 1 A hummingbird needs energy to maintain prolonged flight. The bird obtains its energy from taking in food and transforming the energy contained in food molecules into forms of energy to power its flight through a series of biochemical reactions. (credit: modification of work by Cory Zanker)

Virtually every task performed by living organisms requires energy. Energy is needed to perform heavy labor and exercise, but humans also use energy while thinking, and even during sleep. In fact, the living cells of every organism constantly use energy. Nutrients and other molecules are imported into the cell have many different potential paths: metabolized (broken down) and used for energy, synthesized into new molecules, modified if needed, transported around the cell, and even distributed to the entire organism. For example, the large proteins that make up muscles are built from smaller molecules imported from dietary amino acids. Complex carbohydrates are broken down into simple sugars that the cell uses for energy. Just as energy is required to both build and demolish a building, energy is required for the synthesis and breakdown of molecules as well as the transport of molecules into and out of cells. In addition, processes such as ingesting and breaking down pathogenic bacteria and viruses, exporting wastes and toxins, and movement of the cell require energy.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016



Scientists use the term bioenergetics to describe the concept of energy flow (Figure 2) through living systems, such as cells. Cellular processes such as the building and breaking down of complex molecules occur through stepwise chemical reactions. Some of these chemical reactions are spontaneous and release energy, whereas others require energy to proceed.

Just as living things must continually consume food to replenish their energy supplies, cells must continually produce more energy to replenish that used by the many energy-requiring chemical reactions that constantly take place. Together, all of the chemical reactions that take place inside cells, including those that consume or generate energy, are referred to as the cell’s metabolism.

energy through organisms

Figure 2 Ultimately, most life forms get their energy from the sun. Plants use photosynthesis to capture sunlight, and herbivores eat the plants to obtain energy. Carnivores eat the herbivores, and eventual decomposition of plant and animal material contributes to the nutrient pool.


Thermodynamics refers to the study of energy and energy transfer involving physical matter. The matter relevant to a particular case of energy transfer is called a system, and everything outside of that matter is called the surroundings. For instance, when heating a pot of water on the stove, the system includes the stove, the pot, and the water. Energy is transferred within the system (between the stove, pot, and water). There are two types of systems: open and closed. In an open system, energy can be exchanged with its surroundings. The stovetop system is open because heat can be lost to the air. A closed system cannot exchange energy with its surroundings.

Biological organisms are open systems. Energy is exchanged between them and their surroundings as they use energy from the sun to perform photosynthesis or consume energy-storing molecules and release energy to the environment by doing work and releasing heat. Like all things in the physical world, energy is subject to physical laws. The laws of thermodynamics govern the transfer of energy in and among all systems in the universe.

In general, energy is defined as the ability to do work, or to create some kind of change. Energy exists in different forms. For example, electrical energy, light energy, and heat energy are all different types of energy. To appreciate the way energy flows into and out of biological systems, it is important to understand two of the physical laws that govern energy.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Metabolic Pathways

Consider the metabolism of sugar. This is a classic example of one of the many cellular processes that use and produce energy. Living things consume sugars as a major energy source, because sugar molecules have a great deal of energy stored within their bonds. For the most part, photosynthesizing organisms like plants produce these sugars. During photosynthesis, plants use energy (originally from sunlight) to convert carbon dioxide gas (CO2) into sugar molecules (like glucose: C6H12O6). They consume carbon dioxide and produce oxygen as a waste product. This reaction is summarized as:

6CO2 + 6H2O–>C6H12O6 + 6O2

Because this process involves synthesizing an energy-storing molecule, it requires energy input to proceed. During the light reactions of photosynthesis, energy is provided by a molecule called adenosine triphosphate (ATP), which is the primary energy currency of all cells. Just as the dollar is used as currency to buy goods, cells use molecules of ATP as energy currency to perform immediate work. In contrast, energy-storage molecules such as glucose are consumed only to be broken down to use their energy. The reaction that harvests the energy of a sugar molecule in cells requiring oxygen to survive can be summarized by the reverse reaction to photosynthesis. In this reaction, oxygen is consumed and carbon dioxide is released as a waste product. The reaction is summarized as:

C6H12O6 + 6O2–>6H2O + 6CO2

Both of these reactions involve many steps.

The processes of making and breaking down sugar molecules illustrate two examples of metabolic pathways. A metabolic pathway is a series of chemical reactions that takes a starting molecule and modifies it, step-by-step, through a series of metabolic intermediates, eventually yielding a final product. In the example of sugar metabolism, the first metabolic pathway synthesized sugar from smaller molecules, and the other pathway broke sugar down into smaller molecules. These two opposite processes—the first requiring energy and the second producing energy—are referred to as anabolic pathways (building polymers) and catabolic pathways (breaking down polymers into their monomers), respectively. Consequently, metabolism is composed of synthesis (anabolism) and degradation (catabolism) (Figure 3).

It is important to know that the chemical reactions of metabolic pathways do not take place on their own. Each reaction step is facilitated, or catalyzed, by a protein called an enzyme. Enzymes are important for catalyzing all types of biological reactions—those that require energy as well as those that release energy.

Figure 3 Catabolic pathways are those that generate energy by breaking down larger molecules. Anabolic pathways are those that require energy to synthesize larger molecules. Both types of pathways are required for maintaining the cell’s energy balance.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016



The first law of thermodynamics states that the total amount of energy in the universe is constant and conserved. In other words, there has always been, and always will be, exactly the same amount of energy in the universe. Energy exists in many different forms. According to the first law of thermodynamics, energy may be transferred from place to place or transformed into different forms, but it cannot be created or destroyed. The transfers and transformations of energy take place around us all the time. Light bulbs transform electrical energy into light and heat energy. Gas stoves transform chemical energy from natural gas into heat energy. Plants perform one of the most biologically useful energy transformations on earth: that of converting the energy of sunlight to chemical energy stored within organic molecules (Figure 2). Some examples of energy transformations are shown in Figure 4.

The challenge for all living organisms is to obtain energy from their surroundings in forms that they can transfer or transform into usable energy to do work. Living cells have evolved to meet this challenge. Chemical energy stored within organic molecules such as sugars and fats is transferred and transformed through a series of cellular chemical reactions into energy within molecules of ATP (adenosine triphosphate). Energy in ATP molecules is easily accessible to do work. Examples of the types of work that cells need to do include building complex molecules, transporting materials, powering the motion of cilia or flagella, and contracting muscle fibers to create movement.

energy transformations

Figure 4 Shown are some examples of energy transferred and transformed from one system to another and from one form to another. (credit “ice cream”: modification of work by D. Sharon Pruitt; credit “kids”: modification of work by Max from Providence; credit “leaf”: modification of work by Cory Zanker)

A living cell’s primary tasks of obtaining, transforming, and using energy to do work may seem simple. However, the second law of thermodynamics explains why these tasks are harder than they appear. All energy transfers and transformations are never completely efficient. In every energy transfer, some amount of energy is lost in a form that is unusable. In most cases, this form is heat energy. Thermodynamically, heat energy is defined as the energy transferred from one system to another that is not work. For example, when a light bulb is turned on, some of the energy being converted from electrical energy into light energy is lost as heat energy. Likewise, some energy is lost as heat energy during cellular metabolic reactions.

An important concept in physical systems is that of order and disorder. The more energy that is lost by a system to its surroundings, the less ordered and more random the system is. Scientists refer to the measure of randomness or disorder within a system as entropy. High entropy means high disorder and low energy. Molecules and chemical reactions have varying entropy as well. For example, entropy increases as molecules at a high concentration in one place diffuse and spread out. The second law of thermodynamics says that energy will always be lost as heat in energy transfers or transformations. Living things are highly ordered, requiring constant energy input to be maintained in a state of low entropy.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Potential and Kinetic Energy

When an object is in motion, there is energy associated with that object. Think of a wrecking ball. Even a slow-moving wrecking ball can do a great deal of damage to other objects. Energy associated with objects in motion is called kinetic energy (Figure 5). A speeding bullet, a walking person, and the rapid movement of molecules in the air (which produces heat) all have kinetic energy.

Now what if that same motionless wrecking ball is lifted two stories above ground with a crane? If the suspended wrecking ball is unmoving, is there energy associated with it? The answer is yes. The energy that was required to lift the wrecking ball did not disappear, but is now stored in the wrecking ball by virtue of its position and the force of gravity acting on it. This type of energy is called potential energy (Figure 5). If the ball were to fall, the potential energy would be transformed into kinetic energy until all of the potential energy was exhausted when the ball rested on the ground. Wrecking balls also swing like a pendulum; through the swing, there is a constant change of potential energy (highest at the top of the swing) to kinetic energy (highest at the bottom of the swing). Other examples of potential energy include the energy of water held behind a dam or a person about to skydive out of an airplane.

dam vs waterfall

Figure 5 Still water has potential energy; moving water, such as in a waterfall or a rapidly flowing river, has kinetic energy. (credit “dam”: modification of work by “Pascal”/Flickr; credit “waterfall”: modification of work by Frank Gualtieri)

Potential energy is not only associated with the location of matter, but also with the structure of matter. A spring on the ground has potential energy if it is compressed; so does a rubber band that is pulled taut. On a molecular level, the bonds that hold the atoms of molecules together exist in a particular structure that has potential energy. Remember that anabolic cellular pathways require energy to synthesize complex molecules from simpler ones and catabolic pathways release energy when complex molecules are broken down. The fact that energy can be released by the breakdown of certain chemical bonds implies that those bonds have potential energy. In fact, there is potential energy stored within the bonds of all the food molecules we eat, which is eventually harnessed for use. This is because these bonds can release energy when broken. The type of potential energy that exists within chemical bonds, and is released when those bonds are broken, is called chemical energy. Chemical energy is responsible for providing living cells with energy from food. The release of energy occurs when the molecular bonds within food molecules are broken.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Free and Activation Energy

After learning that chemical reactions release energy when energy-storing bonds are broken, an important next question is the following: How is the energy associated with these chemical reactions quantified and expressed? How can the energy released from one reaction be compared to that of another reaction? A measurement of free energy is used to quantify these energy transfers. Recall that according to the second law of thermodynamics, all energy transfers involve the loss of some amount of energy in an unusable form such as heat. Free energy specifically refers to the energy associated with a chemical reaction that is available after the losses are accounted for. In other words, free energy is usable energy, or energy that is available to do work. Looking at this concept in a biological sense, free energy is the energy within a molecule that can be used to perform work. Glucose has a lot of free energy because there is a lot of energy stored within the bonds of the glucose molecule. Carbon dioxide has a much lower free energy because there is much less energy stored in its bonds.

If energy is released during a chemical reaction, then the change in free energy from the conversion of the reactants to the products, signified as ΔG (delta G) will be a negative number. A negative change in free energy also means that the products of the reaction have less free energy than the reactants, because they release some free energy during the reaction. Reactions that have a negative change in free energy and consequently release free energy are called exergonic reactions. Think: exergonic means energy is exiting the system. These reactions are also referred to as spontaneous reactions, and their products have less stored energy than the reactants. An important distinction must be drawn between the term spontaneous and the idea of a chemical reaction occurring immediately. Contrary to the everyday use of the term, a spontaneous reaction is not one that suddenly or quickly occurs. The rusting of iron is an example of a spontaneous reaction that occurs slowly, little by little, over time.

Figure 1 Free energy of endergonic and exergonic reactions. In an exergonic reaction, the reactants have more free energy than the products. Therefore, energy is released as the reaction proceeds. In an endergonic reaction, the reactants have more less energy than the products. Therefore, energy must be added to make the reaction move take place.

If a chemical reaction absorbs energy rather than releases energy on balance, then the ΔG for that reaction will be a positive value. In this case, the products have more free energy than the reactants. Thus, the products of these reactions can be thought of as energy-storing molecules. These chemical reactions are called endergonic reactions and they are nonspontaneous.

An endergonic reaction will not take place on its own without the addition of free energy.

Figure 2 Shown are some examples of endergonic processes (ones that require energy) and exergonic processes (ones that release energy). (credit a: modification of work by Natalie Maynor; credit b: modification of work by USDA; credit c: modification of work by Cory Zanker; credit d: modification of work by Harry Malsch)

There is another important concept that must be considered regarding endergonic and exergonic reactions. Exergonic reactions require a small amount of energy input to get going, before they can proceed with their energy-releasing steps.

These reactions have a net release of energy, but still require some energy input in the beginning. This small amount of energy input necessary for all chemical reactions to occur is called the activation energy (Figure 3).

Figure 3 Activation energy is the small amount of energy that must be put into a system in order for the reaction to take place. Photo credit Brazosport College; Wikimedia.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016



A substance that helps a chemical reaction to occur is called a catalyst, and the molecules that catalyze biochemical reactions are called enzymes. Most enzymes are proteins and perform the critical task of lowering the activation energies of chemical reactions inside the cell. Most of the reactions critical to a living cell happen too slowly at normal temperatures to be of any use to the cell. Without enzymes to speed up these reactions, life could not persist. Enzymes do this by binding to the reactant molecules and holding them in such a way as to make the chemical bond-breaking and -forming processes take place more easily. It is important to remember that enzymes do not change whether a reaction is exergonic (spontaneous) or endergonic. This is because they do not change the free energy of the reactants or products. They only reduce the activation energy required for the reaction to go forward (Figure 1). In addition, an enzyme itself is unchanged by the reaction it catalyzes. Once one reaction has been catalyzed, the enzyme is able to participate in other reactions.

graph showing energy

Figure 1 Enzymes lower the activation energy of the reaction but do not change the free energy of the reaction.

The chemical reactants to which an enzyme binds are called the enzyme’s substrates. There may be one or more substrates, depending on the particular chemical reaction. In some reactions, a single reactant substrate is broken down into multiple products. In others, two substrates may come together to create one larger molecule. Two reactants might also enter a reaction and both become modified, but they leave the reaction as two products. The location within the enzyme where the substrate binds is called the enzyme’s active site. The active site is where the “action” happens. Since enzymes are proteins, there is a unique combination of amino acid side chains within the active site. Each side chain is characterized by different properties. They can be large or small, weakly acidic or basic, hydrophilic or hydrophobic, positively or negatively charged, or neutral. The unique combination of side chains creates a very specific chemical environment within the active site. This specific environment is suited to bind to one specific chemical substrate (or substrates).

Active sites are subject to influences of the local environment. Increasing the environmental temperature generally increases reaction rates, enzyme-catalyzed or otherwise. However, temperatures outside of an optimal range reduce the rate at which an enzyme catalyzes a reaction. Hot temperatures will eventually cause enzymes to denature, an irreversible change in the three-dimensional shape and therefore the function of the enzyme (Figure 8). Enzymes are also suited to function best within a certain pH and salt concentration range, and, as with temperature, extreme pH, and salt concentrations can cause enzymes to denature.

egg cooking

Figure 2 Heat applied to an egg during cooking irreversibly denatures the proteins. (credit: “K-Wall”/Flickr)

For many years, scientists thought that enzyme-substrate binding took place in a simple “lock and key” fashion. This model asserted that the enzyme and substrate fit together perfectly in one instantaneous step. However, current research supports a model called induced fit (Figure 9). The induced-fit model expands on the lock-and-key model by describing a more dynamic binding between enzyme and substrate. As the enzyme and substrate come together, their interaction causes a mild shift in the enzyme’s structure that forms an ideal binding arrangement between enzyme and substrate.

When an enzyme binds its substrate, an enzyme-substrate complex is formed. This complex lowers the activation energy of the reaction and promotes its rapid progression in one of multiple possible ways.

One of the hallmark properties of enzymes is that they remain ultimately unchanged by the reactions they catalyze. After an enzyme has catalyzed a reaction, it releases its product(s) and can catalyze a new reaction.

Figure 9 The induced-fit model is an adjustment to the lock-and-key model and explains how enzymes and substrates undergo dynamic modifications during the transition state to increase the affinity of the substrate for the active site.

It would seem ideal to have a scenario in which all of an organism’s enzymes existed in abundant supply and functioned optimally under all cellular conditions, in all cells, at all times. However, a variety of mechanisms ensures that this does not happen. Cellular needs and conditions constantly vary from cell to cell, and change within individual cells over time. The required enzymes of stomach cells differ from those of fat storage cells, skin cells, blood cells, and nerve cells. Furthermore, a digestive organ cell works much harder to process and break down nutrients during the time that closely follows a meal compared with many hours after a meal. As these cellular demands and conditions vary, so must the amounts and functionality of different enzymes.

Since the rates of biochemical reactions are controlled by activation energy, and enzymes lower and determine activation energies for chemical reactions, the relative amounts and functioning of the variety of enzymes within a cell ultimately determine which reactions will proceed and at what rates. This determination is tightly controlled in cells. In certain cellular environments, enzyme activity is partly controlled by environmental factors like pH, temperature, salt concentration, and, in some cases, cofactors or coenzymes.

Enzymes can also be regulated in ways that either promote or reduce enzyme activity. There are many kinds of molecules that inhibit or promote enzyme function, and various mechanisms by which they do so. In some cases of enzyme inhibition, an inhibitor molecule is similar enough to a substrate that it can bind to the active site and simply block the substrate from binding. When this happens, the enzyme is inhibited through competitive inhibition, because an inhibitor molecule competes with the substrate for binding to the active site.

On the other hand, in noncompetitive inhibition, an inhibitor molecule binds to the enzyme in a location other than the active site, called an allosteric site, but still manages to block substrate binding to the active site. Some inhibitor molecules bind to enzymes in a location where their binding induces a conformational change that reduces the affinity of the enzyme for its substrate. This type of inhibition is called allosteric inhibition (Figure 10). Most allosterically regulated enzymes are made up of more than one polypeptide, meaning that they have more than one protein subunit. When an allosteric inhibitor binds to a region on an enzyme, all active sites on the protein subunits are changed slightly such that they bind their substrates with less efficiency. There are allosteric activators as well as inhibitors. Allosteric activators bind to locations on an enzyme away from the active site, inducing a conformational change that increases the affinity of the enzyme’s active site(s) for its substrate(s) (Figure 10).

Figure 10 Allosteric inhibition works by indirectly inducing a conformational change to the active site such that the substrate no longer fits. In contrast, in allosteric activation, the activator molecule modifies the shape of the active site to allow a better fit of the substrate.

Many enzymes do not work optimally, or even at all, unless bound to other specific non-protein helper molecules. They may bond either temporarily through ionic or hydrogen bonds, or permanently through stronger covalent bonds. Binding to these molecules promotes optimal shape and function of their respective enzymes. Two examples of these types of helper molecules are cofactors and coenzymes. Cofactors are inorganic ions such as ions of iron and magnesium. Coenzymes are organic helper molecules, those with a basic atomic structure made up of carbon and hydrogen. Like enzymes, these molecules participate in reactions without being changed themselves and are ultimately recycled and reused. Vitamins are the source of coenzymes. Some vitamins are the precursors of coenzymes and others act directly as coenzymes. Vitamin C is a direct coenzyme for multiple enzymes that take part in building the important connective tissue, collagen. Therefore, enzyme function is, in part, regulated by the abundance of various cofactors and coenzymes, which may be supplied by an organism’s diet or, in some cases, produced by the organism.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Feedback Inhibition in Metabolic Pathways

Molecules can regulate enzyme function in many ways. The major question remains, however: What are these molecules and where do they come from? Some are cofactors and coenzymes, as you have learned. What other molecules in the cell provide enzymatic regulation such as allosteric modulation, and competitive and non-competitive inhibition? Perhaps the most relevant sources of regulatory molecules, with respect to enzymatic cellular metabolism, are the products of the cellular metabolic reactions themselves. In a most efficient and elegant way, cells have evolved to use the products of their own reactions for feedback inhibition of enzyme activity. Feedback inhibition involves the use of a reaction product to regulate its own further production (Figure 11). The cell responds to an abundance of the products by slowing down production during anabolic or catabolic reactions. Such reaction products may inhibit the enzymes that catalyzed their production through the mechanisms described above.

Figure 11 Metabolic pathways are a series of reactions catalyzed by multiple enzymes. Feedback inhibition, where the end product of the pathway inhibits an upstream process, is an important regulatory mechanism in cells.

The production of both amino acids and nucleotides is controlled through feedback inhibition. Additionally, ATP is an allosteric regulator of some of the enzymes involved in the catabolic breakdown of sugar, the process that creates ATP. In this way, when ATP is in abundant supply, the cell can prevent the production of ATP. On the other hand, ADP serves as a positive allosteric regulator (an allosteric activator) for some of the same enzymes that are inhibited by ATP. Thus, when relative levels of ADP are high compared to ATP, the cell is triggered to produce more ATP through sugar catabolism.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


How cells obtain energy

Learning Objectives

By the end of this section, you will begin to be able to:

  • Compare energy-generating processes within different types of cells.



Energy in Living Systems

All living organisms require energy to perform their life processes. Energy, as you learned earlier in the chapter about enzymes, is the ability to do work or to create some kind of change. You are familiar with or have learned about many processes that can require energy:

Just as living things must continually consume food to replenish their energy supplies, cells must continually produce more energy to replenish that used by the many energy-requiring chemical reactions that constantly take place. Together, all of the chemical reactions that take place inside cells, including those that consume or generate energy, are referred to as the cell’s metabolism.

A living cell cannot store significant amounts of free energy. Free energy is energy that is not stored in molecules. Excess free energy would result in an increase of heat in the cell, which would denature enzymes and other proteins, and destroy the cell. Instead, a cell must be able to store energy safely and release it for use only as needed. Living cells accomplish this using ATP, which can be used to fill any energy need of the cell. How? It functions like a rechargeable battery.

When ATP is broken down, energy is released. This energy is used by the cell to do work. For example, in the mechanical work of muscle contraction, ATP supplies energy to move the contractile muscle proteins.

ATP Structure and Function

ATP is a complex-looking molecule, but for our purposes you can think of it as a rechargeable battery. ATP, the fully charged form of our battery, is made up of three phosphates (the “TP” part of ATP means “tri phosphate”) attached to a sugar and an adenine (the “A” part of ATP) (Figure 1). When the last phosphate is broken off of the ATP, energy is released. The result is a single phosphate and a molecule called ADP (“D” stands for “di” which means two).

structure of ATP

Figure 1 The structure of ATP shows the basic components of a two-ring adenine, five-carbon ribose sugar, and three phosphate groups.

A large amount of energy is required in order to recharge a molecule of ADP into ATP. This energy is stored in the bond between the second and third phosphates. When this bond is broken, the energy is released in a way that the cell can use it.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


From Mouth to Molecule: Digestion

While plants can produce their own energy using the process of photosynthesis, animals (and other organisms that can’t do photosynthesis) must eat to get energy from food molecules. Just like energy can be stored in the chemical bond between the second and third phosphate of an ATP molecule, energy can also be stored in the chemical bonds that make up food molecules. Most of the energy that we use comes from molecules of glucose, a simple sugar.

Food energy is chemical energy that animals (including humans) derive from their food  through the process of cellular respiration. Cellular respiration involves either joining oxygen from air with the molecules of food (aerobic respiration) or reorganizing the atoms within the molecules in the absence of oxygen (anaerobic respiration).

Humans and other animals need a minimum intake of food energy to sustain their metabolism and to drive their muscles. Foods are composed chiefly of carbohydrates, fats, proteins, water, vitamins, and minerals. Carbohydrates, fats, proteins, and water represent virtually all the weight of food, with vitamins and minerals making up only a small percentage of the weight. In fact, carbohydrates, fats, and proteins comprise ninety percent of the dry weight of foods. Organisms derive food energy mainly from carbohydrates and fats present in the diet, and to a smaller extent proteins and other organic molecules. Some diet components that provide little or no food energy, such as water, minerals, vitamins, cholesterol, and fiber, may still be necessary to health and survival for other reasons. Water, minerals, vitamins, and cholesterol are not broken down; they are used by the body in the form in which they are taken in, so they cannot be used for energy. Fiber, a type of carbohydrate, cannot be completely digested by the human body so energy is not released from fiber when it is digested. Instead, it moves mostly intact through the digestive system.

After you put food into your mouth, you begin to break it down mechanically using your teeth. Enzymes in your saliva begin breaking the food molecules down as well. After you swallow your food, it is further broken down by additional enzymes in the stomach, followed by the small intestine. In the small intestine, the fully broken-down food is absorbed into the blood. The majority of the nutrients (about 95%) are absorbed in the small intestine. Water is reabsorbed from the remaining material in the colon. Then the residual waste is eliminated during defecation.


The human digestive system. (Credit: Leysi24, from Wikimedia. Creative Commons Attribution-Share Alike 3.0 Unported)

Once in the bloodstream, nutrients enter individual cells. Glucose is too large to diffuse through the cell membrane and is typically transported inside cells by proteins. After molecules enter a cell, the breakdown process to produce energy in the form of ATP can be completed.


WikipediaCreative Commons Attribution-ShareAlike License.



An organism’s metabolism is the sum total of all the chemical reactions that occur within the organism. These chemical reactions fall into two basic categories:

This means that metabolism is composed of synthesis (anabolism) and degradation (catabolism) (Figure 1).

Figure 1 Catabolic pathways are those that generate energy by breaking down larger molecules. Anabolic pathways are those that require energy to synthesize larger molecules. Both types of pathways are required for maintaining the cell’s energy balance.

It is important to know that the chemical reactions of metabolic pathways do not take place on their own. Each reaction step is facilitated, or catalyzed, by a protein called an enzyme. Enzymes are important for catalyzing all types of biological reactions—those that require energy as well as those that release energy. Refer back to the chapter on enzymes if you need a reminder about this topic.

Consider the metabolism of sugar (a carbohydrate). This is a classic example of one of the many cellular processes that use and produce energy. Living things consume sugars as a major energy source, because sugar molecules have a great deal of energy stored within their bonds. For the most part, photosynthesizing organisms like plants produce these sugars. During photosynthesis, plants use energy (originally from sunlight) to convert carbon dioxide gas (CO2) into sugar molecules (like glucose: C6H12O6). They consume carbon dioxide and produce oxygen as a waste product. This reaction is summarized as:

6CO2 + 6H2O–>C6H12O6 + 6O2

Recall from chemistry that the abbreviation “CO2” means “one carbon atom covalently bonded to two oxygen atoms.” Water, “H2O” is two hydrogen atoms covalently bonded to one oxygen atom. And “C6H12O6” has 6 carbon atoms, 12 hydrogen atoms, and 6 oxygen atoms that are covalently bonded together.

structure of CO2

Carbon dioxide (CO2) contains one carbon atom covalently bonded to two oxygen atoms. Credit: wikimedia

structure of glucose

Glucose contains 6 carbons, 6 oxygens, and 12 hydrogen atoms. Credit: Ben, 2006. Wikimedia.  Public domain.

The process of producing glucose from carbon dioxide and water requires an energy input to proceed because glucose contains more energy in its molecular bonds than carbon dioxide does.

In contrast, energy-storage molecules such as glucose are consumed to be broken down to use their energy. The reaction that harvests the energy of a sugar molecule in cells requiring oxygen to survive can be summarized by the reverse reaction to photosynthesis. In this reaction, oxygen is consumed and carbon dioxide is released as a waste product. The reaction is summarized as:

C6H12O6 + 6O2–>6H2O + 6CO2

Both of these reactions involve many steps.

The processes of making and breaking down sugar molecules illustrate two examples of metabolic pathways. A metabolic pathway is a series of chemical reactions that takes a starting molecule and modifies it, step-by-step, through a series of metabolic intermediates, eventually yielding a final product. In the example of sugar metabolism, the first metabolic pathway synthesized sugar from smaller molecules, and the other pathway broke sugar down into smaller molecules.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


An overview of Cellular Respiration

Glucose and other molecules from food are broken down to release energy in a complex series of chemical reactions that together are called cellular respiration.

Cellular respiration is a set of metabolic reactions and processes that take place in the cells of organisms to convert biochemical energy from nutrients into ATP, and then release waste products. The reactions involved in respiration are catabolic reactions, which break large molecules into smaller ones, releasing energy in the process. These processes require a large number of enzymes which each perform one specific chemical reaction.

Aerobic Respiration

Aerobic respiration requires oxygen. This is the reason why we breathe oxygen in from the air. This type of respiration releases a large amount of energy from glucose that can be stored as ATP. Aerobic respiration happens all the time in animals and plants, where most of the reactions occur in the mitochondria. Even some prokaryotes can perform aerobic respiration (although since prokaryotes don’t contain mitochondria, the reactions are slightly different). The overall chemical formula for aerobic respiration can be written as:

C6H12O2 + 6 O2 → 6 CO2 + 6 H2O + (approximately) 38 ATP

Translating that formula into English: One molecule of glucose can be broken down in the presence of oxygen gas to produce waste products of carbon dioxide (which we breathe out) and water. This process has an overall release of energy which is captured and stored in 38 molecules of ATP.

Aerobic respiration is a complex process that can be divided into three basic stages: glycolysis, the citric acid cycle, and oxidative phosphorylation. The next several sections in the textbook address the details of these stages.

Anaerobic Respiration

Anaerobic respiration occurs in the absence of oxygen. It releases a much smaller amount of energy than aerobic respiration. Anaerobic respiration does not release enough energy to power human cells for long – think about how long a person can live if they are not able to breathe. Anaerobic respiration occurs in muscle cells during hard exercise (after the oxygen has been used up). It also occurs in yeast when brewing beer. Many prokaryotes perform anaerobic respiration.

There are several different types of anaerobic respiration, which will be discussed in more detail later. All the types of anaerobic respiration involve glycolysis, and none of them go through the citric acid cycle or oxidative phosphorylation. Instead, various other methods are used to regenerate the molecules needed for glycolysis, For now, we will summarize them all using this chemical formula:

C6H12O2  NAD+ → various waste products + NADH + 2 ATP

NAD+ and NADH are two states of a molecule that will carry energy during this process. It will be addressed further in a later section. For right now, just know that NADH carries energy (similar to ATP) and NAD+ is the form that carries less energy (similar to ADP)

Aerobic vs anaerobic respiration

Aerobic Anaerobic
Requires oxygen? Yes No
Glucose breakdown Complete Incomplete
End products CO2 and H2O Animal cells: lactic acid

Plant cells and yeast: carbon dioxide and ethanol

ATP produced About 38 2

Aerobic respiration is much more efficient than anaerobic respiration. One molecule of glucose can generate up to 38 molecules of ATP if aerobic respiration is used. In contrast, only 2 molecules of ATP are generated in anaerobic respiration.

To put it another way, a cellular process which requires 100 molecules of ATP:




Aerobic Respiration, Part 1: Glycolysis

You have read that nearly all of the energy used by living things comes to them in the bonds of the sugar, glucose. Glycolysis is the first step in the breakdown of glucose to extract energy for cell metabolism. Many living organisms carry out glycolysis as part of their metabolism. Glycolysis takes place in the cytoplasm of most prokaryotic and all eukaryotic cells.

Glycolysis begins with a molecule of glucose (C6H12O6). Various enzymes are used to break glucose down into two molecules of pyruvate (C3H4O3, basically a glucose molecule broken in half) (Figure 1). This process releases a small amount of energy.

Figure 1 An overview of glycolysis. In glycolysis, a glucose molecule is converted into two pyruvate molecules.

Glycolysis consists of two distinct phases: energy-requiring, and energy-producing.

Energy-Requiring Steps

The first part of the glycolysis pathway requires an input of energy to begin. The first step in glycolysis is catalyzed by hexokinase, an enzyme with broad specificity that catalyzes the phosphorylation of six-carbon sugars. Hexokinase phosphorylates (adds a phosphate to) glucose using ATP as the source of the phosphate (Figure 2). This produces glucose-6-phosphate, a more chemically reactive form of glucose. This phosphorylated glucose molecule can no longer leave the cell because the negatively charged phosphate will not allow it to cross the hydrophobic interior of the plasma membrane.

Several additional enzymatic reactions occur (Figure 2), one of which requires an additional ATP molecule. At the end of the energy-requiring steps, the original glucose has been split into two three-carbon molecules, and two ATPs have been used as sources of energy for this process.

shows chemical structures of molecules in the first half of glycolysis.

Figure 2 The first half of glycolysis uses two ATP molecules in the phosphorylation of glucose, which is then split into two three-carbon molecules.

Energy-Producing Steps

So far, glycolysis has cost the cell two ATP molecules and produced two small, three-carbon sugar molecules. Both of these molecules will proceed through the second half of the pathway, and sufficient energy will be extracted to pay back the two ATP molecules used as an initial investment and produce a profit for the cell of two additional ATP molecules and two even higher-energy NADH molecules (Figure 3).

During the energy-producing steps, additional enzymes continue to catalyze the breakdown of glucose (Figure 3). The end result of these reactions is two 3-carbon molecules of pyruvate.

more chemical reactions in glycolysis

Figure 3 The second half of glycolysis involves phosphorylation without ATP investment (step 6) and produces two NADH and four ATP molecules per glucose.

An important rate-limiting step occurs at step 6 in glycolysis. If you look at Figure 3, you will notice that during step 6, NAD+ is converted into NADH.  NADH contains more energy than NAD+, and is therefore a desired product from this reaction. However, the continuation of the reaction depends upon the availability NAD+. Thus, NADH must be continuously converted back into NAD+ in order to keep this step going. If NAD+ is not available, the second half of glycolysis slows down or stops.

If oxygen is available in the system, the NADH will be converted readily back into NAD+ by the later processes in aerobic cellular respiration. However, if there is no oxygen available, NADH is not converted back into NAD+. Without NAD+, the reaction in step 6 cannot proceed and glycolysis slows or stops. In an environment without oxygen, an alternate pathway (fermentation) can provide the oxidation of NADH to NAD+.

Outcomes of Glycolysis

Glycolysis starts with glucose and ends with two pyruvate molecules, a total of four ATP molecules and two molecules of NADH. Two ATP molecules were used in the first half of the pathway to prepare the six-carbon ring for cleavage, so the cell has a net gain of two ATP molecules and 2 NADH molecules for its use. If the cell cannot catabolize (break down) the pyruvate molecules further, it will harvest only two ATP molecules from one molecule of glucose. Mature mammalian red blood cells are not capable of aerobic respiration—the process in which organisms convert energy in the presence of oxygen—and glycolysis is their sole source of ATP. If glycolysis is interrupted, these cells lose their ability to maintain their sodium-potassium pumps, and eventually, they die.

Section Summary

Glycolysis is the first pathway used in the breakdown of glucose to extract energy. It was probably one of the earliest metabolic pathways to evolve and is used by nearly all of the organisms on earth. Glycolysis consists of two parts: The first part prepares the six-carbon ring of glucose for cleavage into two three-carbon sugars. ATP is invested in the process during this half to energize the separation. The second half of glycolysis extracts ATP and high-energy electrons from hydrogen atoms and attaches them to NAD+. Two ATP molecules are invested in the first half and four ATP molecules are formed by substrate phosphorylation during the second half. This produces a net gain of two ATP and two NADH molecules for the cell.

What was produced (per molecule of glucose)?

  • 2 pyruvate (3 carbon molecules), 2 NADH, net gain of 2 ATP


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016

OpenStax, Biology. OpenStax CNX. September 16, 2017


Aerobic Respiration, Part 2: Oxidation of Pyruvate and The Citric Acid Cycle

If oxygen is available, aerobic respiration will go forward. In eukaryotic cells, the pyruvate molecules produced at the end of glycolysis are transported into mitochondria (Figure 1), which are the sites of cellular respiration. In order for pyruvate, the product of glycolysis, to enter the next pathway, it must undergo several changes. The conversion is a three-step process.

Mitochondria structure

Figure 1 Diagram of a human mitochondrion. Recall that mitochondria have two membranes: an inner and an outer membrane. Between the two membranes is a region known as the intermembrane space. The mitochondrial matrix is located inside the inner membrane. Photo credit PsChemp, Wikimedia.

Oxidation of Pyruvate

In eukaryotic cells, the pyruvate molecules produced at the end of glycolysis are transported into the mitochondrial matrix (the middle region of the mitochondria) (Figure 1). In the mitochondrial matrix, pyruvate will be transformed into a two-carbon acetyl group by removing a molecule of carbon dioxide. This also produces NADH. The acetyl group is picked up by a carrier compound called coenzyme A (CoA), which is made from vitamin B5. The resulting compound is called acetyl CoA (Figure 2). Acetyl CoA can be used in a variety of ways by the cell, but its major function is to deliver the acetyl group derived from pyruvate to the next pathway in glucose catabolism.

oxidation of pyruvate

Figure 2 Upon entering the mitochondrial matrix, a multi-enzyme complex converts pyruvate into acetyl CoA. In the process, carbon dioxide is released and one molecule of NADH is formed.

 Acetyl CoA to CO2

In the presence of oxygen, acetyl CoA delivers its acetyl group to a four-carbon molecule, oxaloacetate, to form citrate, a six-carbon molecule with three carboxyl groups; this pathway will harvest the remainder of the extractable energy from what began as a glucose molecule. This single pathway is called by different names: the citric acid cycle (for the first intermediate formed—citric acid, or citrate—when acetate joins to the oxaloacetate), the TCA cycle (since citric acid or citrate and isocitrate are tricarboxylic acids), and the Krebs cycle, after Hans Krebs, who first identified the steps in the pathway in the 1930s in pigeon flight muscles.

Like the conversion of pyruvate to acetyl CoA, the citric acid cycle in eukaryotic cells also takes place in the matrix of the mitochondria (Figure 1). Unlike glycolysis, the citric acid cycle is a closed loop: the last part of the pathway regenerates the compound used in the first step. The eight steps of the cycle are a series of chemical reactions that produces the following from each of the two molecules of pyruvate produced per molecule of glucose that originally went into glycolysis (Figure 3):

Part of this is considered an aerobic pathway (oxygen-requiring) because the NADH and FADH2 produced must transfer their electrons to the next pathway in the system, which will use oxygen. If oxygen is not present, this transfer does not occur. The citric acid cycle does NOT occur in anaerobic respiration.

Two carbon atoms come into the citric acid cycle from each acetyl group. Two carbon dioxide molecules are released on each turn of the cycle; however, these do not contain the same carbon atoms contributed by the acetyl group on that turn of the pathway. The two acetyl-carbon atoms will eventually be released on later turns of the cycle; in this way, all six carbon atoms from the original glucose molecule will be eventually released as carbon dioxide. Carbon dioxide is a waste product in most animal cells and will be released outside the organism. It takes two turns of the cycle to process the equivalent of one glucose molecule. Each turn of the cycle forms three high-energy NADH molecules and one high-energy FADH2 molecule. These high-energy carriers will connect with the last portion of aerobic respiration to produce ATP molecules. One ATP (or an equivalent) is also made in each cycle. Several of the intermediate compounds in the citric acid cycle can be used in synthesizing non-essential amino acids; therefore, the cycle is both anabolic and catabolic.

citric acid cycle

Figure 3 In the citric acid cycle, the acetyl group from acetyl CoA is attached to a four-carbon oxaloacetate molecule to form a six-carbon citrate molecule. Through a series of steps, citrate is oxidized, releasing two carbon dioxide molecules for each acetyl group fed into the cycle. In the process, three NAD+ molecules are reduced to NADH, one FAD molecule is reduced to FADH2, and one ATP or GTP (depending on the cell type) is produced (by substrate-level phosphorylation). Because the final product of the citric acid cycle is also the first reactant, the cycle runs continuously in the presence of sufficient reactants. (credit: modification of work by “Yikrazuul”/Wikimedia Commons)

Section Summary

In the presence of oxygen, 3-carbon pyruvate is converted into a 2-carbon acetyl group, which is attached to a carrier molecule of coenzyme A. The resulting acetyl CoA can enter several pathways, but most often, the acetyl group is delivered to the citric acid cycle for further catabolism (breakdown). During the conversion of pyruvate into the acetyl group, a molecule of carbon dioxide and two high-energy electrons are removed. Because two pyruvate were produced from each molecule of glucose during glycolysis, the production of two carbon dioxide molecules (which are released as waste) accounts for two of the six carbons of the original glucose molecule. The other four carbons are released as carbon dioxide during two turns of the citric acid cycle. The electrons are picked up by NAD+, and the NADH carries the electrons to a later pathway for ATP production. At this point, the glucose molecule that originally entered cellular respiration has been completely broken down. Chemical potential energy stored within the glucose molecule has been transferred to electron carriers or has been used to synthesize a few ATPs.

What was produced (per molecule of glucose)?

  • Oxidation of pyruvate: 2 CO2, 2 NADH, 2 acetyl (2 carbon molecule)
  • Products of the citric acid cycle: 4 CO2, 6 NADH, 2 FADH2, 2 ATP


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016

OpenStax, Biology. OpenStax CNX. September 16, 2017. 


Aerobic Respiration, Part 3: Oxidative Phosphorylation

You have just read about two pathways in glucose catabolism—glycolysis and the citric acid cycle—that generate ATP. Most of the ATP generated during the aerobic catabolism of glucose, however, is not generated directly from these pathways. Rather, it derives from a process that begins with passing electrons through a series of chemical reactions to a final electron acceptor, oxygen. This is the only place in aerobic respiration where O2 is actually required. These reactions take place in specialized protein complexes located in the inner membrane of the mitochondria of eukaryotic organisms and on the inner part of the cell membrane of prokaryotic organisms. The energy of the electrons is used to generate ATP. The entirety of this process is called oxidative phosphorylation.

During oxidative phosphorylation:

Electron Transport Chain

The electron transport chain (Figure 1) is the last component of aerobic respiration and is the only part of metabolism that uses atmospheric oxygen. Oxygen continuously diffuses into plants for this purpose. In animals, oxygen enters the body through the respiratory system. Electron transport is a series of chemical reactions that resembles a bucket brigade in that electrons are passed rapidly from one component to the next, to the endpoint of the chain where oxygen is the final electron acceptor and water is produced. There are four complexes composed of proteins, labeled I through IV in Figure 1, and the aggregation of these four complexes, together with associated mobile, accessory electron carriers, is called the electron transport chain. The electron transport chain is present in multiple copies in the inner mitochondrial membrane of eukaryotes and in the plasma membrane of prokaryotes. In each transfer of an electron through the electron transport chain, the electron loses energy, but with some transfers, the energy is stored as potential energy by using it to pump hydrogen ions (H+, protons) across the inner mitochondrial membrane into the intermembrane space, creating an electrochemical gradient. An electrochemical gradient consists of two parts: a difference in solute concentration across the membrane combined with a difference in charge across the membrane. Here, the electrochemical gradient is made up of a higher concentration of H+ in the inner membrane space compared to the mitochondrial matrix.

proteins in the electron transport chain

Figure 1 The electron transport chain is a series of electron transporters embedded in the inner mitochondrial membrane that shuttles electrons from NADH and FADH2 to molecular oxygen. In the process, protons are pumped from the mitochondrial matrix to the intermembrane space, and oxygen is reduced to form water.

Electrons from NADH and FADH2 are passed to protein complexes in the electron transport chain. As they are passed from one complex to another (there are a total of four), the electrons lose energy, and some of that energy is used to pump hydrogen ions from the mitochondrial matrix into the intermembrane space. In the fourth protein complex, the electrons are accepted by oxygen, the terminal acceptor. The oxygen with its extra electrons then combines with two hydrogen ions, further enhancing the electrochemical gradient, to form water. If there were no oxygen present in the mitochondrion, the electrons could not be removed from the system, and the entire electron transport chain would back up and stop. The mitochondria would be unable to generate new ATP in this way, and the cell would ultimately die from lack of energy. This is the reason we must breathe to draw in new oxygen. This is the only place where oxygen is required during the processes of aerobic respiration.

In the electron transport chain, the free energy from the series of reactions just described is used to pump hydrogen ions across the membrane. The uneven distribution of H+ ions across the membrane establishes an electrochemical gradient, owing to the H+ ions’ positive charge and their higher concentration on one side of the membrane.

Hydrogen ions diffuse from the intermembrane space through the inner membrane into the mitochondrial matrix through an integral membrane protein called ATP synthase (Figure 2). This complex protein acts as a tiny generator, turned by the force of the hydrogen ions diffusing through it, down their electrochemical gradient from the intermembrane space, where there are many mutually repelling hydrogen ions to the matrix, where there are few. The turning of the parts of this molecular machine regenerate ATP from ADP. This flow of hydrogen ions across the membrane through ATP synthase is called chemiosmosis.

ATP Synthase

Figure 2 ATP synthase is a complex, molecular machine that uses a proton (H+) gradient to form ATP from ADP and inorganic phosphate (Pi). (Credit: modification of work by Klaus Hoffmeier)

Chemiosmosis (Figure 2) is used to generate 90 percent of the ATP made during aerobic glucose catabolism. The result of the reactions is the production of ATP from the energy of the electrons removed from hydrogen atoms. These atoms were originally part of a glucose molecule. At the end of the electron transport system, the electrons are used to reduce an oxygen molecule to oxygen ions. The extra electrons on the oxygen ions attract hydrogen ions (protons) from the surrounding medium, and water is formed. The electron transport chain and the production of ATP through chemiosmosis are collectively called oxidative phosphorylation (Figure 3).

oxidative phosphorylation

Figure 3 In oxidative phosphorylation, the pH gradient formed by the electron transport chain is used by ATP synthase to form ATP.

ATP Yield

The number of ATP molecules generated from the catabolism of glucose varies. For example, the number of hydrogen ions that the electron transport chain complexes can pump through the membrane varies between species. Another source of variance stems from the shuttle of electrons across the membranes of the mitochondria because the NADH generated from glycolysis cannot easily enter mitochondria. Thus, electrons are picked up on the inside of mitochondria by either NAD+ or FAD+. As you have learned earlier, these FAD+molecules can transport fewer ions; consequently, fewer ATP molecules are generated when FAD+ acts as a carrier. NAD+ is used as the electron transporter in the liver and FAD+ acts in the brain.

Another factor that affects the yield of ATP molecules generated from glucose is the fact that intermediate compounds in these pathways are used for other purposes. Glucose catabolism connects with the pathways that build or break down all other biochemical compounds in cells, and the result is somewhat messier than the ideal situations described thus far. For example, sugars other than glucose are fed into the glycolytic pathway for energy extraction. Moreover, the five-carbon sugars that form nucleic acids are made from intermediates in glycolysis. Certain nonessential amino acids can be made from intermediates of both glycolysis and the citric acid cycle. Lipids, such as cholesterol and triglycerides, are also made from intermediates in these pathways, and both amino acids and triglycerides are broken down for energy through these pathways. Overall, in living systems, these pathways of glucose catabolism extract about 34 percent of the energy contained in glucose.

Section Summary

The electron transport chain is the portion of aerobic respiration that uses free oxygen as the final electron acceptor of the electrons removed from the intermediate compounds in glucose catabolism. The electron transport chain is composed of four large, multiprotein complexes embedded in the inner mitochondrial membrane and two small diffusible electron carriers shuttling electrons between them. The electrons are passed through a series of reactions, with a small amount of free energy used at three points to transport hydrogen ions across a membrane. This process contributes to the gradient used in chemiosmosis. The electrons passing through the electron transport chain gradually lose energy until eventually they are donated to oxygen gas which accepts two protons (H+) and is converted into water. The end products of the electron transport chain are water and roughly 30-34 molecules of ATP. A number of intermediate compounds of the citric acid cycle can be diverted into the anabolism of other biochemical molecules, such as nonessential amino acids, sugars, and lipids. These same molecules can serve as energy sources for the glucose pathways.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016

OpenStax, Biology. OpenStax CNX. September 16, 2017


Metabolism without Oxygen: Fermentation

In aerobic respiration, the final electron acceptor for the electron transport chain is an oxygen molecule, O2. If aerobic respiration occurs, then approximately 30 molecules of ATP will be produced during the electron transport chain and chemiosmosis using the energy of the high-energy electrons carried by NADH or FADH2 to the electron transport chain. When NADH or FADH2 give their high energy electrons to the electron transport chain, NAD+ and FAD are regenerated. These low energy molecules cycle back to glycolysis and/or the citric acid cycle, where they pick up more high energy electrons and allow the process to continue.

Glycolysis and the citric acid cycle can not occur if there is not NAD+ present to pick up electrons as the reactions proceed. When oxygen is present, this isn’t a problem – all of the NADH and FADH2 that were produced during glycolysis and the citric acid cycle are converted back into NAD+ and FAD after the electron transport chain. When no oxygen is present, the electron transport chain can’t run because there is no oxygen to act as the final electron acceptor. This means that the ETC will not be accepting electrons from NADH as its source of power, so NAD+ will not be regenerated. Both glycolysis and the citric acid cycle require NAD+ to accept electrons during their chemical reactions. In order for the cell to continue to generate any ATP, NADH must be converted back to NAD+ for use as an electron carrier. Anaerobic processes use different mechanisms, but all function to convert NAD+ back into NADH.

How is this done?

Both of these methods are called anaerobic cellular respiration. They do not require oxygen to achieve NAD+ regeneration and enable organisms to convert energy for their use in the absence of oxygen.

During anaerobic respiration, only glycolysis occurs. The 2 molecules of NADH that are generated during glycolysis are then converted back into NAD+ during anaerobic respiration so that glycolysis can continue. Since glycolysis only produces 2 ATP, anaerobic respiration is much less efficient than aerobic respiration (2 ATP molecules compared to 36-ish ATP molecules). However, 2 ATP molecules is much better for a cell than 0 ATP molecules. In anaerobic situations, the cell needs to continue performing glycolysis to generate 2 ATP per glucose because if a cell is not generating any ATP, it will die.

Note that the only part of aerobic respiration that physically uses oxygen is the electron transport chain. However, the citric acid cycle can not occur in the absence of oxygen because there is no way to regenerate the NAD+ used during this process.

Lactic Acid Fermentation

The fermentation method used by animals and some bacteria like those in yogurt is lactic acid fermentation (Figure 1). This occurs routinely in mammalian red blood cells and in skeletal muscle that does not have enough oxygen to allow aerobic respiration to continue (such as in muscles after hard exercise). The chemical reaction of lactic acid fermentation is the following:

Pyruvic acid + NADH ↔ lactic acid + NAD+

The build-up of lactic acid causes muscle stiffness and fatigue. In muscles, lactic acid produced by fermentation must be removed by the blood circulation and brought to the liver for further metabolism. Once the lactic acid has been removed from the muscle and is circulated to the liver, it can be converted back to pyruvic acid and further catabolized (broken down) for energy.

Note that the purpose of this process is not to produce lactic acid (which is a waste product and is excreted from the body). The purpose is to convert NADH back into NAD+ so that glycolysis can continue so that the cell can produce 2 ATP per glucose.

lactic acid fermentation

Figure 1 Lactic acid fermentation is common in muscles that have become exhausted by use.

Alcohol Fermentation

Another familiar fermentation process is alcohol fermentation (Figure 2), which produces ethanol, an alcohol. The alcohol fermentation reaction is the following:

alcohol fermentation picture

Figure 2 The reaction resulting in alcohol fermentation is shown.

The fermentation of pyruvic acid by yeast produces the ethanol found in alcoholic beverages (Figure 3). If the carbon dioxide produced by the reaction is not vented from the fermentation chamber, for example in beer and sparkling wines, it remains dissolved in the medium until the pressure is released. Ethanol above 12 percent is toxic to yeast, so natural levels of alcohol in wine occur at a maximum of 12 percent.

fermentation tanks

Figure 3 Fermentation of grape juice to make wine produces CO2 as a byproduct. Fermentation tanks have valves so that pressure inside the tanks can be released.

Again, the purpose of this process is not to produce ethanol, but rather to convert NADH back into NAD+ so that glycolysis can continue.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Metabolism of molecules other than glucose

You have learned about the catabolism of glucose, which provides energy to living cells. But living things consume more than just glucose for food. How does a turkey sandwich, which contains various carbohydrates, lipids, and protein, provide energy to your cells?

Basically, all of these molecules from food are converted into molecules that can enter the cellular respiration pathway somewhere. Some molecules enter at glycolysis, while others enter at the citric acid cycle. This means that all of the catabolic pathways for carbohydrates, proteins, and lipids eventually connect into glycolysis and the citric acid cycle pathways. Metabolic pathways should be thought of as porous—that is, substances enter from other pathways, and other substances leave for other pathways. These pathways are not closed systems. Many of the products in a particular pathway are reactants in other pathways.


So far, we have discussed the carbohydrate from which organisms derive the majority of their energy: glucose. Many carbohydrate molecules can be broken down into glucose or otherwise processed into glucose by the body. Glycogen, a polymer of glucose, is a short-term energy storage molecule in animals (Figure 1). When there is plenty of ATP present, the extra glucose is converted into glycogen for storage. Glycogen is made and stored in the liver and muscle. Glycogen will be taken out of storage if blood sugar levels drop. The presence of glycogen in muscle cells as a source of glucose allows ATP to be produced for a longer time during exercise.


Figure 1 Glycogen is made of many molecules of glucose attached together into branching chains. Each of the balls in the bottom diagram represents one molecule of glucose. (Credit: Glycogen by BorisTM. This work has been released into the public domain)

Most other carbohydrates enter the cellular respiration pathway during glycolysis. For example, sucrose is a disaccharide made from glucose and fructose bonded together. Sucrose is broken down in the small intestine. The glucose enters the beginning of glycolysis as previously discussed, while fructose can be slightly modified and enter glycolysis at the third step. Lactose, the disaccharide sugar found in milk, can be broken down by lactase enzyme into two smaller sugars: galactose and glucose. Like fructose, galactose can be slightly modified to enter glycolysis.

Because these carbohydrates enter near the beginning of glycolysis, their catabolism (breakdown) produces the same number of ATP molecules as glucose.


Proteins are broken down by a variety of enzymes in cells. Most of the time, amino acids are recycled into new proteins and not used as a source of energy. This is because it is more energy efficient to reuse amino acids rather than making new ones from scratch. The body will use protein as a source of energy if:

When proteins are used in the cellular respiration pathway, they are first broken down into individual amino acids. The amino group from each amino acid is removed (deaminated) and is converted into ammonia. In mammals, the liver synthesizes urea from two ammonia molecules and a carbon dioxide molecule. Thus, urea is the principal waste product in mammals from the nitrogen originating in amino acids, and it leaves the body in urine.

Once the amino acid has been deaminated, its chemical properties determine which intermediate of the cellular respiration pathway it will be converted into. These intermediates enter cellular respiration at various places in the Citric Acid Cycle (Figure 2).

protein metabolism in the TCA

Figure 2 The carbon skeletons of certain amino acids (indicated in boxes) derived from proteins can feed into the citric acid cycle. (credit: modification of work by Mikael Häggström)


Triglycerides (fats) are a form of long-term energy storage in animals. Triglycerides store about twice as much energy as carbohydrates. Triglycerides are made of glycerol and three fatty acids. Glycerol can enter glycolysis. Fatty acids are broken into two-carbon units that enter the citric acid cycle (Figure 3).

entry of other molecules diagram

Figure 3 Glycogen from the liver and muscles, together with fats, can feed into the catabolic pathways for carbohydrates.

Remember that if oxygen is not available, glycolysis can occur but not the citric acid cycle or oxidative phosphorylation. Since fatty acids enter the pathway at the citric acid cycle, they can not be broken down in the absence of oxygen. This means that if cells are not performing aerobic cellular respiration, the body can not burn fat for energy. This is why posters about the “Fat Burning Zone” in a gym specify that you need to have a lower heart rate / breathing rate to burn more fat – cells that are not doing aerobic respiration can’t burn fat for fuel!


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Anaerobic Cellular Respiration in Prokaryotes

Certain prokaryotes, including some species of bacteria and Archaea, use anaerobic respiration. For example, the group of Archaea called methanogens reduces carbon dioxide to methane to oxidize NADH. These microorganisms are found in soil and in the digestive tracts of ruminants, such as cows and sheep. Similarly, sulfate-reducing bacteria and Archaea, most of which are anaerobic (Figure 8), reduce sulfate to hydrogen sulfide to regenerate NAD+ from NADH.

green stuff off a coast

Figure 8 The green color seen in these coastal waters is from an eruption of hydrogen sulfide. Anaerobic, sulfate-reducing bacteria release hydrogen sulfide gas as they decompose algae in the water. (credit: NASA image courtesy Jeff Schmaltz, MODIS Land Rapid Response Team at NASA GSFC)

Other fermentation methods occur in bacteria. Many prokaryotes are facultatively anaerobic. This means that they can switch between aerobic respiration and fermentation, depending on the availability of oxygen. Certain prokaryotes, like Clostridia bacteria, are obligate anaerobes. Obligate anaerobes live and grow in the absence of molecular oxygen. Oxygen is a poison to these microorganisms and kills them upon exposure. It should be noted that all forms of fermentation, except lactic acid fermentation, produce gas. The production of particular types of gas is used as an indicator of the fermentation of specific carbohydrates, which plays a role in the laboratory identification of the bacteria. The various methods of fermentation are used by different organisms to ensure an adequate supply of NAD+ for the sixth step in glycolysis. Without these pathways, that step would not occur, and no ATP would be harvested from the breakdown of glucose.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016



Learning Objectives

By the end of this section, you will be able to:

  • Compare energy-generating processes within different types of cells

All living organisms on earth consist of one or more cells. Each cell runs on the chemical energy found mainly in carbohydrate molecules (food), and the majority of these molecules are produced by one process: photosynthesis. Through photosynthesis, certain organisms convert solar energy (sunlight) into chemical energy, which is then used to build carbohydrate molecules. The energy used to hold these molecules together is released when an organism breaks down food. Cells then use this energy to perform work, such as cellular respiration.

The energy that is harnessed from photosynthesis enters the ecosystems of our planet continuously and is transferred from one organism to another. Therefore, directly or indirectly, the process of photosynthesis provides most of the energy required by living things on earth.

Photosynthesis also results in the release of oxygen into the atmosphere. In short, to eat and breathe, humans depend almost entirely on the organisms that carry out photosynthesis.



Putting photosynthesis into context

All living things require energy. Carbohydrates are storage molecules for energy. Living things access energy by breaking down carbohydrate molecules during the process of cellular respiration.  Plants produce carbohydrates during photosynthesis. So if plants make carbohydrate molecules during photosynthesis, do they also perform cellular respiration? The answer is yes, they do. Although energy can be stored in molecules like ATP, carbohydrates (and lipids, which can also enter cellular respiration as a source of energy) are much more stable and efficient reservoirs for chemical energy. Photosynthetic organisms also carry out the reactions of respiration to harvest the energy that they have stored in carbohydrates during photosynthesis. Plants have mitochondria in addition to chloroplasts.

The overall reaction for photosynthesis:

6CO2 + 6H2O →⎯ C6H12O6 + 6O2

is the reverse of the overall reaction for cellular respiration:

6O2 + C6H12O6 → 6CO2 + 6H2O

Photosynthesis produces oxygen as a byproduct, and respiration produces carbon dioxide as a byproduct. In nature, there is no such thing as waste. Every single atom of matter is conserved, recycling indefinitely. Substances change form or move from one type of molecule to another, but never disappear (Figure 1).

CO2 is no more a form of waste produced by respiration than oxygen is a waste product of photosynthesis. Both are byproducts of reactions that move on to other reactions. Photosynthesis absorbs energy from sunlight to build carbohydrates in the chloroplasts, and aerobic cellular respiration releases that stored energy by using oxygen to break down carbohydrates. Both organelles use electron transport chains to generate the energy necessary to drive other reactions. Photosynthesis and cellular respiration function in a biological cycle, allowing organisms to access life-sustaining energy that originates millions of miles away in a star.

giraffe and tree with arrows going in a circle

Figure 1 In the carbon cycle, the reactions of photosynthesis and cellular respiration share reciprocal reactants and products. (credit: modification of work by Stuart Bassil)

There are two basic parts of photosynthesis: the light dependent reactions and the light independent reactions (also known as the Calvin cycle). During the light reactions, the energy from sunlight is stored in energy carrier molecules. These energy carrier molecules are then used to power the reactions of the Calvin cycle, where CO2 molecules are joined together to produce carbohydrates such as glucose.

Figure 2 Overview of the process of photosynthesis. During the light-dependent reactions, the energy from sunlight is used by the chloroplast to create energy molecules: ATP and NADPH. These energy molecules power the Calvin cycle, which creates carbohydrates (G3P) from CO2.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Biology. OpenStax CNX. November 11, 2017.


The structure of the chloroplast

In plants, photosynthesis takes place primarily in leaves, which consist of many layers of cells and have differentiated top and bottom sides. The process of photosynthesis occurs not on the surface layers of the leaf, but rather in a middle layer called the mesophyll (Figure 1).

Figure 1 Not all cells of a leaf carry out photosynthesis. Cells within the middle layer of a leaf have chloroplasts, which contain the photosynthetic apparatus. (credit Zephyris; wikimedia)

The gas exchange of carbon dioxide and oxygen occurs through small, regulated openings called stomata.

Figure 2 Tomato leaf stomate (singular of stomata). Photo credit: Photohound; Wikimedia; Public Domain.

In eukaryotes, photosynthesis takes place inside an organelle called a chloroplast. Some prokaryotes can perform photosynthesis, but they do not contain chloroplasts (or other membrane-bound organelles). In plants, chloroplast-containing cells exist in the mesophyll. Chloroplasts are surrounded by a double membrane similar to the double membrane found within a mitochondrion. Within the chloroplast is a third membrane that forms stacked, disc-shaped structures called thylakoids. Embedded in the thylakoid membrane are molecules of chlorophyll, a pigment (a molecule that absorbs light) through which the entire process of photosynthesis begins. Chlorophyll is responsible for the green color of plants. The thylakoid membrane encloses an internal space called the thylakoid lumen or space. Other types of pigments are also involved in photosynthesis, but chlorophyll is by far the most important. As shown in Figure 3, a stack of thylakoids is called a granum, and the space surrounding the granum is called stroma (not to be confused with stomata, the openings on the leaves).

Figure 3 structure of the chloroplast. Note that the chloroplast is surrounded by a double membrane, but also contains a third set of membranes, which enclose the thylakoids. 

Just like the structure of the mitochondria was important to its ability to perform aerobic cellular respiration, the structure of the chloroplast allows the process of photosynthesis to take place. Both the light-dependent reactions and the Calvin cycle take place inside of the chloroplast.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Light and Pigments

How can light be used to make food? It is easy to think of light as something that exists and allows living organisms, such as humans, to see, but light is a form of energy. Like all energy, light can travel, change form, and be harnessed to do work. In the case of photosynthesis, light energy is transformed into chemical energy, which autotrophs use to build carbohydrate molecules. However, autotrophs only use a specific component of sunlight (Figure 1).

sunset through grass

Figure 1 Autotrophs can capture light energy from the sun, converting it into chemical energy used to build food molecules. (credit: modification of work by Gerry Atwell, U.S. Fish and Wildlife Service)

What Is Light Energy?

The sun emits an enormous amount of electromagnetic radiation (solar energy). Humans can see only a fraction of this energy, which is referred to as “visible light.” The manner in which solar energy travels can be described and measured as waves. Scientists can determine the amount of energy of a wave by measuring its wavelength, the distance between two consecutive, similar points in a series of waves, such as from crest to crest or trough to trough (Figure 2).

diagram showing crest and trough of wavelength

Figure 2 The wavelength of a single wave is the distance between two consecutive points along the wave.

Visible light constitutes only one of many types of electromagnetic radiation emitted from the sun. The electromagnetic spectrum is the range of all possible wavelengths of radiation (Figure 3). Each wavelength corresponds to a different amount of energy carried.

Figure 3 The sun emits energy in the form of electromagnetic radiation. This radiation exists in different wavelengths, each of which has its own characteristic energy. Visible light is one type of energy emitted from the sun.

Each type of electromagnetic radiation has a characteristic range of wavelengths. The longer the wavelength (or the more stretched out it appears), the less energy is carried. Short, tight waves carry the most energy. This may seem illogical, but think of it in terms of a piece of moving rope. It takes little effort by a person to move a rope in long, wide waves. To make a rope move in short, tight waves, a person would need to apply significantly more energy.

The sun emits a broad range of electromagnetic radiation, including X-rays and ultraviolet (UV) rays (Figure 3) . The higher-energy waves are dangerous to living things; for example, X-rays and UV rays can be harmful to humans.

Absorption of Light

Light energy enters the process of photosynthesis when pigments absorb the light. In plants, pigment molecules absorb only visible light for photosynthesis. The visible light seen by humans as white light actually exists in a rainbow of colors. Certain objects, such as a prism or a drop of water, disperse white light to reveal these colors to the human eye. The visible light portion of the electromagnetic spectrum is perceived by the human eye as a rainbow of colors, with violet and blue having shorter wavelengths and, therefore, higher energy. At the other end of the spectrum toward red, the wavelengths are longer and have lower energy.

The wavelengths of light that are reflected from an object and bounce off are detected by our eyes. The wavelengths of light that are absorbed by an object do not make it to our eyes. This means that the color an object appears is due to the wavelengths that are reflected and not those that are absorbed. For example, the apple in Figure 4 appears red (assuming you are not color-blind). This is because the red wavelengths of light are reflected off the apple and the other wavelengths (yellow, green, blue, purple) are absorbed by the apple.

picture of a red apple

Figure 4 This apple appears red because it is reflecting the red wavelengths of light. Other wavelengths are absorbed by the apple.

Understanding Pigments

Different kinds of pigments exist, and each absorbs only certain wavelengths (colors) of visible light. Pigments reflect the color of the wavelengths that they cannot absorb. All photosynthetic organisms contain a pigment called chlorophyll a, which humans see as the common green color associated with plants. Chlorophyll a absorbs wavelengths from either end of the visible spectrum (blue and red), but not from green. Because green is reflected, chlorophyll appears green.

Other pigment types include chlorophyll b (which absorbs blue and red-orange light) and the carotenoids. Each type of pigment can be identified by the specific pattern of wavelengths it absorbs from visible light, which is its absorption spectrum.

Many photosynthetic organisms have a mixture of pigments; between them, the organism can absorb energy from a wider range of visible-light wavelengths. Not all photosynthetic organisms have full access to sunlight. Some organisms grow underwater where light intensity decreases with depth, and certain wavelengths are absorbed by the water. Other organisms grow in competition for light. Plants on the rainforest floor must be able to absorb any bit of light that comes through, because the taller trees block most of the sunlight (Figure 5).

forest made up of large trees

Figure 5 Plants that commonly grow in the shade benefit from having a variety of light-absorbing pigments. Each pigment can absorb different wavelengths of light, which allows the plant to absorb any light that passes through the taller trees. (credit: Jason Hollinger)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


The Light-Dependent Reactions

Photosynthesis takes place in two stages: the light-dependent reactions and the Calvin cycle. In the light-dependent reactions, which take place at the thylakoid membrane, chlorophyll absorbs energy from sunlight and then converts it into chemical energy with the use of water. The light-dependent reactions release oxygen as a byproduct as water is broken apart. In the Calvin cycle, which takes place in the stroma, the chemical energy derived from the light-dependent reactions drives both the capture of carbon in carbon dioxide molecules and the subsequent assembly of sugar molecules.

The two reactions use carrier molecules to transport the energy from one to the other. The carriers that move energy from the light-dependent reactions to the Calvin cycle reactions can be thought of as “full” because they bring energy. After the energy is released, the “empty” energy carriers return to the light-dependent reactions to obtain more energy. You should be familiar with the energy carrier molecules used during cellular respiration: NADH and FADH2. Photosynthesis uses a different energy carrier, NADPH, but it functions in a comparable way. The lower energy form, NADP+, picks up a high energy electron and a proton and is converted to NADPH. When NADPH gives up its electron, it is converted back to NADP+.

How the Light-Dependent Reactions Work

The overall purpose of the light-dependent reactions is to convert solar energy into chemical energy in the form of NADPH and ATP. This chemical energy will be used by the Calvin cycle to fuel the assembly of sugar molecules.

The light-dependent reactions begin in a grouping of pigment molecules and proteins called a photosystem. There are two photosystems (Photosystem I and II), which exist in the membranes of thylakoids. Both photosystems have the same basic structure: a number of antenna proteins to which chlorophyll molecules are bound surround the reaction center where the photochemistry takes place. Each photosystem is serviced by the light-harvesting complex, which passes energy from sunlight to the reaction center. It consists of multiple antenna proteins that contain a mixture of 300–400 chlorophyll a and b molecules as well as other pigments like carotenoids. A photon of light energy travels until it reaches a molecule of chlorophyll pigment. The photon causes an electron in the chlorophyll to become “excited.” The energy given to the electron allows it to break free from an atom of the chlorophyll molecule. Chlorophyll is therefore said to “donate” an electron (Figure 1).The absorption of a single photon or distinct quantity or “packet” of light by any of the chlorophylls pushes that molecule into an excited state. In short, the light energy has now been captured by biological molecules but is not stored in any useful form yet. The energy is transferred from chlorophyll to chlorophyll until eventually (after about a millionth of a second), it is delivered to the reaction center. Up to this point, only energy has been transferred between molecules, not electrons.

To replace the electron in the chlorophyll, a molecule of water is split. This splitting releases two electrons and results in the formation of oxygen (O2) and 2 hydrogen ions (H+) in the thylakoid space. The replacement of the electron enables chlorophyll to respond to another photon. The oxygen molecules produced as byproducts exit the leaf through the stomata and find their way to the surrounding environment. The hydrogen ions play critical roles in the remainder of the light-dependent reactions.

Figure 1 Light energy is absorbed by a chlorophyll molecule and is passed along a pathway to other chlorophyll molecules. The energy culminates in a molecule of chlorophyll found in the reaction center. The energy “excites” one of its electrons enough to leave the molecule and be transferred to a nearby primary electron acceptor. A molecule of water splits to release an electron, which is needed to replace the one donated. Oxygen and hydrogen ions are also formed from the splitting of water.

Keep in mind that the purpose of the light-dependent reactions is to convert solar energy into chemical carriers (NADPH and ATP) that will be used in the Calvin cycle. In eukaryotes and some prokaryotes, two photosystems exist. The first is called photosystem II (PSII), which was named for the order of its discovery rather than for the order of the function. After a photon hits the photosystem II (PSII) reaction center, energy from sunlight is used to extract electrons from water. The electrons travel through the chloroplast electron transport chain to photosystem I (PSI), which reduces NADP+ to NADPH (Figure 3). As the electron passes along the electron transport chain, energy from the electron fuels proton pumps in the membrane that actively move hydrogen ions against their concentration gradient from the stroma into the thylakoid space. The electron transport chain moves protons across the thylakoid membrane into the lumen (the space inside the thylakoid disk). At the same time, splitting of water adds additional protons into the lumen, and reduction of NADPH removes protons from the stroma (the space outside the thylakoids). The net result is a high concentration of protons (H+) in the thylakoid lumen, and a low concentration of protons in the stroma. ATP synthase uses this electrochemical gradient to make ATP, just like it did in cellular respiration. Note that a high concentration of protons = an acidic pH, so the thylakoid lumen has a much more acidic (lower) pH than the stroma.

This whole process is quite analogous to the process that occurs during cellular respiration in the mitochondria. Recall that during CR, the energy carried by NADH and FADH2 is used to pump protons across the inner mitochondrial membrane and into the intermembrane space, creating an electrochemical proton gradient. This gradient is used to power oxidative phosphorylation by ATP synthase to create ATP.

diagram of light reactions

Figure 3 Energy from light is used by the chloroplast electron transport chain to pump protons across the thylakoid membrane into the lumen of the thylakoid. This creates a proton gradient that is used as a source of energy by ATP synthase.

Generating an Energy Molecule: ATP

In the light-dependent reactions, energy absorbed by sunlight is stored by two types of energy-carrier molecules: ATP and NADPH. The energy that these molecules carry is stored in a bond that holds a single atom to the molecule. For ATP, it is a phosphate atom, and for NADPH, it is a hydrogen atom. Recall that NADH was a similar molecule that carried energy in the mitochondrion from the citric acid cycle to the electron transport chain. When these molecules release energy into the Calvin cycle, they each lose atoms to become the lower-energy molecules ADP and NADP+.

The buildup of hydrogen ions in the thylakoid space forms an electrochemical gradient because of the difference in the concentration of protons (H+) and the difference in the charge across the membrane that they create. This potential energy is harvested and stored as chemical energy in ATP through chemiosmosis, the movement of hydrogen ions down their electrochemical gradient through the transmembrane enzyme ATP synthase, just as in the mitochondrion.

The hydrogen ions are allowed to pass through the thylakoid membrane through an embedded protein complex called ATP synthase. This same protein generated ATP from ADP in the mitochondrion. The energy generated by the hydrogen ion stream allows ATP synthase to attach a third phosphate to ADP, which forms a molecule of ATP in a process called photophosphorylation. The flow of hydrogen ions through ATP synthase is called chemiosmosis (just like in cellular respiration), because the ions move from an area of high to low concentration through a semi-permeable structure.

Generating Another Energy Carrier: NADPH

The remaining function of the light-dependent reaction is to generate the other energy-carrier molecule, NADPH. As the electron from the electron transport chain arrives at photosystem I, it is re-energized with another photon captured by chlorophyll. The energy from this electron drives the formation of NADPH from NADP+ and a hydrogen ion (H+). Now that the solar energy is stored in energy carriers, it can be used to make a sugar molecule.

Section Summary

The pigments of the first part of photosynthesis, the light-dependent reactions, absorb energy from sunlight. A photon strikes the antenna pigments of photosystem II to initiate photosynthesis. The energy travels to the reaction center that contains chlorophyll a to the electron transport chain, which pumps hydrogen ions into the thylakoid interior (the lumen). This action builds up a high concentration of hydrogen ions. The ions flow through ATP synthase via chemiosmosis to form molecules of ATP, which are used for the formation of sugar molecules in the second stage of photosynthesis. Photosystem I absorbs a second photon, which results in the formation of an NADPH molecule, another energy and reducing power carrier for the light-independent reactions.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


The Light Independent Reactions (aka the Calvin Cycle)

After the energy from the sun is converted and packaged into ATP and NADPH, the cell has the fuel needed to build carbohydrate molecules. The carbohydrate molecules made will have a backbone of carbon atoms. Where does the carbon come from? The carbon atoms used to build carbohydrate molecules comes from carbon dioxide, which diffuses into the leaves through the stomata. The Calvin cycle is the term used for the reactions of photosynthesis that use the energy stored by the light-dependent reactions to form glucose and other carbohydrate molecules (Figure 1).

calvin cycle

Figure 1 The light-dependent reactions harness energy from the sun to produce ATP and NADPH. These energy-carrying molecules travel into the stroma where the Calvin cycle reactions take place.

The Interworkings of the Calvin Cycle

In plants, carbon dioxide (CO2) enters the chloroplast through the stomata and diffuses into the stroma of the chloroplast—the site of the Calvin cycle reactions where sugar is synthesized. The reactions are named after the scientist who discovered them, and reference the fact that the reactions function as a cycle. Others call it the Calvin-Benson cycle to include the name of another scientist involved in its discovery.

photosynthesis in its entirety

Figure 2 Light reactions harness energy from the sun to produce chemical bonds, ATP, and NADPH. These energy-carrying molecules are made in the stroma where carbon fixation takes place.

The Calvin cycle reactions (Figure 2) can be organized into three basic stages: fixation, reduction, and regeneration. In the stroma, in addition to CO2, two other molecules are present to initiate the Calvin cycle: an enzyme abbreviated RuBisCO (which stands for ribulose-1,5-bisphosphate carboxylase/oxygenase, in case you’re interested), and the molecule ribulose bisphosphate (RuBP). RuBP has five atoms of carbon and a phosphate group on each end.

Figure 3 The Calvin cycle has three stages. In stage 1, the enzyme RuBisCO incorporates carbon dioxide into an organic molecule, 3-PGA. In stage 2, the organic molecule is reduced using electrons supplied by NADPH. In stage 3, RuBP, the molecule that starts the cycle, is regenerated so that the cycle can continue. Only one carbon dioxide molecule is incorporated at a time, so the cycle must be completed three times to produce a single three-carbon GA3P molecule, and six times to produce a six-carbon glucose molecule.

RuBisCO catalyzes a reaction between CO2 and RuBP, which forms a six-carbon compound that is immediately converted into two three-carbon compounds. This process is called carbon fixation, because CO2 is “fixed” from its inorganic form into organic molecules. You can think this as the carbon being converted from the “broken” form in CO2 (which organisms are not able to directly use) into a “fixed” form, which organisms are able to utilize. Because of this very important role in photosynthesis, RuBisCO is probably the most abundant enzyme on earth.

ATP and NADPH use their stored energy to convert the three-carbon compound, 3-PGA, into another three-carbon compound called G3P. This type of reaction is called a reduction reaction, because it involves the gain of electrons. A reduction is the gain of an electron by an atom or molecule. The molecules of ADP and NAD+, resulting from the reduction reaction, return to the light-dependent reactions to be re-energized.

One of the G3P molecules leaves the Calvin cycle to contribute to the formation of the carbohydrate molecule, which is commonly glucose (C6H12O6). Because the carbohydrate molecule has six carbon atoms, it takes six turns of the Calvin cycle to make one carbohydrate molecule (one for each carbon dioxide molecule fixed). The remaining G3P molecules regenerate RuBP, which enables the system to prepare for the carbon-fixation step. ATP is also used in the regeneration of RuBP.

In summary, it takes six turns of the Calvin cycle to fix six carbon atoms from CO2. These six turns require energy input from 12 ATP molecules and 12 NADPH molecules in the reduction step and 6 ATP molecules in the regeneration step.

Evolution Connection

Photosynthesis in desert plants has evolved adaptations that conserve water. In the harsh dry heat, every drop of water must be used to survive. Because stomata must open to allow for the uptake of CO2, water escapes from the leaf during active photosynthesis. Desert plants have evolved processes to conserve water and deal with harsh conditions. A more efficient use of CO2 allows plants to adapt to living with less water. Some plants such as cacti can prepare materials for photosynthesis during the night by a temporary carbon fixation/storage process, because opening the stomata at this time conserves water due to cooler temperatures. In addition, cacti have evolved the ability to carry out low levels of photosynthesis without opening stomata at all, an extreme mechanism to face extremely dry periods.

Section Summary

Using the energy carriers formed in the first steps of photosynthesis, the light-independent reactions, or the Calvin cycle, take in CO2 from the environment. An enzyme, RuBisCO, catalyzes a reaction with CO2 and another molecule, RuBP. After three cycles, a three-carbon molecule of G3P leaves the cycle to become part of a carbohydrate molecule. The remaining G3P molecules stay in the cycle to be regenerated into RuBP, which is then ready to react with more CO2. Photosynthesis forms an energy cycle with the process of cellular respiration. Plants need both photosynthesis and respiration for their ability to function in both the light and dark, and to be able to interconvert essential metabolites. Therefore, plants contain both chloroplasts and mitochondria.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Photosynthesis in Prokaryotes

The two parts of photosynthesis—the light-dependent reactions and the Calvin cycle—have been described, as they take place in chloroplasts. However, prokaryotes, such as cyanobacteria, lack membrane-bound organelles (including chloroplasts). Prokaryotic photosynthetic organisms have infoldings of the plasma membrane for chlorophyll attachment and photosynthesis (Figure 1). It is here that organisms like cyanobacteria can carry out photosynthesis.

twisty green shape labeled thylakoid

Figure 1 A photosynthetic prokaryote has infolded regions of the plasma membrane that function like thylakoids. Although these are not contained in an organelle, such as a chloroplast, all of the necessary components are present to carry out photosynthesis. (credit: scale-bar data from Matt Russell)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

Text adapted from: OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Cell Division - Binary Fission and Mitosis

Learning Objectives

Describe the processes used for cell division.  Compare the process and consequences of binary fission, mitosis, and meiosis.

Since all living things are made up of one or more cells, all living things have to undergo some type of cell division. Cell division serves several basic purposes: reproduction, repair, and growth. Typically during cell division, the DNA of the organism is copied,  then divided into the new cells using one or more divisions. We will therefore start our discussion of cell division with a brief overview of the process of DNA replication (copying DNA), followed by the process by which various types of cells divide that DNA into new cells.



How DNA is arranged in a cell

DNA is a working molecule; it must be replicated (copied) when a cell is ready to divide, and it must be “read” to produce the molecules, such as proteins, to carry out the functions of the cell. For this reason, the DNA is protected and packaged in very specific ways. Because they must carry so much information, DNA molecules can be very long. Stretched end-to-end, the DNA molecules in a single human cell would come to a length of about 2 meters (roughly 6 feet). Thus, the DNA for a cell must be packaged in a very ordered way to fit and function within a structure (the cell) that is not visible to the naked eye.

A cell’s complete complement of DNA is called its genome. In prokaryotes (bacteria), the genome is composed of a single, double-stranded DNA molecule in the form of a loop or circle. The region in the cell containing this genetic material is called a nucleoid. Some prokaryotes also have smaller loops of DNA called plasmids that are not essential for normal growth.

a cartoon of a bacteria. the DNA looks like spaghetti in the center of the cytplasm.

Figure 1 An average prokaryotic cell. Note that the DNA is not surrounded by a membrane to create a nucleus. Photo credit Lady of Hats; Wikipedia.

The size of the genome in one of the most well-studied prokaryotes, Escherichia coli, is 4.6 million base pairs. This would extend a distance of about 1.6 mm if stretched out. Compare that to the length of an E. coli cell, which is approximately 1-2μm long. 1.6mm = 1600μm: so how does all this DNA fit inside a tiny cell? The DNA is twisted beyond the double helix in what is known as supercoiling. Some proteins are known to be involved in the supercoiling; other proteins and enzymes help in maintaining the supercoiled structure.

Eukaryotes, such as animals and plants, have chromosomes that consist of linear DNA molecules. Chromosomes can be seen as thread-like structures located inside the nucleus of eukaryotic cells. Each chromosome is made of protein and a single linear double-helix of DNA (Figure 2). The term chromosome comes from the Greek words for color (chroma) and body (soma). Scientists gave this name to chromosomes because they are cell structures, or bodies, that are strongly stained by some colorful dyes used in research.

Figure 2 Linear chromosomes from the salivary glands of nonbiting midge larvae. Photo credit Joseph Resichig; Wikimedia.

Eukaryotes typically have much more DNA than prokaryotes: the human genome is roughly 3 billion base pairs while the E. coli genome is roughly 4 million. For this reason, eukaryotes employ a different type of packing strategy to fit their DNA inside the nucleus (Figure 4). At the most basic level, DNA is wrapped around proteins known as histones. The DNA wrapped around histones wraps and stacks through several additional levels of complexity. These thicker more compact structures are what you have seen before in pictures labeled “chromosomes”.

eukaryotic chromosomes

Figure 4: The basic structure of eukaryotic chromosomes inside the nucleus of a cell (“Chromosomes” by National Human Genome Research Institute is in the Public Domain)

To summarize:

Prokaryote vs eukaryote chromosomes

Figure 5: A eukaryote contains a well-defined nucleus, whereas in prokaryotes, the chromosome lies in the cytoplasm in an area called the nucleoid.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016

“Genetics Home Reference: Help Me Understand Genetics” by , National Institutes of Health: U.S> National Library of Medicine is in the Public Domain


An Overview of DNA Replication

When a cell divides, it is important that each daughter cell receives an identical copy of the DNA. This is accomplished by the process of DNA replication. The replication of DNA occurs before the cell begins to divide into two separate cells.

The discovery and characterization of the structure of the double helix provided a hint as to how DNA is copied. Recall that adenine nucleotides pair with thymine nucleotides, and cytosine with guanine, and that DNA is double stranded. This means that the two strands are complementary to each other. For example, a strand of DNA with a nucleotide sequence of AGTCATGA will have a complementary strand with the sequence TCAGTACT (Figure 1).

Double helix

Figure 1: The two strands of DNA are complementary, meaning the sequence of bases in one strand can be used to create the correct sequence of bases in the other strand.

Because of the complementarity of the two strands, having one strand means that it is possible to recreate the other strand. This model for replication suggests that the two strands of the double helix separate during replication, and each strand serves as a template from which the new complementary strand is copied (Figure 2).

Semiconservative DNA replication

Figure 2: The semiconservative model of DNA replication is shown. Gray indicates the original DNA strands, and blue indicates newly synthesized DNA.

During DNA replication, each of the two strands that make up the double helix serves as a template from which new strands are copied. The new strand will be complementary to the parental or “old” strand. Each new double strand consists of one parental strand and one new daughter strand. This is known as semiconservative replication. When two DNA copies are formed, they have an identical sequence of nucleotide bases and are divided equally into two daughter cells.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Prokaryotic Cell Division

The cell division process used by prokaryotes (such as E. coli bacteria) and some unicellular eukaryotes is called binary fission. For unicellular organisms, cell division is the only method to produce new individuals. The outcome of this type of cell reproduction is a pair of daughter cells that are genetically identical to the original parent cell. In unicellular organisms, daughter cells are whole individual organisms. This is a less complicated and much quicker process than cell division in eukaryotes. Because of the speed of bacterial cell division, populations of bacteria can grow very rapidly.

Bacteria dividing

Figure 1: An E. coli bacteria dividing into two identical daughter cells

To achieve the outcome of identical daughter cells, there are some essential steps. The genomic DNA must be replicated (using DNA replication) to produce two identical copies of the entire genome. Then, one copy must be moved into each of the daughter cells. The cytoplasmic contents must also be divided to give both new cells the machinery to sustain life. Since bacterial cells have a genome that consists of a single, circular DNA chromosome, the process of cell division is very simple.

Binary fission

Figure 2: Prokaryotic cell division occurs via a process called binary fission.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016

OpenStax, Biology. OpenStax CNX. May 27, 2016


The Eukaryotic Cell Cycle

Eukaryotes have two major types of cell division: mitosis and meiosis. Mitosis is used to produce new body cells for growth and healing, while meiosis is used to produce sex cells (eggs and sperm). Meiosis will be discussed in a later chapter.

The cell cycle is an ordered series of events involving cell growth and cell division that produces two new daughter cells via mitosis. The length of the cell cycle is highly variable even within the cells of an individual organism. In humans, the frequency of cell turnover ranges from a few hours in early embryonic development to an average of two to five days for epithelial cells, or to an entire human lifetime spent without dividing in specialized cells such as cortical neurons or cardiac muscle cells. There is also variation in the time that a cell spends in each phase of the cell cycle. When fast-dividing mammalian cells are grown in culture (outside the body under optimal growing conditions), the length of the cycle is approximately 24 hours. The timing of events in the cell cycle is controlled by mechanisms that are both internal and external to the cell.

Cells on the path to cell division proceed through a series of precisely timed and carefully regulated stages of growth, DNA replication, and division that produce two genetically identical cells. The cell cycle has two major phases: interphase and the mitotic phase (Figure 1). During interphase, the cell grows and DNA is replicated. During the mitotic phase, the replicated DNA and cytoplasmic contents are separated and the cell divides.

Cell cycle

Figure 1: A cell moves through a series of phases in an orderly manner. During interphase, G1 involves cell growth and protein synthesis, the S phase involves DNA replication and the replication of the centrosome, and G2 involves further growth and protein synthesis. The mitotic phase follows interphase. Mitosis is nuclear division during which duplicated chromosomes are segregated and distributed into daughter nuclei. Usually the cell will divide after mitosis in a process called cytokinesis in which the cytoplasm is divided and two daughter cells are formed.


During interphase, the cell undergoes normal processes while also preparing for cell division. For a cell to move from interphase to the mitotic phase, many internal and external conditions must be met. The three stages of interphase are called G1, S, and G2.

G1 Phase (First Gap)

The first stage of interphase is called the G1 phase (first gap) because, from a microscopic aspect, little change is visible. However, during the G1 stage, the cell is quite active at the biochemical level. The cell is accumulating the building blocks of chromosomal DNA and the associated proteins as well as accumulating sufficient energy reserves to complete the task of replicating each chromosome in the nucleus.

S Phase (Synthesis of DNA)

Throughout interphase, nuclear DNA remains in a semi-condensed chromatin configuration. In the S phase, DNA replication can proceed through the mechanisms that result in the formation of identical pairs of DNA molecules—sister chromatids—that are firmly attached to the centromeric region (Figure 2).

Figure 2 DNA replication during S phase copies each linear chromosome. The chromosomes remain attached together at a region called the centromere. Photo credit: Lisa Bartee

The centrosome is also duplicated during the S phase. The two centrosomes will give rise to the mitotic spindle, the apparatus that orchestrates the movement of chromosomes during mitosis. At the center of each animal cell, the centrosomes of animal cells are associated with a pair of rod-like objects, the centrioles, which are at right angles to each other. Centrioles help organize cell division. Centrioles are not present in the centrosomes of other eukaryotic species, such as plants and most fungi.

Figure 3 (a) Structure of the centrioles making up the centrosome. (b) Centrioles give rise to the mitotic spindle (grey threadlike structures). Photo credit: CNX OpenStax Microbiology.

G2 Phase (Second Gap)

In the G2 phase, the cell replenishes its energy stores and synthesizes proteins necessary for chromosome manipulation. Some cell organelles are duplicated, and the cytoskeleton is dismantled to provide resources for the mitotic phase. There may be additional cell growth during G2. The final preparations for the mitotic phase must be completed before the cell is able to enter the first stage of mitosis.

The Mitotic Phase


Figure 4: Mitosis in onion root cells. The cells in this image are in various stages of mitosis. (Credit: Spike Walker. Wellcome Images

To make two daughter cells, the contents of the nucleus and the cytoplasm must be divided. The mitotic phase is a multistep process during which the duplicated chromosomes are aligned, separated, and moved to opposite poles of the cell, and then the cell is divided into two new identical daughter cells. The first portion of the mitotic phase, mitosis, is composed of five stages, which accomplish nuclear division (Figure 5). The second portion of the mitotic phase, called cytokinesis, is the physical separation of the cytoplasmic components into two daughter cells. Although the stages of mitosis are similar for most eukaryotes, the process of cytokinesis is quite different for eukaryotes that have cell walls, such as plant cells.

Figure 5 Summary of the process of mitosis. Photo credit Oganesson007, Wikimedia.


During prophase, the “first phase,” the nuclear envelope starts to dissociate into small vesicles, and the membranous organelles (such as the Golgi apparatus and endoplasmic reticulum), fragment and disperse toward the edges of the cell. The nucleolus disappears. The centrosomes begin to move to opposite poles of the cell. Microtubules that will form the mitotic spindle extend between the centrosomes, pushing them farther apart as the microtubule fibers lengthen. The sister chromatids begin to coil more tightly with the aid of condensin proteins and become visible under a light microscope.

Figure 6 Prophase. Photo credit Kelvin13; Wikimedia.


During prometaphase, the “first change phase,” many processes that were begun in prophase continue to advance. The remnants of the nuclear envelope fragment. The mitotic spindle continues to develop as more microtubules assemble and stretch across the length of the former nuclear area. Chromosomes become more condensed and discrete. Each sister chromatid develops a protein structure called a kinetochore in the centromeric region.

Figure 7 Prometaphase. Photo credit Kelvin13; Wikimedia.

The proteins of the kinetochore attract and bind mitotic spindle microtubules. As the spindle microtubules extend from the centrosomes, some of these microtubules come into contact with and firmly bind to the kinetochores. Once a mitotic fiber attaches to a chromosome, the chromosome will be oriented until the kinetochores of sister chromatids face the opposite poles. Eventually, all the sister chromatids will be attached via their kinetochores to microtubules from opposing poles. Spindle microtubules that do not engage the chromosomes are called polar microtubules. These microtubules overlap each other midway between the two poles and contribute to cell elongation. Astral microtubules are located near the poles, aid in spindle orientation, and are required for the regulation of mitosis.

This illustration shows two sister chromatids. Each has a kinetochore at the centromere, and mitotic spindle microtubules radiate from the kinetochore.

Figure 8 During prometaphase, mitotic spindle microtubules from opposite poles attach to each sister chromatid at the kinetochore. In anaphase, the connection between the sister chromatids breaks down, and the microtubules pull the chromosomes toward opposite poles.


During metaphase, the “change phase,” all the chromosomes are aligned in a plane called the metaphase plate, or the equatorial plane, midway between the two poles of the cell. The sister chromatids are still tightly attached to each other by cohesin proteins. At this time, the chromosomes are maximally condensed.

Figure 9 Metaphase. Photo credit Kelvin13; Wikimedia.


During anaphase, the “upward phase,” the cohesin proteins degrade, and the sister chromatids separate at the centromere. Each chromatid, now called a chromosome, is pulled rapidly toward the centrosome to which its microtubule is attached. The cell becomes visibly elongated (oval shaped) as the polar microtubules slide against each other at the metaphase plate where they overlap.

Figure 10 Anaphase. Photo credit Kelvin13; Wikimedia.


During telophase, the “distance phase,” the chromosomes reach the opposite poles and begin to decondense (unravel), relaxing into a chromatin configuration. The mitotic spindles are depolymerized into tubulin monomers that will be used to assemble cytoskeletal components for each daughter cell. Nuclear envelopes form around the chromosomes, and nucleosomes appear within the nuclear area.

Figure 11 Telophase. Photo credit Kelvin13; Wikimedia.


Cytokinesis, or “cell motion,” is the second main stage of the mitotic phase during which cell division is completed via the physical separation of the cytoplasmic components into two daughter cells. Division is not complete until the cell components have been divided and completely separated into the two daughter cells. Although the stages of mitosis are similar for most eukaryotes, the process of cytokinesis is quite different for eukaryotes that have cell walls, such as plant cells.

In cells such as animal cells that lack cell walls, cytokinesis follows the onset of anaphase. A contractile ring composed of actin filaments forms just inside the plasma membrane at the former metaphase plate (Figure 12). The actin filaments pull the equator of the cell inward, forming a fissure. This fissure, or “crack,” is called the cleavage furrow. The furrow deepens as the actin ring contracts, and eventually the membrane is cleaved in two.

In plant cells, a new cell wall must form between the daughter cells. During interphase, the Golgi apparatus accumulates enzymes, structural proteins, and glucose molecules prior to breaking into vesicles and dispersing throughout the dividing cell (Figure 12). During telophase, these Golgi vesicles are transported on microtubules to form a phragmoplast (a vesicular structure) at the metaphase plate. There, the vesicles fuse and coalesce from the center toward the cell walls; this structure is called a cell plate. As more vesicles fuse, the cell plate enlarges until it merges with the cell walls at the periphery of the cell. Enzymes use the glucose that has accumulated between the membrane layers to build a new cell wall. The Golgi membranes become parts of the plasma membrane on either side of the new cell wall.

Figure 12 During cytokinesis in animal cells, a ring of actin filaments forms at the metaphase plate. The ring contracts, forming a cleavage furrow, which divides the cell in two. In plant cells, Golgi vesicles coalesce at the former metaphase plate, forming a phragmoplast. A cell plate formed by the fusion of the vesicles of the phragmoplast grows from the center toward the cell walls, and the membranes of the vesicles fuse to form a plasma membrane that divides the cell in two.

Summary of Mitosis and Cytokinesis

Figure 13 Mitosis is divided into five stages—prophase, prometaphase, metaphase, anaphase, and telophase. The pictures at the bottom were taken by fluorescence microscopy of cells artificially stained by fluorescent dyes: blue fluorescence indicates DNA (chromosomes) and green fluorescence indicates microtubules (spindle apparatus). (credit “mitosis drawings”: modification of work by Mariana Ruiz Villareal; credit “micrographs”: modification of work by Roy van Heesbeen; credit “cytokinesis micrograph”: Wadsworth Center/New York State Department of Health; scale-bar data from Matt Russell)

G0 Phase

Not all cells adhere to the classic cell-cycle pattern in which a newly formed daughter cell immediately enters interphase, closely followed by the mitotic phase. Cells in the G0 phase are not actively preparing to divide. The cell is in a quiescent (inactive) stage, having exited the cell cycle. Some cells enter G0 temporarily until an external signal triggers the onset of G1. Other cells that never or rarely divide, such as mature cardiac muscle and nerve cells, remain in G0 permanently).


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016


Control of the Cell Cycle

It is essential that daughter cells be exact duplicates of the parent cell. Mistakes in the duplication or distribution of the chromosomes lead to mutations that may be passed forward to every new cell produced from the abnormal cell. To prevent a compromised cell from continuing to divide, there are internal control mechanisms that operate at three main cell cycle checkpoints at which the cell cycle can be stopped until conditions are favorable.

Figure 1 The cell cycle is controlled at three checkpoints. Integrity of the DNA is assessed at the G1 checkpoint. Proper chromosome duplication is assessed at the G2 checkpoint. Attachment of each kinetochore to a spindle fiber is assessed at the M checkpoint.

The first checkpoint (G1) determines whether all conditions are favorable for cell division to proceed. This checkpoint is the point at which the cell irreversibly commits to the cell-division process. In addition to adequate reserves and cell size, there is a check for damage to the genomic DNA. A cell that does not meet all the requirements will not be released into the S phase.

The second checkpoint (G2) bars the entry to the mitotic phase if certain conditions are not met. The most important role of this checkpoint is to ensure that all of the chromosomes have been replicated and that the replicated DNA is not damaged.

The final checkpoint (M) occurs in the middle of mitosis. This checkpoint determines if all of the copied chromosomes are arranged appropriately to be separated to opposite sides of the cell. If this doesn’t happen correctly, incorrect numbers of chromosomes can be partitioned into each of the daughter cells, which would likely cause them to die.

Regulator Molecules of the Cell Cycle

In addition to the internally controlled checkpoints, there are two groups of intracellular molecules that regulate the cell cycle. These regulatory molecules either promote progress of the cell to the next phase (positive regulation) or halt the cycle (negative regulation). Regulator molecules may act individually, or they can influence the activity or production of other regulatory proteins. Therefore, it is possible that the failure of a single regulator may have almost no effect on the cell cycle, especially if more than one mechanism controls the same event. It is also possible that the effect of a deficient or non-functioning regulator can be wide-ranging and possibly fatal to the cell if multiple processes are affected.

Positive Regulation of the Cell Cycle

Two groups of proteins, called cyclins and cyclin-dependent kinases (Cdks), are responsible for the progress of the cell through the various checkpoints. The levels of the four cyclin proteins fluctuate throughout the cell cycle in a predictable pattern (Figure 2). Increases in the concentration of cyclin proteins are triggered by both external and internal signals. After the cell moves to the next stage of the cell cycle, the cyclins that were active in the previous stage are degraded.

This graph shows the concentrations of different cyclin proteins during various phases of the cell cycle. Cyclin D concentrations increase in G_{1} and decrease at the end of mitosis. Cyclin E levels rise during G_{1} and fall during S phase. Cyclin A levels rise during S phase and fall during mitosis. Cyclin B levels rise in S phase and fall during mitosis.

Figure 2 The concentrations of cyclin proteins change throughout the cell cycle. There is a direct correlation between cyclin accumulation and the three major cell cycle checkpoints. Also note the sharp decline of cyclin levels following each checkpoint (the transition between phases of the cell cycle), as cyclin is degraded by cytoplasmic enzymes. (credit: modification of work by “WikiMiMa”/Wikimedia Commons)

Cyclins regulate the cell cycle only when they are tightly bound to Cdks. To be fully active, the Cdk/cyclin complex must also be phosphorylated in specific locations. Like all kinases, Cdks are enzymes (kinases) that phosphorylate other proteins. Phosphorylation activates the protein by changing its shape. The proteins phosphorylated by Cdks are involved in advancing the cell to the next phase (Figure 3). The levels of Cdk proteins are relatively stable throughout the cell cycle; however, the concentrations of cyclin fluctuate and determine when Cdk/cyclin complexes form. The different cyclins and Cdks bind at specific points in the cell cycle and thus regulate different checkpoints.

This illustration shows a cyclin protein binding to a Cdk. The cyclin/Cdk complex is activated when a kinase phosphorylates it. The cyclin/Cdk complex, in turn, phosphorylates other proteins, thus advancing the cell cycle.

Figure 3 Cyclin-dependent kinases (Cdks) are protein kinases that, when fully activated, can phosphorylate and thus activate other proteins that advance the cell cycle past a checkpoint. To become fully activated, a Cdk must bind to a cyclin protein and then be phosphorylated by another kinase.

Since the cyclic fluctuations of cyclin levels are based on the timing of the cell cycle and not on specific events, regulation of the cell cycle usually occurs by either the Cdk molecules alone or the Cdk/cyclin complexes. Without a specific concentration of fully activated cyclin/Cdk complexes, the cell cycle cannot proceed through the checkpoints.

Negative Regulation of the Cell Cycle

The second group of cell cycle regulatory molecules are negative regulators. In positive regulation, active molecules such as CDK/cyclin complexes cause the cell cycle to progress. In negative regulation, active molecules halt the cell cycle.

The best understood negative regulatory molecules are retinoblastoma protein (Rb), p53, and p21. Much of what is known about cell cycle regulation comes from research conducted with cells that have lost regulatory control. All three of these regulatory proteins were discovered to be damaged or non-functional in cells that had begun to replicate uncontrollably (became cancerous). In each case, the main cause of the unchecked progress through the cell cycle was a faulty copy of the regulatory protein. For this reason, Rb and other proteins that negatively regulate the cell cycle are sometimes called tumor suppressors.

Rb, p53, and p21 act primarily at the G1 checkpoint. p53 is a multi-functional protein that has a major impact on the commitment of a cell to division because it acts when there is damaged DNA in cells that are undergoing the preparatory processes during G1. If damaged DNA is detected, p53 halts the cell cycle and recruits enzymes to repair the DNA. If the DNA cannot be repaired, p53 can trigger apoptosis, or cell suicide, to prevent the duplication of damaged chromosomes. As p53 levels rise, the production of p21 is triggered. p21 enforces the halt in the cycle dictated by p53 by binding to and inhibiting the activity of the Cdk/cyclin complexes. As a cell is exposed to more stress, higher levels of p53 and p21 accumulate, making it less likely that the cell will move into the S phase.

Rb exerts its regulatory influence on other positive regulator proteins. Chiefly, Rb monitors cell size. In the active, dephosphorylated state, Rb binds to proteins called transcription factors (Figure 4). Transcription factors “turn on” specific genes, allowing the production of proteins encoded by that gene. When Rb is bound to transcription factors, production of proteins necessary for the G1/S transition is blocked. As the cell increases in size, Rb is slowly phosphorylated until it becomes inactivated. Rb releases the transcription factors, which can now turn on the gene that produces the transition protein, and this particular block is removed. For the cell to move past each of the checkpoints, all positive regulators must be “turned on,” and all negative regulators must be “turned off.”

Figure 4 Rb halts the cell cycle and releases its hold in response to cell growth.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016

OpenStax, Biology. OpenStax CNX. May 27, 2016


Cancer and the cell cycle

Cancer comprises many different diseases caused by a common mechanism: uncontrolled cell growth. Despite the redundancy and overlapping levels of cell cycle control, errors do occur. One of the critical processes monitored by the cell cycle checkpoint surveillance mechanism is the proper replication of DNA during the S phase. Even when all of the cell cycle controls are fully functional, a small percentage of replication errors (mutations) will be passed on to the daughter cells. If changes to the DNA nucleotide sequence of a gene are not corrected, a gene mutation results. All cancers start when a gene mutation causes a change in the order of the amino acids that make up a protein that plays a key role in cell reproduction. Changes in the amino acid sequence can change the shape of the protein. Since the shape of the protein is changed, its function may be changed as well. The change in the cell that results from the misshaped protein may be minor: perhaps a slight delay in the binding of Cdk to cyclin or an Rb protein that detaches from its target DNA while still phosphorylated. Even minor mistakes, however, may allow subsequent mistakes to occur more readily. Over and over, small uncorrected errors are passed from the parent cell to the daughter cells and amplified as each generation produces more non-functional proteins from uncorrected DNA damage. Eventually, the pace of the cell cycle speeds up as the effectiveness of the control and repair mechanisms decreases. Uncontrolled growth of the mutated cells outpaces the growth of normal cells in the area, and a tumor (“-oma”) can result.

Cancer cells

Figure 1 Cancer cells in culture from human connective tissue, illuminated by darkfield amplified contrast, at a magnification of 500x.


The genes that code for the positive cell cycle regulators are called proto-oncogenes. Proto-oncogenes are normal genes that, when mutated in certain ways, become oncogenes, genes that cause a cell to become cancerous. Consider what might happen to the cell cycle in a cell with a recently acquired oncogene. In most instances, a mutation in the DNA sequence of a gene will result in a less functional or non-functional protein. This result is detrimental to the cell and will likely prevent the cell from completing the cell cycle, which means that this cell cannot create daughter cells. In this case, the organism is not harmed because the mutation will not be carried forward and the damage is minimal.

Occasionally, however, a gene mutation causes a change that increases the activity of a positive regulator. For example, a mutation that allows Cdk to be activated without being partnered with cyclin could push the cell cycle past a checkpoint before all of the required conditions are met. If the resulting daughter cells are too damaged to undergo further cell divisions, the mutation would not be propagated and no harm would come to the organism. However, if the atypical daughter cells are able to undergo further cell divisions, subsequent generations of cells will probably accumulate even more mutations, some possibly in additional genes that regulate the cell cycle.

The Cdk gene in the above example is only one of many genes that are considered proto-oncogenes. In addition to the cell cycle regulatory proteins, any protein that influences the cycle can be altered in such a way as to override cell cycle checkpoints. An oncogene is any gene that, when altered, leads to an increase in the rate of cell cycle progression.

Tumor Suppressor Genes

Like proto-oncogenes, many of the negative cell cycle regulatory proteins were discovered in cells that had become cancerous. Tumor suppressor genes are segments of DNA that code for negative regulator proteins. Activated negative regulator proteins prevent the cell from undergoing uncontrolled division. The collective function of the best-understood tumor suppressor gene proteins, Rb, p53, and p21, is to put up a roadblock to cell cycle progression until certain events are completed. A cell that carries a mutated form of a negative regulator might not be able to halt the cell cycle if there is a problem. Tumor suppressors are similar to brakes in a vehicle: Malfunctioning brakes can contribute to a car crash.

Mutated p53 genes have been identified in more than one-half of all human tumor cells. This discovery is not surprising in light of the multiple roles that the p53 protein plays at the G1 checkpoint. A cell with a faulty p53 may fail to detect errors present in the genomic DNA (Figure 2). Even if a partially functional p53 does identify the mutations, it may no longer be able to signal the necessary DNA repair enzymes. Either way, damaged DNA will remain uncorrected. At this point, a functional p53 will deem the cell unsalvageable and trigger programmed cell death (apoptosis). The damaged version of p53 found in cancer cells, however, cannot trigger apoptosis.

Figure 2 The role of normal p53 is to monitor DNA and the supply of oxygen (hypoxia is a condition of reduced oxygen supply). If damage is detected, p53 triggers repair mechanisms. If repairs are unsuccessful, p53 signals apoptosis. A cell with an abnormal p53 protein cannot repair damaged DNA and thus cannot signal apoptosis. Cells with abnormal p53 can become cancerous. (credit: modification of work by Thierry Soussi)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. November 11, 2017


BIOLOGY 212 - Genetics

The Principles of Biology sequence (BI 211, 212, & 213) introduces biology as a scientific discipline for students planning to major in biology and other science disciplines.  Laboratories and classroom activities introduce techniques used to study biological processes and provide opportunities for students to develop their ability to conduct research. BI212 uses genetics as a model system to understand information flow in living organisms.

Course Outcomes: Upon successful completion of this course, students should be able to

  1. Apply the scientific method to biological questions by designing experiments and using the resulting data to form a conclusion.
    1. Design a controlled experiment to answer a biological question.
    2. Predict the outcome of an experiment.
    3. Collect, manipulate, and analyze quantitative and qualitative data
    4. Answer a biological question using data.
  2. Select, evaluate and utilize discipline-specific information and literature to research a biological topic.
    1. Differentiate between questions that can and cannot be answered using science.
    2. Identify appropriate credible sources of information to research a topic.
    3. Evaluate sources of information for their strengths and weaknesses.
  3. Assess the strengths and weaknesses of the design of scientific studies and the conclusions drawn from such studies.
    1. Evaluate the strengths and weaknesses of their own as well as published experiments.
  4. Communicate information using appropriate biological terminology in multiple formats.
    1. Use an appropriate written format to present scientific information.
    2. Use appropriate biological terminology to answer written and oral questions.
    3. Orally present results from an experiment.
  5. Use evidence to develop informed opinions on contemporary biological issues while considering cultural and ethical implications.
    1. Research current ethical issues in genetics and biotechnology.
    2. Form opinions based on published scientific research.
  6. Apply biological theories and concepts to solve problems related to classical and molecular genetics
    1. Describe the molecular basis of inheritance.
    2. Determine the outcome in crosses involving various types of inheritance (e.g. simple dominance, codominance, incomplete dominance, sex-linkage).
    3. Present and decipher information about trait information using a pedigree.
    4. Discuss the possible evolutionary consequences of various types of inheritance.
  7. Discuss the potential implications of mutations at cellular, organismal, and evolutionary levels
    1. Describe the structure of DNA and the process of DNA replication.
    2. Summarize the processes involved in protein synthesis.
    3. Describe how mutations affect the process of protein synthesis and its products.
    4. Discuss the possible evolutionary consequences of mutations.
  8. Describe the purpose of the regulation of gene expression and the mechanisms by which gene expression is regulated
    1. Describe processes through which gene expression can be regulated.
    2. Differentiate between gene regulation processes used by prokaryotes and eukaryotes.
    3. Discuss the possible evolutionary consequences of changes in gene expression.


DNA and Chromosome Structure

Learning Outcomes

  • Discuss the potential implications of mutations at cellular, organismal, and evolutionary levels
    • Describe the structure of DNA and the process of DNA replication.

DNA is a nucleic acid, which is one of the four biological macromolecules that you began learning about in BI211. Recall that nucleic acids are made up of monomers called nucleic acids joined together by strong covalent bonds.

All living things store genetic information using nucleic acids: either DNA or a related molecule called RNA. Different organisms have different ways of packaging and storing DNA inside their cells.


DNA Structure

In the 1950s, Francis Crick and James Watson worked together to determine the structure of DNA at the University of Cambridge, England. Other scientists like Linus Pauling and Maurice Wilkins were also actively exploring this field. Pauling had discovered the secondary structure of proteins using X-ray crystallography. In Wilkins’ lab, researcher Rosalind Franklin was using X-ray diffraction methods to understand the structure of DNA. Watson and Crick were able to piece together the puzzle of the DNA molecule on the basis of Franklin’s data because Crick had also studied X-ray diffraction (Figure 1).

The photo in part A shows James Watson, Francis Crick, and Maclyn McCarty. The x-ray diffraction pattern in part b is symmetrical, with dots in an x-shape

Figure 1 The work of pioneering scientists (a) James Watson, Francis Crick, and Maclyn McCarty led to our present day understanding of DNA. Scientist Rosalind Franklin discovered (b) the X-ray diffraction pattern of DNA, which helped to elucidate its double helix structure. (credit a: modification of work by Marjorie McCarty, Public Library of Science)

 Unfortunately, Watson and Crick gained access to Franklin’s data without her knowledge or approval. In 1962, James Watson, Francis Crick, and Maurice Wilkins were awarded the Nobel Prize in Medicine. Unfortunately, by then Franklin had died (of ovarian cancer, likely caused by exposure to X-rays), and Nobel prizes are not awarded posthumously (after death). This is actually a really interesting story of “sexism in the sciences” –  there’s a movie called “The Secret of Photo 51” that you can find on YouTube if you’re interested.

Based on Rosalind Franklin’s X-ray diffraction photograph, and work by other scientists, Watson and Crick proposed that DNA is made up of two strands of nucleotides that are twisted around each other to form a right-handed helix. The nucleotides are joined together in a chain by covalent bonds known as phosphodiester bonds. Scientists already knew that nucleotides contain the same three important components: a nitrogenous base, a deoxyribose (5-carbon sugar), and a phosphate group (Figure 2). The nucleotide is named depending on the nitrogenous base: adenine (A), thymine (T), cytosine (C), and guanine (G). Adenine and guanine are both purines, while cytosine and thymine are pyrimidines. The purines have a double ring structure with a six-membered ring fused to a five-membered ring. Pyrimidines are smaller in size; they have a single six-membered ring structure. One good way to remember this is that cytosine, thymine, and pyrimidine all contain the letter “y”.

Watson and Crick’s model proposed that the two strands of nucleotides interact through base pairing between the nucleotides: A pairs with T and G pairs with C. Adenine and thymine are complementary base pairs, and cytosine and guanine are also complementary base pairs. The base pairs are stabilized by hydrogen bonds (a weak type of bond that forms between partially positive and partially negative atoms). Adenine and thymine form two hydrogen bonds and cytosine and guanine form three hydrogen bonds. Since a purine is “2 rings” across and a pyrimidine is “1 ring” across (Figure 2), the diameter of the DNA double helix remains constant at “3 rings” (Figure 3).

Illustration depicts the structure of a nucleoside, which is made up of a pentose with a nitrogenous base attached at the 1' position. There are two kinds of nitrogenous bases: pyrimidines, which have one six-membered ring, and purines, which have a six-membered ring fused to a five-membered ring. Cytosine, thymine, and uracil are pyrimidines, and adenine and guanine are purines. A nucleoside with a phosphate attached at the 5' position is called a mononucleotide. A nucleoside with two or three phosphates attached is called a nucleotide diphosphate or nucleotide triphosphate, respectively.

Figure 2 Each nucleotide is made up of a sugar, a phosphate group, and a nitrogenous base. The sugar is deoxyribose in DNA and ribose in RNA.

The carbon atoms of the five-carbon sugar are numbered in order starting from the carbon connected to the nitrogenous base: 1′, 2′, 3′, 4′, and 5′ (1′ is read as “one prime”). We don’t particularly care about the 1′, 2′, or 4′ positions. At the 3′ position, there is always a hydroxyl (OH) group that is on the sugar. The 5′ carbon is attached to a phosphate group. When nucleotides are joined together into a chain, the 5′ phosphate of one nucleotide is attached to the 3′ hydroxyl group of the next nucleotide, thereby forming a 5′-3′ phosphodiester bond. What this means is that when nucleotides are joined together in a chain, there will always be a free 3′ OH group (from the sugar) at one end and a free 5′ phosphate at the other (Figure 3).

structure of DNA

Figure 3 Structure of DNA. Notice that adenine (a purine) and thymine (a pyrimidine) are connected together with 2 hydrogen bonds, while guanine (a purine) and cytosine (a pyrimidine) are connected by three hydrogen bonds. There is a 5′ and 3′ end to both chains of nucleotides, which are antiparallel to each other. Photo credit  Madeline Price Ball; Wikimedia

The two strands are anti-parallel in nature; that is, the 3′ end of one strand points in one direction, while the 5′ end of the other strand points in that direction (Figure 3). The sugar and phosphate of the nucleotides form the backbone of the structure, whereas the nitrogenous bases are stacked inside. Each base pair is separated from the other base pair by a distance of 0.34 nm (nanometer: 1 x 10-9 meters), and each turn of the helix measures 3.4 nm. Therefore, ten base pairs are present per turn of the helix. The diameter of the DNA double helix is 2 nm, and it is uniform throughout. Only the pairing between a purine and pyrimidine can explain the uniform diameter (3 “rings” across). The twisting of the two strands around each other results in the formation of uniformly spaced major and minor grooves (Figure 4). The major and minor grooves are very important for protein binding to DNA.

Figure 4 DNA has (a) a double helix structure and (b) phosphodiester bonds. The (c) major and minor grooves are binding sites for DNA binding proteins during processes such as transcription (the copying of RNA from DNA) and replication.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. December 21, 2017


DNA organization inside a cell

DNA Organization in Prokaryotes

A cell’s DNA, packaged as a double-stranded DNA molecule, is called its genome. In prokaryotes, the genome is composed of a single, double-stranded DNA molecule in the form of a loop or circle (Figure 1). The region in the cell containing this genetic material is called a nucleoid (remember that prokaryotes do not have a separate membrane-bound nucleus). Some prokaryotes also have smaller loops of DNA called plasmids that are not essential for normal growth. Bacteria can exchange these plasmids with other bacteria, sometimes receiving beneficial new genes that the recipient can add to their chromosomal DNA. Antibiotic resistance is one trait that often spreads through a bacterial colony through plasmid exchange.

oval bacteria containing loop of bacterial DNA and smaller circles of plasmid DNA.

Figure 1 Bacterial DNA and plasmids are both circular. Photo credit Spaully; Wikipedia

The size of the genome in one of the most well-studied prokaryotes, E.coli, is 4.6 million base pairs (approximately 1.1 mm, if cut and stretched out). So how does this fit inside a small bacterial cell? The DNA is twisted by what is known as supercoiling. Supercoiling means that DNA is either under-wound (less than one turn of the helix per 10 base pairs) or over-wound (more than 1 turn per 10 base pairs) from its normal relaxed state. Some proteins are known to be involved in the supercoiling; other proteins and enzymes such as DNA gyrase help in maintaining the supercoiled structure.

DNA Organization in Eukaryotes

Eukaryotes have much more DNA than prokaryotes. For example, an E coli bacteria contains roughly 3 million base pairs of DNA, while a human contains roughly 3 billion. In eukaryotes such as humans, the genome consists of several double-stranded linear DNA molecules (Figure 2), which are located inside a membrane-bound nucleus. Each species of eukaryotes has a characteristic number of chromosomes in the nuclei (plural of nucleus) of its cells.  A normal human gamete (sperm or egg) contains 23 chromosomes. A normal human body cell, or somatic cell, contains 46 chromosomes (one set of 23 from the egg and one set of 23 from the sperm) (Figure 2).  The letter n is used to represent a single set of chromosomes; therefore, a gamete (sperm and egg) are designated 1n, and are called haploid cells. Somatic cells (body cells) are designated 2n and are called diploid cells.

The 23 chromosomes from a human female are each dyed a different color so they can be distinguished. During most of the cell cycle, each chromosome is elongated into a thin strand that folds over on itself, like a piece of spaghetti. The chromosomes fill the entire spherical nucleus, but each one is contained in a different part, resulting in a multi-colored sphere. During mitosis, the chromosomes condense into thick, compact bars, each a different color. These bars can be arranged in numerical order to form a karyotype. There are two copies of each chromosome in the karyotype..

Figure 2 There are 23 pairs of homologous chromosomes in a female human somatic cell. The condensed chromosomes are viewed within the nucleus (top), removed from a cell in mitosis and spread out on a slide (right), and artificially arranged according to length (left); an arrangement like this is called a karyotype. In this image, the chromosomes were exposed to fluorescent stains for differentiation of the different chromosomes. A method of staining called “chromosome painting” employs fluorescent dyes that highlight chromosomes in different colors. (credit: National Human Genome Project/NIH)

Matched pairs of chromosomes in a diploid organism are called homologous (“same knowledge”) chromosomes. Of a pair of homologous chromosomes, one came from the egg and the second came from the sperm. Homologous chromosomes are the same length and have specific nucleotide segments called genes in exactly the same location, or locus.  Genes, the functional units of chromosomes, determine specific characteristics by coding for specific proteins. Traits are the variations of those characteristics. For example, hair color is a characteristic with traits that are blonde, brown, or black.

Each copy of a homologous pair of chromosomes originates from a different parent; therefore, the genes themselves are not identical. The variation of individuals within a species is due to the specific combination of the genes inherited from both parents. Even a slightly altered sequence of nucleotides within a gene can result in an alternative trait. For example, there are three possible gene sequences on the human chromosome that code for blood type: sequence A, sequence B, and sequence O. Because all diploid human cells have two copies of the chromosome that determines blood type, the blood type (the trait) is determined by which two versions of the marker gene are inherited. It is possible to have two copies of the same gene sequence on both homologous chromosomes, with one on each (for example, AA, BB, or OO), or two different sequences, such as AB.

Minor variations of traits, such as blood type, eye color, and handedness, contribute to the natural variation found within a species. However, if the entire DNA sequence from any pair of human homologous chromosomes is compared, the difference is less than one percent. The sex chromosomes, X and Y, are the single exception to the rule of homologous chromosome uniformity: Other than a small amount of homology that is necessary to accurately produce gametes, the genes found on the X and Y chromosomes are different.

Eukaryotic Chromosomal Structure and Compaction

If the DNA from all 46 chromosomes in a human cell nucleus was laid out end to end, it would measure approximately two meters; however, its diameter would be only 2 nm. Considering that the size of a typical human cell is about 10 µm (100,000 cells lined up to equal one meter), DNA must be tightly packaged to fit in the cell’s nucleus. At the same time, it must also be readily accessible for the genes to be expressed. During some stages of the cell cycle, the long strands of DNA are condensed into compact chromosomes. There are a number of ways that chromosomes are compacted.

Eukaryotes, whose chromosomes each consist of a linear DNA molecule, employ a complex type of packing strategy to fit their DNA inside the nucleus (Figure 3). At the most basic level, DNA is wrapped around proteins known as histones to form structures called nucleosomes. The histones are evolutionarily conserved proteins that are rich in basic amino acids and form an octamer of eight histone proteins attached together. DNA, which is negatively charged because of the phosphate groups, is wrapped tightly around the histone core. This nucleosome is linked to the next one with the help of a linker DNA. This is also known as the “beads on a string” structure. This is further compacted into a 30 nm fiber, which is the diameter of the structure. At the metaphase stage, the chromosomes are at their most compact, are approximately 700 nm in width, and are found in association with scaffold proteins.

There are five levels of chromosome organization. From top to bottom: The top panel shows a DNA double helix. The second panel shows the double helix wrapped around proteins called histones. The middle panel shows the entire DNA molecule wrapping around many histones, creating the appearance of beads on a string. The fourth panel shows that the chromatin fiber further condenses into the chromosome shown in the bottom panel.

Figure 3 Double-stranded DNA wraps around histone proteins to form nucleosomes that have the appearance of “beads on a string.” The nucleosomes are coiled into a 30-nm chromatin fiber. When a cell undergoes mitosis, the duplicated chromosomes condense even further.

DNA replicates in the S phase of interphase. After replication, the chromosomes are composed of two linked sister chromatids. This means that the only time chromosomes look like an “X” is after DNA replication has taken place and the chromosomes have condensed. During the majority of the cell’s life, chromosomes are composed of only one copy and they are not tightly compacted into chromosomes. When fully compact, the pairs of identically packed chromosomes are bound to each other by cohesin proteins. The connection between the sister chromatids is closest in a region called the centromere. The conjoined sister chromatids, with a diameter of about 1 µm, are visible under a light microscope. The centromeric region is highly condensed and thus will appear as a constricted area.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. December 21, 2017


DNA Replication

Learning Outcomes

  1. Discuss the potential implications of mutations at cellular, organismal, and evolutionary levels
    1. Describe the structure of DNA and the process of DNA replication.

When a cell divides, it is important that each daughter cell receives an identical copy of the DNA. This is accomplished by the process of DNA replication. The replication of DNA occurs before the cell begins to divide into two separate cells.

The discovery and characterization of the structure of the double helix provided a hint as to how DNA is copied. Recall that adenine nucleotides pair with thymine nucleotides, and cytosine with guanine. This means that the two strands are complementary to each other. For example, a strand of DNA with a nucleotide sequence of AGTCATGA will have a complementary strand with the sequence TCAGTACT (Figure 1).

Double helix

Figure 1: The two strands of DNA are complementary, meaning the sequence of bases in one strand can be used to create the correct sequence of bases in the other strand.

Because of the complementarity of the two strands, having one strand means that it is possible to recreate the other strand. This model for replication suggests that the two strands of the double helix separate during replication, and each strand serves as a template from which the new complementary strand is copied (Figure 2).

Semiconservative DNA replication

Figure 2: The semiconservative model of DNA replication is shown. Gray indicates the original DNA strands, and blue indicates newly synthesized DNA.

During DNA replication, each of the two strands that make up the double helix serves as a template from which new strands are copied. The new strand will be complementary to the parental or “old” strand. Each new double strand consists of one parental strand and one new daughter strand. This is known as semiconservative replication. When two DNA copies are formed, they have an identical sequence of nucleotide bases and are divided equally into two daughter cells.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


DNA Replication in Prokaryotes

Recall that the prokaryotic chromosome is a circular molecule with a less extensive coiling structure than eukaryotic chromosomes. The eukaryotic chromosome is linear and highly coiled around proteins. While there are many similarities in the DNA replication process, these structural differences necessitate some differences in the DNA replication process in these two life forms.  DNA replication in prokaryotes has been extensively studied, so we will start there.

How does the replication machinery know where to begin? It turns out that there are specific nucleotide sequences called origins of replication where replication begins. E. coli has a single origin of replication on its one chromosome, as do most prokaryotes (Figure 1). The origin of replication is approximately 245 base pairs long and is rich in AT sequences. This sequence of base pairs is recognized by certain proteins that bind to this site. An enzyme called helicase unwinds the DNA by breaking the hydrogen bonds between the nitrogenous base pairs. ATP hydrolysis is required for this process because it requires energy. As the DNA opens up, Y-shaped structures called replication forks are formed. Two replication forks are formed at the origin of replication and these get extended bi-directionally as replication proceeds. Single-strand binding proteins (Figure 2) coat the single strands of DNA near the replication fork to prevent the single-stranded DNA from winding back into a double helix.

DNA replication in prokaryotes

Figure 1: DNA replication in prokaryotes, which have one circular chromosome.

The next important enzyme is DNA polymerase, also known as DNA pol, which adds nucleotides one by one to the growing DNA chain that are complementary to the template strand (Figure 2). The addition of nucleotides requires energy; this energy is obtained from the nucleotides that have three phosphates attached to them. ATP is actually an adenine nucleotide which has three phosphate groups attached; breaking off the third phosphate releases energy. In addition to ATP, there are also TTP, CTP, and GTP. Each of these is made up of the corresponding nucleotide with three phosphates attached.  When the bond between the phosphates is broken, the energy released is used to form the phosphodiester bond between the incoming nucleotide and the existing chain. In prokaryotes, three main types of polymerases are known: DNA pol I, DNA pol II, and DNA pol III. DNA pol III is the enzyme required for DNA synthesis; DNA pol I and DNA pol II are primarily required for repair (this is another irritating example of naming that was done based on the order of discovery rather than the order of use).

DNA polymerase is able to add nucleotides only in the 5′ to 3′ direction (a new DNA strand can be only extended in this direction). It requires a free 3′-OH group to which it can add nucleotides by forming a phosphodiester bond between the 3′-OH end and the 5′ phosphate of the next nucleotide. This essentially means that it cannot add nucleotides if a free 3′-OH group is not available. Then how does it add the first nucleotide? The problem is solved with the help of a primer that provides the free 3′-OH end. Another enzyme, RNA primase, synthesizes an RNA primer that is about five to ten nucleotides long and complementary to the DNA. RNA primase does not require a free 3′-OH group. Because this sequence primes the DNA synthesis, it is appropriately called the primer. DNA polymerase can now extend this RNA primer, adding nucleotides one by one that are complementary to the template strand (Figure 2).

Figure 2 A replication fork is formed when helicase separates the DNA strands at the origin of replication. The DNA tends to become more highly coiled ahead of the replication fork. Topoisomerase breaks and reforms DNA’s phosphate backbone ahead of the replication fork, thereby relieving the pressure that results from this supercoiling. Single-strand binding proteins bind to the single-stranded DNA to prevent the helix from re-forming. Primase synthesizes an RNA primer. DNA polymerase III uses this primer to synthesize the daughter DNA strand. On the leading strand, DNA is synthesized continuously, whereas on the lagging strand, DNA is synthesized in short stretches called Okazaki fragments. DNA polymerase I replaces the RNA primer with DNA. DNA ligase seals the gaps between the Okazaki fragments, joining the fragments into a single DNA molecule. (credit: modification of work by Mariana Ruiz Villareal)

The replication fork moves at the rate of 1000 nucleotides per second. DNA polymerase can only extend in the 5′ to 3′ direction, which poses a slight problem at the replication fork. As we know, the DNA double helix is anti-parallel; that is, one strand is in the 5′ to 3′ direction and the other is oriented in the 3′ to 5′ direction. One strand, which is complementary to the 3′ to 5′ parental DNA strand, is synthesized continuously towards the replication fork because the polymerase can add nucleotides in this direction. This continuously synthesized strand is known as the leading strand. The other strand, complementary to the 5′ to 3′ parental DNA, is extended away from the replication fork, in small fragments known as Okazaki fragments, each requiring a primer to start the synthesis. Okazaki fragments are named after the Japanese scientist who first discovered them. The strand with the Okazaki fragments is known as the lagging strand.

The leading strand can be extended by one primer alone, whereas the lagging strand needs a new primer for each of the short Okazaki fragments. The overall direction of the lagging strand will be 3′ to 5′, and that of the leading strand 5′ to 3′. A protein called the sliding clamp holds the DNA polymerase in place as it continues to add nucleotides. The sliding clamp is a ring-shaped protein that binds to the DNA and holds the polymerase in place. Topoisomerase prevents the over-winding of the DNA double helix ahead of the replication fork as the DNA is opening up; it does so by causing temporary nicks in the DNA helix and then resealing it. As synthesis proceeds, the RNA primers are replaced by DNA. The primers are removed by the exonuclease activity of DNA pol I, and the gaps are filled in by deoxyribonucleotides. The nicks that remain between the newly synthesized DNA (that replaced the RNA primer) and the previously synthesized DNA are sealed by the enzyme DNA ligase that catalyzes the formation of phosphodiester linkage between the 3′-OH end of one nucleotide and the 5′ phosphate end of the other fragment.

(Lisa’s note: This process is almost impossible to visualize from reading text. I strongly recommend that you watch a couple of animations / videos like the one available here. I will provide links in Blackboard)

Once the chromosome has been completely replicated, the two DNA copies move into two different cells during cell division. The process of DNA replication can be summarized as follows:

  1. DNA unwinds at the origin of replication.
  2. Helicase opens up the DNA-forming replication forks; these are extended bidirectionally.
  3. Single-strand binding proteins coat the DNA around the replication fork to prevent rewinding of the DNA.
  4. Topoisomerase binds at the region ahead of the replication fork to prevent supercoiling.
  5. Primase synthesizes RNA primers complementary to the DNA strand.
  6. DNA polymerase III starts adding nucleotides to the 3′-OH end of the primer.
  7. Elongation of both the lagging and the leading strand continues.
  8. RNA primers are removed by exonuclease activity.
  9. Gaps are filled by DNA pol I by adding dNTPs.
  10. The gap between the two DNA fragments is sealed by DNA ligase, which helps in the formation of phosphodiester bonds.

Table 1: The enzymes involved in prokaryotic DNA replication and the functions of each.

Prokaryotic DNA Replication: Enzymes and Their Function
Enzyme/protein Specific Function
DNA pol I Exonuclease activity removes RNA primer and replaces with newly synthesized DNA
DNA pol II Repair function
DNA pol III Main enzyme that adds nucleotides in the 5′-3′ direction
Helicase Opens the DNA helix by breaking hydrogen bonds between the nitrogenous bases
Ligase Seals the gaps between the Okazaki fragments to create one continuous DNA strand
Primase Synthesizes RNA primers needed to start replication
Sliding Clamp Helps to hold the DNA polymerase in place when nucleotides are being added
Topoisomerase Helps relieve the stress on DNA when unwinding by causing breaks and then resealing the DNA
Single-strand binding proteins (SSB) Binds to single-stranded DNA to avoid DNA rewinding back.

DNA replication has been extremely well-studied in prokaryotes, primarily because of the small size of the genome and large number of variants available. Escherichia coli has 4.6 million base pairs in a single circular chromosome, and all of it gets replicated in approximately 42 minutes, starting from a single origin of replication and proceeding around the chromosome in both directions. This means that approximately 1000 nucleotides are added per second. The process is much more rapid than in eukaryotes. Table 2 summarizes the basic differences between prokaryotic and eukaryotic DNA replication. More specific differences will be discussed in the next section.

Table 2: Differences between Prokaryotic and Eukaryotic Replication

Property Prokaryotes Eukaryotes
Origin of replication Single Multiple
Rate of replication 1000 nucleotides/s 50 to 100 nucleotides/s
Chromosome structure circular linear
Telomerase Not present Present


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


DNA Replication in Eukaryotes

Eukaryotic genomes are much more complex and larger in size than prokaryotic genomes. The human genome has three billion base pairs per haploid set of chromosomes, and 6 billion base pairs are replicated during the S phase of the cell cycle. There are multiple origins of replication on the eukaryotic chromosome; humans can have up to 100,000 origins of replication. The rate of replication is approximately 100 nucleotides per second, much slower than prokaryotic replication.

The number of DNA polymerases in eukaryotes is much more than prokaryotes: 14 are known, of which five are known to have major roles during replication and have been well studied. They are known as pol α, pol β, pol γ, pol δ, and pol ε. I won’t ever ask you the names of these polymerases – focus on the prokaryotic ones.

The essential steps of replication are the same as in prokaryotes. Before replication can start, the DNA has to be made available as template. Eukaryotic DNA is bound to basic proteins known as histones to form structures called nucleosomes. The chromatin (the complex between DNA and proteins) may undergo some chemical modifications, so that the DNA may be able to slide off the proteins or be accessible to the enzymes of the DNA replication machinery. At the origin of replication, a pre-replication complex is made with other initiator proteins. Other proteins are then recruited to start the replication process.

Difference between Prokaryotic and Eukaryotic Replication
Property Prokaryotes Eukaryotes
Origin of replication Single Multiple
Rate of replication 1000 nucleotides/s 50 to 100 nucleotides/s
DNA polymerase types 5 14
Telomerase Not present Present
RNA primer removal DNA pol I RNase H
Strand elongation DNA pol III Pol δ, pol ε
Sliding clamp Sliding clamp PCNA

A helicase using the energy from ATP hydrolysis opens up the DNA helix. Replication forks are formed at each replication origin as the DNA unwinds (remember, there are multiple origins of replication on each eukaryotic chromosome because they are so large). The opening of the double helix causes over-winding, or supercoiling, in the DNA ahead of the replication fork. These are resolved with the action of topoisomerases. Primers are formed by the enzyme primase, and using the primer, DNA pol can start synthesis. While the leading strand is continuously synthesized by the enzyme pol δ, the lagging strand is synthesized by pol ε. A sliding clamp protein known as PCNA (Proliferating Cell Nuclear Antigen) holds the DNA pol in place so that it does not slide off the DNA. RNase H removes the RNA primer, which is then replaced with DNA nucleotides. The Okazaki fragments in the lagging strand are joined together after the replacement of the RNA primers with DNA. The gaps that remain are sealed by DNA ligase, which forms the phosphodiester bond.

Telomere replication

Unlike prokaryotic chromosomes, eukaryotic chromosomes are linear. As you’ve learned, the enzyme DNA pol can add nucleotides only in the 5′ to 3′ direction. In the leading strand, synthesis continues until the end of the chromosome is reached. On the lagging strand, DNA is synthesized in short stretches, each of which is initiated by a separate primer. When the replication fork reaches the end of the linear chromosome, there is no place for a primer to be made for the DNA fragment to be copied at the end of the chromosome. These ends thus remain unpaired, and over time these ends may get progressively shorter as cells continue to divide.

The ends of the linear chromosomes are known as telomeres, which have repetitive sequences that code for no particular gene. In a way, these telomeres protect the genes from getting deleted as cells continue to divide. In humans, a six base pair sequence, TTAGGG, is repeated 100 to 1000 times. The discovery of the enzyme telomerase (Figure 1) helped in the understanding of how chromosome ends are maintained. The telomerase enzyme contains a catalytic part and a built-in RNA template. It attaches to the end of the chromosome, and complementary bases to the RNA template are added on the 3′ end of the DNA strand. Once the 3′ end of the lagging strand template is sufficiently elongated, DNA polymerase can add the nucleotides complementary to the ends of the chromosomes. Thus, the ends of the chromosomes are replicated.

Telomerase has an associated RNA that complements the 5' overhang at the end of the chromosome. The RNA template is used to synthesize the complementary strand. Telomerase then shifts, and the process is repeated. Next, primase and DNA polymerase synthesize the rest of the complementary strand.

Figure 1 The ends of linear chromosomes are maintained by the action of the telomerase enzyme.

Telomerase is typically active in germ cells and adult stem cells. It is not active in adult somatic cells. For her discovery of telomerase and its action, Elizabeth Blackburn (Figure 2) received the Nobel Prize for Medicine and Physiology in 2009.

Photo of Elizabeth Blackburn.

Figure 2 Elizabeth Blackburn, 2009 Nobel Laureate, is the scientist who discovered how telomerase works. (credit: US Embassy Sweden)

Telomerase and Aging

Cells that undergo cell division continue to have their telomeres shortened because most somatic cells do not make telomerase. This essentially means that telomere shortening is associated with aging. With the advent of modern medicine, preventative health care, and healthier lifestyles, the human life span has increased, and there is an increasing demand for people to look younger and have a better quality of life as they grow older.

In 2010, scientists found that telomerase can reverse some age-related conditions in mice. This may have potential in regenerative medicine (Jaskelioff, 2011). Telomerase-deficient mice were used in these studies; these mice have tissue atrophy, stem cell depletion, organ system failure, and impaired tissue injury responses. Telomerase reactivation in these mice caused extension of telomeres, reduced DNA damage, reversed neurodegeneration, and improved the function of the testes, spleen, and intestines. Thus, telomere reactivation may have potential for treating age-related diseases in humans.

Cancer is characterized by uncontrolled cell division of abnormal cells. The cells accumulate mutations, proliferate uncontrollably, and can migrate to different parts of the body through a process called metastasis. Scientists have observed that cancerous cells have considerably shortened telomeres and that telomerase is active in these cells. Interestingly, only after the telomeres were shortened in the cancer cells did the telomerase become active. If the action of telomerase in these cells can be inhibited by drugs during cancer therapy, then the cancerous cells could potentially be stopped from further division.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016

Jaskelioff et al., 2011 Telomerase reactivation reverses tissue degeneration in aged telomerase-deficient mice. Nature 469: 102-7.


DNA Repair

DNA replication is a highly accurate process, but mistakes can occasionally occur, such as a DNA polymerase inserting a wrong base. Uncorrected mistakes may sometimes lead to serious consequences, such as cancer. Repair mechanisms correct the mistakes. In rare cases, mistakes are not corrected, leading to mutations; in other cases, repair enzymes are themselves mutated or defective.

Most of the mistakes during DNA replication are promptly corrected by DNA polymerase by proofreading the base that has been just added (Figure 1). In proofreading, the DNA pol reads the newly added base before adding the next one, so a correction can be made. The polymerase checks whether the newly added base has paired correctly with the base in the template strand. If it is the right base, the next nucleotide is added. If an incorrect base has been added, the enzyme makes a cut at the phosphodiester bond and releases the wrong nucleotide. This is performed by the exonuclease action of DNA pol III. Once the incorrect nucleotide has been removed, a new one will be added again (Figure 1).

DNA polymerase

Figure 1 Proofreading by DNA polymerase corrects errors during replication. Photo credit Madeline Price Ball; Wikimedia.

Some errors are not corrected during replication, but are instead corrected after replication is completed; this type of repair is known as mismatch repair (Figure 2). The enzymes recognize the incorrectly added nucleotide and excise it; this is then replaced by the correct base. If this remains uncorrected, it may lead to more permanent damage. How do mismatch repair enzymes recognize which of the two bases is the incorrect one? In E. coli, after replication, the nitrogenous base adenine acquires a methyl group (CH3); the parental DNA strand will have methyl groups, whereas the newly synthesized strand lacks them. Thus, DNA polymerase is able to remove the wrongly incorporated bases from the newly synthesized, non-methylated strand. In eukaryotes, the mechanism is not very well understood, but it is believed to involve recognition of unsealed nicks in the new strand, as well as a short-term continuing association of some of the replication proteins with the new daughter strand after replication has completed.

The top illustration shows a replicated DNA strand with G-T base mismatch. The bottom illustration shows the repaired DNA, which has the correct G-C base pairing.

Figure 2 In mismatch repair, the incorrectly added base is detected after replication. The mismatch repair proteins detect this base and remove it from the newly synthesized strand by nuclease action. The gap is now filled with the correctly paired base.

In another type of repair mechanism, nucleotide excision repair, enzymes replace incorrect bases by making a cut on both the 3′ and 5′ ends of the incorrect base (Figure 3). The segment of DNA is removed and replaced with the correctly paired nucleotides by the action of DNA pol. Once the bases are filled in, the remaining gap is sealed with a phosphodiester linkage catalyzed by DNA ligase. This repair mechanism is often employed when UV exposure causes the formation of thymine-thymine dimers (the small – connecting the two Ts in Figure 3).

Illustration shows a DNA strand in which a thymine dimer has formed. Excision repair enzyme cut out the section of DNA that contains the dimer so it can be replaced with normal base pairs.

Figure 3 Nucleotide excision repairs thymine dimers. When exposed to UV, thymines lying adjacent to each other can form thymine dimers. In normal cells, they are excised and replaced.

A well-studied example of mistakes not being corrected is seen in people suffering from xeroderma pigmentosa (Figure 4). Affected individuals have skin that is highly sensitive to UV rays from the sun. When individuals are exposed to UV, pyrimidine dimers, especially those of thymine, are formed; people with xeroderma pigmentosa are not able to repair the damage. These are not repaired because of a defect in the nucleotide excision repair enzymes, whereas in normal individuals, the thymine dimers are excised and the defect is corrected. The thymine dimers distort the structure of the DNA double helix, and this may cause problems during DNA replication. People with xeroderma pigmentosa have a higher risk of contracting skin cancer than those who don’t have the condition.

Photo shows a person with mottled skin lesions that result from xermoderma pigmentosa.

Figure 4 Xeroderma pigmentosa is a condition in which thymine dimerization from exposure to UV is not repaired. Exposure to sunlight results in skin lesions. (credit: James Halpern et al.)

Errors during DNA replication are not the only reason why mutations arise in DNA. Mutations, variations in the nucleotide sequence of a genome, can also occur because of damage to DNA. Such mutations may be of two types: induced or spontaneous. Induced mutations are those that result from an exposure to a mutagen: chemicals, UV rays, x-rays, or some other environmental agent. Spontaneous mutations occur without any exposure to any environmental agent; they are a result of natural reactions taking place within the body.

Mutations may have a wide range of effects. Some mutations are not expressed; these are known as silent mutations. Other mutations can have serious effects on the organism (such as the mutation that causes xeroderma pigmentosa.

Mutations in repair genes have been known to cause cancer. Many mutated repair genes have been implicated in certain forms of pancreatic cancer, colon cancer, and colorectal cancer. Mutations can affect either somatic cells or gametes. If many mutations accumulate in a somatic cell, they may lead to problems such as the uncontrolled cell division observed in cancer. If a mutation takes place in a gamete, the mutation can be passed on to the next generation.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. December 21, 2017.


Protein Synthesis

Learning Outcomes

  • Discuss the potential implications of mutations at cellular, organismal, and evolutionary levels
    • Summarize the processes involved in protein synthesis.
    • Describe how mutations affect the process of protein synthesis and its products.

In both prokaryotes and eukaryotes, the major purpose of DNA is to provide the information needed to construct the proteins necessary for the cell can perform all of its functions. Proteins are large, complex molecules that play many critical roles in the body. They do most of the work in cells and are required for the structure, function, and regulation of the body’s tissues and organs.

Recall that proteins are made up of hundreds or thousands of smaller units called amino acids, which are attached to one another in long chains. There are 20 different types of amino acids that can be combined to make a protein. The sequence of amino acids determines each protein’s unique 3-dimensional structure and its specific function.

Table 1: Some of the functions of proteins in cells, listed in alphabetical order:





Antibodies bind to specific foreign particles, such as viruses and bacteria, to help protect the body.


Enzymes carry out almost all of the thousands of chemical reactions that take place in cells. They also assist with the formation of new molecules by reading the genetic information stored in DNA.


Messenger proteins, such as some types of hormones, transmit signals to coordinate biological processes between different cells, tissues, and organs.
Structural component


These proteins provide structure and support for cells. On a larger scale, they also allow the body to move.
Transport/ storage


These proteins bind and carry atoms
and small molecules within cells and throughout the body. 

The information to make proteins is stored in an organism’s DNA. Each protein is coded for by a specific section of DNA called a gene. A gene is the section of DNA required to produce one protein. Genes are typically hundreds or thousands of base pairs in length because they code for proteins made of hundreds or thousands of amino acids.

Remember that DNA in eukaryotes is found as long linear molecules called chromosomes (Figure 1). Chromosomes are millions of base pairs in length and each contain many, many genes (Table 2). An organisms complete set of DNA (including all its genes) is referred to as its genome.

Table 2 Size and number of genes of several human chromosomes.

Chromosome  Size (in base pairs) # of genes
1 248,956,422 2058
10 133,797,422 733
22 50818468 488


human karyotype

Figure 1: A karyotype showing the sizes of all the human chromosomes. Notice that they decrease in size.

To summarize: many base pairs make up one gene, many genes are found on one chromosome, and many chromosomes can be found in one genome.


Figure 2: The arrangement of DNA into chromosomes. Photo credit: Thomas Splettstoesser (


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

“What are proteins and what do they do?” by U.S. National Library of Medicine is in the Public Domain


How do genes direct the production of proteins?

The information to make proteins is stored in an organism’s DNA. Each protein is coded for by a specific section of DNA called a gene. A gene is the section of DNA required to produce one protein. Genes are typically hundreds or thousands of base pairs in length because they code for proteins made of hundreds or thousands of amino acids.

Figure 1 Genes, which are carried on (a) chromosomes, are linearly organized instructions for making the RNA and protein molecules that are necessary for all of processes of life. The (b) interleukin-2 protein and (c) alpha-2u-globulin protein are just two examples of the array of different molecular structures that are encoded by genes. (credit “chromosome: National Human Genome Research Institute; credit “interleukin-2”: Ramin Herati/Created from PDB 1M47 and rendered with Pymol; credit “alpha-2u-globulin”: Darren Logan/rendered with AISMIG)

The journey from gene to protein is complex and tightly controlled within each cell. It consists of two major steps: transcription and translation. Together, transcription and translation are known as gene expression.


During the process of transcription, the information stored in a gene’s DNA is used as a blueprint to produce a similar molecule called RNA (ribonucleic acid) in the cell nucleus. Both RNA and DNA are made up of a chain of nucleotide bases, but they have slightly different chemical properties (Figure 2). Both RNA and DNA contain a 5-carbon sugar, but the sugar differs: it is deoxyribose in DNA and ribose in RNA (DNA stands for deoxyribonucleic acid; RNA stands for ribonucleic acid). DNA and RNA also differ in the nitrogenous bases they contain. DNA contains A, T, C, and G. RNA contains A, C, and G, but no thymine. Instead it contains a base called uracil (U). The type of RNA that contains the information for making a protein is called messenger RNA (mRNA) because it carries the information, or message, from the DNA out of the nucleus into the cytoplasm.

Figure 2 DNA vs RNA. Photo credit Zappys Technology Solution; Flickr.

During transcription, an RNA copy is made from a DNA molecule. This is possible because of the base-pairing rules: A with T (or U) and C with G. The hydrogen bonds connecting the base pairs in a DNA molecule are broken, and an enzyme creates a chain of RNA nucleotides that correspond to the DNA sequence.

In eukaryotes, transcription occurs in the nucleus (because that’s where the DNA is). In prokaryotes, transcription occurs in the cytoplasm because there is no nucleus.

RNA Processing

After prokaryotes produce an mRNA, it can be immediately translated since both processes occur in the cytoplasm. In fact, transcription and translation can occur at the same time – as an mRNA is being transcribed, it can also begin to be translated.

Eukaryotes require a more complex process since the mRNA must move from the cytoplasm to the nucleus. Additionally, eukaryotic mRNAs are typically modified in several different ways: portions of the mRNA that do not code for amino acids are removed (“spliced” out), and the 5′ and 3′ ends are modified to help with recognition and mRNA stability. After these modifications are made, the mature mRNA is transported to the cytoplasm.


Translation, the second step in getting from a gene to a protein, takes place in the cytoplasm. The mRNA interacts with a specialized complex called a ribosome, which “reads” the sequence of mRNA bases. In conjunction with a type of RNA called transfer RNA (tRNA), the protein is assembled according to the instructions in the mRNA molecule. Each sequence of three bases in the mRNA, called a codon, usually codes for one particular amino acid. Remember that amino acids are the building blocks of proteins. Protein assembly continues until the ribosome encounters a “stop” codon (a sequence of three bases that does not code for an amino acid).

Recall that ribosomes are located in two different places in eukaryotic cells: free-floating in the cytoplasm and attached to the rough endoplasmic reticulum. The final destination of the protein determines where it will be synthesized.

central dogma visual

Figure 3: The Central Dogma – DNA is used to make RNA is used to make protein

The flow of information from DNA to RNA to proteins is one of the fundamental principles of molecular biology. It is so important that it is sometimes called the “central dogma” (Figures 3 and 4).

Figure 4: More detail on the central dogma. (“Overview of Protein Synthesis” by Becky Boone is licensed under CC BY-SA 2.0)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

“What are proteins and what do they do?” by U.S. National Library of Medicine is in the Public Domain


The Genetic Code

The Central Dogma: DNA Encodes RNA; RNA Encodes Protein

 To summarize what we know to this point, the cellular process of transcription generates messenger RNA (mRNA), a mobile molecular copy of one or more genes with an alphabet of A, C, G, and uracil (U). Translation of the mRNA template converts nucleotide-based genetic information into a protein product. This flow of genetic information in cells from DNA to mRNA to protein is described by the Central Dogma (Figure 1), which states that genes specify the sequence of mRNAs, which in turn specify the sequence of proteins. The decoding of one molecule to another is performed by specific proteins and RNAs. Because the information stored in DNA is so central to cellular function, it makes intuitive sense that the cell would make mRNA copies of this information for protein synthesis, while keeping the DNA itself intact and protected.

It turns out that the central dogma is not always true. We will not discuss the exceptions here, however.

Figure 1 Instructions on DNA are transcribed onto messenger RNA. Ribosomes are able to read the genetic information inscribed on a strand of messenger RNA and use this information to string amino acids together into a protein.

Amino Acid Structure

Protein sequences consist of 20 commonly occurring amino acids (Figure 1); therefore, it can be said that the protein alphabet consists of 20 letters. Different amino acids have different chemistries (such as acidic versus basic, or polar and nonpolar) and different structural constraints. Variation in amino acid sequence gives rise to enormous variation in protein structure and function.

Figure 1 Structures of the 20 amino acids found in proteins are shown. Each amino acid is composed of an amino group (NH+3 ), a carboxyl group (COO-), and a side chain (blue). The side chain may be nonpolar, polar, or charged, as well as large or small. It is the variety of amino acid side chains that gives rise to the incredible variation of protein structure and function.

Genetic Code

Each amino acid is defined by a three-nucleotide sequence called the triplet codon. The relationship between a nucleotide codon and its corresponding amino acid is called the genetic code. Given the different numbers of “letters” in the mRNA (4 – A, U, C, G) and protein “alphabets” (20 different amino acids) one nucleotide could not correspond to one amino acid. Nucleotide doublets would also not be sufficient to specify every amino acid because there are only 16 possible two-nucleotide combinations (42). In contrast, there are 64 possible nucleotide triplets (43), which is far more than the number of amino acids. Scientists theorized that amino acids were encoded by nucleotide triplets and that the genetic code was degenerate. In other words, a given amino acid could be encoded by more than one nucleotide triplet. (Figure 2). These nucleotide triplets are called codons.

The same codon will always specify the insertion of one specific amino acid. The chart seen in Figure 2 can be used to translate an mRNA sequence into an amino acid sequence. For example, the codon UUU will always cause the insertion of the amino acid phenylalanine (Phe), while the codon UUA will cause the insertion of leucine (Leu).

Figure 2 This figure shows the genetic code for translating each nucleotide triplet in mRNA into an amino acid or a termination signal in a nascent protein. (credit: modification of work by NIH)

Each set of three bases (one codon) causes the insertion of one specific amino acid into the growing protein. This means that the insertion of one or two nucleotides can completely change the triplet “reading frame”, thereby altering the message for every subsequent amino acid (Figure 3). Though insertion of three nucleotides caused an extra amino acid to be inserted during translation, the integrity of the rest of the protein was maintained.

Figure 3 The deletion of two nucleotides shifts the reading frame of an mRNA and changes the entire protein message, creating a nonfunctional protein or terminating protein synthesis altogether.

Three of the 64 codons terminate protein synthesis and release the polypeptide from the translation machinery. These triplets are called stop codons. Another codon, AUG, also has a special function. In addition to specifying the amino acid methionine, it also serves as the start codon to initiate translation. The reading frame for translation is set by the AUG start codon near the 5′ end of the mRNA. The genetic code is universal. With a few exceptions, virtually all species use the same genetic code for protein synthesis, which is powerful evidence that all life on Earth shares a common origin.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. January 2, 2017


Prokaryotic Transcription

Both prokaryotes and eukaryotes perform fundamentally the same process of transcription, with the important difference of the membrane-bound nucleus in eukaryotes. With the genes bound in the nucleus, transcription occurs in the nucleus of the cell and the mRNA transcript must be transported to the cytoplasm. In prokaryotes, which lack membrane-bound nuclei and other organelles, transcription occurs in the cytoplasm of the cell.

RNA Polymerase

RNA Polymerase is the enzyme that produces the mRNA molecule (just like DNA polymerase produced a new DNA molecule during DNA replication). Prokaryotes use the same RNA polymerase to transcribe all of their genes. In E. coli, the polymerase is composed of five polypeptide subunits. These subunits assemble every time a gene is transcribed, and they disassemble once transcription is complete. Each subunit has a unique role (which you do not need to memorize). The polymerase comprised of all five subunits is called the holoenzyme.


Transcription in prokaryotes (and in eukaryotes) requires the DNA double helix to partially unwind in the region of mRNA synthesis. The region of unwinding is called a transcription bubble. The DNA sequence onto which the proteins and enzymes involved in transcription bind to initiate the process is called a promoter. In most cases, promoters exist upstream of the genes they regulate. The specific sequence of a promoter is very important because it determines whether the corresponding gene is transcribed all of the time, some of the time, or hardly at all. The structure and function of a prokaryotic promoter is relatively simple (Figure 1). One important sequence in the prokaryotic promoter is located 10 bases before the transcription start site (-10) and is commonly called the TATA box.


Figure 1 The general structure of a prokaryotic promoter.

To begin transcription, the RNA polymerase holoenzyme assembles at the promoter. The dissociation of σ allows the core enzyme to proceed along the DNA template, synthesizing mRNA by adding RNA nucleotides according to the base pairing rules, similar to the way a new DNA molecule is produced during DNA replication.  Only one of the two DNA strands is transcribed. The transcribed strand of DNA is called the template strand because it is the template for mRNA production. The mRNA product is complementary to the template strand and is almost identical to the other DNA strand, called the non-template strand, with the exception that RNA contains a uracil (U) in place of the thymine (T) found in DNA. Like DNA polymerase, RNA polymerase adds new nucleotides onto the 3′-OH group of the previous nucleotide. This means that the growing mRNA strand is being synthesized in the 5′ to 3′ direction. Because DNA is anti-parallel,  this means that the RNA polymerase is moving in the 3′ to 5′ direction down the template strand (Figure 2).


As elongation proceeds, the DNA is continuously unwound ahead of the core enzyme as the hydrogen bonds that connect the complementary base pairs in the DNA double helix are broken (Figure 2). The DNA is rewound behind the core enzyme as the hydrogen bonds are reformed. The base pairing between DNA and RNA is not stable enough to maintain the stability of the mRNA synthesis components. Instead, the RNA polymerase acts as a stable linker between the DNA template and the newly forming RNA strand to ensure that elongation is not interrupted prematurely.


Figure 2 During elongation, RNA polymerase tracks along the DNA template, synthesizes mRNA in the 5′ to 3′ direction, and unwinds then rewinds the DNA as it is read.


Once a gene is transcribed, the RNA polymerase needs to be instructed to dissociate from the DNA template and liberate the newly made mRNA. Depending on the gene being transcribed, there are two kinds of termination signals. One is protein-based and the other is RNA-based. Both termination signals rely on specific sequences of DNA near the end of the gene that cause the polymerase to release the mRNA.

In a prokaryotic cell, by the time transcription ends, the transcript would already have been used to begin making copies of the encoded protein because the processes of transcription and translation can occur at the same time since both occur in the cytoplasm (Figure 3). In contrast, transcription and translation cannot occur simultaneously in eukaryotic cells since transcription occurs inside the nucleus and translation occurs outside in the cytoplasm.


Figure 3: Multiple polymerases can transcribe a single bacterial gene while numerous ribosomes concurrently translate the mRNA transcripts into polypeptides. In this way, a specific protein can rapidly reach a high concentration in the bacterial cell.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. December 21, 2017


Eukaryotic Transcription

Prokaryotes and eukaryotes perform fundamentally the same process of transcription, with a few key differences. The most important difference between prokaryotes and eukaryotes is the latter’s membrane-bound nucleus and organelles. With the genes enclosed in a nucleus, the eukaryotic cell must be able to transport its mRNA to the cytoplasm and must protect its mRNA from degrading before it is translated. Eukaryotes also employ three different polymerases that each transcribe a different subset of genes.


The eukaryotic promoters that we are most interested in are similar to prokaryotic promoters in that they contain a TATA box (Figure 1). However, initiation of transcription is much more complex in eukaryotes compared to prokaryotes. Unlike the prokaryotic RNA polymerase that can bind to a DNA template on its own, eukaryotes require several other proteins, called transcription factors, to first bind to the promoter region and then help recruit the appropriate polymerase.

lots of colored lines and blobs

Figure 1 The generalized structure of a eukaryotic promoter and transcription factors.

In addition, there are three different RNA polymerases in eukaryotes, each of which is made up of 10 subunits or more. Each eukaryotic RNA polymerase also requires a distinct set of transcription factors to bring it to the DNA template.

RNA polymerase I is located in the nucleolus, a specialized nuclear substructure in which ribosomal RNA (rRNA) is transcribed, processed, and assembled into ribosomes. The rRNA molecules are considered structural RNAs because they have a cellular role but are not translated into protein. The rRNAs are components of the ribosome and are essential to the process of translation. RNA polymerase I synthesizes most of the rRNAs.

RNA polymerase II is located in the nucleus and synthesizes all protein-coding nuclear pre-mRNAs. Eukaryotic pre-mRNAs undergo extensive processing after transcription but before translation. For clarity, the term “mRNA” will only be used to describe the mature, processed molecules that are ready to be translated. RNA polymerase II is responsible for transcribing the overwhelming majority of eukaryotic genes.

RNA polymerase III is also located in the nucleus. This polymerase transcribes a variety of structural RNAs including transfer pre-RNAs (pre-tRNAs), and small nuclear pre-RNAs. The tRNAs have a critical role in translation; they serve as the adaptor molecules between the mRNA template and the growing polypeptide chain. Small nuclear RNAs have a variety of functions, including “splicing” pre-mRNAs and regulating transcription factors.

Each of the types of RNA polymerase recognizes a different promoter sequence and requires different transcription factors.


Following the formation of the preinitiation complex, the polymerase is released from the other transcription factors, and elongation is allowed to proceed as it does in prokaryotes with the RNA polymerase synthesizing pre-mRNA in the 5′ to 3′ direction. As discussed previously, RNA polymerase II transcribes the major share of eukaryotic genes, so this section will focus on how this polymerase accomplishes elongation and termination.

Although the enzymatic process of elongation is essentially the same in eukaryotes and prokaryotes, the DNA template is more complex. When eukaryotic cells are not dividing, their genes exist as a diffuse mass of DNA and proteins called chromatin. The DNA is tightly packaged around charged histone proteins at repeated intervals. These DNA–histone complexes, collectively called nucleosomes, are regularly spaced and include 146 nucleotides of DNA wound around eight histones like thread around a spool.

For RNA synthesis to occur, the transcription machinery needs to move histones out of the way every time it encounters a nucleosome. This is accomplished by a special protein complex called FACT, which stands for “facilitates chromatin transcription.” This complex pulls histones away from the DNA template as the polymerase moves along it. Once the pre-mRNA is synthesized, the FACT complex replaces the histones to recreate the nucleosomes.


The termination of transcription is different for the different polymerases. Unlike in prokaryotes, elongation by RNA polymerase II in eukaryotes takes place 1,000–2,000 nucleotides beyond the end of the gene being transcribed. This pre-mRNA tail is removed during mRNA processing. RNA polymerases I and III require termination signals. Genes transcribed by RNA polymerase I contain a specific 18-nucleotide sequence that is recognized by a termination protein. The process of termination in RNA polymerase III involves an mRNA hairpin that causes the mRNA to be released.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. December 21, 2017.


Eukaryotic RNA Processing

 Eukaryotic mRNAs must undergo several processing steps before they can be transferred from the nucleus to the cytoplasm and translated into a protein. The additional steps involved in eukaryotic mRNA maturation create a molecule that is much more stable than a prokaryotic mRNA. Eukaryotic mRNAs typically last for several hours, whereas the typical prokaryotic mRNA lasts no more than five seconds.

The mRNA transcript is coated in RNA-stabilizing proteins to prevent it from degrading while it is processed and exported out of the nucleus. The three most important steps of pre-mRNA processing are the addition of stabilizing and signaling factors at the 5′ and 3′ ends of the molecule, and the removal of intervening sequences that do not specify the appropriate amino acids. In rare cases, the mRNA transcript can be “edited” after it is transcribed.

5′ Capping

While the pre-mRNA is still being synthesized, a 7-methylguanosine cap is added to the 5′ end of the growing transcript by a phosphate linkage. This 5′ cap protects the nascent mRNA from degradation. In addition, factors involved in protein synthesis recognize the cap to help initiate translation by ribosomes. Structurally, 7-methylguanosine looks like a guanine nucleotide, but with an added methyl group (Figure 1).

Figure 1 The structure of the 5′ cap. Notice that the 7-methylguanosine is shaped like a purine nucleotide, but has an extra methyl group (CH3). It is attached to the mRNA upside-down compared to the other nucleotides. Photo credit Zephyris; Wikipedia.

3′ Poly-A Tail

Once elongation is complete, the pre-mRNA is cleaved by an endonuclease between an AAUAAA consensus sequence and a GU-rich sequence, leaving the AAUAAA sequence on the pre-mRNA. An enzyme called poly-A polymerase then adds a string of approximately 200 adenine nucleotides, called the poly-A tail. This modification further protects the pre-mRNA from degradation and signals the export of the cellular factors that the transcript needs to the cytoplasm.

Pre-mRNA Splicing

Eukaryotic genes are composed of exons, which correspond to protein-coding sequences (ex-on signifies that they are expressed), and intervening sequences called introns (int-ron denotes their intervening role), which may be involved in gene regulation but are removed from the pre-mRNA during processing (Figure 2). Intron sequences in mRNA do not encode functional proteins.

Figure 2 Eukaryotic mRNA contains introns that must be spliced out. A 5′ cap and 3′ tail are also added. Photo credit Kazulanth; Wikimedia. This work has been released into the public domain.

The discovery of introns came as a surprise to researchers in the 1970s who expected that pre-mRNAs would specify protein sequences without further processing, as they had observed in prokaryotes. The genes of higher eukaryotes very often contain one or more introns. These regions may correspond to regulatory sequences; however, the biological significance of having many introns or having very long introns in a gene is unclear. It is possible that introns slow down gene expression because it takes longer to transcribe pre-mRNAs with lots of introns. Alternatively, introns may be nonfunctional sequence remnants left over from the fusion of ancient genes throughout evolution. This is supported by the fact that separate exons often encode separate protein subunits or domains. For the most part, the sequences of introns can be mutated without ultimately affecting the protein product.

All of a pre-mRNA’s introns must be completely and precisely removed before protein synthesis. If the process errs by even a single nucleotide, the reading frame of the rejoined exons would shift, and the resulting protein would be dysfunctional. The process of removing introns and reconnecting exons is called splicing (Figure 2 and 3). Introns are removed and degraded while the pre-mRNA is still in the nucleus. Splicing occurs by a sequence-specific mechanism that ensures introns will be removed and exons rejoined with the accuracy and precision of a single nucleotide. The splicing of pre-mRNAs is conducted by complexes of proteins and RNA molecules called spliceosomes (Figure 3).

Note that more than 70 individual introns can be present, and each has to undergo the process of splicing—in addition to 5′ capping and the addition of a poly-A tail—just to generate a single, translatable mRNA molecule.

Figure 3 Pre-mRNA splicing involves the precise removal of introns from the primary RNA transcript. The splicing process is catalyzed by protein complexes called spliceosomes that are composed of proteins and RNA molecules called snRNAs. Spliceosomes recognize sequences at the 5′ and 3′ end of the intron.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016



 The synthesis of proteins consumes more of a cell’s energy than any other metabolic process. In turn, proteins account for more mass than any other component of living organisms (with the exception of water), and proteins perform virtually every function of a cell. The process of translation, or protein synthesis, involves the decoding of an mRNA message into a polypeptide (protein) product. Amino acids are covalently strung together by peptide bonds in lengths ranging from approximately 50 amino acid residues to more than 1,000. Each individual amino acid has an amino group (NH2) and a carboxyl (COOH) group. Polypeptides are formed when the amino group of one amino acid forms an amide (i.e., peptide) bond with the carboxyl group of another amino acid (Figure 1). This reaction is catalyzed by ribosomes and generates one water molecule.

Remember that there are 20 different amino acids that are commonly used. In Figure 1, R’ represents the part of the amino acid which is different between these 20 structures.

two amino acid structures being attached together

Figure 1 A peptide bond links the carboxyl end of one amino acid with the amino end of another, expelling one water molecule. For simplicity in this image, only the functional groups involved in the peptide bond are shown. The R and R’ designations refer to the rest of each amino acid structure.

The Protein Synthesis Machinery

In addition to the mRNA template, many other molecules contribute to the process of translation and the general structures and functions of the protein synthesis machinery are comparable from bacteria to human cells. Translation requires the input of an mRNA template, ribosomes, tRNAs, and various enzymatic factors (Figure 1).


Even before an mRNA is translated, a cell must invest energy to build each of its ribosomes. Ribosomes are the part of the cell which reads the information in the mRNA molecule and joins amino acids together in the correct order. A ribosome is a very large, complex macromolecule composed of structural and catalytic rRNAs, and many distinct polypeptides. In eukaryotes, the nucleolus is completely specialized for the synthesis and assembly of rRNAs (the RNA component that makes up ribosomes).

Ribosomes are made up of two subunits that come together for translation, rather like a hamburger bun comes together around the meat (the mRNA). The small subunit is responsible for binding the mRNA template, whereas the large subunit sequentially binds tRNAs, a type of RNA molecule that brings amino acids to the growing chain of the polypeptide. Each mRNA molecule can be simultaneously translated by many ribosomes, all synthesizing protein in the same direction: reading the mRNA from 5′ to 3′ and synthesizing the polypeptide from the N terminus to the C terminus (refer to Figure 1 – the N terminus is the end of the amino acid with the Nitrogen; the C terminus is the end with the Carbon).

Ribosomes exist in the cytoplasm in prokaryotes and in the cytoplasm and rough endoplasmic reticulum in eukaryotes. Mitochondria and chloroplasts also have their own ribosomes in the matrix and stroma, which look more similar to prokaryotic ribosomes (and have similar drug sensitivities) than the ribosomes just outside their outer membranes in the cytoplasm.

Figure 2 The protein synthesis machinery includes the large and small subunits of the ribosome, mRNA, and tRNA.


Depending on the species, 40 to 60 types of tRNA exist in the cytoplasm. Serving as adaptors, specific tRNAs bind to sequences on the mRNA template and add the corresponding amino acid to the polypeptide chain. Therefore, tRNAs are the molecules that actually “translate” the language of RNA into the language of proteins.

Each tRNA is made up of a linear RNA molecule that is folded into a complex shape (Figure 3). At one end of the tRNA is an anticodon, which recognizes and base pairs with one of the mRNA codons. At the other end, a specific amino acid is attached. Of the 64 possible mRNA codons—or triplet combinations of A, U, G, and C—three specify the termination of protein synthesis and 61 specify the addition of amino acids to the polypeptide chain. Of these 61, one codon (AUG) also encodes the initiation of translation. Each tRNA anticodon can base pair with one of the mRNA codons and add an amino acid or terminate translation, according to the genetic code. For instance, if the sequence CUA occurred on an mRNA template in the proper reading frame, it would bind a tRNA expressing the complementary sequence, GAU, which would be linked to the amino acid leucine.

a colorful tRNA

Figure 3 The RNA molecule that makes up a tRNA folds into the complex 3-D structure seen here. In this figure, the anticodon is the grey section at the bottom of the structure. The amino acid would be attached to the yellow part at the top right. Photo credit Yikrazuul; Wikimedia.

Aminoacyl tRNA Synthetases

For each tRNA to function, it must have its specific amino acid bonded to it. In the process of tRNA “charging,” each tRNA molecule is bonded to its correct amino acid by a group of enzymes called aminoacyl tRNA synthetases. At least one type of aminoacyl tRNA synthetase exists for each of the 20 amino acids; the exact number of aminoacyl tRNA synthetases varies by species. These enzymes utilize the energy from ATP to energize a specific amino acid, which is then transferred to the tRNA. In this way, tRNA molecules can be used over and over again, but each tRNA always carries the same amino acid because of the specificity of the aminoacyl tRNA synthetase enzymes.

The Mechanism of Protein Synthesis

As with mRNA synthesis, protein synthesis can be divided into three phases: initiation, elongation, and termination. The process of translation is similar in prokaryotes and eukaryotes. Here we’ll explore how translation occurs in E. coli, a representative prokaryote, and specify any differences between prokaryotic and eukaryotic translation.

Initiation of Translation

Protein synthesis begins with the formation of an initiation complex. In E. coli, this complex involves the small ribosomal subunit, the mRNA template, three initiation factors (IF-1, IF-2, and IF-3), and a special initiator tRNA, called tRNAFMet. The initiator tRNA interacts with the start codon AUG, links to a formylated methionine amino acid called fMet, and can also bind IF-2. Formylated methionine is inserted by fMet-tRNAFMet at the beginning of every polypeptide chain synthesized by E. coli, but it is usually clipped off after translation is complete. When an in-frame AUG is encountered during translation elongation, a non-formylated methionine is inserted by a regular Met-tRNAMet.

In E. coli mRNA, a sequence upstream of the first AUG codon, called the Shine-Dalgarno sequence (AGGAGG), interacts with the rRNA molecules that compose the ribosome. This interaction anchors the small ribosomal subunit at the correct location on the mRNA template. Guanosine triphosphate (GTP), which is a purine nucleotide triphosphate, acts as an energy source during translation—both at the start of elongation and during the ribosome’s translocation.

In eukaryotes, a similar initiation complex forms, comprising mRNA, the small ribosomal subunit, IFs, and nucleoside triphosphates (GTP and ATP). The charged initiator tRNA, called Met-tRNAi, does not bind fMet in eukaryotes, but is distinct from other Met-tRNAs in that it can bind IFs. Like in E. coli, a “normal” methionine amino acid is inserted when the ribosome encounters in-frame AUG codons.

Instead of depositing at the Shine-Dalgarno sequence, the eukaryotic initiation complex recognizes the 7-methylguanosine cap at the 5′ end of the mRNA. A cap-binding protein (CBP) and several other IFs assist the movement of the ribosome to the 5′ cap. Once at the cap, the initiation complex tracks along the mRNA in the 5′ to 3′ direction, searching for the AUG start codon. Many eukaryotic mRNAs are translated from the first AUG, but this is not always the case. The sequence of bases around each AUG helps determine if that AUG will be used as the start codon.

Once the appropriate AUG is identified, the large ribosomal subunit binds to the complex of Met-tRNAi, mRNA, and the small ribosomal subunit. This step completes the initiation of translation in eukaryotes.

Summary: In both prokaryotes and eukaryotes, the small ribosomal subunit binds to the special initiator methionine tRNA. With the help of several other factors, this complex identifies the start codon (AUG) based on the sequence of nucleotides nearby (Figure 4, top diagram). Then, the large ribosomal subunit binds (Figure 4, middle diagram).

Figure 4 Translation begins when a tRNA anticodon recognizes a codon on the mRNA. The large ribosomal subunit joins the small subunit, and a second tRNA is recruited. As the mRNA moves relative to the ribosome, the polypeptide chain is formed. Entry of a release factor into the A site terminates translation and the components dissociate.

Translation, Elongation, and Termination

In prokaryotes and eukaryotes, the basics of elongation are the same. The large ribosomal subunit of consists of three compartments: the A (aminoacyl) site binds incoming charged aminoacyl tRNAs. The P (peptidyl) site binds charged tRNAs carrying amino acids that have formed peptide bonds with the growing polypeptide chain but have not yet dissociated from their corresponding tRNA. The E (exit) site releases dissociated tRNAs so that they can be recharged with free amino acids. There is one exception to this assembly line of tRNAs: in E. coli, fMet-tRNAFMet is capable of entering the P site directly without first entering the A site. Similarly, the eukaryotic Met-tRNAi, with help from other proteins of the initiation complex, binds directly to the P site. In both cases, this creates an initiation complex with a free A site ready to accept the tRNA corresponding to the first codon after the AUG.

During translation elongation, the mRNA template provides specificity. As the ribosome moves along the mRNA, each mRNA codon comes into register, and specific binding with the corresponding charged tRNA anticodon is ensured. If mRNA were not present in the elongation complex, the ribosome would bind tRNAs nonspecifically and a nonsense protein would be produced.

Elongation proceeds with charged tRNAs entering the A site and then shifting to the P site followed by the E site with each single-codon “step” of the ribosome (Figure 4, bottom diagram). Ribosomal steps are induced by conformational changes that advance the ribosome by three bases in the 3′ direction. The energy for each step of the ribosome is donated by an elongation factor that hydrolyzes GTP. Peptide bonds form between the amino group of the amino acid attached to the A-site tRNA and the carboxyl group of the amino acid attached to the P-site tRNA. The formation of each peptide bond is catalyzed by peptidyl transferase, an RNA-based enzyme that is integrated into the large ribosomal subunit. The energy for each peptide bond formation is derived from GTP hydrolysis, which is catalyzed by a separate elongation factor. The amino acid bound to the P-site tRNA is also linked to the growing polypeptide chain. As the ribosome steps across the mRNA, the former P-site tRNA enters the E site, detaches from the amino acid, and is expelled (Figure 4). Amazingly, the E. coli translation apparatus takes only 0.05 seconds to add each amino acid, meaning that a 200-amino acid protein can be translated in just 10 seconds.

ribosome and tRNAs

Figure 5 The movement of the tRNA molecules through the ribosome during protein synthesis. Note that the ribosome is moving from 5′ to 3′ along the mRNA, and the tRNAs are coming in from the front (the 3′ direction) and exiting at the back (the 5′ direction). Photo credit Boumphreyfr; Wikimedia.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. January 2, 2017


Optional Section - Micropigs

Micropigs are tiny, genetically-edited pigs that have recently been developed by a Chinese genomics institute (Li, 2014). The Chinese scientists used a technique called TALENs to edit the genome of pig cells (Figure 1). Each cell inside a pig contains two copies of the growth hormone receptor gene: one from each of its parents. The TALENs technique was used to delete one of the two copies of this gene.

Figure 1 General overview of the TALEN process. The left and right TALEN bind to a specific sequence of genomic DNA inside the nucleus of a cell. When they are correctly bound, nuclease enzymes (represented by scissors) cut the genomic DNA. The TALEN sequences can be edited by scientists to target different DNA sequences in the genome. (Photo credit: Ogletreerd, Wikimedia.

Growth hormone (GH) stimulates the growth of essentially all tissues within the body. GH is a 191 amino acid peptide (protein) hormone which is produced from the GH gene. Cells sense the presence of GH protein hormone with the growth hormone receptor (GHR) protein on the outside of the cell. GHR protein is produced from the GHR gene and is found on the cell membrane on the outside of cells.
The GHR protein has three major parts:

  • An extracellular region that sticks out from the outside surface of the cell
  • A transmembrane region that goes through the cell membrane and anchors the receptor to the membrane
  • An intracellular region on the inside of the cell membrane that transmits signals to the interior of the cell.

The extracellular region binds (attaches) to GH, fitting together like a lock and its key. The binding of growth hormone transmits signals through the cell membrane to the intracellular region of the receptor (Figure 2). These signals “turn on” genes involved in growth and metabolism so that those genes are made into proteins. These proteins stimulate the growth and division of other cells in the organism.

If growth hormone is not present, the organism will not grow to full size. In humans, severe GH deficiency can lead to an adult height of only 4 feet tall. If growth hormone receptor is not present, the “grow” signal from the GH will not be transmitted inside of cells, so growth will not be stimulated.

Figure 2 Growth hormone signaling pathway. When GH (growth hormone) binds to GHR (growth hormone receptor), a signal is sent through the cell membrane and into the nucleus of the cell. This signal turns on genes involved in cell metabolism and growth. (Photo credit: Lisa Bartee, 2017)


Li f, Li Y, Liu H, Zhang H, Liu C, Zhang X, Dou H, Yang W, Du Y. 2014 Sep. Production of GHR double-allelic knockout Bama pig by TALENs and handmade cloning. Yi Chuan: 36(9):903-11.


Gene Regulation

Learning Objectives

By the end of this section, you will be able to:

  • Describe processes through which gene expression can be regulated.
  • Differentiate between gene regulation processes used by prokaryotes and eukaryotes.
  • Discuss the possible evolutionary consequences of changes in gene expression.

Each cell expresses, or turns on, only a fraction of its genes. The rest of the genes are repressed, or turned off. The process of turning genes on and off is known as gene regulation. Gene regulation is an important part of normal development. Genes are turned on and off in different patterns during development to make a brain cell look and act different from a liver cell or a muscle cell, for example. Gene regulation also allows cells to react quickly to changes in their environments. Although we know that the regulation of genes is critical for life, this complex process is not yet fully understood.

For a cell to function properly, necessary proteins must be synthesized at the proper time. All organisms and cells control or regulate the transcription and translation of their DNA into protein. The process of turning on a gene to produce RNA and protein is called gene expression. Whether in a simple unicellular organism or in a complex multicellular organism, each cell controls when and how its genes are expressed. For this to occur, there must be a mechanism to control when a gene is expressed to make RNA and protein, how much of the protein is made, and when it is time to stop making that protein because it is no longer needed.

Cells in multicellular organisms are specialized; cells in different tissues look very different and perform different functions. For example, a muscle cell is very different from a liver cell, which is very different from a skin cell. These differences are a consequence of the expression of different sets of genes in each of these cells. All cells have certain basic functions they must perform for themselves, such as converting the energy in sugar molecules into energy in ATP. Each cell also has many genes that are not expressed, and expresses many that are not expressed by other cells, such that it can carry out its specialized functions. In addition, cells will turn on or off certain genes at different times in response to changes in the environment or at different times during the development of the organism. Unicellular organisms, both eukaryotic and prokaryotic, also turn on and off genes in response to the demands of their environment so that they can respond to special conditions.

Tortoiseshell cat

Figure 1 The unique color pattern of this cat’s fur is caused by either the orange or the black allele of a gene being randomly silenced (turned off).

The control of gene expression is extremely complex. Malfunctions in this process are detrimental to the cell and can lead to the development of many diseases, including cancer.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Prokaryotic versus Eukaryotic Gene Expression

To understand how gene expression is regulated, we must first understand how a gene becomes a functional protein in a cell. The process occurs in both prokaryotic and eukaryotic cells, just in slightly different fashions.

Because prokaryotic organisms lack a cell nucleus, the processes of transcription and translation occur almost simultaneously. When the protein is no longer needed, transcription stops. As a result, the primary method to control what type and how much protein is expressed in a prokaryotic cell is through the regulation of DNA transcription into RNA. All the subsequent steps happen automatically. When more protein is required, more transcription occurs. Therefore, in prokaryotic cells, the control of gene expression is almost entirely at the transcriptional level.

Eukaryotic cells, in contrast, have intracellular organelles and are much more complex. Recall that in eukaryotic cells, the DNA is contained inside the cell’s nucleus and it is transcribed into mRNA there. The newly synthesized mRNA is then transported out of the nucleus into the cytoplasm, where ribosomes translate the mRNA into protein. The processes of transcription and translation are physically separated by the nuclear membrane; transcription occurs only within the nucleus, and translation only occurs outside the nucleus in the cytoplasm. The regulation of gene expression can occur at all stages of the process (Figure 1):

  • Epigenetic level: regulates how tightly the DNA is wound around histone proteins to package it into chromosomes
  • Transcriptional level: regulates how much transcription takes place
  • Post-transcriptional level: regulates aspects of RNA processing (such as splicing) and transport out of the nucleus
  • Translational level: regulates how much of the RNA is translated into protein
  • Post-translational level: regulates how long the protein lasts after it has been made and whether the protein is processed into an active form

Figure 1 Eukaryotic gene expression is regulated during transcription and RNA processing, which take place in the nucleus, as well as during protein translation, which takes place in the cytoplasm. Further regulation may occur through post-translational modifications of proteins.

The differences in the regulation of gene expression between prokaryotes and eukaryotes are summarized in Table 1.

Table 1: Differences in the Regulation of Gene Expression of Prokaryotic and Eukaryotic Organisms

Prokaryotic organisms Eukaryotic organisms
Lack nucleus Contain nucleus
RNA transcription and protein translation occur almost simultaneously

RNA transcription occurs prior to protein translation, and it takes place in the nucleus. RNA translation to protein occurs in the cytoplasm.

RNA post-processing includes addition of a 5′ cap, poly-A tail, and excision of introns and splicing of exons.

Gene expression is regulated primarily at the transcriptional level Gene expression is regulated at many levels (epigenetic, transcriptional, post-transcriptional, translational, and posttranslational)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Eukaryotic epigenetic regulation

DNA modifications that do not change the DNA sequence can affect gene activity. Chemical compounds that are added to single genes can regulate their activity; these modifications are known as epigenetic changes. The epigenome comprises all of the chemical compounds that have been added to the entirety of one’s DNA (genome) as a way to regulate the activity (expression) of all the genes within the genome. The chemical compounds of the epigenome are not part of the DNA sequence, but are on or attached to DNA (“epi-“ means above in Greek). Epigenomic modifications remain as cells divide and in some cases can be inherited through the generations. Environmental influences, such as a person’s diet and exposure to pollutants, can also impact the epigenome.

Epigenetic changes can help determine whether genes are turned on or off and can influence the production of proteins in certain cells, ensuring that only necessary proteins are produced. For example, proteins that promote bone growth are not produced in muscle cells. Patterns of epigenome modification vary among individuals, different tissues within an individual, and even different cells.

Regulating Access to Genes

Recall that the human genome encodes over 20,000 genes; each of the 23 pairs of human chromosomes encodes thousands of genes. The DNA in the nucleus is precisely wound, folded, and compacted into chromosomes so that it will fit into the nucleus. It is also organized so that specific segments can be accessed as needed by a specific cell type.

The first level of organization, or packing, is the winding of DNA strands around histone proteins. Histones package and order DNA into structural units called nucleosome complexes, which can control the access of proteins to the DNA regions (Figure 1a). Under the electron microscope, this winding of DNA around histone proteins to form nucleosomes looks like small beads on a string (Figure 1b). These beads (histone proteins) can move along the string (DNA) and change the structure of the molecule.

Figure 1 DNA is folded around histone proteins to create (a) nucleosome complexes. These nucleosomes control the access of proteins to the underlying DNA. When viewed through an electron microscope (b), the nucleosomes look like beads on a string. (credit “micrograph”: modification of work by Chris Woodcock)

If DNA encoding a specific gene is to be transcribed into RNA, the nucleosomes surrounding that region of DNA can slide down the DNA to open that specific chromosomal region and allow for the transcriptional machinery (RNA polymerase) to initiate transcription. Nucleosomes can move to open the chromosome structure to expose a segment of DNA, but do so in a very controlled manner.

Active open regions of chromatin are called euchromatin (Figure 2). Regions of the genome that are transcriptionally active are typically euchromatic. Tightly wound regions of chromatin are called heterochromatin. Heterochromatic regions of the genome are typically silenced and transcriptionally inactive.

Figure 2 The difference in chromatin packaging between an active (euchromatic) and inactive (heterochromatic) region of DNA.

Modifications to DNA and histones

How the histone proteins move, and whether the DNA is wrapped loosely or tightly around them, is dependent on signals found on both the histone proteins and on the DNA. These signals are tags added to histone proteins and DNA that tell the histones if a chromosomal region should be open or closed. These tags are not permanent, but may be added or removed as needed. They are chemical modifications (phosphate, methyl, or acetyl groups) that are attached to specific amino acids in the protein or to the nucleotides of the DNA. The tags do not alter the DNA base sequence, but they do alter how tightly wound the DNA is around the histone proteins.

This type of gene regulation is called epigenetic regulation. Epigenetic means “around genetics.” The changes that occur to the histone proteins and DNA do not alter the nucleotide sequence and are not permanent. Instead, these changes are temporary (although they can and often do persist through multiple rounds of cell division) and alter the chromosomal structure (open euchromatin or closed heterochromatin) as needed. A gene can be turned on or off depending upon the location and modifications to the histone proteins and DNA. If a gene is to be transcribed, the histone proteins and DNA are modified surrounding the chromosomal region encoding that gene. This opens the chromosomal region (it becomes euchromatic) to allow access for RNA polymerase and other proteins, called transcription factors, to bind to the promoter region, located just upstream of the gene, and initiate transcription. If a gene is to remain turned off, or silenced, the histone proteins and DNA have different modifications that signal a closed chromosomal configuration. In this closed configuration (heterochromatin), the RNA polymerase and transcription factors do not have access to the DNA and transcription cannot occur (Figure 2).

DNA Methylation

A common type of epigenomic modification is called methylation. Methylation involves attaching small molecules called methyl groups, each consisting of one carbon atom and three hydrogen atoms, to DNA nucleotides or the amino acids that make up the histone proteins.

When DNA is methylated, the methyl group is typically added to cytosine nucleotides. This occurs within very specific regions called CpG islands. These are stretches with a high frequency of cytosine and guanine dinucleotide DNA pairs (CG) found in the promoter regions of genes. When this configuration exists, the cytosine member of the pair can be methylated (a methyl group is added). This modification changes how the DNA interacts with proteins, including the histone proteins that control access to the region. When methyl groups are added to a particular gene, that gene is turned off or silenced, and no protein is produced from that gene (Figure 3).

“Histone Code” Hypothesis

The histone code hypothesis is the hypothesis that transcription of a gene is in part regulated by modifications made to histone proteins, primarily on their somewhat floppy ends (their “tails”). Many of the histone tail modifications correlate very well to chromatin structure and both histone modification state and chromatin structure correlate well to gene expression levels. The most important concept in the histone code hypothesis is that the histone modifications serve to recruit other proteins by specific recognition of the modified histone, rather than through simply stabilizing or destabilizing the interaction between histone and the underlying DNA. These recruited proteins then act to alter chromatin structure actively or to promote transcription.

The histone code has the potential to be massively complex. There are at least 20 modifications that are made to histone tails that have been relatively well characterized, and the potential for many more that we have not discovered. Each histone can be modified on multiple amino acids, with multiple different chemical modifications. The information that can be stored in the histone code dwarfs the amount that is stored in the order of the bases in the human genome.

Histone Methlyation

A portion of the histone protein known as the histone tail can have methyl groups (CH3) added to it. This is the same modification that is made to cytosine nucleotides in DNA. The specific amino acid in the histone tail that gets methylated is very important for determining whether it will tighten or loosen chromatin structure. Modification to several amino acids in the tail is correlated with euchromatin and active transcription, while modification to other amino acids is correlated with heterochromatin and gene silencing.

Histone Acetylation

Histone tails can also be modified by the addition of an acetyl group (this process is known as acetylation). If you remember from cellular respiration, an acetyl group (such as that found in acetyl-CoA) is a 2-carbon molecule. When histone tails are acetylated, this typically causes the tails to loosen from around the DNA, allowing the chromatin to loosen (Figure 3).

Figure 3 Nucleosomes can slide along DNA. When nucleosomes are spaced closely together (top), transcription factors cannot bind and gene expression is turned off. When the nucleosomes are spaced far apart (bottom), the DNA is exposed. Transcription factors can bind, allowing gene expression to occur. Modifications to the histones and DNA affect nucleosome spacing.

Other modifications

There are many other modifications that can be made to histone proteins in addition to methylation and acetylation. Histone tails can be phosphorylated or ubiquitinated (where a small protein called ubiquitin is attached). Histone phosphorylation seems to be related to DNA repair. Ubiquitination has been shown to be associated with both transcriptional activation or inactivation, depending on the specific location.

Epigenetic Changes


Because errors in the epigenetic process, such as modifying the wrong gene or failing to add a compound to a gene, can lead to abnormal gene activity or inactivity, they can cause genetic disorders. Conditions including cancers, metabolic disorders, and degenerative disorders have all been found to be related to epigenetic errors.

Cancerous cells often have regions of DNA that show different levels of methylation compared to normal cells. Some genes are methylated and silenced in cancerous cells, while they are unmethylated and active in normal cells. Other genes are active in cancerous cells, but inactive in normal cells. Each specific cancer in each specific individual can show different patterns of methylation, although there are similarities between many different types of cancer.

Scientists continue to explore the relationship between the genome and the chemical compounds that modify it. In particular, they are studying what effect the modifications have on gene function, protein production, and human health.

Figure 4 Histone proteins and DNA nucleotides can be modified chemically. Modifications affect nucleosome spacing and gene expression. (credit: modification of work by NIH)


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Eukaryotic Transcriptional Regulation

Like prokaryotic cells, the transcription of genes in eukaryotes requires the actions of an RNA polymerase to bind to a sequence upstream of a gene to initiate transcription. However, unlike prokaryotic cells, the eukaryotic RNA polymerase requires other proteins, or transcription factors, to facilitate transcription initiation. Transcription factors are proteins that bind to the promoter sequence and other regulatory sequences to control the transcription of the target gene. RNA polymerase by itself cannot initiate transcription in eukaryotic cells. Transcription factors must bind to the promoter region first and recruit RNA polymerase to the site for transcription to be established.

The Promoter and the Transcription Machinery

Genes are organized to make the control of gene expression easier. The promoter region is immediately upstream of the coding sequence. This region can be short (only a few nucleotides in length) or quite long (hundreds of nucleotides long). The longer the promoter, the more available space for proteins to bind. This also adds more control to the transcription process. The length of the promoter is gene-specific and can differ dramatically between genes. Consequently, the level of control of gene expression can also differ quite dramatically between genes. The purpose of the promoter is to bind transcription factors that control the initiation of transcription.

The interaction between parts of the promoter and transcription factors is quite complex (Figure 1) and differs between specific genes. In addition to the general transcription factors, there are gene-specific transcription factors. There are hundreds of transcription factors in a cell that each bind specifically to a particular DNA sequence motif. Transcription factors respond to environmental stimuli that cause the proteins to find their binding sites and initiate transcription of the gene that is needed.

Eukaryotic gene expression is controlled by a promoter immediately adjacent to the gene, and an enhancer far upstream. The DNA folds over itself, bringing the enhancer next to the promoter. Transcription factors and mediator proteins are sandwiched between the promoter and the enhancer. Short DNA sequences within the enhancer called distal control elements bind activators, which in turn bind transcription factors and mediator proteins bound to the promoter. RNA polymerase binds the complex, allowing transcription to begin. Different genes have enhancers with different distal control elements, allowing differential regulation of transcription.

Figure 1 An enhancer is a DNA sequence that promotes transcription. Each enhancer is made up of short DNA sequences called distal control elements. Activators bound to the distal control elements interact with mediator proteins and transcription factors. Two different genes may have the same promoter but different distal control elements, enabling differential gene expression.

Enhancers and Transcription

In some eukaryotic genes, there are regions that help increase or enhance transcription. These regions, called enhancers, are not necessarily close to the genes they enhance. They can be located upstream of a gene, within the coding region of the gene, downstream of a gene, or may be thousands of nucleotides away.

Enhancer regions are binding sequences, or sites, for transcription factors. When a DNA-bending protein binds, the shape of the DNA changes (Figure 1). This shape change allows for the interaction of the activators bound to the enhancers with the transcription factors bound to the promoter region and the RNA polymerase. Whereas DNA is generally depicted as a straight line in two dimensions, it is actually a three-dimensional object. Therefore, a nucleotide sequence thousands of nucleotides away can fold over and interact with a specific promoter.

Turning Genes Off: Transcriptional Repressors

Like prokaryotic cells, eukaryotic cells also have mechanisms to prevent transcription. Transcriptional repressors can bind to promoter or enhancer regions and block transcription. Like the transcriptional activators, repressors respond to external stimuli to prevent the binding of activating transcription factors.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. January 3, 2017.


Eukaryotic Post-transcriptional Regulation

After RNA is transcribed, it must be processed into a mature form before translation can begin. This processing after an RNA molecule has been transcribed, but before it is translated into a protein, is called post-transcriptional modification. As with the epigenetic and transcriptional stages of processing, this post-transcriptional step can also be regulated to control gene expression in the cell. If the RNA is not processed, shuttled, or translated, then no protein will be synthesized.

Alternative RNA Splicing

In the 1970s, genes were first observed that exhibited alternative RNA splicing. Alternative RNA splicing is a mechanism that allows different protein products to be produced from one gene when different combinations of introns (and sometimes exons) are removed from the transcript (Figure 1). This alternative splicing can be haphazard, but more often it is controlled and acts as a mechanism of gene regulation, with the frequency of different splicing alternatives controlled by the cell as a way to control the production of different protein products in different cells, or at different stages of development. Alternative splicing is now understood to be a common mechanism of gene regulation in eukaryotes; according to one estimate, 70% of genes in humans are expressed as multiple proteins through alternative splicing.

Figure 1 Pre-mRNA can be alternatively spliced to create different proteins.

How could alternative splicing evolve? Introns have a beginning and ending recognition sequence, and it is easy to imagine the failure of the splicing mechanism to identify the end of an intron and find the end of the next intron, thus removing two introns and the intervening exon. In fact, there are mechanisms in place to prevent such exon skipping, but mutations are likely to lead to their failure. Such “mistakes” would more than likely produce a nonfunctional protein. Indeed, the cause of many genetic diseases is alternative splicing rather than mutations in a sequence. However, alternative splicing would create a protein variant without the loss of the original protein, opening up possibilities for adaptation of the new variant to new functions. Gene duplication has played an important role in the evolution of new functions in a similar way—by providing genes that may evolve without eliminating the original functional protein.

Figure 2 There are five basic modes of alternative splicing.

Control of RNA Stability

Before the mRNA leaves the nucleus, it is given two protective “caps” that prevent the end of the strand from degrading during its journey. The 5′ cap, which is placed on the 5′ end of the mRNA, is usually composed of a methylated guanosine triphosphate molecule (GTP). The poly-A tail, which is attached to the 3′ end, is usually composed of a series of adenine nucleotides. Once the RNA is transported to the cytoplasm, the length of time that the RNA remains there can be controlled. Each RNA molecule has a defined lifespan and decays at a specific rate. This rate of decay can influence how much protein is in the cell. If the RNA decays more rapidly, translation has less time to occur, so less protein will be produced. Conversely, if RNA decays less rapidly, more protein will be produced. This rate of decay is referred to as the RNA stability. If the RNA is stable, it will be detected for longer periods of time in the cytoplasm. Binding of proteins to the RNA can influence its stability (Figure 3).

In the mature RNA molecule, exons are spliced together between the 5' and 3' untranslated regions. A 5' cap is attached to the 5' untranslated region, and a poly-A tail is attached to the 3' untranslated region. RNA-binding proteins associate with the 5' and 3' untranslated regions.

Figure 3 The protein-coding region of mRNA is flanked by 5′ and 3′ untranslated regions (UTRs). The presence of RNA-binding proteins at the 5′ or 3′ UTR influences the stability of the RNA molecule.

RNA Stability and microRNAs

In addition to proteins that bind to and control (increase or decrease) RNA stability, other elements called microRNAs can bind to the RNA molecule. These microRNAs, or miRNAs, are short RNA molecules that are only 21–24 nucleotides in length. The miRNAs are made in the nucleus as longer pre-miRNAs. These pre-miRNAs are chopped into mature miRNAs by a protein called dicer. Together, miRNAs and a large protein complex called RISC rapidly destroy the RNA molecule.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. January 3, 2017.


Eukaryotic Translational and Post-Translational Regulation

After the RNA has been transported to the cytoplasm, it is translated into protein. Control of this process is largely dependent on the RNA molecule. As previously discussed, the stability of the RNA will have a large impact on its translation into a protein. As the stability changes, the amount of time that it is available for translation also changes.

The Initiation Complex and Translation Rate

Like transcription, translation is controlled by proteins that bind and initiate the process. In translation, the complex that assembles to start the process is referred to as the initiation complex. Regulation of the formation of this complex can increase or decrease rates of translation (Figure 1).

The eIF2 protein is a translation factor that binds to the small 40S ribosome subunit. When eIF2 is phosphorylated, translation is blocked.

Figure 1 Gene expression can be controlled by factors that bind the translation initiation complex.

Chemical Modifications, Protein Activity, and Longevity

Proteins can be chemically modified with the addition of groups including methyl, phosphate, acetyl, and ubiquitin groups. The addition or removal of these groups from proteins regulates their activity or the length of time they exist in the cell. Sometimes these modifications can regulate where a protein is found in the cell—for example, in the nucleus, the cytoplasm, or attached to the plasma membrane.

Chemical modifications occur in response to external stimuli such as stress, the lack of nutrients, heat, or ultraviolet light exposure. These changes can alter epigenetic accessibility, transcription, mRNA stability, or translation—all resulting in changes in expression of various genes. This is an efficient way for the cell to rapidly change the levels of specific proteins in response to the environment. Because proteins are involved in every stage of gene regulation, the phosphorylation of a protein (depending on the protein that is modified) can alter accessibility to the chromosome, can alter translation (by altering transcription factor binding or function), can change nuclear shuttling (by influencing modifications to the nuclear pore complex), can alter RNA stability (by binding or not binding to the RNA to regulate its stability), can modify translation (increase or decrease), or can change post-translational modifications (add or remove phosphates or other chemical modifications).

The addition of an ubiquitin group to a protein marks that protein for degradation. Ubiquitin acts like a flag indicating that the protein lifespan is complete. These proteins are moved to the proteasome, an organelle that functions to remove proteins, to be degraded (Figure 2). One way to control gene expression, therefore, is to alter the longevity of the protein.

Multiple ubiquitin groups bind to a protein. The tagged protein is then fed into the hollow tube of a proteasome. The proteasome degrades the protein.

Figure 2 Proteins with ubiquitin tags are marked for degradation within the proteasome.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Cancer and Gene Regulation

Cancer is not a single disease but includes many different diseases. In cancer cells, mutations modify cell-cycle control and cells don’t stop growing as they normally would. Mutations can also alter the growth rate or the progression of the cell through the cell cycle. As a result, cells can progress through the cell cycle unimpeded, even if mutations exist in the cell and its growth should be terminated.

Cancer: Disease of Altered Gene Expression

Cancer can be described as a disease of altered gene expression. There are many proteins that are turned on or off (gene activation or gene silencing) that dramatically alter the overall activity of the cell. A gene that is not normally expressed in that cell can be switched on and expressed at high levels. This can be the result of gene mutation or changes in gene regulation (epigenetic, transcription, post-transcription, translation, or post-translation).

Changes in epigenetic regulation, transcription, RNA stability, protein translation, and post-translational control can be detected in cancer. While these changes don’t occur simultaneously in one cancer, changes at each of these levels can be detected when observing cancer at different sites in different individuals. Therefore, changes in histone acetylation (epigenetic modification that leads to gene silencing), activation of transcription factors by phosphorylation, increased RNA stability, increased translational control, and protein modification can all be detected at some point in various cancer cells. Scientists are working to understand the common changes that give rise to certain types of cancer or how a modification might be exploited to destroy a tumor cell.

Tumor Suppressor Genes, Oncogenes, and Cancer

In normal cells, some genes function to prevent excess, inappropriate cell growth. These are tumor suppressor genes, which are active in normal cells to prevent uncontrolled cell growth. There are many tumor suppressor genes in cells. The most studied tumor suppressor gene is p53, which is mutated in over 50 percent of all cancer types. The p53 protein itself functions as a transcription factor. It can bind to sites in the promoters of genes to initiate transcription. Therefore, the mutation of p53 in cancer will dramatically alter the transcriptional activity of its target genes.

Proto-oncogenes are positive cell-cycle regulators (their normal function is to allow the cell cycle to progress through checkpoints). When mutated, proto-oncogenes can become oncogenes and cause cancer. Overexpression of the oncogene can lead to uncontrolled cell growth. This is because oncogenes can alter transcriptional activity, stability, or protein translation of another gene that directly or indirectly controls cell growth.

Cancer and Epigenetic Alterations

Silencing genes through epigenetic mechanisms is also very common in cancer cells. There are characteristic modifications to histone proteins and DNA that are associated with silenced genes. In cancer cells, the DNA in the promoter region of silenced genes is methylated on cytosine DNA residues in CpG islands. Histone proteins that surround that region lack the acetylation modification that is present when the genes are expressed in normal cells. This combination of DNA methylation and histone deacetylation (epigenetic modifications that lead to gene silencing) is commonly found in cancer. When these modifications occur, the gene present in that chromosomal region is silenced. Increasingly, scientists understand how epigenetic changes are altered in cancer. Because these changes are temporary and can be reversed—for example, by preventing the action of the histone deacetylase protein that removes acetyl groups, or by DNA methyl transferase enzymes that add methyl groups to cytosines in DNA—it is possible to design new drugs and new therapies to take advantage of the reversible nature of these processes. Indeed, many researchers are testing how a silenced gene can be switched back on in a cancer cell to help re-establish normal growth patterns.

Genes involved in the development of many other illnesses, ranging from allergies to inflammation to autism, are thought to be regulated by epigenetic mechanisms. As our knowledge of how genes are controlled deepens, new ways to treat diseases like cancer will emerge.

Cancer and Transcriptional Control

Alterations in cells that give rise to cancer can affect the transcriptional control of gene expression. Mutations that activate transcription factors, such as increased phosphorylation, can increase the binding of a transcription factor to its binding site in a promoter. This could lead to increased transcriptional activation of that gene that results in modified cell growth. Alternatively, a mutation in the DNA of a promoter or enhancer region can increase the binding ability of a transcription factor. This could also lead to the increased transcription and aberrant gene expression that is seen in cancer cells.

Researchers have been investigating how to control the transcriptional activation of gene expression in cancer. Identifying how a transcription factor binds, or a pathway that activates where a gene can be turned off, has led to new drugs and new ways to treat cancer. In breast cancer, for example, many proteins are overexpressed. This can lead to increased phosphorylation of key transcription factors that increase transcription. One such example is the overexpression of the epidermal growth factor receptor (EGFR) in a subset of breast cancers. The EGFR pathway activates many protein kinases that, in turn, activate many transcription factors that control genes involved in cell growth. New drugs that prevent the activation of EGFR have been developed and are used to treat these cancers.

Cancer and Post-transcriptional Control

Changes in the post-transcriptional control of a gene can also result in cancer. Recently, several groups of researchers have shown that specific cancers have altered expression of miRNAs. Because miRNAs bind to the 3′ UTR of RNA molecules to degrade them, overexpression of these miRNAs could be detrimental to normal cellular activity. Too many miRNAs could dramatically decrease the RNA population leading to a decrease in protein expression. Several studies have demonstrated a change in the miRNA population in specific cancer types. It appears that the subset of miRNAs expressed in breast cancer cells is quite different from the subset expressed in lung cancer cells or even from normal breast cells. This suggests that alterations in miRNA activity can contribute to the growth of breast cancer cells. These types of studies also suggest that if some miRNAs are specifically expressed only in cancer cells, they could be potential drug targets. It would, therefore, be conceivable that new drugs that turn off miRNA expression in cancer could be an effective method to treat cancer.

Cancer and Translational/Post-translational Control

There are many examples of how translational or post-translational modifications of proteins arise in cancer. Modifications are found in cancer cells from the increased translation of a protein to changes in protein phosphorylation to alternative splice variants of a protein. An example of how the expression of an alternative form of a protein can have dramatically different outcomes is seen in colon cancer cells. The c-Flip protein, a protein involved in mediating the cell death pathway, comes in two forms: long (c-FLIPL) and short (c-FLIPS). Both forms appear to be involved in initiating controlled cell death mechanisms in normal cells. However, in colon cancer cells, expression of the long form results in increased cell growth instead of cell death. Clearly, the expression of the wrong protein dramatically alters cell function and contributes to the development of cancer.

New Drugs to Combat Cancer: Targeted Therapies

Scientists are using what is known about the regulation of gene expression in disease states, including cancer, to develop new ways to treat and prevent disease development. Many scientists are designing drugs on the basis of the gene expression patterns within individual tumors. This idea, that therapy and medicines can be tailored to an individual, has given rise to the field of personalized medicine. With an increased understanding of gene regulation and gene function, medicines can be designed to specifically target diseased cells without harming healthy cells. Some new medicines, called targeted therapies, have exploited the overexpression of a specific protein or the mutation of a gene to develop a new medication to treat disease. One such example is the use of anti-EGF receptor medications to treat the subset of breast cancer tumors that have very high levels of the EGF protein. Undoubtedly, more targeted therapies will be developed as scientists learn more about how gene expression changes can cause cancer.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Concepts of Biology. OpenStax CNX. January 3, 2017.


Meiosis - Sexual Reproduction

Learning Objectives

Course Outcomes for this section:

Apply biological theories and concepts to solve problems related to classical and molecular genetics

  1. Describe the molecular basis of inheritance.

The ability to reproduce in kind is a basic characteristic of all living things. In kind means that the offspring of any organism closely resembles its parent or parents. Hippopotamuses give birth to hippopotamus calves; Monterey pine trees produce seeds from which Monterey pine seedlings emerge; and adult flamingos lay eggs that hatch into flamingo chicks. In kind does not generally mean exactly the same. While many single-celled organisms and a few multicellular organisms can produce genetically identical clones of themselves through mitotic cell division, many single-celled organisms and most multicellular organisms reproduce regularly using another method.


Figure 1: Each of us, like these other large multicellular organisms, begins life as a fertilized egg. After trillions of cell divisions, each of us develops into a complex, multicellular organism. (credit a: modification of work by Frank Wouters; credit b: modification of work by Ken Cole, USGS; credit c: modification of work by Martin Pettitt)

Sexual reproduction is the production by parents of sex cells and the fusion of two sex cells to form a single, unique cell. In multicellular organisms, this new cell will then undergo mitotic cell divisions to develop into an adult organism. A type of cell division called meiosis leads to the cells that are part of the sexual reproductive cycle. Sexual reproduction, specifically meiosis and fertilization, introduces variation into offspring that may account for the evolutionary success of sexual reproduction. The vast majority of eukaryotic organisms can or must employ some form of meiosis and fertilization to reproduce.

Sexual reproduction was an early evolutionary innovation after the appearance of eukaryotic cells. The fact that most eukaryotes reproduce sexually is evidence of its evolutionary success. In many animals, it is the only mode of reproduction. And yet, scientists recognize some real disadvantages to sexual reproduction. On the surface, offspring that are genetically identical to the parent may appear to be more advantageous. If the parent organism is successfully occupying a habitat, offspring with the same traits would be similarly successful. There is also the obvious benefit to an organism that can produce offspring by asexual budding, fragmentation, or asexual eggs. These methods of reproduction do not require another organism of the opposite sex. There is no need to expend energy finding or attracting a mate. That energy can be spent on producing more offspring. Indeed, some organisms that lead a solitary lifestyle have retained the ability to reproduce asexually. In addition, asexual populations only have female individuals, so every individual is capable of reproduction. In contrast, the males in sexual populations (half the population) are not producing offspring themselves. Because of this, an asexual population can grow twice as fast as a sexual population in theory. This means that in competition, the asexual population would have the advantage. All of these advantages to asexual reproduction, which are also disadvantages to sexual reproduction, should mean that the number of species with asexual reproduction should be more common.

However, multicellular organisms that exclusively depend on asexual reproduction are exceedingly rare. Why is sexual reproduction so common? This is one of the important questions in biology and has been the focus of much research from the latter half of the twentieth century until now. A likely explanation is that the variation that sexual reproduction creates among offspring is very important to the survival and reproduction of those offspring. The only source of variation in asexual organisms is mutation. This is the ultimate source of variation in sexual organisms. In addition, those different mutations are continually reshuffled from one generation to the next when different parents combine their unique genomes, and the genes are mixed into different combinations by the process of meiosis. Meiosis is the division of the contents of the nucleus that divides the chromosomes among gametes. Variation is introduced during meiosis, as well as when the gametes combine in fertilization.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016

OpenStax, Biology. OpenStax CNX. May 27, 2016

OpenStax, Biology. OpenStax CNX. May 27, 2016


Overview of Meiosis

Sexual reproduction requires fertilization, a union of two cells from two individual organisms. If those two cells each contain one set of chromosomes, then the resulting cell contains two sets of chromosomes. The number of sets of chromosomes in a cell is called its ploidy level (Figure 1). Haploid cells contain one set of chromosomes. Cells containing two sets of chromosomes are called diploid. If the reproductive cycle is to continue, the diploid cell must somehow reduce its number of chromosome sets before fertilization can occur again, or there will be a continual doubling in the number of chromosome sets in every generation. So, in addition to fertilization, sexual reproduction includes a nuclear division, known as meiosis, that reduces the number of chromosome sets.

Figure 1 Number of chromosomes in a haploid and diploid cell. Note that triploid and tetraploid are not normal numbers of chromosomes in humans.

Most animals and plants are diploid, containing two sets of chromosomes; in each somatic cell (the non-reproductive cells of a multicellular organism), the nucleus contains two copies of each chromosome that are referred to as homologous chromosomes. Somatic cells are sometimes referred to as “body” cells. Homologous chromosomes are matched pairs containing genes for the same traits in identical locations along their length (Figure 2). Diploid organisms inherit one copy of each homologous chromosome from each parent; all together, they are considered a full set of chromosomes. In animals, haploid cells containing a single copy of each homologous chromosome are found only within gametes. Gametes fuse with another haploid gamete to produce a diploid cell.

Figure 2 A karyotype displaying all of the chromosomes in the human genome. Note that there are two copies of each chromosome. These are the homologous chromosomes (one from each parent).

Nearly all animals employ a diploid-dominant life-cycle strategy in which the only haploid cells produced by the organism are the gametes. Early in the development of the embryo, specialized diploid cells, called germ cells, are produced within the gonads, such as the testes and ovaries. Germ cells are capable of mitosis to perpetuate the cell line and meiosis to produce gametes. Once the haploid gametes are formed, they lose the ability to divide again. There is no multicellular haploid life stage. Fertilization occurs with the fusion of two gametes, usually from different individuals, restoring the diploid state (Figure 3).

Figure 3 In animals, sexually reproducing adults form haploid gametes from diploid germ cells. Fusion of the gametes gives rise to a fertilized egg cell, or zygote. The zygote will undergo multiple rounds of mitosis to produce a multicellular offspring. The germ cells are generated early in the development of the zygote.

The nuclear division that forms haploid cells, which is called meiosis, is related to mitosis. As you have learned, mitosis is part of a cell reproduction cycle that results in identical daughter nuclei that are also genetically identical to the original parent nucleus. In mitosis, both the parent and the daughter nuclei contain the same number of chromosome sets—diploid for most plants and animals. Meiosis employs many of the same mechanisms as mitosis. However, the starting nucleus is always diploid and the nuclei that result at the end of a meiotic cell division are haploid. To achieve the reduction in chromosome number, meiosis consists of one round of chromosome duplication and two rounds of nuclear division.

Figure 4 An overview of meiosis. Two sets of homologous chromosomes are shown. One set is comprised of a long red and a long blue chromosome. The second set is the two shorter chromosomes. During interphase, the chromosomes are duplicated so that in the second cell the look like X’s. These two connected copies are called sister chromatids. Photo credit Rdbickel; Wikimedia.

Because the events that occur during each of the division stages are analogous to the events of mitosis, the same stage names are assigned. However, because there are two rounds of division, the stages are designated with a “I” or “II.” Thus, meiosis I is the first round of meiotic division and reduces the number of chromosome sets from two to one (Figure 4). The genetic information is also mixed during this division to create unique recombinant chromosomes. Meiosis II, in which the second round of meiotic division takes place in a way that is similar to mitosis, separates the sister chromatids (the identical copies of each chromosome produced during DNA replication that are attached at the centromere).


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016


Meiosis I


Meiosis is preceded by an interphase which is nearly identical to the interphase preceding mitosis. During interphase, the DNA of the chromosomes is replicated (during S phase). After DNA replication, each chromosome becomes composed of two identical copies (called sister chromatids) that are held together at the centromere until they are pulled apart during meiosis II (Figure 1).

Figure 1: Sister chromatids are identical copies of a chromosome that are held together at the centromere. They are produced during DNA replication. (Credit: User:SyntaxError55, from Wikimedia)

Meiosis I

Meiosis is preceded by an interphase consisting of the G1, S, and G2 phases, which are nearly identical to the phases preceding mitosis. The G1 phase, which is also called the first gap phase, is the first phase of the interphase and is focused on cell growth. The S phase is the second phase of interphase, during which the DNA of the chromosomes is replicated. Finally, the G2 phase, also called the second gap phase, is the third and final phase of interphase; in this phase, the cell undergoes the final preparations for meiosis.

During DNA duplication in the S phase, each chromosome is replicated to produce two identical copies, called sister chromatids, that are held together at the centromere by cohesin proteins. Cohesin holds the chromatids together until anaphase II. The centrosomes, which are the structures that organize the microtubules of the meiotic spindle, also replicate. This prepares the cell to enter prophase I, the first meiotic phase.

Prophase I

Early in prophase I, the chromosomes can be seen clearly microscopically. As the nuclear envelope begins to break down, the proteins associated with homologous chromosomes bring the pair close to each other. The tight pairing of the homologous chromosomes is called synapsis (Figure 2). In synapsis, the genes on the chromatids of the homologous chromosomes are precisely aligned with each other. Recall that synapsis does NOT occur during mitosis.

Figure 2 Early in prophase I, homologous chromosomes come together to form a synapse. The chromosomes are bound tightly together and in perfect alignment by a protein lattice called a synaptonemal complex and by cohesin proteins at the centromere.

In synapsis, the genes on the chromatids of the homologous chromosomes are aligned precisely with each other. An exchange of chromosome segments between non-sister homologous chromatids occurs and is called crossing over (Figure 3). The crossover events are the first source of genetic variation produced by meiosis. A single crossover event between homologous non-sister chromatids leads to a reciprocal exchange of equivalent DNA between a maternal chromosome and a paternal chromosome. Now, when that sister chromatid is moved into a gamete, it will carry some DNA from one parent of the individual and some DNA from the other parent. The recombinant sister chromatid has a combination of maternal and paternal genes that did not exist before the crossover.

Figure 3: In this illustration of the effects of crossing over, the blue chromosome came from the individual’s father and the red chromosome came from the individual’s mother. Crossover occurs between non-sister chromatids of homologous chromosomes. The result is an exchange of genetic material between homologous chromosomes. The chromosomes that have a mixture of maternal and paternal sequence are called recombinant and the chromosomes that are completely paternal or maternal are called non-recombinant.

Prometaphase I

The key event in prometaphase I is the attachment of the spindle fiber microtubules to the kinetochore proteins at the centromeres. Kinetochore proteins are multiprotein complexes that bind the centromeres of a chromosome to the microtubules of the mitotic spindle. Microtubules grow from centrosomes placed at opposite poles of the cell. The microtubules move toward the middle of the cell and attach to one of the two fused homologous chromosomes. The microtubules attach at each chromosomes’ kinetochores. With each member of the homologous pair attached to opposite poles of the cell, in the next phase, the microtubules can pull the homologous pair apart. A spindle fiber that has attached to a kinetochore is called a kinetochore microtubule. At the end of prometaphase I, each tetrad is attached to microtubules from both poles, with one homologous chromosome facing each pole. The homologous chromosomes are still held together at chiasmata. In addition, the nuclear membrane has broken down entirely.

Figure 4 In prometaphase I, microtubules attach to the fused kinetochores of homologous chromosomes, and the homologous chromosomes are arranged at the midpoint of the cell in metaphase I. In anaphase I, the homologous chromosomes are separated.

Metaphase I

During metaphase I, the homologous chromosomes are arranged in the center of the cell with the kinetochores facing opposite poles. The orientation of each pair of homologous chromosomes at the center of the cell is random. This randomness, called independent assortment, is the physical basis for the generation of the second form of genetic variation in offspring (Figure 5). Consider that the homologous chromosomes of a sexually reproducing organism are originally inherited as two separate sets, one from each parent in the egg and the sperm. Using humans as an example, one set of 23 chromosomes is present in the egg donated by the mother. The father provides the other set of 23 chromosomes in the sperm that fertilizes the egg. In metaphase I, these pairs line up at the midway point between the two poles of the cell. Because there is an equal chance that a microtubule fiber will encounter a maternally or paternally inherited chromosome, the arrangement of the tetrads at the metaphase plate is random. Any maternally inherited chromosome may face either pole. Any paternally inherited chromosome may also face either pole. The orientation of each tetrad is independent of the orientation of the other 22 tetrads.

Figure 5 Random, independent assortment during metaphase I can be demonstrated by considering a cell with a set of two chromosomes (n = 2). In this case, there are two possible arrangements at the equatorial plane in metaphase I. The total possible number of different gametes is 2n, where n equals the number of chromosomes in a set. In this example, there are four possible genetic combinations for the gametes. With n = 23 in human cells, there are over 8 million possible combinations of paternal and maternal chromosomes.

In each cell that undergoes meiosis, the arrangement of the tetrads is different. The number of variations depends on the number of chromosomes making up a set. There are two possibilities for orientation (for each tetrad); thus, the possible number of alignments equals 2n where n is the number of chromosomes per set. Humans have 23 chromosome pairs, which results in over eight million (223) possibilities. This number does not include the variability previously created in the sister chromatids by crossover. Given these two mechanisms, it is highly unlikely that any two haploid cells resulting from meiosis will have the same genetic composition (Figure 5).

To summarize the genetic consequences of meiosis I: the maternal and paternal genes are recombined by crossover events occurring on each homologous pair during prophase I; in addition, the random assortment of tetrads at metaphase produces a unique combination of maternal and paternal chromosomes that will make their way into the gametes.

Anaphase I

In anaphase I, the microtubules pull the linked chromosomes apart. The sister chromatids remain tightly bound together at the centromere. The chiasmata are broken in anaphase I as the microtubules attached to the fused kinetochores pull the homologous chromosomes apart (Figure 4).

Telophase I and Cytokinesis I

In telophase, the separated chromosomes arrive at opposite poles. The remainder of the typical telophase events may or may not occur, depending on the species. In some organisms, the chromosomes decondense and nuclear envelopes form around the chromatids in telophase I. In other organisms, cytokinesis—the physical separation of the cytoplasmic components into two daughter cells—occurs without reformation of the nuclei. In nearly all species of animals and some fungi, cytokinesis separates the cell contents via a cleavage furrow (constriction of the actin ring that leads to cytoplasmic division). In plants, a cell plate is formed during cell cytokinesis by Golgi vesicles fusing at the metaphase plate. This cell plate will ultimately lead to the formation of cell walls that separate the two daughter cells.

Two haploid cells are the end result of the first meiotic division. The cells are haploid because at each pole, there is just one of each pair of the homologous chromosomes. Therefore, only one full set of the chromosomes is present. This is why the cells are considered haploid—there is only one chromosome set, even though each homolog still consists of two sister chromatids. Recall that sister chromatids are merely duplicates of one of the two homologous chromosomes (except for changes that occurred during crossing over). In meiosis II, these two sister chromatids will separate, creating four haploid daughter cells.

Summary of Meiosis I

The chromosomes are copied during interphase (prior to meiosis I). This forms two identical sister chromatids that are attached together at the centromere. During prophase I, crossing over introduces  genetic variation by swapping pieces of homologous chromosomes. Additional genetic variation is introduced by independent assortment, which takes into account how the homologous chromosomes line up during metaphase I. At the end of meiosis I, two haploid cells (where each chromosome still consists of two sister chromatids) are produced.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. January 2, 2017


Meiosis II

In some species, cells enter a brief interphase, or interkinesis, before entering meiosis II. Interkinesis lacks an S phase, so chromosomes are not duplicated. The two cells produced in meiosis I go through the events of meiosis II at the same time. During meiosis II, the sister chromatids within the two daughter cells separate, forming four new haploid gametes. The mechanics of meiosis II is similar to mitosis, except that each dividing cell has only one set of homologous chromosomes. Therefore, each cell has half the number of sister chromatids to separate out as a diploid cell undergoing mitosis.

Prophase II

If the chromosomes decondensed in telophase I, they condense again. If nuclear envelopes were formed, they fragment into vesicles. The centrosomes that were duplicated during interkinesis move away from each other toward opposite poles, and new spindles are formed.

Prometaphase II

The nuclear envelopes are completely broken down, and the spindle is fully formed. Each sister chromatid forms an individual kinetochore that attaches to microtubules from opposite poles.

Metaphase II

The sister chromatids are maximally condensed and aligned at the equator of the cell.

Anaphase II

The sister chromatids are pulled apart by the kinetochore microtubules and move toward opposite poles (Figure 1). Non-kinetochore microtubules elongate the cell.

In meiosis II, the connected sister chromatids remaining in the haploid cells from meiosis I will be split to form four haploid cells. The two cells produced in meiosis I go through the events of meiosis II in synchrony. Overall, meiosis II resembles the mitotic division of a haploid cell. During meiosis II, the sister chromatids are pulled apart by the spindle fibers and move toward opposite poles.

Figure 1 In prometaphase I, microtubules attach to the fused kinetochores of homologous chromosomes. In anaphase I, the homologous chromosomes are separated. In prometaphase II, microtubules attach to individual kinetochores of sister chromatids. In anaphase II, the sister chromatids are separated.

Telophase II and Cytokinesis

The chromosomes arrive at opposite poles and begin to decondense. Nuclear envelopes form around the chromosomes. Cytokinesis separates the two cells into four unique haploid cells. At this point, the newly formed nuclei are both haploid and have only one copy of the single set of chromosomes. The cells produced are genetically unique because of the random assortment of paternal and maternal homologs and because of the recombining of maternal and paternal segments of chromosomes (with their sets of genes) that occurs during crossover.

The entire process of meiosis is outlined in Figure 2.

Figure 2 An animal cell with a diploid number of four (2n = 4) proceeds through the stages of meiosis to form four haploid daughter cells.

Summary of Meiosis II

Meiosis II begins with the 2 haploid cells where each chromosome is made up of two connected sister chromatids. DNA replication does NOT occur at the beginning of meiosis II. The sister chromatids are separated, producing 4 genetically different haploid cells.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016


Comparing Meiosis and Mitosis

Mitosis and meiosis, which are both forms of division of the nucleus in eukaryotic cells, share some similarities, but also exhibit distinct differences that lead to their very different outcomes. Mitosis is a single nuclear division that results in two nuclei, usually partitioned into two new cells. The nuclei resulting from a mitotic division are genetically identical to the original. They have the same number of sets of chromosomes: one in the case of haploid cells, and two in the case of diploid cells. On the other hand, meiosis is two nuclear divisions that result in four nuclei, usually partitioned into four new cells. The nuclei resulting from meiosis are never genetically identical, and they contain one chromosome set only—this is half the number of the original cell, which was diploid.

The differences in the outcomes of meiosis and mitosis occur because of differences in the behavior of the chromosomes during each process. Most of these differences in the processes occur in meiosis I, which is a very different nuclear division than mitosis. In meiosis I, the homologous chromosome pairs become associated with each other, are bound together, experience chiasmata and crossover between sister chromatids, and line up along the metaphase plate in tetrads with spindle fibers from opposite spindle poles attached to each kinetochore of a homolog in a tetrad. All of these events occur only in meiosis I, never in mitosis.

Homologous chromosomes move to opposite poles during meiosis I so the number of sets of chromosomes in each nucleus-to-be is reduced from two to one. For this reason, meiosis I is referred to as a reduction division. There is no such reduction in ploidy level in mitosis.

Meiosis II is much more analogous to a mitotic division. In this case, duplicated chromosomes (only one set of them) line up at the center of the cell with divided kinetochores attached to spindle fibers from opposite poles. During anaphase II, as in mitotic anaphase, the kinetochores divide and one sister chromatid is pulled to one pole and the other sister chromatid is pulled to the other pole. If it were not for the fact that there had been crossovers, the two products of each meiosis II division would be identical as in mitosis; instead, they are different because there has always been at least one crossover per chromosome. Meiosis II is not a reduction division because, although there are fewer copies of the genome in the resulting cells, there is still one set of chromosomes, as there was at the end of meiosis I.

Cells produced by mitosis will function in different parts of the body as a part of growth or replacing dead or damaged cells. They may even be involved in asexual reproduction in some organisms. Cells produced by meiosis in a diploid-dominant organism such as an animal will only participate in sexual reproduction.

Figure 1 Meiosis and mitosis are both preceded by one round of DNA replication; however, meiosis includes two nuclear divisions. The four daughter cells resulting from meiosis are haploid and genetically distinct. The daughter cells resulting from mitosis are diploid and identical to the parent cell.


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016


Errors in Meiosis

Inherited disorders can arise when chromosomes behave abnormally during meiosis. Chromosome disorders can be divided into two categories: abnormalities in chromosome number and chromosome structural rearrangements. Because even small segments of chromosomes can span many genes, chromosomal disorders are characteristically dramatic and often fatal.

Disorders in Chromosome Number

The isolation and microscopic observation of chromosomes forms the basis of cytogenetics and is the primary method by which clinicians detect chromosomal abnormalities in humans. A karyotype is the number and appearance of chromosomes, including their length, banding pattern, and centromere position. To obtain a view of an individual’s karyotype, cytologists photograph the chromosomes and then cut and paste each chromosome into a chart, or karyogram (Figure 1).


Figure 1 This karyogram shows the chromosomes of a normal female human immune cell during mitosis. (Credit: Andreas Bolzer, et al)

By observing a karyogram, geneticists can actually visualize the chromosomal composition of an individual to confirm or predict genetic abnormalities in offspring even before birth.

Geneticists Use Karyograms to Identify Chromosomal Aberrations

Although Mendel is referred to as the “father of modern genetics,” he performed his experiments with none of the tools that the geneticists of today routinely employ. One such powerful cytological technique is karyotyping, a method in which traits characterized by chromosomal abnormalities can be identified from a single cell. To observe an individual’s karyotype, a person’s cells (like white blood cells) are first collected from a blood sample or other tissue. In the laboratory, the isolated cells are stimulated to begin actively dividing. A chemical called colchicine is then applied to cells to arrest condensed chromosomes in metaphase. Cells are then made to swell using a hypotonic solution so the chromosomes spread apart. Finally, the sample is preserved in a fixative and applied to a slide.

The geneticist then stains chromosomes with one of several dyes to better visualize the distinct and reproducible banding patterns of each chromosome pair. Following staining, the chromosomes are viewed using bright-field microscopy. A common stain choice is the Giemsa stain. Giemsa staining results in approximately 400–800 bands (of tightly coiled DNA and condensed proteins) arranged along all of the 23 chromosome pairs; an experienced geneticist can identify each band. In addition to the banding patterns, chromosomes are further identified on the basis of size and centromere location. To obtain the classic depiction of the karyotype in which homologous pairs of chromosomes are aligned in numerical order from longest to shortest, the geneticist obtains a digital image, identifies each chromosome, and manually arranges the chromosomes into this pattern (Figure 1).

At its most basic, the karyogram may reveal genetic abnormalities in which an individual has too many or too few chromosomes per cell. Examples of this are Down Syndrome, which is identified by a third copy of chromosome 21, and Turner Syndrome, which is characterized by the presence of only one X chromosome in women instead of the normal two. Geneticists can also identify large deletions or insertions of DNA. For instance, Jacobsen Syndrome—which involves distinctive facial features as well as heart and bleeding defects—is identified by a deletion on chromosome 11. Finally, the karyotype can pinpoint translocations, which occur when a segment of genetic material breaks from one chromosome and reattaches to another chromosome or to a different part of the same chromosome. Translocations are implicated in certain cancers, including chronic myelogenous leukemia.

During Mendel’s lifetime, inheritance was an abstract concept that could only be inferred by performing crosses and observing the traits expressed by offspring. By observing a karyogram, today’s geneticists can actually visualize the chromosomal composition of an individual to confirm or predict genetic abnormalities in offspring, even before birth.

Of all the chromosomal disorders, abnormalities in chromosome number are the most easily identifiable from a karyogram. Disorders of chromosome number include the duplication or loss of entire chromosomes, as well as changes in the number of complete sets of chromosomes. They are caused by nondisjunction, which occurs when pairs of homologous chromosomes or sister chromatids fail to separate during meiosis. The risk of nondisjunction increases with the age of the parents.

Nondisjunction can occur during either meiosis I or II, with different results (Figure 2). If homologous chromosomes fail to separate during meiosis I, the result is two gametes that lack that chromosome and two gametes with two copies of the chromosome. If sister chromatids fail to separate during meiosis II, the result is one gamete that lacks that chromosome, two normal gametes with one copy of the chromosome, and one gamete with two copies of the chromosome.

Figure 2 Nondisjunction occurs when homologous chromosomes or sister chromatids fail to separate during meiosis, resulting in an abnormal chromosome number. Nondisjunction may occur during meiosis I or meiosis II.

An individual with the appropriate number of chromosomes for their species is called euploid; in humans, euploidy corresponds to 22 pairs of autosomes and one pair of sex chromosomes (such as is seen in the karyotype in Figure 1). An individual with an error in chromosome number is described as aneuploid, a term that includes monosomy (loss of one chromosome) or trisomy (gain of an extraneous chromosome). Monosomic human zygotes missing any one copy of an autosome invariably fail to develop to birth because they have only one copy of essential genes. Most autosomal trisomies also fail to develop to birth; however, duplications of some of the smaller chromosomes (13, 15, 18, 21, or 22) can result in offspring that survive for several weeks to many years. Trisomic individuals suffer from a different type of genetic imbalance: an excess in gene dose. Cell functions are calibrated to the amount of gene product produced by two copies (doses) of each gene; adding a third copy (dose) disrupts this balance. The most common trisomy is that of chromosome 21, which leads to Down syndrome. Individuals with this inherited disorder have characteristic physical features and developmental delays in growth and cognition.

Figure 3 Karyotype of an individual with Down Syndrome. Photo credit U.S. Department of Energy Human Genome Program. Wikimedia.

The incidence of Down syndrome is correlated with maternal age, such that older women are more likely to give birth to children with Down syndrome (Figure 4).

Figure 4: The incidence of having a fetus with trisomy 21 increases dramatically with maternal age.

An individual with more than the correct number of chromosome sets (two for diploid species) is called polyploid. For instance, fertilization of an abnormal diploid egg with a normal haploid sperm would yield a triploid zygote. Polyploid animals are extremely rare, with only a few examples among the flatworms, crustaceans, amphibians, fish, and lizards. Triploid animals are sterile because meiosis cannot proceed normally with an odd number of chromosome sets. In contrast, polyploidy is very common in the plant kingdom, and polyploid plants tend to be larger and more robust than euploids of their species (Figure 5).

Figure 5 As with many polyploid plants, this triploid orange daylily (Hemerocallis fulva) is particularly large and robust, and grows flowers with triple the number of petals of its diploid counterparts. (credit: Steve Karg)

Sex Chromosome Nondisjunction

Humans display dramatic deleterious effects with autosomal trisomies and monosomies. Therefore, it may seem counterintuitive that human females and males can function normally, despite carrying different numbers of the X chromosome. In part, this occurs because of a process called X inactivation. Early in development, when female mammalian embryos consist of just a few thousand cells, one X chromosome in each cell inactivates by condensing into a structure called a Barr body. The genes on the inactive X chromosome are not expressed. The particular X chromosome (maternally or paternally derived) that is inactivated in each cell is random, but once the inactivation occurs, all cells descended from that cell will have the same inactive X chromosome. By this process, females compensate for their double genetic dose of X chromosome.

In so-called “tortoiseshell” cats, X inactivation is observed as coat-color variegation (Figure 6). Females heterozygous for an X-linked coat color gene will express one of two different coat colors over different regions of their body, corresponding to whichever X chromosome is inactivated in the embryonic cell progenitor of that region. When you see a tortoiseshell cat, you will know that it has to genetically be a female.


Figure 6 Embryonic inactivation of one of two different X chromosomes encoding different coat colors gives rise to the tortoiseshell phenotype in cats. (credit: Michael Bodega)

In an individual carrying an abnormal number of X chromosomes, cellular mechanisms will inactivate all but one X in each of her cells. As a result, X-chromosomal abnormalities are typically associated with mild mental and physical defects, as well as sterility. If the X chromosome is absent altogether, the individual will not develop.

Several errors in sex chromosome number have been characterized. Individuals with three X chromosomes, called triplo-X, appear female but express developmental delays and reduced fertility. The XXY chromosome complement, corresponding to one type of Klinefelter syndrome, corresponds to male individuals with small testes, enlarged breasts, and reduced body hair. The extra X chromosome undergoes inactivation to compensate for the excess genetic dosage. Turner syndrome, characterized as an X0 chromosome complement (i.e., only a single sex chromosome), corresponds to a female individual with short stature, webbed skin in the neck region, hearing and cardiac impairments, and sterility.

Chromosome Structural Rearrangements

Cytologists have characterized numerous structural rearrangements in chromosomes, including partial duplications, deletions, inversions, and translocations. Duplications and deletions often produce offspring that survive but exhibit physical and mental abnormalities. Cri-du-chat (from the French for “cry of the cat”) is a syndrome associated with nervous system abnormalities and identifiable physical features that results from a deletion of most of the small arm of chromosome 5 (Figure 7). Infants with this genotype emit a characteristic high-pitched cry upon which the disorder’s name is based.


Figure 7 This individual with cri-du-chat syndrome is shown at various ages: (A) age two, (B) age four, (C) age nine, and (D) age 12. (credit: Paola Cerruti Mainardi)

Chromosome inversions and translocations can be identified by observing cells during meiosis because homologous chromosomes with a rearrangement in one of the pair must contort to maintain appropriate gene alignment and pair effectively during prophase I.

A chromosome inversion is the detachment, 180° rotation, and reinsertion of part of a chromosome (Figure 8). Unless they disrupt a gene sequence, inversions only change the orientation of genes and are likely to have more mild effects than aneuploid errors.

Figure 8 An inversion occurs when a chromosome segment breaks from the chromosome, reverses its orientation, and then reattaches in the original position.

A translocation occurs when a segment of a chromosome dissociates and reattaches to a different, nonhomologous chromosome. Translocations can be benign or have devastating effects, depending on how the positions of genes are altered with respect to regulatory sequences. Notably, specific translocations have been associated with several cancers and with schizophrenia. Reciprocal translocations result from the exchange of chromosome segments between two nonhomologous chromosomes such that there is no gain or loss of genetic information (Figure 9).

Figure 9 A reciprocal translocation occurs when a segment of DNA is transferred from one chromosome to another, nonhomologous chromosome. (credit: modification of work by National Human Genome Research/USA)

 We discussed one specific example of a chromosomal translocation in BI211 – the “Philadelphia chromosome” that is found in people who suffer from chronic myeloid leukemia (CML). In this translocation, a piece of chromosome 9 is swapped with a section of chromosome 22. This connects two genes on chromosome 22 – one that was originally from chromosome 9 and one that was from chromosome 22. This translocation produces the BCR-ABL fusion protein, which causes white blood cells to divide out of control. BCR-ABL positive cancers can be treated with the drug Gleevac.

Figure 9 “Philadelphia chromosome” showing the location of the BCR-ABL fusion protein. Photo credit A Obeidat; Wikimedia


Unless otherwise noted, images on this page are licensed under CC-BY 4.0 by OpenStax.

OpenStax, Biology. OpenStax CNX. May 27, 2016


Genetics: Dog Coat Color

Learning Objectives

By the end of this section, you will be able to:

  • Describe the molecular basis of inheritance.
  • Determine the outcome in crosses involving complete dominance.
  • Present and decipher information about inheritance using a pedigree.                

Figure 1: Experimenting with thousands of garden peas, Johann Gregor Mendel uncovered the fundamentals of genetics. (credit: modification of work by Jerry Kirkhart)

Remember that a trait is an aspect of the physical appearance of an organism that can vary. Organisms get their traits from proteins; proteins are produced using the information found in the organism’s DNA. Variation in the DNA between different organisms causes the production of proteins that contain differing orders of amino acids. These proteins can have different shapes and therefore different functions. When proteins function differently, this leads to differences in traits.

Recall that diploid organisms have two copies of each chromosome: a pair of homologous chromosomes. The reason that they have two copies is because they inherited one copy of each chromosome from each parent. Each parent donates one haploid gamete (egg or sperm) to the reproductive process. A haploid gamete contains one copy of each chromosome because during meiosis the number of chromosomes is cut in half: the DNA is copied once and then divided twice. This separation of the homologous chromosomes means that only one of the copies of the gene gets moved into a gamete. The offspring are formed when that gamete unites with one from another parent and the two copies of each gene (and chromosome) are restored.

Figure 2: A karyogram is a picture of all the chromosomes in a cell, organized into homologous pairs. This is a human karyogram which shows the 46 chromosomes present in diploid human somatic cells.

A diploid organism has two copies of a given gene. The two copies may or may not encode the same version of that characteristic. For example, one individual pea plant (such as those studied by Mendel) would have two copies of the gene that controls flower color. That individual could carry one version of the gene that leads to white flower color and a second different version of that same gene that leads to violet flower color. The interaction between these two different versions of the same gene will lead to the visible flower color in the pea plant. Gene variations that arise by mutation and exist at the same relative locations on homologous chromosomes are called alleles. Mendel examined the inheritance of genes with just two allele forms, but it is common to encounter more than two alleles for many genes in a natural population.

Each individual (assuming it is a diploid organism) will have two alleles for a specific gene: one from each of its two parents. These two alleles are expressed and interact to produce physical characteristics. The observable traits expressed by an organism are referred to as its phenotype. An organism’s underlying genetic makeup, consisting of both the physically visible and the non-expressed alleles, is called its genotype.

Diploid organisms that are homozygous for a gene have two identical alleles, one on each of their homologous chromosomes. If the organism has two different alleles, this is referred to as heterozygous. 

This chapter will address a simple type of inheritance: complete dominance. In this type of inheritance, there are two alleles: dominant and recessive. A dominant allele will completely cover up a recessive allele. This means that if one dominant allele is present, the organism will have the trait conferred by that allele. In order for the recessive phenotype to be seen, the organism must have two recessive alleles. Just because an allele is dominant does not automatically make it better than a recessive trait. It also does not make it more common than the recessive trait. All it means for an allele to be dominant is that it is able to cover up the recessive allele.

We typically abbreviate the genotype of an organism by using single letters. The letter chosen is often the first letter of the dominant trait. A homozygous dominant genotype would be written AA, a heterozygous genotype as Aa, and a homozygous recessive genotype as aa.


Introduction to Genetics

 “Genetics” is the study of how traits are inherited. A trait is defined as a variation in the physical appearance of a heritable characteristic. It seeks to understand how traits are passed from generation to generation. Before you start learning about the details of inheritance, let’s review some topics that are important in order to understand genetics.

Recall that genes are segments of DNA that are typically several hundred or thousand bases long. Each gene directs the production of a protein through the process of protein synthesis: DNA gets transcribed to produce an mRNA; mRNA provides to code for a ribosome to produce a chain of amino acids. Read this section of the book if you need to review this topic: How do genes direct the production of proteins?

The Central Dogma – DNA is used to make RNA is used to make protein. Photo credit: ?

Recall that eukaryotic genes are found on chromosomes and that each eukaryotic chromosome typically contains hundreds or thousands of genes. In most eukaryotes, including humans and other animals, each cell contains two copies of each chromosome. The reason we have two copies of each gene is that we inherit one from each parent.

In contrast to eukaryotes, prokaryotes have one circular chromosome. This means they have one copy of each gene.

Read this section of the book if you need to review this topic: How DNA is arranged in the cell

Figure 3: There are 23 pairs of chromosomes in a female human body cell. These chromosomes are viewed within the nucleus (top), removed from a cell during cell division (right), and arranged according to length (left) in an arrangement called a karyotype. In this image, the chromosomes were exposed to fluorescent stains to distinguish them. (credit: “718 Bot”/Wikimedia Commons, National Human Genome Research)

There are 23 pairs of chromosomes in a female human body cell. These chromosomes are viewed within the nucleus (top), removed from a cell during cell division (right), and arranged according to length (left) in an arrangement called a karyotype. In this image, the chromosomes were exposed to fluorescent stains to distinguish them. (credit: “718 Bot”/Wikimedia Commons, National Human Genome Research)

Chromosomes are inherited by the offspring from the parents via the egg or sperm. Inside one egg or one sperm is one copy of each chronometer (so 23 total in humans). When an egg is fertilized by a sperm, the resulting zygote (fertilized egg) will contain two copies of each chromosome, just like each of its parents.

Meiosis is the process that produces eggs and sperm. Eggs and sperm are also known as gametes. During meiosis, one copy of each paired chromosome is moved into the gamete. Cells with one copy of each chromosome are known as “haploid“. This separation, or segregation, of the homologous (paired) chromosomes means also that only one of the copies of the gene gets moved into a gamete.

The offspring are formed when that gamete unites with one from another parent and the two copies of each gene (and chromosome) are restored. Read this section of the book if you need more information on this topic: Overview of Meiosis

During meiosis, the DNA is copied once, then the cell divides twice. This produces cells with half as much genetic information as the original cell (1 copy of each chromosome). These cells become the sex cells (eggs or sperm). When two sex cells unite during fertilization, the original number of chromosomes (2 copies of each one) is restored. Photo credit: ?

The offspring will receive two copies of each gene (one from each parent), but the copies are not necessarily identical. You already knew this – you don’t get identical information from your mother and your father because they have different DNA (which gives them different traits). The different versions of one specific gene are known as alleles. As you learn about genetics, you will learn about how the information from both alleles of a specific gene interact to give an individual their trait. The genetic information that an individual has is called their genotype. The genotype of an individual produces the individual’s phenotype, or physical traits.


Unless otherwise noted, text and images by Lisa Bartee, 2016.

OpenStax, Concepts of Biology. OpenStax CNX. May 18, 2016


Pedigrees and Punnett Squares


Inheritance of a trait through generations can be shown visually using a pedigree, such as is pictured in Figure 1. Square shapes represent males; circles represent females. Filled-in shapes are individuals that have whatever trait is being shown in the pedigree. Two individuals connected together with a horizontal line between them are the parents of the individuals that are connected by vertical lines below them. Siblings are typically shown in birth order with the oldest sibling to the left.


Figure 1 A simple pedigree. In this pedigree, the parents (at the top) have produced three children: a male and two females. The first female has the condition being shown in the pedigree.

Punnett Squares

As discussed above, diploid individuals have two copies of each chromosome: one from their male parent, one from their female parent. This means they have two copies of each gene. They can have two of the same alleles (homozygous) or two different alleles (heterozygous). Regardless of their genotype, they will randomly pass only one copy of each chromosome to their offspring. This is because meiosis produces haploid gametes that contain one copy of each chromosome, and those chromosomes are assorted into gametes randomly. Since genes are present on chromosomes, this means they will pass one copy of each gene to their offspring. That means that an offspring inherits one allele of each gene from each of its two parents. This is illustrated in Figure 2. This concept is called Mendel’s Law of Segregation.


Figure 2 Two parents who are heterozygous each pass one chromosome / gene / allele to each offspring. Each resulting offspring has two of each chromosome / gene. The individual can have two of the same or two different alleles.

An easy, organized way of illustrating the offspring that can result from two specific parents is to use a Punnett square. The gametes that can be generated by each parent are represented above the rows and next to the columns of the square. Each gamete is haploid for the “A gene”, meaning it only contains one copy of that gene. In the Punnett square seen in Figure 3, haploid eggs are above each column and haploid sperm are next to each row. When a haploid sperm and a haploid egg (each with 1 copy of the “A gene”) combine during the process of fertilization, a diploid offspring (with 2 copies of the A gene) is the result.


Figure 3: A Punnett square showing a cross between two individuals who are both heterozygous for A.

A Punnett square shows the probability of an offspring with a given genotype resulting from a cross. It does not show actual offspring. For example, the Punnett square in Figure 3 shows that there is a 25% chance that a homozygous recessive offspring will result from the cross Aa x Aa. It does not mean that these parents must have 4 offspring and that they will have the ratio 1 AA : 2 Aa : 1 aa. It’s just like flipping a coin: you expect 50% heads, but you wouldn’t be too surprised to see 7 heads out of 10 coin flips. Additionally, the probability does not change for successive offspring. The probability that the first offspring will have the genotype “aa” is 25% and the probability of the second offspring having the genotype “aa” is still 25%. Again, it’s just like flipping a coin: if you flip heads the first time, that doesn’t change the probability of getting heads on the next flip.

Organisms don’t just inherit one trait at a time, though. They inherit all their traits at once. Sometimes, we want to determine the probability of an individual inheriting two different traits. The easiest way to do this is to determine the probability of the individual inheriting each trait separately, then multiply those probabilities together. An example of this can be seen in Figure 4. In order for this to work, we must assume that genes do not influence each other with regard to the sorting of alleles into gametes, and every possible combination of alleles for every gene is equally likely to occur. This is called Mendel’s Law of Independent Assortment.

Figure 4: These two Punnett square show the cross between two individuals who are both heterozygous for two different genes: BbAa x BbAa. We can determine the probability of an offspring having the recessive trait for “B” and the dominant trait for “A”. The probability of the offspring having the recessive phenotype for “B” is 1/4. The probability of the offspring having the dominant phenotype for “A” is 3/4. 1/4 x 3/4 = 3/16.

Another way of determining the probability of getting two different traits is to use a dihybrid Punnett square. Figure 5 shows three generations of the inheritance of pea seed color and shape. Peas can be either yellow or green, and they can be either round or wrinkled. These are two of the traits that Mendel studied in his work with peas. In the first generation (the “P” generation), two true-breeding (homozygous) individuals are crossed. Their offspring will get one allele of the Y gene and one allele of the R gene from each parent. This means that all their offspring (the “F1” generation) will be heterozygous for both genes. The results (the “F2” generation) from crossing two heterozygous individuals can be seen in the 4×4 Punnett square in Figure 5.

Figure 5: This dihybrid cross shows the expected offspring from the F2 generation after crossing YYRR x yyrr. Compare the results from this Punnett square to the results seen in the previous figure. They match! Photo Credit: OpenStax Biology.

The gametes produced by the F1 individuals must have one allele from each of the two genes. For example, a gamete could get an R allele for the seed shape gene and either a Y or a y allele for the seed color gene. It cannot get both an R and an r allele; each gamete can have only one allele per gene. The law of independent assortment states that a gamete into which an r allele is sorted would be equally likely to contain either a Y or a y allele. Thus, there are four equally likely gametes that can be formed when the RrYy heterozygote is self-crossed, as follows: RY, rY, Ry, and ry. Arranging these gametes along the top and left of a 4 × 4 Punnett square (Figure 5) gives us 16 equally likely genotypic combinations. From these genotypes, we find a phenotypic ratio of 9 round–yellow:3 round–green:3 wrinkled–yellow:1 wrinkled–green (Figure 5). These are the offspring ratios we would expect, assuming we performed the crosses with a large enough sample size.

We can look for individuals who have the recessive phenotype for Y and the dominant phenotype for R. These individuals must have two little y’s and at least one big R. The possible genotypes are yyRR or yyRr. Examining the Punnett square in Figure 5, we can find 3 individuals with these genotypes (they are round and green). If you compare the results from Figure 4 and Figure 5, you’ll see that we have arrived at the same value: 3/16!


Unless otherwise noted, text and images by Lisa Bartee, 2016.


Black fur color: a dominant trait

Black fur color is dominant over brown

Figure 1 This chocolate lab has two recessive alleles of the TYRP1 gene. (Credit: Rob Hanson; photo from Wikimedia.)

Most of us are familiar with the labrador retriever dog breed, such as the chocolate lab seen in Figure 8. But have you ever thought about what makes this dog brown? The difference between brown and black coat color in dogs is caused by a mutation in the TYRP1 gene. The TYRP1 gene provides instructions for making an enzyme called tyrosinase-related protein 1. This enzyme is required to produce a pigment called eumelanin. Eumelanin is a dark colored pigment. The TYRP1 gene is located on chromosome 11 in dogs (Parker, 2001).

A group of scientists who were interested in determining what caused the difference between black and brown coats sequenced the DNA within the protein-coding region of the TYRP1 gene (Schmutz, 2002). They identified three variations in the DNA making up the TYRP1 gene between brown dogs and black dogs. These variations in DNA sequence are examples of different alleles of the TYRP1 gene.

Table 1: Variations in the TYRP1 allele that lead to brown color in dogs. Data from Schmutz, 2002.

Location Black DNA sequence Brown DNA sequence Effect on protein
exon 2 TGT CGT changes a cysteine amino acid to a serine
exon 5 CAG TAG introduces a premature stop codon which results in 330 amino acids instead of 512 amino acids in the protein
exon 5 CCT — (deleted) deletion of a proline amino acid

All of these variations in the DNA sequence are predicted to cause a change in the amino acid sequence of the TYRP1 protein. These changes affect the production of eumelanin pigment, which is black in color. When eumelanin is not being produced correctly, the dog appears brown instead of black.

Like other diploid organisms, dogs all have two copies of the TYRP1 gene (one from their male parent, one from their female parent). Dogs that are homozygous for the black allele (dogs that have two copies of the black allele) are obviously going to be black in color. Dogs that are homozygous for the brown allele are obviously going to be brown. Dogs that are heterozygous (dogs that have one black allele and one brown allele) appear black. The black and brown colors do not blend together: the black allele covers up the brown allele. This means that the black allele is dominant over the brown allele. Remember that dominant alleles cover up recessive alleles. If there is one dominant allele present, the dog will appear black. The brown allele is recessive to the black allele. There must be two copies of the recessive brown allele present in order for the dog to appear brown.


Figure 2: Black and brown phenotypes in labrador retrievers. (Credit: demealiffe; from Wikimedia)

Remember that genotypes can be abbreviated with a single letter and that the letter which is chosen is typically the first letter of the dominant trait. In this case, the letter “B” is used to represent the dominant black allele, while “b” represents a recessive brown allele.

The reason that the black allele is dominant over the brown allele in this specific situation is because the black allele produces functional TYRP1 protein, while the brown allele does not. The presence of one functional allele produces enough TYRP1 protein allows the cells to produce eumelanin and appear black.

Remember: dominant does not mean “better” or “more normal”. Black color does not confer any special advantages on dogs compared to brown color. It’s just a difference.


Figure 3: What alleles of TYRP1 does this black lab puppy have? We can’t tell by looking at it. The puppy could be homozygous (BB) or heterozygous (Bb). Since black is completely dominant over brown, both options would be black. (Credit: Alice Birkin)

Let’s visualize the inheritance of black and brown using a pedigree. The pedigree in Figure 4 shows a litter of puppies. The shaded symbol shows a brown puppy, while open symbols are black individuals.


Figure 4: An example litter of puppies. The filled-in symbol shows a brown individual.

To interpret this pedigree, let’s start with information that we already know:

  • Brown is recessive, which means brown individuals must have the genotype bb. In this pedigree, brown individuals are filled in.
  • Black is dominant, which means black individuals must have at least one B allele. Their genotype could be either BB or Bb. In this pedigree, black individuals are not filled in.

Figure 5 shows the same pedigree, but with information about the individual’s genotypes filled in.

  1. The shaded individual, who is a brown female puppy, must have the genotype bb. If she had any B alleles, she would be black because the black allele is dominant over the brown allele.
  2. In order for the brown puppy to have the genotype bb, she must have gotten two “b” alleles: one from each of her parents. We know that her parents are both black (because they are unshaded), which means they must have a least one “B” allele. This means that both parents must be heterozygous: Bb.
  3. The three black puppies must have at least one “B” allele in order for them to be black in color. However, we can’t tell whether they are homozygous dominant (BB) or heterozygous (Bb) since both of those genotypes would result in black color. One way to represent this on a pedigree is B-, meaning that the second allele could be either B or b.

Figure 5: Genotypes of the individuals in this pedigree.

We can also show the cross between these parents as a Punnett square (Figure 6). We would expect 1/4 of the offspring to have the genotype bb, and that is what we see in the pedigree above.


Figure 6: The information from the pedigree shown in Figure can also be shown as a Punnett square.

Human Connection

A small number of mutations in the TYRP1 gene have been found to cause oculocutaneous albinism type 3. This condition includes a form of albinism called rufous oculocutaneous albinism, which has been described primarily in dark-skinned people from southern Africa. Affected individuals have reddish-brown skin, ginger or red hair, and hazel or brown irises. Two TYRP1 mutations are known to cause this form of albinism in individuals from Africa. One mutation replaces a protein building block (amino acid) in tyrosine-related protein 1 with a signal that prematurely stops protein production. This mutation, written as Ser166Ter or S166X, affects the amino acid serine at protein position 166. The other mutation, written as 368delA, deletes a single DNA building block from the TYRP1gene. Other alterations in this gene have been reported in a few affected people of non-African heritage. Most TYRP1 mutations lead to the production of an abnormally short, nonfunctional version of tyrosinase-related protein 1. Because this enzyme plays a role in normal pigmentation, its loss leads to the changes in skin, hair, and eye coloration that are characteristic of oculocutaneous albinism.


Photo credit: Muntuwandi; from Wikipedia.


Unless otherwise noted, text and images by Lisa Bartee, 2016.

Parker HG, Yuhua X, Mellersh CS, Khan S, Shibuya H, Johnson GS, Ostrander EA. Sept 2001. Meiotic linkage mapping of 52 genes onto the canine map does not identify significant levels of microrearrangement. Mamm Genome. 12(9):713-8.

Schmutz SM, Berryere TG, Goldfinch AD. 2002. TYRP1 and MC1R genotypes and their effects on coat color in dogs. Mammalian Genome 13, 380-387.

OpenStax, Biology. OpenStax CNX. May 27, 2016

OpenStax, Biology. OpenStax CNX. May 27, 2016

Information about TYRP1 and oculocutaneous albinism type 3: “Tyrp1” by Genetics Home Reference: Your Guide to Understanding Genetic ConditionsNational Institutes of Health: U.S> National Library of Medicine is in the Public Domain


Yellow fur color: a recessive trait

Yellow color in dogs

Labrador retrievers don’t only come in brown and black, they also come in yellow. Yellow color in labs is caused by variations in a different gene: MC1R. This gene controls the production of the melanocortin 1 receptor protein. MC1R is located on chromosome 5 in dogs (Schmutz, 2001).


Figure 1: This yellow lab is producing light-colored pheomelanin instead of dark-colored eumelanin. (Credit: Djmirko; from Wikimedia)

Melanocytes make two forms of melanin, eumelanin and pheomelanin. The relative amounts of these two pigments help determine the color of an individual’s hair and skin. Individuals who produce mostly eumelanin tend to have brown or black hair and dark skin that tans easily (in humans). Eumelanin also protects skin from damage caused by ultraviolet (UV) radiation in sunlight. Individuals who produce mostly pheomelanin tend to have red or blond hair, freckles, and light-colored skin that tans poorly. Because pheomelanin does not protect skin from UV radiation, people with more pheomelanin have an increased risk of skin damage caused by sun exposure.

The melanocortin 1 receptor controls which type of melanin is produced by melanocytes. When the receptor is activated, it triggers a series of chemical reactions inside melanocytes that stimulate these cells to make eumelanin. If the receptor is not activated or is blocked, melanocytes make pheomelanin instead of eumelanin. This means that if the receptor is working correctly and is turned on, dark pigment will be produced. If the receptor is not functional or is not turned on, light pigment will be produced.


Figure 2: The three recognized colors of labs are due to black eumelanin, brown eumelanin, or pheomelanin. (Credit: Erikeltic, from Wikimedia)

Schmutz et. al. (2002) determined the DNA sequence for the MC1R gene from dogs of various colors. They determined that black and brown dogs all have one allele of MC1R, while yellow and red dogs have a different allele. The allele that leads to yellow or red color has a premature stop codon which results in a shorter-than-normal protein. This protein would be predicted to not function correctly. Remember that when the melanocortin 1 receptor is not functioning correctly, light pheomelanin pigment is produced and not dark eumelanin.

Dogs that are homozygous for the functioning allele of MC1R (which would cause eumelanin to be produced) are dark in color. Dogs that are homozygous for the non-functioning allele (which would cause pheomelanin to be produced) are light in color. Dogs that are heterozygous are dark in color. What does this tell you about which allele is dominant? If you said “the dark allele is dominant because it covers up the light allele”, you’re correct. We will use “E” to represent the genotype at MC1R because the dominant phenotype in this case is the production of eumelanin. Dogs that have the genotype EE or Ee will produce eumelanin and be dark. Dogs that have the genotype “ee” will produce pheomelanin and be light.


Figure 3: In this pedigree, the shaded individual is yellow. She therefore has the genotype ee and produces pheomelanin. We can’t tell the genotype of her mate by looking (he could be Ee or EE), but since all of their puppies were dark in color, we would predict that his genotype was EE. In this cross: EE x ee, 100% of the puppies would have the genotype Ee, so 100% of the puppies would produce eumelanin instead of pheomelanin.

The cross shown in Figure 3 can also be shown as a Punnett square (Figure 4). Since we are unsure whether the male dog has the genotype “EE” or “Ee”, we have to make two Punnett squares. Since all of the puppies resulting from this cross were black, we would predict that the first Punnett square shows the cross. However, it is possible that the second Punnett square is correct. There are only 4 puppies, so it’s not hard to imagine that they could all be black even though the Punnett square predicts only 50% black. It would be comparable to flipping a coin 4 times and getting 4 heads in a row. Getting 4 heads in a row is less likely, but definitely possible.


Figure 4: Cross from Figure 3 shown as Punnett squares

It is very important to note here that yellow dogs still have the TYRP1 gene, even though they are not black or brown!

Human Connection

Common variations (polymorphisms) in the MC1R gene are associated with normal differences in skin and hair color. Certain genetic variations are most common in people with red hair, fair skin, freckles, and an increased sensitivity to sun exposure. These MC1R polymorphisms reduce the ability of the melanocortin 1 receptor to stimulate eumelanin production, causing melanocytes to make mostly pheomelanin. Although MC1R is a key gene in normal human pigmentation, researchers believe that the effects of other genes also contribute to a person’s hair and skin coloring.

The melanocortin 1 receptor is also active in cells other than melanocytes, including cells involved in the body’s immune and inflammatory responses. The receptor’s function in these cells is unknown.


Photo credit: dusdin on flickr; from Wikipedia.


Unless otherwise noted, text and images by Lisa Bartee, 2016.

OpenStax, Biology. OpenStax CNX. May 27, 2016

OpenStax, Biology. OpenStax CNX. May 27, 2016

OpenStax, Biology. OpenStax CNX. May 27, 2016

Human Connection – information about MC1R: “MC1R” by Genetics Home Reference: Your Guide to Understanding Genetic ConditionsNational Institutes of Health: U.S> National Library of Medicine is in the Public Domain

Schmutz SM, Moker JS, Berryere TG, Christison KM, Dolf G. 2001. An SNP is used to map MC1R to dog chromosome 5. Anim Genet. 32(1):43-4.

Schmutz SM, Berryere TG, Goldfinch AD. 2002. TYRP1 and MC1R genotypes and their effects on coat color in dogs. Mammalian Genome 13, 380-387.


Epistasis: the relationship between black, brown, and yellow fur


Dogs don’t have either the TYRP1 gene or the MC1R gene – they have both. In fact, every dog will have two copies of the TYRP1 gene and two copies of the MC1R gene. Since both genes control aspects of coat color, it makes sense that they interact. In fact, TYRP1 and MC1R have what is called an epistatic relationship: the action of one gene controls the expression of a second gene. Another way to phrase this relationship is that the effect of one gene is dependent on another gene.

Remember that TYRP1 is required for the production of eumelanin. The dominant allele of TYRP1 (B) produces black eumelanin, while the recessive allele (b) produces brown eumelanin.  However, if a dog is homozygous recessive for MC1R (ee), they lack the ability to produce eumelanin at all. If no eumelanin is being produced, it doesn’t matter whether it would have been black or brown: there is none. This means that any dog that is homozygous recessive for MC1R will appear yellow regardless of its genotype at TYRP1. These two genes are epistatic: the action of MC1R controls the expression of TYRP1. The effect of TYRP1 is dependent on MC1R.

If a dog has at least one dominant functioning allele of MC1R, then its genotype at TYRP1 can be seen. If the dog has at least one dominant allele of TYRP1, it will appear black. If it has two recessive alleles, it will appear brown.


Figure 1: Genotypes for TYRP1 (B) and MC1R (E) that lead to the three recognized colors of labs. (Credit EArellano, from Wikimedia)

A pedigree can be used to show the inheritance of two different genes such as TYRP1 and MC1R.


Figure 2: In this pedigree, a cross between an individual who is heterozygous for both MC1R and TYRP1 and an individual who has the genotype “Bbee”is shown. Black individuals are shaded black, yellow individuals are shaded yellow, and brown individuals are shaded grey. The 6 different possible genotypes are each shown as one offspring. This does not give you any information about the probability of getting a certain genotype of offspring – it gives you the actual number of offspring observed and their traits.

Punnett squares can also be used to show this cross. If the probability of inheriting one trait is multiplied by the probability of inheriting the second trait, the overall probability of getting any given offspring can be determined.


Figure 3: These two Punnett squares can be used to determine the results of a cross between these individuals: Bbee x BbEe. If you wanted to determine the probability of getting a brown dog, you would multiply the probability of getting bb by the probability of having at least one dominant E. That would equal 1/4 x 1/2 = 1/8. This gives you the probability of getting a brown dog, but doesn’t tell you anything about the number of brown dogs actually observed.

Human Connection

Individuals who have albinism lack the ability to produce any pigment. If no pigment is being produced, the color that the pigment would have been is unimportant. The effect of the pigment genes is controlled by the gene that allows pigment to be produced. This is an example of epistasis.

Albinism can occur in humans (see the section on TYRP1) as well as other animals, such as the squirrel seen below.


Photo credit: Stephenkniatt from Wikipedia.


Unless otherwise noted, text and images by Lisa Bartee, 2016.

OpenStax, Biology. OpenStax CNX. May 27, 2016

Schmutz SM, Berryere TG, Goldfinch AD. 2002. TYRP1 and MC1R genotypes and their effects on coat color in dogs. Mammalian Genome 13, 380-387.


Brindle color: partial dominance and epistasis

Brindle coloration is a black and brown striping pattern that is caused by different alleles at the “K locus”, which is probably a gene called ASIP that controls pigment switching (Figure 1; Ciampolini, 2013). There are three alleles of the K locus: KB, kbr, and ky (Kerns, 2007). The KB allele is dominant over the other two alleles and produces solid black color. kbr produces the brindle color pattern and is dominant over the ky allele. This means that dogs with the genotype kbrkbr or kbrky will have the brindle color pattern. Dogs with the genotype kykare yellow in color.


Figure 1 This boxer shows the brindle color pattern, which looks sort of like tiger stripes. (Credit:  Steve Henderson Location: Memphis, TN)

The K locus and MC1R (which controls the difference between dark eumelanin and light pheomelanin production) have an epistatic relationship. If a dog has two recessive alleles for MC1R and is therefore unable to make eumelanin, the dog will appear yellow regardless of its genotype at the K locus.


Unless otherwise noted, text and images by Lisa Bartee, 2016.

Ciampolini R, Cecchi F, Spaterna A, Bramante A, Bardet SM, Oulmouden A. 2013. Characterization of different 5′-untranslated exons of the ASIP gene in black-and-tan Doberman Pinscher and brindle Boxer dogs. Anim Genet. 44(1):114-7.

Kerns JA, Cargill EJ, Clark LA, Candille SI, Berryere TG, Olivier M, Lust G, Todhunter RJ, Schmutz SM, Murphy KE, Barsh GS. 2007. Linkage and segregation analysis of black and brindle coat color in domestic dogs. Genetics. 176(3):1679-89.


Incomplete dominance: when traits blend

Flower color in snapdragons

Mendel’s results in crossing peas, black vs brown fur color, and eumelanin production vs pheomelanin production all demonstrate traits are inherited as dominant and recessive. This contradicts the historical view that offspring always exhibited a blend of their parents’ traits. However, sometimes heterozygote phenotype is intermediate between the two parents. For example, in the snapdragon, Antirrhinum majus (Figure 1), a cross between a homozygous parent with white flowers (CWCW) and a homozygous parent with red flowers (CRCR) will produce offspring with pink flowers (CRCW) (Figure 2).


Figure 1: These pink flowers of a heterozygote snapdragon result from incomplete dominance. (credit: “storebukkebruse”/Flickr)

Note that different genotypic abbreviations are used to distinguish these patterns from simple dominance and recessiveness. The abbreviation CW  can be read as “at the flower color gene (C), the white allele is present.”


Figure 2: A cross between a red and white snapdragon will yield 100% pink offspring.

This pattern of inheritance is described as incomplete dominance, meaning that neither of the alleles is completely dominant over the other: both alleles can be seen at the same time. The allele for red flowers is incompletely dominant over the allele for white flowers. Red + white = pink. The results of a cross where the alleles are incompletely dominant can still be predicted, just as with complete dominant and recessive crosses. Figure 3 shows the results from a cross between two heterozygous individuals: CRCW x CRCW . The expected offspring would have the genotypic ratio 1 CRCR:2 CRCW:1 CWCW, and the phenotypic ratio would be 1:2:1 for red:pink:white. The basis for the intermediate color in the heterozygote is simply that the pigment produced by the red allele (anthocyanin) is diluted in the heterozygote and therefore appears pink because of the white background of the flower petals.


Figure 3: The results of crossing two pink snapdragons.

Straight, curly, and wavy hair in dogs


Figure 4: The wavy hair on this labradoodle is caused by incomplete dominance. (Credit: Localpups, Flickr)

Another example of incomplete dominance is the inheritance of straight, wavy, and curly hair in dogs. The KRT71 gene is used to synthesize the keratin 71 protein. Genes in the KRT family provide instructions for making proteins called keratins. Keratins are a group of tough, fibrous proteins that form the structural framework of epithelial cells, which are cells that line the surfaces and cavities of the body. Epithelial cells make up tissues such as the hair, skin, and nails. These cells also line the internal organs and are an important part of many glands.

Keratins are best known for providing strength and resilience to cells that form the hair, skin, and nails. These proteins allow tissues to resist damage from friction and minor trauma, such as rubbing and scratching. Keratins are also involved in several other critical cell functions, including cell movement (migration), regulation of cell size, cell growth and division (proliferation), wound healing, and transport of materials within cells. Different combinations of keratin proteins are found in different tissues.

The mutation which causes curly hair in dogs, such as the labradoodle seen in Figure 23, is in exon 2 of the gene and is predicted to substantially disrupt the structure of the keratin 71 protein (Cadieu, 2009). This change in protein shape prevents the keratin proteins from interacting together correctly within the hair, altering the structure of the hair and resulting in a curly coat (Runkel, 2006).

When a dog has two curly alleles (KCKC), it has a very curly coat, such as on the poodle in Figure 5. A dog with two straight alleles (K+K+) has a straight coat. Dogs that are heterozygous (K+KC) have an intermediate or wavy coat like the labradoodle in Figure 4.


Figure 24: This poodle has two copies of the curly allele of the KRT71 gene (KCKC). Compare his curly hair to the wavy hair of the labradoodle in Figure 23. The labradoodle is heterozygous (K+KC). (Credit B. Schoener; From Wikimedia)

Human Connection – Blood Type

Blood is classified into different groups according to the presence or absence of molecules called antigens on the surface of every red blood cell in a person’s body. Antigens determine blood type and can either be proteins or complexes of sugar molecules (polysaccharides). The genes in the blood group antigen family provide instructions for making antigen proteins. Blood group antigen proteins serve a variety of functions within the cell membrane of red blood cells. These protein functions include transporting other proteins and molecules into and out of the cell, maintaining cell structure, attaching to other cells and molecules, and participating in chemical reactions.

There are 29 recognized blood groups, most involving only one gene. Variations (polymorphisms) within the genes that determine blood group give rise to the different antigens for a particular blood group protein. For example, changes in a few DNA building blocks (nucleotides) in the ABO gene give rise to the A, B, and O blood types of the ABO blood group. The changes that occur in the genes that determine blood group typically affect only blood type and are not associated with adverse health conditions, although exceptions do occur.

The A and B alleles are codominant, which is similar to incomplete dominance in that heterozygotes have an intermediate phenotype. If both the A and B alleles are present, both will be seen in the phenotype. The O allele is recessive to both A and B.

Photo credit: InvictaHOG, from Wikipedia.


Unless otherwise noted, text and images by Lisa Bartee, 2016.

Cadieu E, Neff MW, Quignon P, Walsh K, Chase K, Parker HG, Vonholdt BM, Rhue A, Boyko A, Byers A, Wong A, Mosher DS, Elkahloun AG, Spady TC, André C, Lark KG, Cargill M, Bustamante CD, Wayne RK, Ostrander EA. 2009. Coat variation in the domestic dog is governed by variants in three genes. Science. 326(5949):150-3.

Runkel F, Klaften M, Koch K, Böhnert V, Büssow H, Fuchs H, Franz T, Hrabé de Angelis M. 2006. Morphologic and molecular characterization of two novel Krt71 (Krt2-6g) mutations: Krt71rco12 and Krt71rco13. Mamm Genome. 17(12):1172-82.

OpenStax, Biology. OpenStax CNX. May 27, 2016

“Blood Group Antigens” by Genetics Home Reference: Your Guide to Understanding Genetic ConditionsNational Institutes of Health: U.S> National Library of Medicine is in the Public Domain

“Keratins” by Genetics Home Reference: Your Guide to Understanding Genetic ConditionsNational Institutes of Health: U.S> National Library of Medicine is in the Public Domain


White spotting: When there's more than two alleles

 So far, we have discussed genes which have only two alleles. However, that is not always the case: there can be more than two alleles for a given gene. One example is the MITF gene, which is the major gene that controls white spotting in dogs. This protein is required for the migration and survival of melanocytes into the skin during development. If it is not functional, it impairs the ability of the skin to make pigment, thus “covering up” the effect of other color genes. There are thought to be at least four alleles that can contribute (Karlsson, 2007). Depending on which alleles are present in a dog, the amount of white can vary from none (a solid-colored dog) to mostly white (Table 2 and Figure 1).

Table 2: Combinations of different alleles for MITF result in different amounts of white present in the coat.

Alleles Amount of white
SS None (solid colored)
Ssi Small amounts of white possible on chin, chest, feet, and tail tip
Ssp Pied markings where the coat is more than 50% colored, with white on the face, chest, feet, collar, underbelly, and tail tip
sisp Approximately even amounts of color and white
sise More than 50% white with irregular splashes of color
sese Mostly white with only minimal areas of color, perhaps on one or both ears, an eye patch, or a spot near the tail

Figure 1: These dogs have different combinations of alleles of the MITF gene. The first dog probably has the genotype “SS”; the dog in the center is likely “Ssp“; the dog on the right is likely “sese“. (Credits: Funny black dog by X posid from Publicdomainpictures. A black and white dog by Petr Kratochvil from Free stock photos. White dog with black ears by RetyiRetyi from Pixabay.)

Human Connection – Blood Type

Human blood type was discussed in the previous section. You may remember that there are three alleles for the ABO gene: A, B, and O. A and B are codominant, meaning that if both alleles are present, both will be seen in the phenotype. A person with type AB blood has one A allele and one B allele.

O is recessive to A and B. A person with the genotype AO will have Type A blood. A person with the genotype BO will have type B blood. Type O blood results from two O alleles.


Photo credit: Kalaiarasy, from Wikipedia.


Unless otherwise noted, text and images by Lisa Bartee, 2016.

Karlsson EK, Baranowska I, Wade CM, Salmon Hillbertz NH, Zody MC, Anderson N, Biagi TM, Patterson N, Pielberg GR, Kulbokas EJ 3rd, Comstock KE, Keller ET, Mesirov JP, von Euler H, Kämpe O, Hedhammar A, Lander ES, Andersson G, Andersson L, Lindblad-Toh K. 2007. Efficient mapping of mendelian traits in dogs through genome-wide association. Nat Genet. 39(11):1321-8.


Hemophilia: a sex-linked disorder

So far, all the genes we have discussed have had two copies present in all individuals. This is because the individual inherited one from the male parent’s haploid gamete and one from the female parent’s haploid gamete. The two gametes came together during fertilization to produce a diploid individual. There is, however, one exception to this: genes which are present on the sex chromosomes.

In humans, as well as in many other animals and some plants, the sex of the individual is determined by sex chromosomes – one pair of non-homologous chromosomes. Until now, we have only considered inheritance patterns among non-sex chromosomes, or autosomes. In addition to 22 homologous pairs of autosomes, human females have a homologous pair of X chromosomes, whereas human males have an XY chromosome pair. Although the Y chromosome contains a small region of similarity to the X chromosome so that they can pair during meiosis, the Y chromosome is much shorter and contains fewer genes. When a gene being examined is present on the X, but not the Y, chromosome, it is X-linked.

The X chromosome is one of two sex chromosomes. Humans and most mammals have two sex chromosomes, the X and Y. Females have two X chromosomes in their cells, while males have X and Y chromosomes in their cells. Egg cells all contain an X chromosome, while sperm cells contain an X or a Y chromosome. This arrangement means that during fertilization, it is the male that determines the sex of the offspring since the female can only give an X chromosome to the offspring.

Figure 1: A diagram showing the autosomal and sex chromosomes. Remember that in a diploid cell, there would be two copies of each autosomal chromosome present. (Credit: Darryl Lega, NHGRI)

Most sex-linked genes are present on the X chromosome simply because it is much larger than the Y chromosome. The X chromosome spans about 155 million DNA base pairs and represents approximately 5 percent of the total DNA in cells. The X chromosome likely contains 800 to 900 genes. In contrast, the Y chromosome has approximately 59 million base pairs and only 50-60 genes. Sex is determined by the SRY gene, which is located on the Y chromosome and is responsible for the development of a fetus into a male. This means that the presence of a Y chromosome is what causes a fetus to develop as male. Other genes on the Y chromosome are important for male fertility.

Hemophilia is a bleeding disorder that slows the blood clotting process. People with this condition experience prolonged bleeding or oozing following an injury, surgery, or having a tooth pulled. In severe cases of hemophilia, continuous bleeding occurs after minor trauma or even in the absence of injury (spontaneous bleeding). Serious complications can result from bleeding into the joints, muscles, brain, or other internal organs. Milder forms of hemophilia do not necessarily involve spontaneous bleeding, and the condition may not become apparent until abnormal bleeding occurs following surgery or a serious injury.

The major types of this condition are hemophilia A (also known as classic hemophilia or factor VIII deficiency) and hemophilia B (also known as Christmas disease or factor IX deficiency). Although the two types have very similar signs and symptoms, they are caused by mutations in different genes.

Hemophilia A and hemophilia B are inherited in an X-linked recessive pattern. The genes associated with these conditions are located on the X chromosome, which is one of the two sex chromosomes. In males (who have only one X chromosome), one altered copy of the gene in each cell is sufficient to cause the condition. In females (who have two X chromosomes), a mutation would have to occur in both copies of the gene to cause the disorder. Because it is unlikely that females will have two altered copies of this gene, it is very rare for females to have hemophilia. A characteristic of X-linked inheritance is that fathers cannot pass X-linked traits to their sons (Figure 2 and 3).

Figure 2 X-linked recessive inheritance. Photo credit OpenStax College; OpenStax Anatomy and Physiology.


Figure 3: If a carrier female and a normal male produce offspring, there is a 25% total chance that they will have a child with hemophilia. None of their daughters will have the disease (although all will be carriers). Half their sons will be hemophiliacs.

In X-linked recessive inheritance, a female with one altered copy of the gene in each cell is called a carrier. Carrier females have about half the usual amount of coagulation factor VIII or coagulation factor IX, which is generally enough for normal blood clotting. However, about 10 percent of carrier females have less than half the normal amount of one of these coagulation factors; these individuals are at risk for abnormal bleeding, particularly after an injury, surgery, or tooth extraction.

Colorblindness is another example of a sex-linked trait in humans. The genes that produce the photopigments necessary for color vision are located on the X chromosome. If one of these genes is not functional because it contains a harmful mutation, the individual will be colorblind. Men are much more likely than women to be colorblind: up to 100 times more men than women have various types of colorblindness (


Figure 4: A test image for color-blindness as seen by someone with normal color vision and several types of colorblindness. (Credit: Sakurambo)


Unless otherwise noted, text adapted by and images by Lisa Bartee, 2016.

OpenStax, Biology. OpenStax CNX. May 27, 2016

“X chromosome” by Genetics Home Reference: Your Guide to Understanding Genetic ConditionsNational Institutes of Health: U.S> National Library of Medicine is in the Public Domain

“Y chromosome” by Genetics Home Reference: Your Guide to Understanding Genetic ConditionsNational Institutes of Health: U.S> National Library of Medicine is in the Public Domain

“Hemophilia” by Genetics Home Reference: Your Guide to Understanding Genetic ConditionsNational Institutes of Health: U.S> National Library of Medicine is in the Public Domain


Overall phenotypes: putting it all together

None of the genes discussed in these sections occur in isolation: one individual dog would have all the genes for color, hair structure, and hemoglobin (dogs can get hemophilia too). Genes interact together to produce the overall phenotype of the individual.

Example 1: Sugar

For example, look at Sugar in Figure 1. She has short hair that is mostly white. The colored portion of her hair is the tiger-striped pattern termed “brindle.”


Figure 1 Sugar has short hair with brindle colored spots. (Credit: Lisa Bartee)

The difference between short and long hair in dogs is caused by different alleles of a gene called FGF5. This gene produces a protein that is important in regulating the hair growth cycle. When the protein doesn’t function correctly, the growth phase of the hair cycle is longer, resulting in long hair. Short hair is the dominant trait. Since Sugar has short hair, we know she has at least one dominant allele of FGF5. We can use the letter “S” for short hair. Sugar’s genotype for FGF5 is therefore “S-“, meaning she has one dominant allele and we can’t tell by looking at her what her second allele is.

Sugar’s hair is also straight, which means that she has two straight alleles of KRT71. Her genotype would be K+K+.

Sugar is more than 50% white with irregular splashes of color, which means that her genotype for MITF (the gene that controls white spotting) is sise.

The brindle pattern is caused by the kbr allele at the K locus. Sugar can’t have the KB allele or she would have solid color instead of the brindle pattern because KB is dominant over kbr and ky. She could have either the genotype kbrkbr or kbrky, since the kbr allele is dominant over the yellow allele (ky).

Sugar has black eumelanin pigment in her hair and nose. This means she has the dominant phenotype for TYPR1, so her genotype would be “B-“. Because she has eumelanin and not pheomelanin in her coat, she has the dominant phenotype for MC1R, so her genotype would be “E-“.

Sugar is a female dog who does not have hemophilia. This means that her genotype would be either XHXH or XHXh.

Putting all these together, we could say that Sugar’s overall coat genotype is S- K+KsisB- E- XHX

We could potentially determine some of the unknown alleles in her genotype if we knew anything about her parents, but Sugar was adopted from the Multnomah County Animal Shelter after being picked up as a stray. Therefore, her ancestry is unknown. However, it turns out that after having her ancestry determined using DNA sequencing, she is 100% American Staffordshire Terrier.

Example 2: Rags


Figure 2: Rags is similar in color to Sugar, but has a very different fur type. (Credit: Lisa Bartee)

Rags has “furnishings”, a term used to describe his beard and mustache. Furnishings are caused by a mutation in the RSPO2 gene. This gene produces a protein that is involved in establishing hair follicles. The allele that leads to furnishings is dominant over the allele for no furnishings. Rags must therefore have the genotype “F-” at RSPO2. This allele also causes the long-ish hair on his legs and tail.

Gene Genotype Phenotype
RSPO2 FF or Ff has furnishings
FGF5 SS or Ss short fur (his longer fur is caused by the furnishings allele)
KRT71 K+K+ straight fur
MITF sise more than 50% white
K locus KBKB, KBkbr, or KBky Solid color, not brindle or yellow.
TYRP1 BB or Bb Produces black eumelanin, not brown
MC1R EE or Ee Produces eumelanin instead of pheomelanin
F8 XHY Male, no hemophilia

Example 3: Black poodle


Figure 3: Black poodle. (Credit: B. Schoener from Wikimedia)

Gene Genotype Phenotype
RSPO2 ff no furnishings
FGF5 ss long fur
KRT71 KcKc curly fur
MITF SS entirely solid color
K locus KBKB, KBkbr, or KBky Solid color, not brindle or yellow.
TYRP1 BB or Bb Produces black eumelanin, not brown
MC1R EE or Ee Produces eumelanin instead of pheomelanin

Example 4: Golden Retriever


Figure 4: Golden Retriever. (Credit: Dirk Vorderstraße)

Gene Genotype Phenotype
RSPO2 ff no furnishings
FGF5 ss long fur
KRT71 K+K+ straight fur
MITF SS entirely solid color
K locus KBKB, KBkbr, or KBky Solid color, not brindle or yellow.
TYRP1 BB or Bb Produces black eumelanin, not brown (seen in the nose)
MC1R ee Produces pheomelanin instead of eumelanin, so appears yellow

But wait, there’s more!


Figure 5: An English Cocker Spaniel. (Credit eNil)

We haven’t exhaustively discussed all the genes that can affect dog appearance. For example, what gene (or genes) causes the English Springer Spaniel in Figure 5 to be red? What gene(s) cause it to be speckled on it’s back? Or lead to its freckles? There are estimated to be about 19,000 genes in the dog genome (Ostrander, 2005). The interactions of all these genes together lead to the overall phenotype of one individual dog.

If you’re interested in learning more about the genes that are involved in the appearance of dogs, check out the Dog Coat Color Genetics website at


Unless otherwise noted, text and images by Lisa Bartee, 2016.

Ostrander EA, Wayne RK. 2005. The Canine Genome. Genome Res. 15: 1706-1716.


KIT - embryonic lethality

Although MITF is the major gene impacting white spotting in dogs, a second gene known as KIT has also been shown to have an impact in a subpopulation of German Shepherd dogs. This mutation in KIT  arose very recently (it appeared spontaneously in a female dog born in  2000) and causes a phenotype called “panda spotting.” The KIT gene produces a tyrosine kinase receptor protein that functions in the same melanogenesis pathway as MITF. The KIT receptor controls many important processes within the cells including growth and division, survival, and cell migration. Its signaling function is important in the development of many different types of cells, including melanocytes which product the pigment melanin.

Interestingly, no dogs have been identified that are homozygous for the mutation in KIT. This is probably because having two mutations in this gene is lethal because of its many important functions. This is called a homozygous lethal allele that results in embryonic lethality. There are other examples of embryonic lethal genes:

Figure 1 Lethality of homozygous agouti mutation. Photo credit Jcfidy; Wikimedia.

Figure 2 Jason Acuña outside of the Waterfront Mariott in Portland, OR. on August 15, 2009. Photo credit Sakibomb222; Wikimedia.

Figure 3 A “rumpy riser” manx kitten. Photo credit Michelle Wiegold; Wikimedia.

Figure 4 Michelob (Michael) and Irish Mist (Misty), a hairless and a coated Xoloitzcuintli. Christopher A. and Amanda L. Dellario, Nottingham, NH, USA. Photo taken by Amanda L. Dellario, August 2006. Wikimedia.

In humans, the KIT gene is located on chromosome 4 (Figure 5), while it is located on chromosome 13 in dogs (Wong, 2012).

Figure 5 Chromosomal location of the KIT gene in humans. Photo credit Genetics Home Reference; Public Domain.

While KIT has only been shown to have a phenotypic effect in one family of German Shepherd dogs (at least so far), at least 69 mutations in the human KIT gene have been identified (GHR, 2018). Mutations in the human KIT gene lead to piebaldism, where melanocytes are absent from certain areas of the hair and skin (Figure 6). These mutations are inherited in an autosomal dominant fashion. I couldn’t find any information about whether homozygous piebald mutations are lethal in humans. This is likely because generating a homozygous piebald human would require mating two heterozygous piebald humans, and this disorder is rare so this mating would be unlikely.

Figure 6 Piebaldism in a 5 year old boy. Neither of his parents nor any of his five siblings showed any white spotting. Photo credit Wellcome Images; Wikimedia.


Wilson J. 2009. Cat World [Internet]. Feline Genetic Loci Table. [accessed on January 2, 2017]. Available from:

Genetics Home Reference (GHR). 2018. National Institutes of Health. Available from

Wong AK, Ruhe AL, Roberson KR, Lowe ER, Williams DC, Neff MW. 2012. A de novo mutation in KIT causes white spotting in a subpopulation of German Shepherd dogs. Animal Genetics: 44:305-310.


It's not all in the genes - the effect of environment

Not all traits are directly caused by DNA alone. The environment also plays a large role in shaping an individual’s traits. Some examples can be seen below.

  • Height and weight: A number of genes interact to determine the general height and weight that a person will have. But the environment has a major influence as well. If an individual is malnourished, their growth may be slowed and they may be smaller than they would have been if they had gotten enough food. In contrast, if a person consumes more calories than they need, their weight will likely increase regardless of their genetics.
  • Fingerprints: the general characteristics of a person’s fingerprints are determined by genetics, but the specific pattern is generated randomly during development. Identical twins typically have fingerprints that are similar, but not identical.
  • Intelligence: Like most aspects of human behavior and cognition, intelligence is a complex trait that is influenced by both genetic and environmental factors. Roughly 50% of a person’s IQ appears to be determined by genetic factors. Factors related to a child’s home environment and parenting, education and availability of learning resources, and nutrition, among others, also contribute to intelligence. A person’s environment and genes influence each other, and it can be challenging to tease apart the effects of the environment from those of genetics. For example, if a child’s IQ is similar to that of his or her parents, is that similarity due to genetic factors passed down from parent to child, to shared environmental factors, or (most likely) to a combination of both? It is clear that both environmental and genetic factors play a part in determining intelligence.
  • Cancer Risk: For example, a person could inherit a mutation in the BRCA1 gene, which increases the risk of developing breast or ovarian cancer. Researchers have identified more than 1,800 mutations in the BRCA1 gene. Most BRCA1 gene mutations lead to the production of an abnormally short version of the BRCA1 protein or prevent any protein from being made from one copy of the gene. As a result, less of this protein is available to help repair damaged DNA or fix mutations that occur in other genes. As these defects accumulate, they can trigger cells to grow and divide uncontrollably to form a tumor. These mutations are present in every cell in the body and can be passed from one generation to the next. As a result, they are associated with cancers that cluster in families. However, not everyone who inherits a mutation in the BRCA1 gene will develop cancer. Other genetic, environmental, and lifestyle factors also contribute to a person’s cancer risk.
  • In contrast, cancer can be caused by purely environmental factors. According to the CDC, cigarette smoking is the number one risk factor for lung cancer. In the United States, cigarette smoking is linked to about 90% of lung cancers and people who smoke are 15 to 30 times more likely to get lung cancer or die from lung cancer than people who do not smoke. Radon exposure also increases the likelihood that a person will develop lung cancer.

Figure: The colors on the poodle seen in this figure have no relationship to his DNA: he was dyed for a parade. (Credit: skeeze)


“BRCA1” by Genetics Home Reference: Your Guide to Understanding Genetic ConditionsNational Institutes of Health: U.S> National Library of Medicine is in the Public Domain

“Is intelligence determined by genetics?” by Genetics Home Reference: Your Guide to Understanding Genetic ConditionsNational Institutes of Health: U.S> National Library of Medicine is in the Public Domain


Pleiotropy - one gene affects more than one trait

So far, we have discussed examples where changing the DNA sequence of one gene affects one protein, which affects one specific trait (for example, the change from brown to black fur). However, there are examples where one mutation can affect more than one trait. This is called pleiotropy.


MC1R, the gene which leads to the difference between yellow and dark colored dogs, is also found in humans. Recall that MC1R helps determine whether mostly eumelanin (dark pigment) or pheomelanin (lighter reddish pigment) will be produced. Humans who produce mostly eumelanin have the active allele of MC1R and have darker skin that tans easily and darker hair. Humans who produce mostly pheomelanin have inactive MC1R proteins and have red or blonde hair and light skin with freckles that does not tan easily.

One relatively obvious secondary phenotype of having lighter skin is that rates of skin cancer are higher in those individuals compared to individuals who have darker skin that tans easily. Since MC1R affects skin pigmentation, it also has an effect on skin cancer rates. In addition, MC1R has an effect on cancer rates that is unrelated to skin pigmentation due to its interactions with other genes that regulate inflammatory responses, DNA repair, and apoptosis (Feller, 2016).

Interestingly, red headed individuals also exhibit higher pain tolerance due to their MC1R alleles (Liem, 2004; Liem, 2006). So far, the reason for this is unknown.

Mutations in MC1R have been shown to decrease knee cartilage in mice (Lorenz, 2014). Again, the mechanism for this is not understood, but likely relates to MC1Rs signaling role.

MC1R polymorphisms have been associated with a decrease in the development of sepsis (blood poisoning) in humans (Seaton, 2017). This is probably due to the role that MC1R plays in inflammation (again, due to its role in signal transduction).

Activation of MC1R by agonists (chemicals that activate receptors) were shown to reduce several harmful symptoms in the kidneys of rats (Lindskog, 2010). The specific reason for this is not understood.

“Frizzled” chickens

The dominant “frizzle” allele causes feathers to turn upwards rather than remain flat against the chicken’s body (Figure 1). However, along with producing defective feathers, the frizzle allele also lead to abnormal body temperatures, higher metabolic and blood flow rates, and greater digestive capacity. Furthermore, chickens who had this allele also laid fewer eggs than their wild-type counterparts.

Figure 1 A “frizzled” chicken. Photo credit Joe Goldberg; Flickr.


Phenylketonuria (PKU) is a disorder that affects the levels of the amino acid phenylalanine in the body. We get phenylalanine from food, then process it within our cells. Individuals with PKU have a mutation in the enzyme required to break down phenylalanine.  but levels of this amino acid build up in individuals with PKU. This build up can lead to a variety of different symptoms including intellectual disability, seizures, poor bone strength, skin rashes, behavioral and mental disorders, and an unusually small head. If you’ve ever seen a warning on a package that says “Phenylketonurics – contains phenylalanine”, this is why.


Feller L, Khammissa RAG, Kramer B, Altini M, Lemmer J. 2016. Basal cell carcinoma, squamous cell carcinoma and melanoma of the head and face. Head Face Med. 12:11.

Liem EB, et. al. 2004. Anesthetic Requirement is Increased in Redheads. Anesthesiology. 101(2): 279-283.

Liem EB, Joiner TV, Tsueda K, Sessler DI. 2006. Increased sensitivity to Thermal Pain and Reduced Subcutaneious Lidocaine Efficacy in Redheads. Anesthesiology. 102(3): 509-514.

Lindskog A, et. al. 2010. Melanocortin 1 Receptor Agonists Reduce Proteinuria. J Am Soc Néphron. 21(8): 1290-1298.

Lorenz J, et. al. 2014. Melanocortin 1 Receptor-Signaling Deficiency Results in an Articular Cartilage Phenotype and Accelerates Pathogenesis of Surgically Induced Murine Osteoarthritis. PLoS One. 9(9): e105858.




Learning Objectives

By the end of this section, you will be able to:

  • Describe how mutations affect protein synthesis and its products.


How Gene Mutations Occur

A gene mutation is a permanent alteration in the DNA sequence that makes up a gene, such that the sequence differs from what is found in most people. Mutations range in size; they can affect anywhere from a single DNA building block (base pair) to a large segment of a chromosome that includes multiple genes.

Recall that the DNA sequence found within a gene controls protein synthesis. If the DNA sequence is altered, this can alter the amino acid sequence within a protein.

Figure 1 The process of protein synthesis first creates an mRNA copy of a DNA sequence during the process of transcription. This mRNA is translated into a sequence of amino acids by the ribosome. In this way, the information encoded in the sequence of bases in the DNA making up a gene is used to produce a protein.

Gene mutations can be classified in two major ways:

  • Hereditary mutations are inherited from a parent and are present throughout a person’s life in virtually every cell in the body. These mutations are also called germline mutations because they are present in the parent’s egg or sperm cells, which are also called germ cells. When an egg and a sperm cell unite, the resulting fertilized egg cell receives DNA from both parents. If this DNA has a mutation, the child that grows from the fertilized egg will have the mutation in each of his or her cells.
  • Acquired (or somatic) mutations occur at some time during a person’s life and are present only in certain cells, not in every cell in the body. These changes can be caused by environmental factors such as ultraviolet radiation from the sun, or can occur if a mistake is made as DNA copies itself during cell division. Acquired mutations in somatic cells (cells other than sperm and egg cells) cannot be passed on to the next generation.
hereditary mutation

Figure 2 The red individual has inherited two mutated alleles of a gene from their parents. This is an example of a hereditary mutation.

Figure: The color variation seen in this tulip is caused by a somatic mutation – one which occurred early in the development of this individual flower.

Genetic changes that are described as de novo (new) mutations can be either hereditary or somatic. In some cases, the mutation occurs in a person’s egg or sperm cell but is not present in any of the person’s other cells. In other cases, the mutation occurs in the fertilized egg shortly after the egg and sperm cells unite. It is often impossible to tell exactly when a de novo mutation happened. As the fertilized egg divides, each resulting cell in the growing embryo will have the mutation. De novo mutations may explain genetic disorders in which an affected child has a mutation in every cell in the body but the parents do not, and there is no family history of the disorder.

Somatic mutations that happen in a single cell early in embryonic development can lead to a situation called mosaicism. These genetic changes are not present in a parent’s egg or sperm cells, or in the fertilized egg, but happen a bit later when the embryo includes several cells. As all the cells divide during growth and development, cells that arise from the cell with the altered gene will have the mutation, while other cells will not. Depending on the mutation and how many cells are affected, mosaicism may or may not cause health problems.

Most disease-causing gene mutations are uncommon in the general population. However, other genetic changes occur more frequently. Genetic alterations that occur in more than 1 percent of the population are called polymorphisms. They are common enough to be considered a normal variation in the DNA. Polymorphisms are responsible for many of the normal differences between people such as eye color, hair color, and blood type. Although many polymorphisms have no negative effects on a person’s health, some of these variations may influence the risk of developing certain disorders.


“Mutations and Health” by U.S. National Library of Medicine is in the Public Domain


Introduction to Genetic Disorders

 To function correctly, each cell depends on thousands of proteins to do their jobs in the right places at the right times. Sometimes, gene mutations prevent one or more of these proteins from working properly. By changing a gene’s instructions for making a protein, a mutation can cause the protein to malfunction or to be missing entirely. When a mutation alters a protein that plays a critical role in the body, it can disrupt normal development or cause a medical condition. A condition caused by mutations in one or more genes is called a genetic disorder.

In some cases, gene mutations are so severe that they prevent an embryo from surviving until birth. These changes occur in genes that are essential for development, and often disrupt the development of an embryo in its earliest stages. Because these mutations have very serious effects, they are incompatible with life.

It is important to note that genes themselves do not cause disease—genetic disorders are caused by mutations that make a gene function improperly. For example, when people say that someone has “the cystic fibrosis gene,” they are usually referring to a mutated version of the CFTR gene, which causes the disease. All people, including those without cystic fibrosis, have a version of the CFTR gene.


“Mutations and Health” by U.S. National Library of Medicine is in the Public Domain


Do all gene affect health and development?

No; only a small percentage of mutations cause genetic disorders—most have no impact on health or development. For example, some mutations alter a gene’s DNA sequence but do not change the function of the protein made by the gene.

Often, gene mutations that could cause a genetic disorder are repaired by certain enzymes before the gene is expressed and an altered protein is produced. Each cell has a number of pathways through which enzymes recognize and repair mistakes in DNA. Because DNA can be damaged or mutated in many ways, DNA repair is an important process by which the body protects itself from disease.

A very small percentage of all mutations actually have a positive effect. These mutations lead to new versions of proteins that help an individual better adapt to changes in his or her environment. For example, a beneficial mutation could result in a protein that protects an individual and future generations from a new strain of bacteria.

Because a person’s genetic code can have a large number of mutations with no effect on health, diagnosing genetic conditions can be difficult. Sometimes, genes thought to be related to a particular genetic condition have mutations, but whether these changes are involved in development of the condition has not been determined; these genetic changes are known as variants of unknown significance (VOUS). Sometimes, no mutations are found in suspected disease- related genes, but mutations are found in other genes whose relationship to a particular genetic condition is unknown. It is difficult to know whether these variants are involved in the disease.

Blue lobster

Figure: This lobster contains a mutation that causes it to be blue. This is estimated to occur in roughly one in two million lobsters.


“Mutations and Health” by U.S. National Library of Medicine is in the Public Domain


Types of Mutations

The DNA sequence of a gene can be altered in a number of ways. Gene mutations have varying effects on health, depending on where they occur and whether they alter the function of essential proteins. The types of mutations include:

  • Silent mutation: Silent mutations cause a change in the sequence of bases in a DNA molecule, but do not result in a change in the amino acid sequence of a protein (Figure 1).
  • Missense mutation: This type of mutation is a change in one DNA base pair that results in the substitution of one amino acid for another in the protein made by a gene (Figure 1).
  • Nonsense mutation: A nonsense mutation is also a change in one DNA base pair. Instead of substituting one amino acid for another, however, the altered DNA sequence prematurely signals the cell to stop building a protein (Figure 1). This type of mutation results in a shortened protein that may function improperly or not at all.

Figure: Some mutations do not change the sequence of amino acids in a protein. Some swap one amino acid for another. Others introduce an early stop codon into the sequence causing the protein to be truncated.

  • Insertion or Deletion: An insertion changes the number of DNA bases in a gene by adding a piece of DNA. A deletion removes a piece of DNA. Insertions or deletions may be small (one or a few base pairs within a gene) or large (an entire gene, several genes, or a large section of a chromosome). In any of these cases, the protein made by the gene may not function properly.
  • Duplication: A duplication consists of a piece of DNA that is abnormally copied one or more times. This type of mutation may alter the function of the resulting protein.
  • Frameshift mutation: This type of mutation occurs when the addition or loss of DNA bases changes a gene’s reading frame. A reading frame consists of groups of 3 bases that each code for one amino acid. A frameshift mutation shifts the grouping of these bases and changes the code for amino acids. The resulting protein is usually nonfunctional. Insertions, deletions, and duplications can all be frameshift mutations.

Figure 2 A frameshift mutation adds or deletes 1 or 2 bases. This results in a shift of the “reading frame” for the ribosome causing a drastic change in amino acid sequence. Photo Credit: OpenStax Biology.

  • Repeat expansion: Nucleotide repeats are short DNA sequences that are repeated a number of times in a row. For example, a trinucleotide repeat is made up of 3-base- pair sequences, and a tetranucleotide repeat is made up of 4-base-pair sequences. A repeat expansion is a mutation that increases the number of times that the short DNA sequence is repeated. This type of mutation can cause the resulting protein to function in a completely different way than it would have originally.


“Mutations and Health” by U.S. National Library of Medicine is in the Public Domain


Multifactorial Disorders and Genetic Predispositions

 Multifactorial Disorders

Researchers are learning that nearly all conditions and diseases have a genetic component. Some disorders, such as sickle cell disease and cystic fibrosis, are caused by mutations in a single gene. The causes of many other disorders, however, are much more complex. Common medical problems such as heart disease, diabetes, and obesity do not have a single genetic cause—they are likely associated with the effects of multiple genes in combination with lifestyle and environmental factors. Conditions caused by many contributing factors are called complex or multifactorial disorders.
main symptoms of diabetes

Figure 1 The main symptoms of diabetes, a multifactorial disorder

Although complex disorders often cluster in families, they do not have a clear- cut pattern of inheritance. This makes it difficult to determine a person’s risk of inheriting or passing on these disorders. Complex disorders are also difficult to study and treat because the specific factors that cause most of these disorders have not yet been identified. Researchers continue to look for major contributing genes for many common complex disorders.

Genetic Predispositions

A genetic predisposition (sometimes also called genetic susceptibility) is an increased likelihood of developing a particular disease based on a person’s genetic makeup. A genetic predisposition results from specific genetic variations that are often inherited from a parent. These genetic changes contribute to the development of a disease but do not directly cause it. Some people with a predisposing genetic variation will never get the disease while others will, even within the same family.

Genetic variations can have large or small effects on the likelihood of developing a particular disease. For example, certain mutations in the BRCA1 or BRCA2 genes greatly increase a person’s risk of developing breast cancer and ovarian cancer. Variations in other genes, such as BARD1 and BRIP1, also increase breast cancer risk, but the contribution of these genetic changes to a person’s overall risk appears to be much smaller.

Current research is focused on identifying genetic changes that have a small effect on disease risk but are common in the general population. Although each of these variations only slightly increases a person’s risk, having changes in several different genes may combine to increase disease risk significantly. Changes in many genes, each with a small effect, may underlie susceptibility to many common diseases, including cancer, obesity, diabetes, heart disease, and mental illness.

In people with a genetic predisposition, the risk of disease can depend on multiple factors in addition to an identified genetic change. These include other genetic factors (sometimes called modifiers) as well as lifestyle and environmental factors. Although a person’s genetic makeup cannot be altered, some lifestyle and environmental modifications (such as having more frequent disease screenings and maintaining a healthy weight) may be able to reduce disease risk in people with a genetic predisposition.


“Mutations and Health” by U.S. National Library of Medicine is in the Public Domain


Changes in Numbers of Genes or Chromosomes

Changes in numbers of genes

People have two copies of most genes, one copy inherited from each parent. In some cases, however, the number of copies varies—meaning that a person can be born with one, three, or more copies of particular genes. Less commonly, one or more genes may be entirely missing. This type of genetic difference is known as copy number variation (CNV).

Copy number variation results from insertions, deletions, and duplications of large segments of DNA. These segments are big enough to include whole genes. Variation in gene copy number can influence the activity of genes and ultimately affect many body functions.

Researchers were surprised to learn that copy number variation accounts for a significant amount of genetic difference between people. More than 10 percent of human DNA appears to contain these differences in gene copy number. While much of this variation does not affect health or development, some differences likely influence a person’s risk of disease and response to certain drugs. Future research will focus on the consequences of copy number variation in different parts of the genome and study the contribution of these variations to many types of disease.

Changes in Numbers of Chromosomes

Human cells normally contain 23 pairs of chromosomes, for a total of 46 chromosomes in each cell. A change in the number of chromosomes can cause problems with growth, development, and function of the body’s systems. These changes can occur during the formation of reproductive cells (eggs and sperm), in early fetal development, or in any cell after birth. A gain or loss of chromosomes from the normal 46 is called aneuploidy.

A common form of aneuploidy is trisomy, or the presence of an extra chromosome in cells. “Tri-” is Greek for “three”; people with trisomy have three copies of a particular chromosome in cells instead of the normal two copies. Down syndrome is an example of a condition caused by trisomy. People with Down syndrome typically have three copies of chromosome 21 in each cell, for a total of 47 chromosomes per cell.


Figure 1 This karyotype, which is a picture of all the chromosomes from one individual, is from a person who has Trisomy 13.

Monosomy, or the loss of one chromosome in cells, is another kind of aneuploidy. “Mono-” is Greek for “one”; people with monosomy have one copy of a particular chromosome in cells instead of the normal two copies. Turner syndrome is a condition caused by monosomy. Women with Turner syndrome usually have only one copy of the X chromosome in every cell, for a total of 45 chromosomes per cell.

Rarely, some cells end up with complete extra sets of chromosomes. Cells with one additional set of chromosomes, for a tota