2.2 Naming Organic Compounds
Systematic Names
Chemists communicating about organic chemistry can use drawings in many situations, but in others the preference would be to describe substances with names. However there are millions of known, described organic substances. So providing each with an independent name, and learning a collection of such names, would be impossible. A solution to this difficulty exists in a system for naming, whereby a chemical is described by components of a name in such a way that the name contains enough information to translate the words back into a complete, unique structure.
Such a system has been devised by the International Union of Pure and Applied Chemistry (IUPAC, usually pronounced eye-you-pack). While the IUPAC system is used frequently in conversation for smaller, simpler molecules, large and complex molecules such as many made by biological organisms end up with long and complicated names. However even these names are often used in official communication about such molecules. They can be written into documentation and are often easier to share than drawings. Unfortunately, public misunderstanding of how IUPAC names work has led to some association between complicated structure names and the health risks of exposure to a substance. This association is not at all real. Many natural substances, and substances that are very healthy to consume, have long and complicated systematic names.
The substance (Z)-3-hexenyl ethanoate sounds pretty unfriendly, but it is actually a principal component of the smell of fresh cut grass. The primary odor chemical in the fragrance of a rose is (2E)-3,7-Dimethyl-2,6-octadien-1-ol.
These names obviously would not be commonly used even by chemists discussing them at a cocktail party, but they are information-dense and in official communication provide an unequivocal link to a single chemical structure.
How to get a systematic name from a structure
To assign a name to a compound, begin by determining the ‘parent chain‘, which is the longest straight chain of carbon atoms. On paper, you should be able to put a finger down on one end of the parent chain and trace through all carbons until you get to the end, without needing to lift your finger. We’ll start with the simplest straight chain alkane structures.
If the parent chain is just one carbon long, the name is based on CH4 which is called methane. For a two-carbon parent chain the name will be based on C2H6, which is ethane. The table below continues with the names of longer straight-chain alkanes. While rote memorization is generally not the best way to learn organic chemistry, it may be worth committing these to memory, as they are the basis for the rest of the IUPAC nomenclature system. With some practice they will become part of your functional vocabulary.
Names for straight-chain alkanes:
1 carbon: methane
2 carbons: ethane
3 carbons: propane
4 carbons: butane
5 carbons: pentane
6 carbons: hexane
7 carbons: heptane
8 carbons: octane
9 carbons: nonane
10 carbons: decane
While many of these names share a Greek root with more familiar geometric shape names, some do not. For the first four, chemistry students often learn their order with the aid of the mnemonic (memory device) “Mice Eat Peanut Butter.”
Substituents branching from the main parent chain are given a location signifier. This is done by providing the counted carbon number within the parent chain where the branch exists, with the lowest possible numbers being used. For example, notice below how the compound on the left is named 1-chlorobutane, not 4-chlorobutane. “1” designates the chlorine is attached at the first carbon in the parent chain. When the substituents are small carbon-containing, so-called alkyl groups, the terms methyl, ethyl, and propyl are used to identify them.
Other common names for more complex alkyl groups are isopropyl, tert-butyl and phenyl. You may recognize how complicated the names could become, with multiple branches and non-carbon substituent groups all possible on large chains, etc.. In some situations this has caused a preference for common names to be used in casual talk or even among scientists, such as the names given for the amino acids shown below. Some common names, such as phenylalanine, include components of systematic names within them. No one can learn all the common names, and no one can learn all the rules for systematic names in a short period of time. For now we are learning bits and pieces, and learning how the system for nomenclature works.
The structure shown below is laid out on the page so that the longest continual carbon chain is oriented vertically. Structures that are presented this way can be confusing, leading to misinterpretation. In this case the structure could be accidentally named 2-ethylpropane (incorrect) instead of 2-methylbutane (correct).
Keep in mind the IUPAC name for straight-chain hydrocarbons is always based on the longest possible parent chain, which in this case is four carbons, not three. Especially if you are looking at large and complicated structures, it can get tricky to identify the parent chain, but it is the foundation of the name.
When carbons bond to form rings, the resulting cyclic alkanes are called cyclopropane, cyclobutane, cyclopentane, cyclohexane, and so on:
In cases where multiple copies of the same substituent are on a structure, the prefixes di, tri, and tetra are used.
We will learn more about functional groups soon. But for now, recognize that these recognizable groups of atoms show up in names as characteristic suffixes. Alcohols, for example, have ‘ol’ appended to the parent chain name, along with a number designating the location of the alcohol group. Ketones are a functional group with a double bond to oxygen, designated in names by the suffix ‘one’.
All of the examples we have seen so far have been simple in the sense that only one functional group was present on each molecule. There are of course many more rules in the IUPAC system, and as you can imagine, the IUPAC naming of larger molecules with multiple functional groups, ring structures, and substituents can get very unwieldy very quickly. The drug cocaine, for example, has the IUPAC name ‘methyl (1R,2R,3S,5S)-3-(benzoyloxy)-8-methyl-8-azabicyclo[3.2.1] octane-2-carboxylate.’
You can see why the IUPAC system is not used very much in biological organic chemistry – the molecules are just too big and complex. A further complication is that, even outside of a biological context, many simple organic molecules are known almost universally by their ‘common’, rather than IUPAC names. The compounds acetic acid, chloroform, and acetone are only a few examples.
In biochemistry, nonsystematic names (like ‘cocaine’, ‘capsaicin’, ‘pyruvate’ or ‘ascorbic acid’) are usually used, and when systematic nomenclature is employed it is often specific to the class of molecule in question: different systems have evolved, for example, for fats and for carbohydrates. We will not focus very intensively in this text on IUPAC nomenclature or any other nomenclature system, but if you undertake a more advanced study in organic or biological chemistry you may be expected to learn one or more naming systems in some detail. If you are familiar with how naming systems work, you will be able to apply that general understanding to any specific system you need to learn.
Exercise 2.2.1
Look up the IUPAC names for acetic acid, chloroform, and acetone. One place you can find these is on Wikipedia, in the box of chemical information that is on the right side of the page. Wikipedia is quite reliable for this type of technical information. Other reliable sources for this kind of information include ChemSpider and PubChem.
Exercise 2.2.2
Attempt to draw line-bond structures of the following compounds, based on what you have learned about the IUPAC nomenclature system. If you can’t do these now you probably will be able to do them in a few weeks. If you are unable to draw the line-bond structure, describe it in words.
a) methylcyclopentane
b) 5-methyl-1-hexanol (“-ol” indicates an alcohol, or -OH from carbon)
c) 2-methyl-2-butene (“-ene” indicates a carbon to carbon double bond)
Exercise 2.2.3
Exercise 2.2.4
Exercise 2.2.5
Exercise 2.2.6
The longest continual chain of carbons in a chemical structure.
A substituent is an atom or group of atoms connected to the parent chain, but not part of it.
Alkyl groups are groups containing only carbon and hydrogen.
A group of 3 carbons with an attachment at the 2nd carbon.
A group of 4 carbons with an attachment at tertiary carbon in position 2.
A group containing 6 carbons and 5 hydrogens arranged in a ring attached as a substituent to a parent chain.