The human microbiome

OLD Audio recording

Video recording (.mov format, 2.3Gbytes)
Video recording (480p .mp4 format, 0.3Gbytes)
Video recording (1080p .mp4 format, 1.8Gbytes


Although microbial ecologists usually focus on other environments, one microbial environment of special importance to us is the human body; the human "microbiome". Microbiologists know a lot more about human microbiology than any other system, and particularly about organisms that cause serious acute disease. But this is the tip of the iceberg - there are approximately 10 times as many bacterial cells in and on the human body as there are human cells! Given that all the human cells are of the same species, and the microbial populations are of countless species, the genetic information content of a human is predominated by the microbial genes. Very basic questions remain, such as:

  • How much microbial diversity is there in the human body?
  • How much do the microbial communities differ in different parts of the body?
  • How much variation is there in "normal" microbial communites?
  • How much do our microbial communities change over time?
  • How do these communities contribute to our overall health?
  • ... and so on...

Think of it this way; picture an old-growth tree:

It would be futile to try to understand the health and function of this tree by studying only its leaves, limbs, trunk, roots and wood, or even its genome; sure, you'd get some information, but to really understand this tree, you have to think of it as an ecosystem, and include the animals, plants, fungi, and microbes that cover it's surface (both above and below ground) and fill it's nooks and crannies. If the tree were unhealthy, a good place to start might be to examine all of these symbionts. But in the absence of an obvious cause, such as kudzu or tuberculosis, what do you look for? What is "normal"? How can you distinguish normal from problematic? In general, in the case of human microbiology, we don't have a clue. This, then is the purpose of todays paper:

Bacterial community variation in human body habitats across space and time

pdfCostello EK, Lauber CL, Hamady M, Fierer N gordon JI & Knight R 2009 Bacterial community variation in human body habitats across space and time. Science 326:1694-1697

pdfOnline supplement to this paper

Purpose: To begin to determine how human-associated microbial communities vary from place to place on the body, from person to person, and over time.

In this series of experiments, the authors have taken samples from 3 female and 6 male healthy humans at 27 locations on 4 separate days over 3 months time:

DNA was isolated from each samples and a region of the ssu-rRNA was amplified by PCR. These PCRs each had a unique "barcode" sequence added to the forward primer; these barcodes allow the exact source of each sequence to be identified directly from the sequence. This allows them to mix as many samples as they have into a single 454 sequencing run. They ended up with 815 different samples, and an average of 1315 classifiable sequences per samples, totaling just over a million total useable sequences.

As you might predict, nearly all (92.3%) of the sequences were from 4 bacterial phyla: the Actinobacteria, Firmicutes, Proteobacteria, and Bacterodetes. Most of the rest are cyanobacteria (from pollen chloroplasts) and some Fusobacteria (primarily in the mouth):

The data can be sorted out by body site in the following graph:

In this graph, the central pie charts represent the "core" groups that were found in all individuals at that site. The outer circles are groups found in some individuals but not others. Only the 40 most comon groups are shown; the grey zones represents all other groups combined.

Another way to represent most of this data is this graph:

In order to compare samples against each other, the authors use a metric called "Unifrac". Unifrac stands for "UNIque FRACtion". It starts with a reference phylogenetic tree, and then for each population determines which branches in this tree lead to sequences in that sample. Two populations are compared by determining the fraction of branches (actually branch length) that are shared between the populations and the fraction of branch length that is unique (i.e. the unifrac) to each population:

Population similarities can be visualized graphically by calculating all pair-wise unifrac distances, then either generating a tree from these distances or by principle component analysis. Note that in the case of trees, each leaf of the tree doesn't represent any specific sequence, but the entire population of sequences, and distance in the tree represents how similar or distant those populations are.

Their metric for "diversity" is much simpler - forany one community, they just add up all of the branch lengths leading to sequences that appear in that sample, and divide by the number of sequences obtained.

Note that in both he unifrac and diversity metrics, sequences greter than or equal to 97% identical are considered to be the same thing; this is sometimes called a "molecular species", or OTU (operational taxonomic unit).

So, they sorted out the fractions of each kind of sequence in each body habitat, per individual, across the time points. This data is summarized graphically in figure S4:

(This is not readable on the web page - look Figure S4 in the the supplementary data PDF, which you can scale as you see fit.) Just by way of example, let's look at the mouth and gut:

In the graphs at the top, each bar represents the fraction of sequences from any particular phylogenetic group from a specific person. The lower graphs are by date. Notice that these two body sites are completely different in compostion; a mixture of Streptococcus, Veilonella, Prevotella, Pasteureliaceaea, and Neisseria (and so on) in the mouth, but predominantly Bacteroides in the gut. Notice also the big differences in the fraction of organisms in difference people (upper graphs); e.g. Male #4 has a mouth full of Veilonella and Prevotella, whereas Male #3 has a mouth full of Neisseria (which I'm sure he can explain). In contrast, the populations are much less variable over time (lower graphs).

So human microbial populations are primarily defined by body site. This can also be visualized in the PC plots:

As you can see (Panels B-D), sorting by person, time or even sex (although there is some discriminaion here) isn't helpful, but if you sort by body location, the mouth, gut, and EAC (external auditory canel) form nice, distinct clusters. In panels E and F, you can see that there is much more variation between boby sites than there is between people or over time. Panel G shows this in a tree by location, and there are a number of graphs, trees and plots in the supplementary data going through this in detail.

The authors also spend some times showing that some sites are more diverse than others:

In panel A, you can see that some parts of the skin (generally arms, hnds, and legs) were the most diverse of populations. The gut and mouth were also very diverse. In contrast, other parts of the skin were not very diverse, nor were the EAC or labia minora. (A more detailed version of this graph can be found as figures S14-S16.) As you can see in panel B, the tongue was very consistent, whereas the forehad was more variable and the forearm was highly variable.

How diversity changes over time is shown in panel C. This are graphed as rarifaction curves - a statisical representation in which you plot from your data the average about of diversity (remember: sum tree branch length) with increasing number of sequences added. For example, in the first graph, for "F1 palm", with 500 randomly chosen sequences from the data, the average sum length of the tree of these sequences isabout 17. If you double the number of sequence used to 1000, the total tree length increases to 26. With high enough sequence numbers, these should flaten out to horizintal - this would man that you've collected enough sequences to get everything. Notice, however, that the curves don't approach an asymptote - they flatten out but not toward a slope of zero. This means that you should expect to continue to get new sequences not matter how much you increase your dta size. In other words, microbial diversity is a bottomless pit!

One interesting finding was that gut communities varied a lot between individuals, but were relatively constant over time. This is a huge community - there are certainly more microbes in the gut that everywhere lse on the body put together. The authors believe that these stable differences in microbial gut communities might have a big impact on health, including nutrition, immunity, and especially obesity.

On the other hand, the mouth generally harbored a very diverse microbial population, but was much more consistent from person to person or from time to time than were other sites of the body. This means that disease related to problematic oral microbiology might be easier to identify than at other sites of the body; "normal" is more readily defined.

Skin populations also varies with site, and could be clustered into 3 types: the head (predominated by Propionibacteria), the arms (also lots of Propioibacteria, but less predominantly so), and the trunk and legs (Staphylococci and Corynebacteria).

A second interesting implication of the observation that there is so much variation in human microbial populations, and that this variation is stable, is that your microbiome is unque to you, and you leave traces of this everywhere you go - doorknobs, cell phones, keyboards, &c. In a recent paper, these same authors showed that they could match keyboards and mice to their users just from the microbial populations left behind - and they could do so weeks after their last use!

In the supplementary material, they demonstrate that they could readily distinguish individuals from their composite microbiobes:

Note that in all cases, days 1 and 2 are related, as are days 3 and 4; remember that 12 is one day after 1, and 4 is one day after 3, but these pairs are 3 months apart.

The authors then go on to ask whether these differences in skin populations are the result of differences in environmental conditions, or just historical contingency (who got there first). Much of the rest of the paper, which we will not review, deals with "transplants", in which areas of skin are disinfected and populated artificially with the populations from other sites or other people.