The Size-Luminosity Relationship in Extra-Galactic Astronomy — a paper by Alex Drozd

Here we offer you an article on the curious case of the density range shared by all modern-day elliptical galaxies.

Some physical process we don’t understand is driving all elliptical galaxies to evolve towards a certain common density. Is this an attribute of galaxy structures themselves, or a consequence of an undiscovered physical law? The size-luminosity relationship is a mystery – evidence of a major trend that astronomers never would have expected.

Alex Drozd explores the phenomenon.

The Size-Luminosity Relationship in Extra-Galactic Astronomy

Alex Drozd



When I started undergraduate research at the University of Alabama, I met Dr. Nair, an assistant professor in the Department of Physics & Astronomy. New to the University, she was looking to build a team of undergraduates to help streamline the basic chores involved in her research. Dr. Nair is an extra-galactic astronomer; she studies galaxy evolution and the structural differences between galaxies in the local universe and galaxies at high redshifts — that is, the light of objects moving away from us is in effect stretched into the longer, lower frequency wave lengths, and the further away the object the greater the shift. When I approached her about joining the team, she directed me to the white board in her office, brimming with numbers, astrophysics equations, and hand-drawn graphs of galaxy brightness profiles.

“I’m looking for more undergraduates,” she said. “I have more data to analyze than I have time for. Are you interested in studying galaxies?”

“Sure,” I replied. “I live in one after all.”

The local universe extends from the Milky Way to a redshift of z<0.1 — about one billion light years away—where z is the ratio of the galaxy’s relative speed to the speed of light, from which a distance can be calculated. For Dr. Nair’s research, galaxies anywhere from z=0 up to z»3 — zero to five billion light years away — were studied. More distant galaxies have been discovered, but too few to constitute a statistically significant sample. She focuses on elliptical galaxies within this distance range because they are populous and visible enough to be analyzed and compared with their neighbors. She directed me to download a collection of galaxy cluster images within the z=0 up to z»3 range.

Figure 1: Middle-aged elliptical galaxies between z 0 to 3. —   This image includes the distant galaxy cluster Abell 370,  one of the very first galaxy clusters in which astronomers observed gravitational lensing, in which gravity warps spacetime and distorts the light we receive from galaxies lying beyond the gravity lens. The arcs and streaks are the gravitationally distorted images of more distant galaxies. Source: Image by NASA, ESA, HST Frontier Fields


In the following months, to prepare for my image and brightness analyses of elliptical galaxies, I read academic papers on extra-galactic astronomy and filled my hard drive with high-resolution images of galaxy clusters — leaving little room for anything else. It turns out that most scientific data, at least in astronomy, is available online. If you have the drive space, you can download Hubble Space Telescope images and process them yourself with free software. There are also top-of-the-line, high-priced astronomical image processing programs, like MIRA and IRIS. However, even free programs like DS9, named after the Star Trek’s Deep Space Nine, are used by both students and professional astronomers for image processing and scientific analysis. I downloaded it and loaded up all the Hubble Space Telescope images I’d been storing on my computer.

Elliptical galaxies look like giant glowing spheres of stars (Figures 2 and 3). Our own Milky Way is a spiral galaxy 100,000 light years in diameter, with long, star-filled arms curving out from the galactic core. The stars of a spiral galaxy orbit about the center in a flat galactic plane. The Milky Way won’t always be a spiral. In about four billion years, when we collide with our closest galactic neighbor, the Andromeda Galaxy, gravitational effects will cause the two galaxies to restructure. They might become a single elliptical galaxy, in which the stars chaotically orbit about the galactic center without a plane or pattern.


Figure 2: The classification of galaxy shapes, based on Edwin Hubble.  (Source: Image by NASA & ESA).
“Interactive Hubble Tuning Fork“, released 19/11/2012 10:00 am; © C. North, M. Galametz & the Kingfish Team
Access the really nice Interactive Hubble Tuning Fork version of this image at


For those of you who might be depressed by such a future for our beloved Milky Way, fear not. When galaxies merge, their stars and planets rarely collide, because there is so much empty space between them.1 The average distance between stars is about 30 trillion miles.

So, humankind is unlikely to be extinguished by a galactic collision, or by the death of our Sun in five billion years. But in two billion years the Sun’s energy output will have increased to the point where temperatures on Earth will be too hot for liquid water.1 Unless of course we’ve developed technology to move our planet to a safer range.

Galactic merging events are exactly what extra-galactic astronomers like Dr. Nair study to understand the evolution of galaxies. Looking at elliptical galaxies is the best way to do this considering they are the products of mergers. Since the observable universe is billions of light years across — its edge always growing larger as the universe continues to expand — the light we receive from the most distant ellipticals is already billions of years old—meaning we’re seeing them as they were billions of years ago. Looking at progressively closer ellipticals, we can study the entire history of their evolution, from the time they were first formed all the way up to the present.


Figure 3: The image compares an average present-day spiral galaxy (left) with its counterpart in the primordial past (right), when galaxies were likely had more hot, bright stars.
Image credit: NASA/JPL-Caltech/STScI

Extra-galactic astronomers studying the evolution of elliptical galaxies have found a curious anomaly, referred to as the size-luminosity relationship in early-type galaxies. ‘Early-type galaxy’ is the name originally used for elliptical galaxies under the galaxy classification scheme, created by Edwin Hubble, the early 20th  century astronomer who discovered that the universe contained galaxies besides our own. The size-luminosity relationship is one of the most fascinating topics in the field of extra-galactic astronomy, and relates to the physical concept of density.

Density here refers to the amount of mass within a given volume of space. A cup of water is denser than a cup of air, and a cup of iron is denser than a cup of water. Though mass and weight aren’t the same thing, they’re directly related, and you can see that a given volume of space would be denser if it has more weight in it than one with less weight.

Galaxies have size and mass just like all other matter in the universe. Galaxies are collections of stars and gas clouds bound together by gravity. The ones with more stars are more massive than the ones with fewer, but a small galaxy with a hundred billion stars is denser than a larger galaxy with the same number of stars.

The most distant elliptical galaxies, and therefore the oldest, are extremely compact. In this context, luminosity directly relates to mass because more mass in a galaxy means more stars, and more stars means more brightness3.

When extra-galactic astronomers like Dr. Nair plot the size-luminosity relationship of elliptical galaxies, they observe something quite unexpected: Billions of years ago these galaxies were quite dense, but the degree of density varied greatly. Yet local, present-day ellipticals vary little in density. This means that over time, regardless of how compact an elliptical galaxy was to begin with, it expands — or ‘puffs out’ — to the density range common to all elliptical galaxies today4. Some physical process we don’t understand is driving all elliptical galaxies to evolve towards a certain common density. Is this an attribute of galaxy structures themselves, or a consequence of an undiscovered physical law? The size-luminosity relationship is a mystery – evidence of a major trend that astronomers never would have expected.

How do elliptical galaxies “know” when to stop growing? Is there a physical process that keeps track of and adjusts their density? How do the ultra-compact ones “know” to grow by a lot, and the less compact ones by only a little? What’s causing them to puff out?

Extra-galactic astronomers have some hypotheses about what mechanisms may be contributing to this phenomenon, the most likely one being mergers. Elliptical galaxies don’t stop merging after just one collision. In fact, they’re expected to be even more likely to collide again after a merger because they now have more mass and attract other objects nearby more intensely.

You might ask: if a merger adds both mass and size to the galaxy, why doesn’t the density stay the same? The mass of a galaxy increases, but proportionally its size increases much more5. Galaxies spin, and when mass is added to them it disturbs the angular momentum of the system, causing the orbiting bodies to spread out. So mostly likely small galaxies merge with these ellipticals in events we call minor mergers, and cause the size evolution over time, the ‘puffing out.’ This adds much more size than mass, meaning the density goes down overall. Minor mergers are also more frequent.

There are different forms of merging events: dry mergers and wet mergers. The former refers to merging events between galaxies lacking appreciable interstellar medium — gas clouds in the space between the stars – where not much interaction happens between the different pockets of gas inside the two galaxies. Because of this, dry mergers don’t have much of an effect on the overall behavior of the resulting galaxy, whereas wet mergers — where galaxies with appreciable interstellar medium are involved — induce star formation due to gravitational instabilities in the merging gas clouds. These processes can change the distribution of the angular momentum in a galaxy as the new stars drift into their orbital positions about the galactic center. It is currently thought that the Milky Way-Andromeda Collision will be a dry merging event, as not much gas will be available during the collision with which to trigger star formation.6

Other hypotheses have been proposed for the existence of the size-luminosity relationship. They range from the spreading out and dissipation of the gas clouds in elliptical galaxies — not related to merging events — to the astronomer’s favorite go-to when something about a galaxy’s mass doesn’t add up: dark matter,3 which makes up most of a galaxy’s mass but can only be indirectly detected. It’s possible that the restructuring of a galaxy’s dark matter during merging events could be causing the size evolution — assuming it interacts with itself at all.

Another hypothesis, gas dissipation, notes that not all distant galaxies are compact; a small minority are not very dense at all. Yet, even these anomalous galaxies eventually puff out — otherwise, we’d still see compact galaxies in the local universe today. Over time, they lose gas and this affects the galactic structure, decreasing size and therefore gaining density, to reach the same value of density most compact early galaxies evolve to. So, whether elliptical galaxies begin with high or low density, they evolve over time to reach the same density as every other elliptical.7

But the merger-driven model of early-type galaxy evolution is the hypothesis that’s gained the most traction. It best fits the computer simulation models that theoretical astrophysicists have, even though it possesses numerous inconsistencies, and researchers are continually finding problems with it. Until a better hypothesis comes along, it’s the one most extra-galactic astronomers are sticking with.

Except for Dr. Nair and her colleagues.

She wrote a paper demonstrating that mergers shouldn’t be able to account for just how small the scatter on the size-luminosity plot is.5 The merger model predicts a narrow range of densities that nearby early-type galaxies can fall into, but Dr. Nair and her colleagues showed, using more recently collected data and different methods of analyzing brightness, that the range of densities is much smaller than predicted by merger-driven models (Figure 4).

Figure 4: The top row shows Dr. Nair’s data, where the range of densities is smaller. The bottom row shows data collected and measured by an older method of measuring a galaxy’s brightness. (Source: Nair, et al. 2011)


And there’s much more to the picture. Before the anomalous size-luminosity relationship in early-type galaxies was discovered, astrophysical computer models predicted that environmental factors were a key influence in the evolution of ellipticals. Early-type galaxies in low density environments, locations without many nearby galaxies or bodies, also called isolated environments, were thought to grow less than those in high density environments like galaxy clusters, where many neighboring galaxies and bodies are in closer proximity. High density environments have a higher number of collisions and merging events and early hypotheses suggested that environment would play a huge role in galaxy evolution.8

Yet, contrary to what our current astrophysical models and simulations of galaxy evolution are still predicting, it’s been observed that environment plays no role in early-type evolution5. Regardless of whether an elliptical is in a galaxy cluster or an isolated environment, it undergoes an evolution with the same end-point, becoming about as dense as every other early-type galaxy in the nearby universe. If mergers are the explanation, how is this possible? How is it that galaxies in clusters evolve exactly as do isolated galaxies? The former undergo extensive collisions and merging events, while the latter might only experience a few merging events over their entire history. The environmental independence of galaxy evolution may be the most perplexing characteristic of the size-luminosity relationship. If the mechanism by which these processes occur were to be discovered, it could provide valuable insight into how galaxies evolve, how matter was distributed in the early universe, and what galaxies might look like in the distant future.

Even if minor mergers prove to be the mechanism by which ellipticals evolve, this doesn’t answer the question as to why compact galaxies in the early universe grew at unique rates to the same final density we observe today. Why, when we look around at the nearby universe, have elliptical galaxies stopped growing? We know of no law that states an elliptical galaxy must fall into this specific and narrow range of densities. But the empirical fact remains that ellipticals evolve to have about the same ratio of mass to volume, i.e., the same slope on the size-luminosity graph. Could this lead to the discovery of a new law of physics? Perhaps one that could describe the behavior of dark matter?

And remember those images Dr. Nair had me download? The ones of galaxy clusters in between the local and the distant universe, between z of 0 to 3? They show what you would expect: mid-distanced, middle-aged, elliptical galaxies aren’t as compact as the most distant ones; they’re larger in size, suggesting they’re getting closer to the density exhibited by early-types in the local universe. We can see them in mid-approach (Figure 1). It can be frustrating to look at. What’s causing it? I wished the answer could leap out of the graph at me. But the size luminosity relationship has remained a mystery even to veteran extra-galactic astronomers who’ve been working on it for years.

The James Webb Space Telescope is set to launch in October of 2018. With its new capabilities — for example its infrared imaging camera9 (able to see through obscuring gas and dust, see Figure 5) — astronomers will be able to make extra-galactic observations at even higher redshifts. Astronomers will be able to gather data about elliptical galaxies even further away, further back in time, and perhaps get closer to solving the size-luminosity mystery, gaining insights into how the universe we live in evolves.

Figure 5: Visible light and infrared views of the Monkey Head Nebula. Credit: NASA and ESAAcknowledgment: the Hubble Heritage Team (STScI/AURA), and J. Hester.   Using infrared we can see through more dust and gas than with visible light.



Works Cited:

1 Marel, R; Besla, G; Cox, T.G.; Sohn, S; Anderson, J. American Physics Journal. The M31 Velocity Vector. III. Future Milky Way-M31-M33 Orbital Evolution, Merging, and Fate of the Sun, 2012; Vol. 753, 1

2 Fraser, C. Universe Today. You Could Fit All the Planets Between the Earth and the Moon, 2015

3Nipoti, C; Treu, T; Leauthaud, A; Bundy, K; Newman, B; Auger, M. Monthly Notices of the Royal Astronomical Society. Size and velocity-dispersion evolution of early-type galaxies in a Λ cold dark matter universe, 2012; 422, 2,, pg. 1714-1731

4Shankar, F; Marulli, F; Bernardi, M; Boylin-Kolchin, M; Dai, X; Khochfar, S. Monthly Notices of the Royal Astronomical Society. Further constraining galaxy evolution models through the size function of SDSS early-type galaxies, 2010; 405, 2,, pg. 948-960

5Nair, P; Bergh, S; Abraham, R. The Astrophysical Journal Letters. A Fundamental Line for Elliptical Galaxies, 2011; Vol. 734, 2, 10.1088/2041-8205/734/2/L31

6Cox, T.J; Loeb, A. Monthly Notices of the Royal Astronomical Society. The Collision between The Milky Way and Andromeda, 2008; 386, 1,, pg. 461-474

7Mancini, C; Daddi, E; Renzini, A; Salmi, F; McCracken, H.J; Cimatti, A; Onodera, M; Salvato, M; Koekemoer, A.M.; Aussel, H; Floc’h, E. Le; Willot, C; Capak, P. Monthly Notices of the Royal Astronomical Society. High-redshift elliptical galaxies: are they (all) really compact?, 2010; 401, 10.1111/j.1365-2966.2009.15728.x, pg. 933-940

8Shankar, F; Marulli, F; Bernardi, M; Mei, S; Meert, A; Vikram, V. Monthly Notices of the Royal Astronomical Society. Size Evolution of Spheroids in a Hierarchical Universe, 2013; 428, 1,, pg. 109-128

9Gardner, J. The Space Science Reviews. The James Webb Space Telescope, 2006; Vol. 123, 4, 10.1007/s11214-006-8315-7, pg. 485-606





“The Size-Luminosity Relationship in Extra-Galactic Astronomy”  © Alex Drozd

Alex Drozd is a graduate of the University of Alabama. He studied astrophysics and is now working as a programmer. He is also a science fiction writer, and has previously been published by Daily Science Fiction.


— The 2017 Nobel Prizes in Science

The 2017 Nobel Prizes in Science


The winners of the 2017 Nobel Prize were announced this week, much to the delight of scientists, readers, and enthusiasts around the world. I’ll briefly discuss the science-related awards for this year. The Nobel Prize in Economic Sciences has yet to be awarded.

The Nobel Prize in Chemistry was awarded to Joachim Frank, Richard Henderson, and Jacques Dubochet for, “developing cryo-electron microscopy for the high-resolution structure determination of biomolecules in solution.” Their pioneering work has allowed researchers and clinicians to visualize the structure of drugs, compounds, and proteins at some of the highest-resolution ever seen. By understanding how these molecules look and behave in solution, better applications can be developed for their use in health and technology.

The Nobel Prize in Physiology or Medicine was awarded to Michael Rosbach, Michael Young, and Jeffrey Young for “their discoveries of molecular mechanisms controlling the circadian rhythm.” The circadian clock is the regulatory system that governs the biological clock of the human body and within human tissues. The centers in the brain that control the circadian clock regulate human and animal sleep cycles and are controlled by light and hormones.

The Nobel Prize in Physics was awarded to Rainer Weiss, Kip Thorne, and Barry Barish for “decisive contributions to the LIGO detector and the observation of gravitational waves.” Gravitational waves are the curvature of spacetime due to gravitational objects colliding or moving in space. The waves propagate out from each disturbance like the ripples in a pond after a stone has been thrown. Albert Einstein and other physicists famously predicted their occurrence but did not have the technology to detect them. Until now that is!

It is always exciting to see who gets these prestigious awards. Congrats to all the winners!

— Meet the Scientist Q & A: Benjamin C. Kinney, Ph.D

Meet the Scientist Q & A: Benjamin C. Kinney, Ph.D.


From time to time I’ll be conducting interviews and/or Q & A’s with scientists from around the world in the blog’s new Meet the Scientist series. We’ll discuss current research, the state of science in general, and anything else of interest that might pop up. First up: Dr. Benjamin Kinney!

Dr. Benjamin C. Kinney has a Ph.D. in Neuroscience and is a neuroscientist at Washington University in St. Louis. He is also the Assistant Editor for the science fiction podcast Escape Pod. He writes science fiction and fantasy, and his short stories have been published in Strange Horizons, Flash Fiction Online, Cosmic Roots & Eldritch Shores, and more. You can find more of his writing at or follow him on Twitter @BenCKinney, where he explains neuroscience concepts on his weekly #NeuroThursday feature.

Doug: Thanks for your time, Dr. Kinney! So, what got you into science?

Ben: I started like so many scientists did: with science fiction. This goes back into the mists of childhood memories for me. For as long as I can remember, I’ve always been driven by that sense of wonder and discovery. I ended up in neuroscience because it’s the biggest mystery of all: both impossibly vast, and impossibly personal.

Doug: What does your research focus on and what have you found?

Ben: I study how the brain and body change after injury to the hand and arm. In the past, I’ve worked with amputees and hand transplant patients (and cyborg monkeys), but now I work with people who’ve suffered nerve injuries. I’m particularly interested in handedness: how can it change and what can we do for patients whose dominant hands get injured. I’m just starting up a big study to compare laboratory measurements of hand function with how patients use their hands in their daily life. Hopefully we’ll be able to figure out which of those lab measurements have a real impact on patients’ quality of life. Once we do, we’ll be able to use therapy, training, and neuro-stimulation to improve the kinds of movement that matter most.

Doug: Are there any misconceptions in your field of work or in neuroscience at large?

Ben: Neuroscience is very complex, which means it can get oversimplified. I could talk all day about public misconceptions of neuroscience, but here’s one that ties directly into my own work: the idea that a person can be “left-brained” or “right brained” is complete bunk. The two sides of the brain are specialized for different skills, but you’re not “good at left-side things” or “good at right-side things.” You’re good at some things and not others – and it makes no difference whether those things are on the same side or opposite sides.

Doug: What’s one big question in the field that you’d like to see answered in your lifetime?

Ben: Every human being’s brain is different. Different folds and valleys, different networks of cells. What I want to know is: How much does that matter? There are so many problems, both scientific and medical, that we can’t address right now because of how much the brain varies from person to person. If we could predict or interpret that variation – for example, if we knew that certain neuroanatomical patterns affected an individual’s response to a psychiatric drug – we could understand and accomplish so much more.

Doug: How might we get more of the public to engage in science discussions?

Ben: I think the trick will be to get people thinking about science and scientists as part of everyday life, not just something that strange weirdos do in a mysterious basement laboratory. When I go to parties and people say, “I’ve never met a neuroscientist!”, I say, “Why not? Neuroscientists are everywhere. I’m surrounded by them every day!”

We’re living in different worlds – an inevitable part of how we as Americans so often structure our lives around work. But I think we need to pierce some holes in that to make science feel less like a mystery cult and more like something anyone can access.

Doug: You were recently brought on board as an editor at Escape Pod, congratulations! Any advice on how to approach incorporating hard science into science fiction or genre writing?

Ben: Remember that the science is there to support and inspire the story, not the other way around. If you want to write about a new piece of science or technology, I recommend focusing less on what it does, and more on what it means to people and their lives.

Doug: Do you incorporate your research interests into your writing?

Ben: Usually indirectly. I write across a broad spectrum of science fiction and fantasy, and a fair amount of it draws indirectly from my neuroscience training. I have strong opinions about human decision-making, artificial intelligence, and alien minds – whether scientific or fantastic! But every now and then I do produce a story that draws directly from my work. The most neuroscientific thing I’ve published is a silly little story called “Cyborg Shark Battle (Season 4, O’ahu Frenzy)” in the Cat’s Breakfast Anthology from Third Flatiron Press. In graduate school I used brain-machine interfaces to study how the monkey brain controls movement and Cyborg Shark Battle applies that technology for entertainment and profit in the realm of reality TV.

Doug: Running a lab can require a lot of funding and I imagine you spend a lot of time grant writing. I know that writing scientific grants and writing fiction can be different processes. As a writer of science fiction, how do you balance the two?

Ben: Sadly, the answer is “triage.” There are only so many hours in the day, so I try to use them for productive things. Thankfully “reading” is productive to a writing career, so I have ways to relax, but I’ve probably watched only 3-4 television shows in the last five years. I also sent my wife to Mars for a year, that gave me a ton of extra writing time! I recommend it for everyone.

Doug:  Can you recommend any good books about neuroscience?

Ben: Fiction or nonfiction? I’ll go with fiction, because unless you count peer-reviewed research publications, I don’t read non-fiction in my own field – I’m the wrong audience. My favorite neuroscience-focused science fiction books are Ancillary Justice by Ann Leckie, and Blindsight by Peter Watts. Ancillary Justice’s science is subtle, but the novel is permeated by a deep understanding of neurological disorders and cognitive science. Blindsight is more explicit about its neuroscience and it wraps a fascinating argument into an excellent (and terrifying) story, so I always recommend it even though I wildly disagree with it.

Doug: Thank you so much for taking the time to answers questions about your science and writing!

— Returning to the March for Science: Where are we now?

Returning to the March for Science: Where are we now?


It’s been almost two months since supporters around the world marched in solidarity to increase public awareness for science and speak up for informed decision-making in the government.  This feels like a good time to step back and assess the impact of the March and discuss what’s percolating as we move forward.

Overall, the reception for the March was mixed depending on who you asked. The Pew Research Center surveyed 1,012 people about their reaction to the March for Science and a summary of their findings can be read here. Pew reported that it was primarily Democrats and younger generations that supported the March for Science and thought it would help science in the future. Although this is a small sample size, it really is unfortunate that the March appears to have only enjoyed partisan support. The whole point of the movement was to encourage advocacy from individuals of all backgrounds and create a new public discourse about informed policy. In this respect, the March for Science had questionable impact on the collective view of science in the entire community.

Perhaps telling is this graphic that summarizes the viewpoints by political leaning:

Source: Pew Research Center

Add into the mix that President Trump announced earlier in June that the United States would withdraw from the Paris Climate Agreement and that he is still committed to reinvesting in coal and fossil fuel energies, it seems the March was not successful in reaching the ears of the White House. This is very disappointing, especially considering clean energy jobs are now a larger portion of the US economy than coal,  and global warming is going to have a major impact on our health and our economy in the future.

Thankfully, the message did reach ears on Capitol Hill – right where the March for Science ended in D.C. The White House’s budget proposal for fiscal year 2017 including drastic cuts to the National Institutes of Health (NIH) and many other science funding agencies. But the deal struck by Congress to fund the government through the end of September ultimately saw an increase in funding (from 2016 levels) for the NIH and other organizations, including: the United States Department of Agriculture (USDA), the National Oceanic and Atmospheric Administration (NOAA), NASA, the US Geological Survey (USGS), the National Science Foundation, and parts of the Department of Energy (DOE). Unfortunately, the EPA is still in the crosshairs of the new administration and lost funding compared with last year.

Overall, this is good news considering the people directly responsible for negotiating and enacting the federal budget appear to be supportive of a positive role for science in society and within our government. However, the battle will be taken up again later this year with the 2018 budget proposal. Science is again being threatened with devastating cuts to research, from a 22% cut to the NIH, 19% to the Department of Health and Human Services (HHS), 15% at the DOE, 13% at the NSF, and horrific cuts to the  research at the Department of Homeland Security (DHS). NASA would see a slight budget increase, but not to climate-related research.

For me, this is where any short-term gains stemming from the March for Science will be measured. How will Congress respond to cuts in science moving forward in the Trump Administration? It is incredibly unfortunate that there is a still a partisan divide when it comes to support for science. We need to work together as nation to cross those barriers and tear them down. Science impacts us all and a unified front for science advocacy makes it that much more powerful. Below is a picture of my friend fr with his sign during the march, which I think highlights this important issue:

It’s also up to the scientists and doctors and researchers and all those in the scientific community to continue engaging and speaking up about these issues. We have to work together on this. The momentum of the March will only continue if there is a sustained level of input and idea-sharing that politicians and the community find accessible. The long-term payoff of this continued discussion will be with the next generation and the development of our world’s future research and STEM community, investment in clean energy, education, and the development of new technologies to better our world. Hopefully in the budget battles to come the important gains sciences has made this year will be highlighted and used to protect funding and inspire others to invest wisely in our future.

— What Can Open Access Data Do for You?

What Can Open Access Data Do for You?


Data collection and usage is an essential component of scientific research. It’s arguably the most important. Without data, we can’t make observations about the world and deduce truths about how it works.

I wrote in my previous post that research articles are the primary source of data and dissemination of hypothesis-driven research. But while research articles concisely present some data, there is almost always a larger dataset that can still be of use to the authors of the article, and others.

There is a growing movement within science for open access of all data generated for a research project or collected by government agencies. Open access data means that the raw and processed data should be available, free of charge, to anyone. Access to data is a core tenant of the scientific process and most scientific projects funded by the United States government, often through grants from the National Institutes of Health (NIH) or National Science Foundation (NSF), will ultimately be published and released in an accessible format to the public.

However, depending on where the manuscript is published, there may be a long embargo on the release of the text or data and only those with subscriptions can access it right away. Open access data means that all the data and text is released the day the manuscript is published. This also allows collaborators, and even research competitors, to further peruse this data and use it again for different questions (if applicable).

Many journals, including the umbrella Public Library of Science (PLOS) journals, have taken this endeavor as far as possible to provide access to as much of the data used in a research article as possible. PLOS also created guidelines to determine how open a journal is and thus, the articles that are published within it. PLOS has worked hard to streamline the terminology of open access data using the ‘HowOpenIsIt?’ Open Access Spectrum and their handy evaluation tool to evaluate and rate scientific journals.

But what does open access really mean? How does one even access that data? I’ll take you through an example using the PLOS website.

I went to and entered in the first key term that came to mind. I started working on this post on a beautiful weekday morning in Baltimore and I heard some birds chirping at each other through my window, so I did a search using the term: bird sounds. I know very little about birds so I clicked on the first link, which directed me to a research paper entitled: Automated Sound Recognition Provides Insights into Behavioral Ecology of a Tropical Bird.

Now, there are a few things to note about this article and others like it on PLOS. First, you can download the entire article as a PDF. For many journals, even those found in Nature and Science, you may already hit your first obstacle: a paywall. This means you’ll need a subscription to the journal or publisher to gain access, and this can get very costly for institutes and individuals.

Next, this article’s supplemental data is found near the bottom of the page, which is the case for many research articles like it. Here you often find raw datasets, metadata analytics, additional graphs, and/or tables cited in the article but not necessarily featured as essential figures in the main text. You should be able to download each of these files individually. In fact, this article on tropical birds has a set of supplementary files that include the actual recordings of the bird calls used in the analysis. (This one is my favorite. It sounds like a monkey.)

You’ll also notice there is an entire section labeled ‘Data Availability’, which is located just below the article’s abstract. Here you can find all the databases that the raw and processed data was uploaded to during the publication process. These databases, like Gene Expression Omnibus, Zenodo, and, offer datasets that are free to download and explore on your own after each manuscript is accepted and published online. Forbes created a list of 33 databases that feature open source file sharing and storage and each has its own unique sets of data that are free to explore.

So, what should we do with this data? Why is open access data important?

In theory, open data should be provided with every manuscript that used public funding to support that research. This isn’t always the case in practice, however. I mentioned the restricted access by paywalls and embargoes, where data is often hidden from public view.

Open data is a check on accountability and reproducibility and it can counter the pseudoscience that’s often in the news. For instance, climate change deniers like to argue that the Earth isn’t really warming and that global temperatures don’t change. Their arguments are supported by data provided by Berkeley’s Earth Laboratory.  While indeed this specific dataset supports the claim that the Earth isn’t warming, the dataset only provides air temperature recordings taken above land masses. Considering seventy percent of Earth is covered by water, this dataset is incomplete. Additional datasets, on the same website nonetheless, provide land and sea temperature data that more accurately depict what is occurring in our climate.

So open access data can go both ways, and the appropriate types of data need to be considered when applying these free documents to your own work or arguments.

Other sources of open data will even take the liberty of building analysis pipelines for you to use right away. ExAtlas was designed at the National Institutes on Aging, NIH, to provide a one-stop shop for gene expression analysis. Taking data analysis to the next level, Swedish statistician Hans Rosling built the GapMinder – an intuitive and interactive web-based algorithm that you can use to visualize raw datasets in specific contexts of health disparities. GapMinder highlights the many disparities in our world, including; age and income levels in the developed and developing world, all the way to the relationship between country GDP and gender-specific health span and longevity.

GapMinder is an amazing program to become familiar with and it uses publicly-funded datasets cataloged from around the world to generate meaningful results. It’s fairy intuitive to use and provides an additional dimension to analyze demographic outcomes by country and year across a variety of variables, including health, vaccination rates, income, GDP, education, gender, geopolitical region, and many others.

For example, below is a picture of the average life expectancy for every country on Earth as it relates to the total health spending each year (as a % of GDP). I’ve highlighted the United States as an example. In 1995, the US was spending almost 14% of its GDP on healthcare, with a return of about 76 years in life expectancy.

Source: GapMinder

Now, compare that with 2010, which is seen below. You can see that the US spends about 18% of its GDP on healthcare in 2010 with a very small uptick in life expectancy. This alludes to the rising cost of healthcare in the US.  Each of the other circles on the graph represents a specific country and some of them have made great gains in healthcare for little additional expenditure in the same time frame. The size of the circle is also directly proportional to the population of the country and the track of yellow dots indicates the year by year changes within the U.S.

Source: GapMinder

All the raw data on GapMinder is freely available to download and you can track any country of interest over the time frame that data is available. If you want more information on the power of this program, and why it’s free for all to use, check out the two TED Talks on GapMinder: Talk 1 and Talk 2.

The U.S. government also hosts several websites that maintain public databases. is a good place to start to see what is available for a topic of interest, from global warming to waterfowl lead exposure in Alaska and Russia. The Centers for Disease Control and Prevention (CDC) keeps a comprehensive database of health statistics and information, as does the World Health Organization.

The U.S. Environmental Protection Agency (EPA) also has its own open data website. This is not without controversy, as EPA employees are still grappling with how to respond to cultural and administrative changes due to the new Presidency. For a time, the website was even shut down but now it appears to be up and an archive of older data from previous Presidential administrations will be provided. Fearing for the loss of publicly-funded climate data, scientists around the world have banded together to download and archive climate data stored on the EPA and NOAA websites in case data were removed and/or destroyed. There is even a data repository for these datasets called Data Refuge, where open data can be cataloged, deposited, and accessed. Considering many of the scientists who advise the EPA on environmental policy were just sacked this week, this is an important endeavor.

Moving forward, it’s critical that raw and processed data be curated and provided to the public. I hope you can get a sense of how critical this is for informed-policy and that this data is readily-usable by anyone willing to take a few moments to explore with it.

Next time, we’re going to dive into some of the recent space discoveries: including planets, space biology, and the latest NASA initiatives!