Gal Science: On How to Interpret a Study -The Toast

Skip to the article, or search this site

Home: The Toast

Screen Shot 2014-03-17 at 10.53.26 AM

As The Toast searches for its one true Gal Scientist, we will be running a ton of wonderful one-off pieces by female scientists of all shapes and sizes and fields and education levels, which we are sure you will enjoy. They’ll live here, so you can always find them. Most recently: The Little Blind Bookworm in Your Brain.

Interpreting a research study can be daunting, and no matter how involved in the sciences we are, it’s easy to take mental shortcuts to decide whether to accept research findings or not. Research in my field, social science, is especially prone to the “hastily read abstract which tangentially mentions women, draw possibly correct conclusion and then blame penises” treatment mentioned in the call for this series, so let’s tackle one of the most common mental shortcuts that people take when assessing a research study: dismissing a study outright because it was conducted or funded by someone with an agenda.

I mean, it’s true that one should always consider the source of research findings, and it’s true that this strategy works most of the time: Cigarettes don’t cause cancer? Yeah right, Philip Morris! The world is about 6,000 years old? Shut up, Creation Museum!

There are two main problems with this strategy, though. First, the reality is that researchers are people, and all of us, whether we work alongside advocates or in a university, experience some kind of draw do the work we’re doing. It drives me absolutely bananas when I hear a smarmy, dismissive comment about a researcher “with an agenda.” Everybody has an agenda. How else would you pick a research topic? Can you imagine devoting your career to something that you find boring in the interest of being fair? (I won’t rehash a previous Gal Scientist on Why We Study, but if you somehow missed Lauren Sherman’s brilliant thoughts, click that right now!)

There are all sorts of motives that can lead to picking a research topic: to test political assertions, work through family trauma, get laid, justify a variety of awful habits, and in the end, it doesn’t have to matter what a researcher’s individual motives may be. Researchers with explicit personal biases can set up a research design that minimizes them, like a double blind study, or having multiple people verify interpretations of responses. The part of my job that is the most fun is hashing out interpretations of survey responses with fellow researchers with a different perspective than mine.

Similarly, researchers who may think they don’t care either way how their study turns out might still project their own biases without even realizing it. This can include really basic assumptions, like that respondents interpret your question in the same way you interpret it in the first place. My favorite example ever of a wonky survey item is this one question in the General Social Survey, which is this massive, long-­running survey that asks about just about anything you can think of and has been used to test social theories in thousands of published articles. There’s this one question, “From time to time, most people discuss important matters with other people. Looking back over the last 6 months, who are the people with whom you discussed matters important to you?” That has commonly been used as a measure of social connectedness and the number of close relationships that someone has. In particular, lots of researchers have used this measure to publish findings about how people are reporting fewer close ties in their social networks than they did in years past, and often affirm stereotypes like women have more close relationships than macho, stoic dudes. I won’t link examples because there are so many that it wouldn’t be fair, but it’s intuitive, right?

But what the hell does “important things” mean to people answering that question? Someone thought to find that out, and you can check out this PDF of a version of the article without tables for the hilarious (to me) findings, but as it turns out, “important things” can mean basically anything. People said that their important conversations were about basically everything from “caring for one’s aged parents,” to “the new traffic lights installed in town,” to the article’s eponymous example, “cloning headless frogs.” So, even benign assumptions that have nothing to do with a political agenda can steer a study (or multiple studies or a subfield) in the wrong direction. If a good chunk of people answering that question were thinking of topics that are perhaps of import to their town or world, and researcher writers have been thinking people were saying something about the inner workings of personal relationships, then a single incorrect assumption has driven a not­-insignificant chunk of scholarship on human relationships.

Second, it’s a double-­edged sword. Say I was a climate scientist at an environmental think tank. If I trashed a colleague’s study asserting that global warming isn’t real only because the research was funded by an oil company without saying anything about their methods, then why shouldn’t the oil company researchers attempt to trash my findings on climate change only because I work for an organization that accepts that human actions have contributed to climate change?

The point of doing science is that there are standards by which we can assess and verify information presented to us. The scientific method means just that: there’s a method that we follow, and we should be able to see if someone else did. No matter what I think of another researcher personally, taking someone’s interpretation at their word (or not) is not what’s best for anybody involved. (Although, goodness knows, evidence is not exactly an antidote to ego. We try, though. We try.)

This point applies to publications as well as people. Specifically, sometimes it’s fine for a study to not be published in a peer­-reviewed journal. The reality of academic journals is that they only publish brand­new stuff, usually have word count limits, and can take months to get your work out there (and even then, it’s behind a pretty significant paywall). If the only worthy research was work that could be summarized in less than 10 pages, completely brand new, and not at all time­ sensitive, then what’s the point? A think tank or NGO needs accurate data for decision­ making, and they may need the same 100 pages of data updated every year. No journal is going to be interested in that, so oftentimes self-­publishing is the best option.

So, what are some better steps for assessing whether a research study is blaming penises enough? Lucky for us, that’s an eminently Google­able topic, but here’s an overview of how I do it that might be helpful to you:

1. Consider context. I know that I just ranted against this, but what you really want is moderation. Was it published by a respectable journal? If not, is there a good reason why? It it addressing a controversial topic? Are potential conflicts of interest disclosed? Are there any immediate red flags based on what you know about who produced it? Keep that stuff in mind, but keep going.

2. Consider methods. Is there transparency about how they collected this information? Do they let you know who was in the sample and how they were selected? Does the method match what they were trying to find out? For example, lots of people think that a large­scale, quantitative study that is representative of the population is the gold standard for social science research, but it’s not. Say you needed to know about the challenges that new immigrants face when they arrive in New York City. Fielding a survey to the whole population of the U.S. (or even just to NYC) with questions about those topics would be expensive, would not include enough brand­new immigrants to tell you anything, and you’d probably not ask all the right questions. Rather, that situation would call for interviewing actual recent immigrants, letting them paint the full picture.

3. Consider interpretation. Think about factually what was found. Do you think it matches the conclusions that the researchers drew? Did they generalize their assertions about everybody on earth even though they only surveyed a handful of people? This is a big question to ask when the study population is comprised of college students. Psychology or Sociology 101 course credit is often contingent on participating in a certain number of studies or experiments by faculty and graduate students, which is a great way to get a large sample, but if you’re talking about a research university, you’re probably talking about 101 classes that are not particularly diverse in terms of age, race, socioeconomic status, or most other demographic descriptors that matter for predicting patterns in human behavior. Finally, did the researchers miss any social or contextual reasons for a finding, like attributing behavior to genetic differences between men and women without discussing socialization? (I’m looking at you, studies about how WOMEN BE SHOPPIN’. A stereotype specific to a certain time and place should never be mistaken for a theoretical foundation for inquiry, and it’s gross to basically make up stories about how much hunting and gathering our cave­dwelling ancestors did or did not do and then publish on it.)

4. Consider limitations. I often skip to this step first, because I’m inclined to judge researchers by about how honest and open they can be about the weaknesses of their research design. No study is flawless, and I admire honesty about the inevitable flaws in a study’s design. In survey research, the biggest thing to watch for is assertion of a causal relationship when really it’s just that two things are related for other reasons. The classic example that stats instructors bring up is the extremely strong relationship between ice cream sales and deaths from drowning. The rates of each of these rise together at exactly the same time every year without fail, and a hasty researcher may conclude that this implies that ice cream is somehow causing people to drown. Of course, there’s a third variable at play in this scenario: temperature. Hotter temperatures mean more ice cream consumption and more swimming, so it’s really summer weather driving both ice cream sales and drowning, with no additional relationship between the two. One can minimize the possibility that a third, unknown variable is actually what’s led to findings, but humans aren’t genetically identical lab mice experiencing the exact same stimuli, so it’s best to be honest that we can’t entirely eliminate unknowns and researcher biases.

Obviously, I’m speaking from my tiny little sliver of the social sciences universe. What did I miss? Which of my advice doesn’t translate to the hard sciences? What do you think I got wrong?

Maddy Boesen works at the nonprofit GLSEN, where she does research on LGBTQ issues in K-12 education and occasionally blogs about it. She does not own a lab coat.

Add a comment

Skip to the top of the page, search this site, or read the article again