Friday, April 27, 2012

Facts and Fuss about the p-value


The p-value is extensively reported throughout the medical literature although its use is highly debated and not always well understood. While it is much less commonly referred to in science media articles, statistical significance testing does play an important role in interpreting research results and this post is dedicated to a (hopefully) simple explanation of what the p-value is, its uses and limitations.

So what is the p-value and why do we use it?
The p-value is a statistical measure which assesses whether one particular value is statistically significantly different from a reference value which indicates no change. That’s a textbook definition, but like most statistics it is much easier to ignore the definition and focus on an example of how it is used.

In our example let’s say we are interested in looking at whether there is a link between exposure to asbestos (a fibre commonly used as insulation in old (and not-so-old) buildings) and the development of lung cancer.  In science experiments you start by outlining two scenarios: a ‘null’ hypothesis (remember this simply as the null hypothesis usually means no change) and an ‘alternative’ hypothesis (the alternative hypothesis is the idea or question that you are interested in testing). For our example the null hypothesis is that there is no association between asbestos and lung cancer (no change), and the alternative hypothesis is that there is an association between asbestos and cancer. 

Imagine that in our hypothetical study we found that people exposed to asbestos were two times more likely to develop lung cancer than those people who were not exposed to asbestos. How can we then be sure that this is a real finding (even if other possible contributing factors, or ‘confounders’ such as smoking have been taken into account)? How can we be sure that this result didn’t happen purely by chance? To explain the role of chance, I’ll employ the familiar example of the results of a coin toss. We know that there is a 50:50 chance (or 0.5 probability) that when you toss a coin, the head-side will land face up. In practice however, if you toss a coin ten times, the head-side may land face up, by chance, seven out of ten times, but if you toss a coin 100 times, the number of times the head-side lands face up will come closer to 50 (or half).

Assessing whether a finding is ‘real’ or whether it could occur by chance is where the p-value comes into play. The p-value is the probability of obtaining this finding given that the null hypothesis is true (i.e. there is no association between asbestos exposure and lung cancer). A p-value that is small suggests that the result is unlikely to be due just to chance and this provides some evidence that the null hypothesis is incorrect. Conversely a large p-value indicates that it is likely chance could be playing a role and that there really is no association between asbestos exposure and lung cancer. This may seem black and white: a small p-value equals a real finding; a large p-value indicates equals no real finding. So what is all the fuss about?

The grey zone
There are three main limitations or criticisms of the p-value:
  • It is common practise in medical research to use a cut-off significance level of 0.05. Again using our example, this means that if our hypothetical study described above had p-value of 0.049 the finding would be ‘statistically significant’ and we might conclude that exposure to asbestos is associated with lung cancer. However, if this same study had a p-value of just two points more (0.051) it would then fall into the category of being ‘statistically insignificant’ and we may therefore falsely claim that the evidence does not indicate that exposure to asbestos is linked to lung cancer.
  • The p-value depends on sample size. In our example let us assume that exposure to asbestos is linked to lung cancer (current evidence suggests that this very much is the case); if we did an experiment to test this and used only a small number of people (small sample size), the p-value may be larger than 0.05. This means that even though the alternative hypothesis is true, due to the small sample size the p-value is larger than the cut-off for statistical significance and we may falsely conclude that the null hypothesis is true (that there is no association between exposure to asbestos and the development of lung cancer). If we were then to re-run the same experiment exactly the same again, but this time use a larger number of people the p-value would be smaller, possibly indicating that the results are now significant.
  • Another criticism of the p-value is again related to the cut-off significance level of 0.05. This cut-off means that 5% of the time the p-value will be false. This means that in up to 1/20 tests a p-value could indicate a result is statistically significant even if it’s not.
Why use it at all?
The p-value along with other statistical tests (such as confidence intervals – more about those later) can be a useful tool to help researchers and others interpret research results. Medical research plays an important role in health policies, doctor advice or practices and general public health advice. Before research is translated into policy or practice changes, it is important to understand if a study result might be due to chance or if it could be a real finding. It is also important to keep in mind the limitations of the p-value and to assess these in terms of the study design. 

Post wrap-up
I drafted this post during attendance at a weekly writing group. When I briefly outlined what the post was about two of my fellow writing group members immediately pointed my attention to an amusingly but aptly named article on the p-value and its limitations. For those who are interested in reading more, the article is called ‘The Earth is Round (p<.05) and is by Jacob Cohen (PDF).

Tuesday, April 17, 2012

Introduction


Welcome to undressing science.  

We're bringing sexy back (Timberlake- please don't sue).  We are about to have science give you a long, fulfilling strip tease.  No it won’t be like the quick and easy flash that most mass media gives you.  This is going to be slow and methodical and not just skin-deep.  Most importantly this is going to be an honest and accurate undressing.  

Like my co-blogger AFour I will be using a pseudonym (Fenix) in order to maintain relative anonymity.  However unlike AFour  I will be revealing my address, credit card details and phone number (ladies…) thus giving you a good reason to keep visiting the site.  But for now you’ll have to settle for a few of my other more basic details.

I am a PhD candidate at the Australian National University in Canberra, Australia.  My research is focused upon the design and reform of international environmental institutions.  However my interests range far and wide from political philosophies and international relations through to the realm of nutrition (particularly intermittent fasting).  Perhaps the most interesting and frustrating issue in regards to my research interests is the serious amount of disinformation and misunderstandings that are all too common in public perception.  Just take a look at the media and climate change for a case in point.  

Consequently, I am passionate about communicating research to others, and more importantly showing them how to critically assess research and its (often manipulated) portrayal in the media.   The majority of this will be through research reviews (coming soon will be a review of the connections between climate change and obesity- say what?).  Why put the time and effort into these reviews?  Because as a democracy in modern times it is every citizens responsibility to understand the latest developments in policy and research.  Critical thinking is really just learning to be an informed citizen.  And because science really is sexy, once you get to know it…

As you can probably tell already I am overly reliant upon sexual innuendo- get used to it and stay tuned...

Wednesday, April 4, 2012

Introduction


Welcome to Undressing Science! 

I’m AFour, one of two bloggers who will be contributing posts to Undressing Science. My co-blogger will also be posting an introduction here soon but for now I’d like to tell you a bit about myself and what I intend to blog.

As you will undoubtedly notice, I have decided to stay (mostly) anonymous by using the blogger name AFour. My reasons for this are not to be deceitful but rather to allow myself some freedom to express and communicate my thoughts. This is my first time blogging and I’ve found myself in a kind of paradox; the idea of putting down words and thoughts for everyone to see is both scary and, at the same time, liberating. Don’t let my withholding name prevent you from reading on, short of providing my address or credit card number, most other relevant details about me will be shared. 

I am a student studying at the Australian National University in Canberra, Australia. I have recently commenced a PhD in population health, specifically epidemiology, and I am particularly interested in mental health and diseases.  Aside from my direct research interests, I am also passionate about communication and believe that research results should be available to, and openly discussed by, any interested individuals and communities.

There are many fantastic sites and blogs (I recently became addicted to Bad Science – an endlessly thought-provoking and rip-the-blindfold-from-your-eyes blog written by Ben Goldacre) dedicated to the discussion of all things science and research, aimed at those with a more advanced understanding of science concepts. I will try to avoid duplicating these and will instead be trying to write posts, focusing on recent research publications and books, that untangle the web of jargon and technical spiel to reveal the crux of what is being told. I hope this will be readable and enjoyable for those interested in or wanting to learn more about science and what the latest research is revealing, but who do not necessarily want to spend three years getting a science degree.

While I will be taking a much needed break from my computer over Easter, keep an eye out for my upcoming posts on critically analysing science reports (a layman’s guide) and understanding statistical significance tests.

Oh, and please bear with us as we try to get this blog up and running. It may need a few make-overs before the month is out....