In the eye of the beholder?
Hey guys and girls,
So, as you probably know, the last month or so of Galaxy Zoo has been dedicated to testing whether we have any bias in our classifications (and if you want to know why we are interested in looking at the rotation of galaxies then please have a read here). By ‘bias’ we basically mean some systematic error in the way people classify (you can get a good explanation in Jordan’s post), and this is different from just random general scatter of results. For example, we know that when a galaxy is faint or small then people are more likely to think it is an elliptical galaxy – and this particular mophology bias is something that Steven must compensate for in his work.
It has been really exciting to work on the rotation classifications of Galaxy Zoo, and as many of you know early on in the project we realised that people were classifying more galaxies as anti-clockwise (see the Telegraph article for example). Specifically, if we take those galaxies that are well classified (ie. more than 80% of people agree) then we find we have an anti-clockwise:clockwise ratio of about 52:48. This may not sound particularly significant, but as you increase the number of galaxies that you have in your sample (as more of you lovely people classify for us) then this ratio becomes more significant, and is highly unlikely for the ~35,000 galaxies that we have. [For those of you who like probability, the number of anti-clockwise galaxies that we expect is distributed according to a Binomial probability distribution. And if we assume that the ratio is really 50:50, then out of a total of N galaxies we expect N/2 to be anti-clockwise, with a standard deviation of sqrt(N/4).]
In the plot below we show the relative excess of clockwise votes (for users that classified more than about 300 galaxies) – this is the number of clockwise votes minus the number of anti-clockwise, votes divided by the sum of the two. For example, this number would be 1 if a user always clicks clockwise, and zero if they click both clockwise and anti-clockwise equally.
This graph confirms that everyone is generally clicking anti-clockwise more often, because we see that the mean tends to lie below the zero line. But this plot cannot distinguish between intrinsic excess of anti-clockwise galaxies on the sky or human bias, and it is obviously very important for our rotation results that we get a handle on this as we could not announce our possible anti-clockwise excess result to the scientific community without doing these bias checks. So the basic idea is to look at the votes for a galaxy before and after a galaxy image is flipped. For example, if 6 out of 10 people thought it was originally clockwise, then after flippping we expect about 6 out of 10 people to now think it is rotating anti-clockwise (if there is no rotation bias).
Since the end of November many of the images in Galaxy Zoo have been flipped for this purpose (and we’ve been monitoring the status here), and we now think that we have enough data to measure the levels of bias. This week Anze has flown over from Berkeley (in California) especially to crunch the numbers with Kate (in Oxford); it is quite a job – with over 7 million classifications to go through! And during our analysis some rather subtle points arose… as with most science, things don’t go exactly to plan!
So we basically wanted to compare the classifications for a galaxy before and after flipping, but we quickly realised that peoples behaviour in the last month or so is very different to the earlier datasets (see Anze’s post for an explanation of how we reduce the data). For example, recently people have been more likely to click the ‘Star/Don’t know’ button. This might be because we have lots of new users, recruited through our latest publicity drive. Or maybe lots of old members have come back after receiving the newsletters. Either way it meant we couldn’t simply compare before and after votes. Also, annoyingly, the original unflipped images are no longer on the site and so getting a handle on this behaviour change was a bit tricky (note that one of the first rules of scientific experiements is to have a control test, but accidentally a miscommunication amoungst team members meant that in this case our control sample got left out!). Fortunately though, we are able to use the monochrome images that are currently in the site to compare to (as we observe that being in black and white does not change how people choose between anti-clockwise and clockwise).
So we want to know what the average votes per button are, for the average galaxy in Galaxy Zoo. This is where we encountered our second problem – our bias sample does not cover all of the Galaxy Zoo galaxies, but just 10% of them, and this 10% was not selected at random. In particular we know that we have more anti-clockwise galaxies in the bias sample (on the site at the moment). Therefore we needed to careful undo what we did when we selected this subsample, so to then construct an effectively random subsample of our full database. Then we could look at the average weights.
In the figure we show the average fraction of votes that a galaxy gets for clockwise (class=2) and anti-clockwise (class=3). We show the result for the original classifications in black (before December), for the monchrome images in red, and for the flipped images in green. We also show the 1 standard deviation errorbars from sampling.
So what we see is that the class=3 points are always higher than the class=2 points, and crucially this is true even after we flip a galaxy image! Looking at the red points, we find that before flipping there is a 6.0% chance of hitting anti-clock and 5.5% of hitting clock for our sample. Then after flipping (green) there is a 5.9% of hitting anti-clock and 5.6% of hitting clock. So the point is that those numbers stay the same (within 1 standard deviation) when they should actually reverse if there is no bias. It is easier to think in terms of the ratio of fractions:
anti/(anti+clock)=0.522 before flipping
anti/(anti+clock)=0.512 after flipping.
And if we had:
a) no bias and no excess then these should both be 0.5.
b) no bias and a real excess then one should be the opposite of the other (ie. 0.52 and then 0.48)
c) a bias and no excess then we would expect them to stay the same and not equal 0.5.
But what we actually find is that 0.522 is 5 standard deviations away from 0.5, 0.512 is 3 standard deviations away from 0.5, and 0.522 & 0.512 are within 1 standard deviation of each other. So you see we appear to be convincingly in situation (c).
So what next? Well – it is fantastic that we have been able to get a handle on the bias even if it did turn out to be effecting our results. Only with Galaxy Zoo which has so many contributors were we able to detect the bias (and it may turn out to be an inherent bias in the way people see galaxies, which an interesting psychology result). Without so many classifications the excess result would have always remained uncertain. And while we no longer think we have an overall excess of anti-clockwise galaxies (which we never expected in the first place!) we can still do a lot of interesting work and pursue our original scientific aims, as explained here and here.
Thanks guys! And keep up the good work. Current classifications remain useful, and we hope to give you some more images next week (possibly returning to the full catalogue!).
Cheers, Kate & Anze