Do galaxies care where they live?
Does where we live make a difference to the kind of person we are? This is a question that has been addressed many times by social scientists, and certainly with more refined thought than the following example, but it will serve our purposes.
Consider one person, Victor, living in a small countryside village, and another, Claire, who lives in the centre of a city. The nearest shops to Victor are many miles away. When he has a sudden biscuit craving and opens the cupboard to find, to his horror, that his wife finished off the last packet the previous evening, it is a great effort for him to travel to the shops to get another. Claire, on the other hand, has merely to stroll to the corner of her road to satisfy her craving for something crunchy. However, while Claire often finds herself nipping out for a packet of biscuits, Victor rarely has the need. He always makes sure he buys plenty of biscuits on his regular weekly shopping trip, and there is always the packet hidden at the back of the other cupboard that his wife hasn’t noticed. Victor is very organised, while Claire clearly isn’t, at least when it comes to biscuits. Does this have anything to do with where they live?
Of course, biscuit buying habits, although important, aren’t the only thing one can say about an individual. Each person is complex and unique, imperfectly describable even by a very large number of personality traits. However, there are simple and obvious ways of crudely dividing up the population. Although we have so far confined ourselves to biscuits, the chances are that Victor is generally more organised than Claire. Perhaps there is a way of dividing people into groups by how organised they are. I’ve no idea, but there are small number of general personality traits, like introvert and extrovert, that describe how many specific personality traits tend to group together, such that you can give reasonably good description of someone by just a few words.
By now you are sure to be wondering what the hell this has got to do with galaxies. Well, to date there has been very little research into the biscuit hoarding characteristics of different galaxies, but like people, galaxies are extremely complex objects. There are so many processes simultaneously going on inside them that we just can’t fully describe each one, never mind understand how those processes go towards forming the properties of the individual as a whole. However, one thing about galaxies, that you can’t help noticing when you’ve looked at a enough of them, is how cleanly they can be split into two different types: spirals and ellipticals. Spirals are, at least in some respects, very organised. Most of their stars are travelling in circles around the galaxy centre in an ordered manner. Ellipticals, on the other hand, are in disarray. Their stars move around on many different, random orbits. (It is interesting how the appearance of order, a nice smooth elliptical galaxy, appears when many unorganised things happen at once, but that is a whole other topic.)

We’ve made the distinction between spirals and ellipticals completely obvious in Galaxy Zoo by only giving you those two options, along with “star/don’t know”. Even so, if we’d just sat each of you at a table with a pile of galaxy pictures to sort, without giving you any instructions about how to do it, most of you would probably have arrived at the same way of dividing them up. Those of you who value simplicity would have formed two or three piles. The pickier ones amongst you would probably be surrounded by lots of neat little stacks, containing galaxies with two sprawling spiral arms, with many tightly wound arms, big blobs, small blobs, red, blue, and so on. Nevertheless, the main distinction, the difference between all the galaxies on your left and those on your right, would probably be whether they possess a disk, often containing spiral arms, or whether they are just a big, smooth elliptical.
Of course, as many of you will have noticed, not all galaxies do fit into a nice category. So, as well as your stacks of spirals and ellipticals, you would be likely to have a collection of weird objects. However, these only form a small fraction of the whole population of galaxies. Whether you choose to hide your pile of odd galaxies away to one side, or display it smack right in front of you, again depends on your character. The projects examining blue ellipticals or Hanny’s Voorwerp belong to the latter class – confronting the occasional odd object to see what secrets it can tell us. The analysis I have been working on has more of the former character: as most objects are elliptical or spiral, let’s ignore the few weird ones and study how the majority behaves. One problem with working with the majority is that this is very many objects, hundreds of thousands of galaxies. To analyse a dataset this large we have to use statistics, for example we consider the fraction of objects that are elliptical, and how that changes when we only look at galaxies with certain properties.
If you did the galaxy sorting exercise described above you would be reproducing work performed by many astronomers over the past ninety years, including Hubble, de Vaucouleurs and Sandange. This subject is called morphology, literally the study of the ‘forms’ that galaxies take. Strictly morphology doesn’t include a description of the colours of galaxies, but rather their shape or appearance in greyscale.
The distinction between spirals and ellipticals was noted even before it was fully accepted that these objects reside outside our own galaxy. It was also noticed, almost immediately, that spirals and ellipticals are distributed differently on the sky. They all tend to cluster together in groups, rather than being evenly or randomly arranged, but ellipticals cluster much more strongly than spirals. Ellipticals live in galaxy cities, alongside many others, whereas spirals prefer the villages and isolation of the cosmos’ countryside.
To use more scientific language, ellipticals are concentrated in high density regions, where many galaxies are located in a small volume of space. Spirals, on the other hand, are usually found in low density environments, where galaxies are separated from others by large distances. As mentioned earlier, the dependence of galaxy morphology on the density of surrounding galaxies was noticed early in the 20th century. However, it wasn’t until the 1980’s that it was well quantified in two landmark papers by Dressler (1980), looking specifically at large galaxy clusters, and Postman and Geller (1984), who extended the relationships to lower density environments around clusters and smaller groups. These studies tried to classify galaxies as ellipticals, spirals, or lenticulars. This last type is a galaxy morphology somewhere between a spiral and an elliptical: with a disk, but with no spiral arms. Lenticulars are tricky to classify, and so in Galaxy Zoo so far we haven’t asked the classifiers to try and identify them. Galaxy Zoo “ellipticals” will contain normal ellipticals, and most of the lenticulars. This issue will be discussed more in future posts.
This figure shows the morphology-density relation from Postman and Geller (1984) and Dressler (1980), based on around 9000 galaxies. The lines show the fraction of ellipticals (red), lenticulars (orange), and ellipticals + lenticulars (purple) versus a measurement of local density. The different lines of the same colour just indicate three different sources. You can see that as local density increases, going from left to right in the figure, the fraction of ellipticals and lenticulars increases.
With the latest Galaxy Zoo data provided to me by Anze, I set to work analysing how a galaxy’s morphology depends on the environment it lives in. The initial thing I had to do was carefully measure and correct for any biases in the morphological classifications. This in itself is interesting, although it tells us more about people and the telescope than about galaxies, so I won’t discuss it further here. The next thing to do was to find out about the environments of the galaxies – specifically the local galaxy density. These were kindly provided by Ivan Baldry, an astronomer at Liverpool John Moores University who has done lots of work on the variation of galaxy colours with environment.
When I had my corrected dataset, with measurements of environment added in, the first thing I looked at was the relationship between the fraction of galaxies that are elliptical and local galaxy density.
This figure shows the morphology-density relation for nearby galaxies from Galaxy Zoo, based on 100733 objects. The light shading indicates the very small uncertainties on the relation.
It is difficult to directly compare the Galaxy Zoo morphology-density relation with that by Postman & Geller (1984) shown further above. This is because the local density was measured in a different way, and they include lenticular galaxies separately. However, it is easy to see that the overall behaviour is the same. In regions of high density the fraction of elliptical galaxies increases. The Galaxy Zoo relation is much more accurate, as it is based on more than ten times the number of galaxies, and very clearly defined, which will enable future studies and models to easily compare with it. It shows clearly that morphology depends smoothly on local galaxy density over all environments. Even in the lowest density regions there is some dependence.
Now is a good time to think back to Victor and Claire. Like Victor, organised spiral galaxies tend to live in areas of low density. Disorganised ellipticals are found where many galaxies cluster together, somewhat comparable to the city Claire lives in. But is Victor organised because he lives in such an isolated place, and is forced to be; or is he just an intrinsically organised person, and so living in the countryside didn’t seem such a problem? Likewise, is Claire disorganised because of where she lives? Do the plethora of nearby shops make biscuit hoarding unnecessary? Or is she simply a disorganised person, and so chose to live in the city to avoid having to be organised? If Victor moved to the city, would be become more disorganised? Would the place he lives change his personality?
Obviously galaxies don’t choose where they live, in the sense that Victor and Claire can, but the analogy is still strong. Are there more ellipticals in clusters because that’s where ellipticals happen to be, or because something about where they live has turned them into ellipticals? If otherwise identical galaxies form in areas of different densities, would they be the same, or is there something happening in dense regions that changes galaxies into ellipticals? Maybe something about dense regions turns organised galaxies into disorganised ones.
One of the powers of Galaxy Zoo is the staggering number of galaxies we have data for. It is possible to divide up our sample by a variety of galaxy properties, such as their mass and colour, and still have enough galaxies in each slice to see how environment affects that particular subsample. Each of these different properties tells us something different about a galaxy, and enables us to go someway to disentangling their intrinsic properties from recent changes. I’ll discuss the things we’ve learned by doing this in future posts.
More on the Voorwerp
Kevin asked whether I could provide an entry on our efforts to figure out just what Hanny’s Voorwerp might be. This is definitely a guest blog – I am not a ZooKeeper, but they have been gracious enough to let me feed some of the less delicate animals on occasion.
As a reminder, Hanny posted this object on galaxyzooforum.org back on August 13 (I can’t believe it was that long ago now, but apparently the topic scrolled way down in the forum until early December). It showed up on the SDSS color rendition as a deep blue, irregular cloud, just south of the spiral galaxy IC 2497. Pulling out the brightness measurements from the Sloan data in all five filters gave a very unusual result. This thing looked so blue in the color images (made from the gri images) because it puts out almost ten times as much light coming into the g filter as any of the others, and isn’t even detected in the very-far-red z band. That suggests that there is a very strong emission line somewhere in the wavelength range of that filter, about 4200-5500 Angstroms. The SDSS images do show a small object at the north tip of the blob with a more continuous distribution of light; the location is suspicious, but we don’t have direct evidence yet whether it belongs to the Voorwerp. The blob does show structure in the g image, like shells or loops.
Archive searches turned up a radio source in IC 2497, and nothing else helpful (the object just appears on the old Palomar Sky Survey blue-light photographs). A single emission line in that wavelength range could be almost anything, although it was a bit odd that no other emission line was bright enough to produce much light in the other filters. It might be some kind of small nebula in our own galaxy (really small so its dust didn’t block our view of IC 2497 just to the north), some kind of ionized gas cloud associated with IC 2497 itself (redshift z=0.05), or something like the “Lyman α blobs”, gigantic glowing gas clouds seen only in the early Universe (z=3 or so). A spectrum would tell which (if any) of these was correct. So I started emailing friends who use appropriate telescopes pretty regularly, and mostly ended up grumbling about the shortage of spectrographs on 1-3 meter telescopes these days. Meanwhile, I was able to do some measurements with the SARA 0.9-meter telescope, which our university operates remotely as part of a consortium. (In fact, I did these measurements sitting at home, assisted by one of our cats who finds a logbook in front of a monitor the most comfortable place in the house). It takes a pretty long time for a telescope that size to surpass the quick-look Sloan image, but these data were able to narrow down where that strong emission line could be. I used a different set of filters, the classic BVRI set which were designed to be optimized for certain measurements of stars (rather than galaxies), but are helpful here because they’re different. The bright peak made it into the V filter but not the others. The V band runs more or less from 5000-5900 Angstroms, so the wavelength we seek is in the overlap between v and [i]g[/i] between 5000-5500 Angstroms. Alas, that didn’t help us much, since the strong [O III] emission line at 5007 Angstroms would land in that range for something very nearby or at the redshift of IC 2497.

Finally, some of the UK zookeepers were able to find a colleague working at the 4.2-m William Herschel Telescope on the island of La Palma who was able to get a spectrum, while some of us were at the big meeting of the American Astronomical Society in Texas (just last week). The WHT is very well equipped for spectroscopy, and La Palma is a superb site (from which I’ve seen the sharpest images from any telescope I could put my hands on). We’ve got a quick-look screensnap of the spectrum, and it answers a couple of questions right away. The Voorwerp is at almost exactly the same redshift as IC 2497, and almost certainly associated with it. The strong and narrow emission lines are what one would see from a star-forming region. But there are some things about it that are strange, and need more work.
I’ve labelled some of the emission lines in the spectrum here. The spectrograph slit was oriented roughly north-south, running through IC 2497 as well, and is shown left-to-right. Wavelength increases from bottom to top; this is a slice of the violet-to-green region, from about 3400-5100 Angstroms in the reference frame of the object itself. We see the hydrogen series (labelled as H+Greek letters), produced when electrons join with free protons to make hydrogen atoms. There are also lines from heavier elements; the brackets denote so-called forbidden lines, radiation which arises from decay of energy levels excited by collisions between ions and electrons. Looking at what we can tell so far about the relative strengths of these features, there is funny business afoot. First, the gas is hot (even by the standards of ionized nebula). The ratio of [O III] lines between 4363 and 4959+5007 is sensitive to temperature (for those who really want to know why, here is an online lecture including details, with abundant thanks to the late Don Osterbrock for pounding this stuff into my thick head). To have the 4363 line even detectable, the gas has to be unusually hot, more like 15-20,000 K (exact numbers are pending getting the final calibrated spectrum from the observers). Even odder are some of the other lines. He II is produced when an electron joins a bare helium nucleus, and requires high enough temperature or radiation with enough energy to tear both electrons from helium (four times harder than for hydrogen). We don’t see this in star forming regions. The only stars hot enough to produce He II in surrounding nebulae are the central stars of planetary nebulae (which are the hottest stars known, but only for a few thousand years) and a handful of X-ray-bright stars usually associated with accretion onto black holes or neutron stars. On top of that, at the blue end of the spectrum is [Ne V]. If it’s hard to rip two electrons from helium, it’s that much harder to pull four from neon lights. This requires 97 electron volts (eV), compared to 54 to make He II and 13.6 to ionize hydrogen. [Ne V] does sometimes show up in planetary nebulae, but even there calculations suggest that it’s not the UV starlight that’s responsible, but that high-speed shock waves may be the culprit. This line is also common in the spectra of active galaxies – Seyfert nuclei and their kin, where we know that there are abundant X-rays interacting with the gas.

So the spectrum tells us where the Voorwerp is, and leaves us with a fascinating conundrum. (To quote an email from a ZooKeeper, “Hmm..that doesn’t make any sense! Excellent…” Not only do we see these high-ionization lines, but we can already see that they come from the whole cloud, not some small bright region. Are we dealing with shocks, or perhaps with radiation from an active nucleus in IC 2497 which is obscured from our point of view but shining full force toward the blob? Or something we haven’t thought of? All good questions. We’ll know more when we have the calibrated spectrum so we can do detailed numerical comparisons.
There are obviously a lot more observations we’d like to have. The gas is shining so brightly that it’s hard to tell what the stars are doing. We’re putting together a request to have the Swift orbiting observatory take a look with its UV camera and perhaps in X-rays as well. Swift was designed to follow up gamma-ray bursts, but they also take requests for where to point while sitting there waiting for a random burst to go off. And not too long from now, it will be the season to propose for Hubble imaging and spectroscopy with the Gemini telescope’s integral-field unit (which gives the spectrum not just along a line, but at every point within a small area of sky).
Whatever this is, it’s rare. After I mentioned wanting to improve my SQL fu to check for more things in the SDSS with its odd colors, Chris Lintott did just that. There are no more things in the survey database which are not imaging artifacts and have colors within 15% of what we see here. There’s more work ahead to make sure that we include the possibility of, say, having H-alpha not fall between the r and i filters as it did here, but there can’t be many more of these. Rare objects suggest rare events, just the kind of thing that it takes a deep sky survey and careful winnowing to find. Dank U wel, Hanny!
AAS Talk
I suspect this is finally the last post relating to the AAS meeting, but I wanted to share the slides from my talk last Friday. Please note that these results are officially provisional! Talks at the AAS are just 5 minutes long (with so many astronomers it’s hard to find space) and I was definitely pushing my luck cramming this much in. As you’ll see, I’m not really one for lots of words on slides so I’ll write a brief commentary between them.












All slides are copyright the Galaxy Zoo team and shouldn’t be used without permission.
Interview with Chris and Jordan
While we were at the AAS, Jordan and I were interviewed by Pamela Gay of Astronomy Cast. You can hear the results over at Star Stryder. (The good news is it’s audio, so you don’t have to worry about my shirt…)
Galaxy Zoo: the poster
The reason that Chris and I were at the meeting last week was to present results from Galaxy Zoo. On Thursday, I gave a scientific poster session about the public outreach results from Galaxy Zoo – how thousands of people have helped us classify galaxies, and how we hope we have helped you understand the process of science. On Friday, Chris gave a talk about Galaxy Zoo’s science results.
Today, I’ll write about the public outreach poster, and on Thursday, Chris or I will write about the science talk. At this point, two questions might be occurring to you:
1) What the heck is a “poster”?
2) What do we mean by “public outreach”?
There are two main ways of presenting scientific results at meetings. One is to give a talk. At AAS, these talks are 10 minutes, including time for questions – and it goes by quickly! The other way is to present a “poster” at a scientific “poster session.” In a poster session, authors write about their research and tack it up on a bulletin board 4 feet (120 cm) square. They leave the poster up all day, and stand in front of it at designated times, answering questions. Thus, posters are a good way to present “work in progress,” and get feedback from colleagues.
Here is a copy of our poster (it’s the entire poster as a 5 MB JPG image):
Galaxy Zoo public outreach poster
[Note: there is a section on how we are planning a social science study of Galaxy Zoo volunteers. Some of you may be worried about being a part of this experiment. The short answer is, don’t worry. We will not use any classifications in the study unless you explicitly give us permission to include yours, and no classification will be identifiable as coming from a specific person. For a more detailed answer, read the poster or the Galaxy Zoo Meets Social Science topic in the forum. You’re welcome to ask questions as (anonymous) comments here or by private message in the forum to zookeeperJordan.]
Here are three photos of Chris and I standing in front of the Galaxy Zoo poster (I’m the one in the hat).
Chris and I proudly posing in front of the poster:
A long shot of the poster hall, with us in front of our poster. You can see several other posters as well:
Chris and I answer questions from an unidentified astronomer:
The content of the poster was about how Galaxy Zoo has supported public outreach in science. Public outreach means many things to many people – it’s everything from creating formal lesson plans for use in schools (what I do with SDSS data) to developing museum exhibits to giving public talks to writings blogs (like Chris’s) and podcasts.
What we are doing with Galaxy Zoo is a new and innovative way of working with the public. Our inspiration was Stardust@Home, where volunteers searched through aerogels to find interstellar dust grains. That took some training and careful examination; Galaxy Zoo requires only a quick glance to classify a galaxy as spiral or elliptical. We’ve also tried to use the forum and this blog to give you some insight into the day-to-day process by which scientists work – an insight that scientists often aren’t able to give because of schedule constraints.
We were just one of maybe 100 posters presented on Thursday, but we got excellent response from the people that stopped by. The astronomy community is excited about what all of us are doing here at Galaxy Zoo. On Thursday, we’ll let you know what we told them about the new science that we are discovering.
AAS: index of zookeeper experiences
I’m back from the AAS meeting in Austin. Last week, Chris and I reported on some of our experiences at the meeting; I know many of you said you wished you could be there, so we wanted to give you a peek at what a scientific meeting was like. There were a lot of posts flying furiously, so this is an index of what we reported on, both here and on Chris’s blog. For more perspectives on the goings-on at the meeting, see the Astronomy Cast LIVE blog. The meeting went from Tuesday to Friday, so this is indexed by day.
Tuesday:
Chris wrote about the search for extrasolar planets, both the progress being made and the postponement of some other missions.
Then, Chris posted some highlights of research presented on Tuesday, including three results that have implications for Galaxy Zoo: a study of the importance of classifying galaxies by eye, the discovery of a spiral galaxy that appears to rotate backwards, and the discovery of a voorwerp-like blue blob.
Wednesday:
Chris posted some beautiful images of the infrared sky from the UKIRT.
Then, Jordan posted about his experiences at the Sloan Digital Sky Survey booth, answering questions about the survey while wearing a chef’s hat. The purpose of the hat was to advertise a session called “Cooking with Sloan,” which served up hot and fresh galaxy images like the ones you see on Galaxy Zoo.
Chris posted again, about how observers and theorists are both making important contributions to the study of extrasolar planets.
At the end of the day, Chris posted about a talk he went to with the intriguing title of How Astronomers Die.
Thursday:
Thursday was the day of the Galaxy Zoo poster presentation – much more about this tomorrow. We were busy in the morning, but several posts appeared in the afternoon.
First, Jordan posted about pub conversations with a researcher at the University of Alaska Anchorage about the role of scientific research in science education.
Next, Kate and Anze posted about the initial results of the Galaxy Zoo bias study, finding that the apparent excess of anticlockwise galaxies has something to do with human perception and not the universe. They also share some ideas about what we’re doing next.
Then, Chris posted twice in a row, about a study of a galaxy supercluster, then about an interview with him posted on Youtube.
Then, Jordan proudly noted that his chef’s hat had been complimented by a Nobel prize recipient, and later added a slightly embarrassing picture.
Friday:
Friday was Chris’s talk about the science results from Galaxy Zoo – more on that tomorrow as well.
Then, Kevin posted about one of the fascinating and unexpected results of Galaxy Zoo – Hanny’s voorwerp.
We hope you enjoyed our coverage of AAS. The next meeting is in St. Louis in early June; whichever of us is going to that will try to provide you with coverage of that meeting too.
What's the blue stuff below?
‘Anyone?‘ asked Hanny from the Galaxy Zoo Forum. She came across a weird blue blob that none of us could really make any sense of. It’s right next to a rather massive galaxy that might be a spiral or a somewhat disturbed galaxy.

A highly scientific illustration.
At first, we had no clue. The mystery blob didn’t have a spectrum, so we couldn’t tell much about it at all. It could be in our Milky Way, it could be as distant as that big galaxy, or it could even be at the edge of the universe. Bill Keel enhanced the SDSS image a bit (see below) to reveal the intricate structure of what became known as ‘Hanny’s Voorwerp’ (object).

The five different SDSS bands (g,u,r,i,z), note the intricate structure in the g-band image.
The object seems to be very bright in the g-band image and virtually absent in the others. This led us to think that it must be an emission line object, i.e. an object which emits most of its light only in very specific atomic transitions. This usually means that what we are seeing is ionised gas, rather than stars. Still, it could be anything. Bill Keel kindly also obtained a multi-colour image with the 0.9m SARA telescope at Kitt Peak. The three colours here are much closer to what human eyes would see, so as Bill pointed out, it’s actually much more appropriate to call it the mystery *green* blob.
BVR image from the SARA telescope.
We’ve managed to contact a friend of ours who is currently observing at the 4.2m William Herschel telescope in La Palma and convinced him to take a spectrum of the Voorwerp for us. It shows us that the Voorwerp is…. *drumroll* at the same distance as the big galaxy. This implies that it’s really rather huge and luminous.What does all this mean? What is the Voorwerp? That’s not too clear yet. We have to properly analyse the spectrum to understand what exactly is going on. It’s likely forming stars at a huge rate, ionising lots of gas and making it shine. We’re also trying to get a deeper image to see if there’s evidence of an interaction between the big galaxy and the Voorwerp.So what’s next? We’ll have to do a lot of work to understand this mystery blob better. Right now, the Voorwerp is only slightly less mysterious than when we started, but I have a feeling that it’s going to be really good fun figuring out what is really happening here. It also shows the power of Galaxy Zoo and of having you guys go through the images by eye. If Hanny hadn’t spotted it and asked, we’d never have known about it!
Interview with Chris on cosmology
The Bad Astronomer has an interview with me on his site, talking about the cosmology results.
w00t!
I just got a compliment on my chef hat from a Nobel Prize recipient.
UPDATE: By popular demand, here is the slightly embarrassing photo:
In the eye of the beholder?
Hey guys and girls,
So, as you probably know, the last month or so of Galaxy Zoo has been dedicated to testing whether we have any bias in our classifications (and if you want to know why we are interested in looking at the rotation of galaxies then please have a read here). By ‘bias’ we basically mean some systematic error in the way people classify (you can get a good explanation in Jordan’s post), and this is different from just random general scatter of results. For example, we know that when a galaxy is faint or small then people are more likely to think it is an elliptical galaxy – and this particular mophology bias is something that Steven must compensate for in his work.
It has been really exciting to work on the rotation classifications of Galaxy Zoo, and as many of you know early on in the project we realised that people were classifying more galaxies as anti-clockwise (see the Telegraph article for example). Specifically, if we take those galaxies that are well classified (ie. more than 80% of people agree) then we find we have an anti-clockwise:clockwise ratio of about 52:48. This may not sound particularly significant, but as you increase the number of galaxies that you have in your sample (as more of you lovely people classify for us) then this ratio becomes more significant, and is highly unlikely for the ~35,000 galaxies that we have. [For those of you who like probability, the number of anti-clockwise galaxies that we expect is distributed according to a Binomial probability distribution. And if we assume that the ratio is really 50:50, then out of a total of N galaxies we expect N/2 to be anti-clockwise, with a standard deviation of sqrt(N/4).]
In the plot below we show the relative excess of clockwise votes (for users that classified more than about 300 galaxies) – this is the number of clockwise votes minus the number of anti-clockwise, votes divided by the sum of the two. For example, this number would be 1 if a user always clicks clockwise, and zero if they click both clockwise and anti-clockwise equally.
This graph confirms that everyone is generally clicking anti-clockwise more often, because we see that the mean tends to lie below the zero line. But this plot cannot distinguish between intrinsic excess of anti-clockwise galaxies on the sky or human bias, and it is obviously very important for our rotation results that we get a handle on this as we could not announce our possible anti-clockwise excess result to the scientific community without doing these bias checks. So the basic idea is to look at the votes for a galaxy before and after a galaxy image is flipped. For example, if 6 out of 10 people thought it was originally clockwise, then after flippping we expect about 6 out of 10 people to now think it is rotating anti-clockwise (if there is no rotation bias).
Since the end of November many of the images in Galaxy Zoo have been flipped for this purpose (and we’ve been monitoring the status here), and we now think that we have enough data to measure the levels of bias. This week Anze has flown over from Berkeley (in California) especially to crunch the numbers with Kate (in Oxford); it is quite a job – with over 7 million classifications to go through! And during our analysis some rather subtle points arose… as with most science, things don’t go exactly to plan!
So we basically wanted to compare the classifications for a galaxy before and after flipping, but we quickly realised that peoples behaviour in the last month or so is very different to the earlier datasets (see Anze’s post for an explanation of how we reduce the data). For example, recently people have been more likely to click the ‘Star/Don’t know’ button. This might be because we have lots of new users, recruited through our latest publicity drive. Or maybe lots of old members have come back after receiving the newsletters. Either way it meant we couldn’t simply compare before and after votes. Also, annoyingly, the original unflipped images are no longer on the site and so getting a handle on this behaviour change was a bit tricky (note that one of the first rules of scientific experiements is to have a control test, but accidentally a miscommunication amoungst team members meant that in this case our control sample got left out!). Fortunately though, we are able to use the monochrome images that are currently in the site to compare to (as we observe that being in black and white does not change how people choose between anti-clockwise and clockwise).
So we want to know what the average votes per button are, for the average galaxy in Galaxy Zoo. This is where we encountered our second problem – our bias sample does not cover all of the Galaxy Zoo galaxies, but just 10% of them, and this 10% was not selected at random. In particular we know that we have more anti-clockwise galaxies in the bias sample (on the site at the moment). Therefore we needed to careful undo what we did when we selected this subsample, so to then construct an effectively random subsample of our full database. Then we could look at the average weights.
In the figure we show the average fraction of votes that a galaxy gets for clockwise (class=2) and anti-clockwise (class=3). We show the result for the original classifications in black (before December), for the monchrome images in red, and for the flipped images in green. We also show the 1 standard deviation errorbars from sampling.
So what we see is that the class=3 points are always higher than the class=2 points, and crucially this is true even after we flip a galaxy image! Looking at the red points, we find that before flipping there is a 6.0% chance of hitting anti-clock and 5.5% of hitting clock for our sample. Then after flipping (green) there is a 5.9% of hitting anti-clock and 5.6% of hitting clock. So the point is that those numbers stay the same (within 1 standard deviation) when they should actually reverse if there is no bias. It is easier to think in terms of the ratio of fractions:
anti/(anti+clock)=0.522 before flipping
anti/(anti+clock)=0.512 after flipping.
And if we had:
a) no bias and no excess then these should both be 0.5.
b) no bias and a real excess then one should be the opposite of the other (ie. 0.52 and then 0.48)
c) a bias and no excess then we would expect them to stay the same and not equal 0.5.
But what we actually find is that 0.522 is 5 standard deviations away from 0.5, 0.512 is 3 standard deviations away from 0.5, and 0.522 & 0.512 are within 1 standard deviation of each other. So you see we appear to be convincingly in situation (c).
So what next? Well – it is fantastic that we have been able to get a handle on the bias even if it did turn out to be effecting our results. Only with Galaxy Zoo which has so many contributors were we able to detect the bias (and it may turn out to be an inherent bias in the way people see galaxies, which an interesting psychology result). Without so many classifications the excess result would have always remained uncertain. And while we no longer think we have an overall excess of anti-clockwise galaxies (which we never expected in the first place!) we can still do a lot of interesting work and pursue our original scientific aims, as explained here and here.
Thanks guys! And keep up the good work. Current classifications remain useful, and we hope to give you some more images next week (possibly returning to the full catalogue!).
Cheers, Kate & Anze






