Archive | Science RSS for this section

#GZoo10 Day 2 : Happy Birthday to us

Ten years ago today, I was trying to work out how to deal with the sudden flood of volunteers heading to our site to help explore the Universe. Lots of that traffic came from a BBC News article, so it seems appropriate they’re marking the day with a new piece reflecting on recent results.

(As Bill says, you can find out more about those results here on the Galaxy Zoo blog).

We’re ready for day 2 of the Galaxy Zoo 10 workshop at St Catherine’s College in Oxford; it was great to have so many people following along yesterday morning on the Livestream – yesterday’s talks are still up, and today’s schedule is:

9.40am: Karen Masters (Portsmouth)
10.00am: Lucy Fortson (Minnesotta)
10.20am: Hugh Dickinson (Minnesotta)
11.00am: Sam Penny (Portsmouth)
11.20am: Becky Smethurst (Nottingham)
11.40am: Ross Hart (Nottingham)
12 noon: Seb Turner (Liverpool John Moores)
12.20pm: Peter McGill (Oxford).

We’ll blog these talks as they happen here too.

From here one in folks it’s Becky Smethurst taking the reins – once more unto the breach my friends! 

We’re kicking off the day with an original science team member: Karen Masters! She’s going to be asking the question: After 10 years of Galaxy Zoo: What Now?

She kicks us off with thinking about how Galaxy Zoo classifications are just as quantitative as an automated computer classification of a galaxy’s morphology – but what can we do to combine these measurements? The problem is that computers which searching for a best fit model can get stuck in what’s called a “local minimum” in parameter space which is not actually the best fit. Combining human interaction with this computer fitting process will help to find the “global minimum” or the true best fit for the model. Karen is showing a new project that is still in the early development stages where volunteers help to fit a model light profile for a galaxy.

Karen now segways into the idea that colour ≠ morphology, which was one of the first results from Galaxy Zoo.

She’s now talking about how we need to go one step beyond the visual morphologies we have, to classify how the stars are actually rotating. New galaxy surveys, such as MaNGA that Karen is working on, are taking many spectra across the whole galaxy to get rotation maps. We can then classify these rotation maps to understand a galaxy’s history – less ordered rotation suggests that a galaxy’s disk has been disturbed by something like a merger or interaction. This is really helpful to astronomers to be able to figure this out, especially if there’s no clues in the visual morphology that an interaction or merger is happening.

Karen ends her wonderful talk by discussing how we can all get more students involved with Galaxy Zoo and astronomy as well. Galaxy Zoo is often used in Astronomy 101 classes at universities, as well as in schools – so how do we engage more with this side of the Galaxy Zoo community?

Next up this morning is Lucy Fortson talking about “The evolution of a Galaxy Zoo team member.” Ten years ago Lucy was working at the Adler Planetarium in Chicago and was aware that the best research model for museums was to get the public to participate in it – so jumped at the chance to join the Galaxy Zoo research team! She’s now going to be giving us a round up of everything going on in the group at Minnesota – one of the hubs for the Galaxy Zoo research team.

Mel Galloway’s work comparing the UKIDSS (infrared images) and SDSS (optical images) Galaxy Zoo classifications – how does the morphology of a galaxy change with wavelength? Mel has also worked with the Galaxy Zoo Hubble classifications looking at how disk galaxies evolve and become passive (i.e. non star forming). The problem was that the number of disks drops out at the high redshifts that the Hubble Space Telescope can target, so makes the sample of disks incomplete. This ended up with a new offshoot project classifying disk galaxies lovingly titled Save Mel’s Thesis! With these classifications providing a more complete sample, Mel managed to show how the the amount of passive disks increased from 6 billion years ago to the present day. Not only that, she also showed that more massive galaxies are more likely to keep their disks after they stop forming stars as well.

Now we’re all getting emotional because Lucy’s brought up Kyle Willett who recently left astronomy to work in the data science industry. But where would Galaxy Zoo be without Kyle?! He not only published a lot of the data release papers for Galaxy Zoo but also kept up the site maintenance and even ran the kaggle competition we held for machine classifications of galaxies.

Speaking of machine classifications of galaxies – did someone say Melanie Beck?! Melanie is working on using machine learning in conjunction with the user classifications to complete Galaxy Zoo much quicker by retiring the easy subjects much quicker, leaving the users with the interesting and difficult things to classify.

This is going to be really important in the future when new telescopes like the LSST start observing when we’ll have almost 1 billion images of galaxies! If the computers are working on the easy stuff, say 90% of the images, users will still be left with 1 million more interesting things to classify!

The next Minnesota person is Hugh Dickinson – but he’s in the room, so now we’re hearing from him personally. Hugh is working on the classifications from Galaxy Zoo Illustris – the first time users have been asked to classify galaxies from a different universe – albeit a universe that only exists inside a computer! Illustris is a simulation of galaxies which is one of the most realistic simualtions to date; it’s high resolution and includes all the ingredients for a Universe such as dark matter, stellar processes, feedback and gas cooling.

But why is it important to classify what are essentially “fake” galaxies? Well with a simulation we know exactly what the history of a galaxy is throughout the entire simulated life of the Universe, so if we’re getting the same morphologies as we see in the Universe then we know that the Physics we’ve assumed to make this simulated universe are right. It can also help highlight the links between a galaxy’s visual appearance and its dynamical history. So although it might seem weird classifying something that doesn’t really exist, those classifications are extremely helpful to astronomers to help down pin down the Physics that is occurring in the Universe.

So the first thing that Hugh has looked at is the first question in the Galaxy Zoo tree: smooth or featured? And the bad news is that the classifications of the simulation images don’t match the real classifications of galaxies from Galaxy Zoo 2.

This means that there’s an issue with the simulations – investigating it further they found that there is actually very good agreement between the classifications for higher mass galaxies (> 10^11 solar masses) where the same number of smooth galaxies are seen. For lower mass galaxies though there are a lot more featured things in the simulations than in the real Universe. So, from this preliminary study Hugh’s figured out that we’ll only be able to compare the detailed structures of simulated and real galaxies of high mass – so now these are back being classified right now on the Galaxy Zoo site to get answers to all the further questions about spiral arms, bars etc. So please keep classifying!

Hugh is now telling us all about one of our newest projects: Galaxy Nurseries! Users look at the spectrum of a galaxy and are asked whether the emission they’re seeing in the plot is a real feature or not. The data is a bit messy, so this is not an ideal task for a computer but perfect for a human! There’s a lot to be done with this project but Hugh is excited to see the classifications come through.

Time for coffee for us now – we need caffeine to keep up this level of science discussion!

Next up for our listening pleasure is Sam Penny from the University of Portsmouth.

She’s going to be giving us an overview of the MaNGA survey, which is a new survey looking at the kinematics of galaxies by taking lots of spectra for each galaxy in a bundle. How are we going to integrate this then with Galaxy Zoo? Currently Sam works on very low mass non star forming galaxies which tend to be quite low brightness so aren’t always the easiest to visually classify. So she’s using the kinematic information that MaNGA provides to reveal how these galaxies have evolved. But she wants to know is there a link between the kinematics and the morphology of the galaxies? That way, if we don’t have kinematics for some galaxies will we still be able to pick these galaxies out?

Sam’s other interest is void galaxies; these are found isolated from other galaxies in extremely low dense environments. Surprisingly for galaxies that have been isolated from other galaxies their entire lives, some of them are very massive, very red (i.e. no longer star forming) and have disks. So how did they get so massive on their own? The most massive galaxies in the Universe are thought to build up through mergers – but if these galaxies are isolated then this doesn’t seem to be a possibility for these objects!

This is Chris Lintott taking back over to blog Becky’s talk.

She’s discussing our elephant in the room – the fact that we don’t deal with kinematic morphologies; in other words, we classify galaxies by how they look, not how the stars and gas move within them.

Before that, though, she’s pointing out that the clean samples we assemble from our data – which require a threshold of, say, 80% of people to agree on a classification before a galaxy makes it in – throw out a huge number of galaxies. That works, says Becky, if you’re trying to assemble a sample of discrete bins, like the categories on the Hubble diagram, and not a continuous range of types of galaxies.

But! There’s a catch. If you care about kinematic morphology, there really is a true binary. Things are rotating as a disk or they are not. (The audience seems not necessarily to agree on this point; we’ll see what happens when we get to questions). To study kinematics we can use instruments like MaNGA, which I hope Becky described above during Sam’s talk – Becky tested us and though the majority could distinguish a rotating smooth galaxy from a non-rotating smooth galaxy, it certainly isn’t easy just by looking. (And we probably shouldn’t draw conclusions from a sample of one).

Becky reckons that only 20% of Galaxy Zoo smooth galaxies are ‘true’ ellipticals – those that don’t have a hidden disk-like rotation inside. How will this affect Becky’s work on fitting star formation histories? She does this using a code called StarPy, which uses statistics to decide when a galaxy started to stop star-formation and how fast that ‘quenching’ is happening. (There’s a nice description of StarPy on the blog here).

Using MaNGA to divide galaxies not by visual morphology, but by rotation, Becky has run Starpy and finds differences. Non-regular rotators – what we might call ‘true ellipticals’ – quench either quite fast or very fast; there were a bunch of smooth galaxies that only quenched slowly, but these now seem to be the regular rotators; disks hiding amongst the Galaxy Zoo smooth sample.

Becky finishes with this diagram, from a recent review of kinematic morphologies. She’s added the location of the elephant – she feels we’re good at distinguishing spirals but need to talk about the smooth ones. In questions, Karen reckons we can tell the difference visually, Jean Tate wanted confirmation that lenticulars rotate (they do!), and I reckon we need to think hard about statistics.

Becky is back!

Next up is Ross Hart who’s telling us all about his work for correcting for biases in Galaxy Zoo 2 data to get out a nice complete sample of spiral galaxies. The problem is that at greater distances it becomes harder to spot spiral arms as things get fainter.

Ross’s code does a really nice job of recovering lots of spiral galaxies that would’ve been missed otherwise in the catalogue. Now we’ve got this catalogue we can use it to study the properties of spirals with arm number – turns out many armed spirals look a lot bluer than typical 2 armed spirals. Despite the fact that they appear blue (i.e. forming lots of hot, young, blue stars!), when Ross measures the star formation rate of these galaxies there is no dependance with arm number. But, he does find that more armed spirals have more hydrogen gas than 2 armed spirals, so might have more fuel for star formation in the future.

So what is actually happening in the spiral arms? Ross is now using an automated method to identify where the spiral arms in galaxies are called SpArcFiRe. Once the spiral arm locations have been identified, we can take off their light from the rest of the disk and just study what’s going on in the spiral arms and disk separately. Ross is currently working on these results but has some interesting preliminary results – so watch this space!

Now for something completely different! We’ve got Sebastian Turner from Liverpool John Moores University telling us all about automating galaxy classifications.

He’s asking what are we going to do moving forward? How will we merge the efforts of computers and humans? Seb is working on using statistical clustering methods to pick out information from the data we already have about galaxies. Clustering methods can pick out groupings of features in a sample – Seb feeds in information about the mass, colour and shape of a galaxy and the machine returns how many groupings it thinks there are in this multi-dimension parameter space. This can tell us something more about galaxy evolution because as humans we could never visualise this multi-dimensional space. Seb is showing us how the clustering algorithm picks up the areas of the colour magnitude diagram that we’re used to including the blue cloud, red sequence and green valley. The groups it picks out also correlate nicely with Hubble type morphology as well which is encouraging! There’s a lot more work we can do with this including using the Galaxy Zoo classifications as an input to the algorithm.

Ross wanted to know if we could input kinematics into this as well? Seb definitely thinks it’s possible. Brooke then asked about how it was weird the algorithm picked out two groups in the blue cloud before it picked out a group for the green valley. Seb reckons a big issue is that the algorithm tries to equalise the numbers in the groups and there’s just so many galaxies in the blue cloud.

Next up is a summer student at the University of Oxford: Peter McGill. He’s been looking at star formation histories of galaxies (like me!) but in galaxies at greater distance (high redshift) in the COSMOS survey with images taken by the Hubble Space Telescope.

In particular he’s focussing on figuring out what is stopping star formation (or quenching it) in the high redshift Universe. He’s using Starpy again (see Becky’s talk above!) to model the star formation history but has changed Starpy to take colours from the Hubble telescope rather than the SDSS. He’s looking at how these star formation histories change at different redshifts in the COSMOS survey and in comparison the results in the local Universe with SDSS. He’s showing plots where you can see how the rate that the star formation quenches at gets quicker for galaxies at higher redshift. His future work includes figuring out to include galaxy environment in these studies and changing the method to use a better algorithm to explore the parameter space!

Now we’re off to lunch and afterwards we’ll be having “un-conference” sessions where we’ll have lots of discussions. I’ll be live blogging later on when we report back – see you then internet!

We’re back! We’ve had a very productive afternoon discussing all the weird and wonderful things.

First up, Chris is telling us about a discussion half the team had about a paper draft from four years ago that we forgot existed. It got left by the wayside when the team ran into some problems with completeness of a spiral sample – however Ross’s current work has solved that for us! So we’re going to try completing it as a team again so watch this space.

Steven is now telling us about a session they had about integrating machine learning algorithms with Galaxy Zoo classifications (and Zooniverse classifications as a whole). So how will this affect the interaction of the volunteers with the site and the quality of the data we get out? The aim of this is to speed up classifications (for GZ2 from 6 months –> 1 month) since we’re moving into an era of even larger data sets. It will also hopefully mean the machine will do the boring stuff and the interesting stuff gets left in for the users (although that will vary with project). We have to be careful though not to put in too many good images though because research has shown there is a sweet spot for the amount of boring classifications to keep people classifying on the site. Also we want to make sure we’re still showing a representative sample of objects so that the public logging onto the site don’t get a biased view of what galaxies look like in the Universe. Could we have somewhere we still put retired (either by the machine or by humans) images for users to still explore these? The discussion then came back to what the main science goal of the Galaxy Zoo project is – is the final goal science or to make an interesting data set?

There was a hack session that also occurred to get a master data table for the MaNGA Galaxy Zoo classifications. We have them all, they’re just a bit scattered all over the place and need concatenate into one giant table.

Then the science team had a discussion about fast and slow rotators – can we actually do kinematic classifications by eye? We think we’re going to challenge the astronomical community to test this.

Then the science team had a talk about Talk. The discussion was focussed on the importance of needing to engage with the Galaxy Zoo community.

There was also another hack taking place this afternoon playing around with the Galaxy Zoo:3D data. Some of us got to grips with the data and tried to make some nice plots. One thing we did realise is that it might be worthwhile in getting about a third of the sample classified a bit further for better statistics.

We had a discussion about one of the tools that we use as a team to infer star formation histories; Starpy. We want to update this code to use a more robust statistical algorithm to get results.

Whilst that was going on, there was also a discussion about how to engage more with undergraduate students. One idea was to have a summer camp with undergraduates who will be engaging in research in the future – teaching them about citizen science, interacting with data, coding skills with a focus on Python – similar to the .Astronomy summer camp or the LSST data camp. Perhaps this could be tied in each year with a science team meeting? Also users may also enjoy this type of summer camp idea as well!

And that’s it for science for the day! We have our formal conference dinner tonight though so we’re all looking forward to socialising over a delicious dinner. Until next time internet!

Welcome to #GZoo10 : Day 1

It’s the day before Galaxy Zoo’s tenth birthday, and the team have gathered in Oxford for three days of discussing science and our plans for the future. Because it’s Galaxy Zoo, we’re inviting any of you who are interested to follow along online.

Members of the Galaxy Zoo team relax before the start of their meeting in an Oxford pub.

The mornings will be taken up with talks from team members. Today’s schedule is :

10am : Chris Lintott (Oxford)
10.20am: Lee Kelvin (Liverpool John Moores)
11am: Steven Bamford (Nottingham)
11.20am: Lucy Newnham (Portsmouth)
11.40am: Sandor Kruk (Oxford)
12 noon: Bill Keel (Alabama)

All the talks will be available via Oxford’ LiveStream account here. You can ask us questions using the #GZoo10 hashtag on Twitter – we will make sure someone in the audience at each session is watching so comments online make it into the room.

The afternoon will be an unconference and hack session, with the team debating the issues raised during the day and getting to work together. These sessions won’t be streamed, but we will blog about what’s going on.

It’s Becky Smethurst blogging from here on in folks… 

So we’ve kicked off the day with our fearless leader of the Zooniverse, Chris Lintott, reminding us that on this day 10 years ago the team were having conversations about how it would be amazing if they could get 10,000 people to help classify. Chris is still amazed that we’re here 10 years later with over 400,000 of you.

Chris is running through some of the modes in which we work with the Galaxy Zoo data. The first is looking at traditional morphologies, which the project was designed to do, like bars and spirals. The second is “distraction mode” where we’re all distracted by the serendipitous discoveries that the users make which we weren’t expecting, like the Voørwerpjes and the green peas. The final mode is the modelling mode, where we’re fitting models to the Galaxy Zoo data to explain something about the Universe. This mode also includes the amazing work with classifications of simulated galaxy images that are ongoing on the Galaxy Zoo site right now!

One of the questions from the audience for Chris is: “Why have the serendipitous discoveries dried up on Galaxy Zoo?” For one thing Chris thinks that one issue is that is takes so long to follow up on these discoveries – we’re still working on the Voørwerpjes! – but one thing we don’t have with the current images on the site (GAMA and KiDS etc.) is a link to the science survey site where the images come from. We had that with the original Sloan Digital Sky Survey (SDSS) images in Galaxy Zoo 1 & 2 which allowed the users to explore the data themselves and flag up something interesting.

Up next is one of the newest members to the Galaxy Zoo team: Lee Kelvin! He’s telling us about his work with the Galaxy Zoo classifications of the GAMA and KiDS survey images which have just been classified by users on the site. The special thing about GAMA is that it’s multi-wavelength; it takes images in various bands across the spectrum, from the ultra-violet to the infra-red. This is important because, as Lee points out, the morphology of a galaxy changes a lot across different wavelengths.


GAMA also has cross-over with the KiDS survey (the main role for which is to map the locations of gravitational lenses in the Universe, like those users hunted for in Space Warps!) which has much higher resolution than the SDSS images originally in GZ1 & GZ2. This means they’re perfect for classifying morphologies because more detailed features are resolved. These images are on the site right now – which means lots of pretty pictures for us to classify! These classifications give the team a wealth of information on the galaxies in these surveys – especially when users flag the interesting cases on Talk.

The early results from these classifications with the images from KiDS look very promising but Lee says there’s lots more work to be done! Including setting up a follow-up Zooniverse project trying to distinguish between true smooth elliptical shaped galaxies and disk galaxies that look smooth – so look out for that project going live in the next couple of months!

We’re back and caffeinated after a refreshing coffee break! Now Steven Bamford has taken the stand and is talking to us about the next steps for morphology studies with Galaxy Zoo.

He starts us off by reminding us that we can’t just split galaxies into spiral and elliptical galaxies anymore – it’s a lot more complicated than that with a whole evolutionary sequence of smooth disk galaxies between the pure elliptical and pure spiral galaxy sequences. It’s therefore really important to get both visual classifications from Galaxy Zoo but also quantitative morphologies. A quantitative approach is where you analyse an image to reduce the description of a galaxy down to a number – for example, how disturbed or asymmetric a galaxy is. Steven is explaining how you can do this by making a model of a galaxy’s light and subtracting off the original image and analysing what you’re left with. The problem is that the models are tidy but the galaxies are messy! Deciding which model to use is very difficult but that’s where the Galaxy Zoo classifications come in – they can be used as prior information to decide which model to use.

Steven explains the reason why we actually want to do all this model fitting is because we care about population statistics. Sometimes we don’t care about individual objects and we want to look at the big picture – to do that we need to reduce all that information down as much as we can.

Next up is one of the newest additions to the Galaxy Zoo team, Lucy Newnham a PhD Student at Portsmouth! She’s giving us a nice introduction to the big picture of galaxy evolution and how galaxies stop star forming as they evolve. She’s particularly focussing on barred galaxies and whether the bar can cause this shut down of star formation.

She’s done some follow up observations of some barred galaxies picked out by Galaxy Zoo using radio telescopes! Ionised hydrogen gas emits a very specific wavelength of light in the radio part of the spectrum (21cm) – so if you can detect emission with radio telescopes at these wavelengths it means there is hydrogen gas there to fuel star formation. It took 115 hours total observing time with the VLA and GMRT to get data for just 7 galaxies! The first one she’s reduced the data for is UGC9362 and she’s found that there is a hole in the gas in the centre of the galaxy where the bar is. She thinks that means that since the bar is rotating with the galaxy, it has carved out a hole in the gas as it does so and used up all the gas needed for star formation.

The next question Lucy is trying to answer is if the strength of the spiral arms is affecting the star formation in a galaxy? To quantify the strength of the spiral arms, Lucy is using the Galaxy Zoo classifications – where more people agree that a galaxy has spiral arms the stronger the spiral arms will be! Lucy has now looked at trends in galaxy properties with the strength of the spiral arms showing us a plot that she even made this morning! LIVE SCIENCE EVERYBODY!

Taking the stand now is another PhD student, Sandor Kruk, who will be continuing this barred galaxy theme: “Dealing with bars… and other mess”. He clarifies that when he refers to “mess” he means other morphological features!

Again, he’s focussing on this problem of what makes galaxies stop forming stars. Earlier results from Galaxy Zoo that Karen Masters worked on back in 2012 suggested that bars were a likely culprit. Sandor is now following up on this work to split the galaxy light into the separate components: bar, disk and bulge. Looking at the colour of this light will let us know if that part is star forming: red things are old, with little star formation and blue things are young, with recent star formation. To split this light he had to model the light of over 3500 galaxies! That’s a mammoth effort, but it’s paid off because he’s found that there is a difference between the colours of disks in galaxies with and without bars!

Whilst doing all this modelling, along the way he also made a serendipitous discovery: that some of the bars were offset from the centre of the disks. This is weird – it means that perhaps these galaxies have had an interaction with another galaxy which has shifted everything around. Turns out though that some of these objects had already been flagged in talk by the users! Makes us wonder what else is hiding in there that the team hasn’t yet seen!

Well Sandor reckons we should start with some of the questions of the Galaxy Zoo decision tree that the team haven’t yet had chance to look at. For example, what shape is the bulge of the galaxy – boxy or round? Does the galaxy have a ring? While Sandor has been fitting all of his 3500 galaxies (some barred and some unbarred as a control sample) for his bar study, he’s been getting some ideas for how we can tackle these questions – so watch this space!

So next up is one of the original science team members, Bill Keel! He’s sort of become the curator of the objects in Galaxy Zoo which don’t fit into any of the classifications we ask about on the site. He’ll be telling us specifically about the Voørwerpjes (i.e. ionization echos). The first one was flagged on August 13th 2007 (another 10 year anniversary coming up, mark it in your calendars!) by one of the volunteers who brought an unusual blue smudge below a galaxy to the team’s attention. Bill is now telling us how they figured out that the weird blue smudge near the galaxy turned out to be a gas cloud which had been ionised by emission from the active supermassive black hole in the centre of the nearby galaxy. We can tell this by looking at the spectrum of these objects – where we split the light into its component wavelengths to spot specific elements and molecules.

After identifying what this first object was, the users then found more! Bill ended up doing follow up observations on 20 of these objects – including 8 followed up with the Hubble Space Telescope. Turns out NGC7252, a galaxy that astronomers have been studying for 30 years, even has one of these ionised clouds!

The search continues for more of these objects – including another one flagged by a user in February 2017 in the current data being classified on Galaxy Zoo. So keep a weather eye out people!

We’re now going to open up the conference to discussion – between the team that are here and you following along online! If you’d like to ask a question or make a comment for discussion – either post it here on the blog or on Twitter with #GZoo10.

The discussion so far has covered how we consider more detailed features of a galaxy and how galaxy simulations will tie in with what we do in the future. We’re also starting the discussion of how the Galaxy Zoo site will be restructured in the future as we move to the new Zooniverse web platform – exciting!

Now we’re all off to lunch to fuel ourselves for a long afternoon of discussion and unconferencing! See you all in an hour – until then, keep tweeting!

We are back! After an afternoon of “un-conferencing” where we all suggest sessions for discussion and schedule them on the fly.

We first talked about what science we’re going to do with your classifications on the infrared images from the UKIDSS sample. We want to compare how the shape of galaxies changes from the optical to the infrared but it gets difficult because galaxies tend to be fainter and smaller in the infrared. A lot of us are keen to study how the number of bars changes from optical wavelengths to infrared wavelengths. There are some studies showing that bars disappear in the infrared, but there are also some that show that bars appear in the infrared where there are none in the optical. One of the Galaxy Zoo PhD students, Mel Galloway, has already had a quick look at this and we discussed where to take this work next! First thing first though – releasing the classifications as a data table to the public.

Our next discussion session was about the future of the Galaxy Zoo classification cite. How are we going to ask the users to classify the galaxies? The current mode is the classification tree that we get users to walk through and answer each question for every galaxy. This is very difficult to analyse at the end of the project though. So we discussed changing the interface to either (i) single binary questions about each galaxy, e.g. Bar or no bar? Smooth or featured? (ii) A survey project similar to the interface for Snapshot Serengeti which presents all the options for a galaxy at once, (iii) Lots of mini projects which are all offshoots of Galaxy Zoo focussing on one specific science question, or (iv) pairwise classification where we show two images of galaxies and ask which is more featured etc. There were many opinions about what the best way of doing this but we’d also love to hear your thoughts!

Later on we had an “alpha” test of a revamped Galaxy Zoo project which is survey style – it took people a while to get used to but people did seem to like it! There was also a lot of feedback but it was good to get the discussion flowing about what classifiers would like and what researchers would need.

There was also a discussion about how to study bars with the classifications from Galaxy Zoo. It’s a little difficult to pick stuff out, especially the weaker bars. One of the ways astronomers tend to find bars (e.g. when Galaxy Zoo classifications don’t exist for their sample!) is to fit light profiles to the disk of galaxies and take that model light off the original image. What you’re left with is called a “residual” – light that you didn’t account for, i.e. light from a bar. So there was a discussion about making an offshoot Zooniverse project classifying the residual light images to find weak bars.

Ross Hart then led a discussion about his new way of debiasing the Galaxy Zoo classifications to take into account the distance to galaxies and the fact that features get lost. He can recover lots more spirals with his new method. The table we link to on the Galaxy Zoo data page now has his debiased data table linked first.

We also had a discussion session about the outreach project Tactile Universe – which is a project engaging the blind community with astronomy. They’ve been 3D printing images of galaxies – the brightness being the third axis! We’d love to be able to make a tactile Galaxy Zoo but we have to wait for the tactile screen technology that we’d need to be able to do it! Looks like we’ve got our first session for our Galaxy Zoo Twentieth Anniversary Conference – watch this space #GZoo20.

Now we’ve finished up with the discussion all about the science, we get a treat at the end of the day! Our reward is that our very own Grant Miller has come to tell us all Tales From the Zooniverse! He’s telling us all about his first day on the job in the Zooniverse and how he realised it was going to be a great job when he went into his first meeting all about penguins with the Zooniverse’s Tom Hart! He is now showcasing how amazing the Zooniverse project builder is and is currently trying to build the original Galaxy Zoo project with it in under 3 minutes! And I can tell you: Reader, he managed it! He’s now telling us about his top picks for the Top 10 Zooniverse projects you’ve never heard of:

10) Monopole Quest
9) Expert Smooth/Not
8) Letters to Ryan
7) Bash The Bug
6) Faces of the World
5) The Planetary Response Network
4) Beluga Bits
3) Supernova Hunters
2) Family Certificates
1) Grant can’t name the top one! There’s so many on there now that Grant doesn’t know all of the projects on there (he used to know all the researchers of the projects but not anymore!) – 4700 new projects created since the project builder was launched. 47 of these have been fully launched as new projects, with 31 awaiting launch now.

His take home point: a LOT can happen in ten years!

Introducing the 100th Zooniverse Project: Galaxy Nurseries

It is my pleasure to announce the launch of a brand new Zooniverse project: Galaxy Nurseries. By taking part in this project, volunteers will help us measure the distances of thousands of galaxies, using their spectra. Before I tell you more about the new project and the fascinating science that you will be helping with, I have an announcement to make. Galaxy Nurseries is actually the 100th Zooniverse project, and we’re launching it in the year that Galaxy Zoo (the project that started the Zooniverse phenomenon) celebrates its 10 year anniversary. We can’t think of a better birthday present than a brand new galaxy project!

To celebrate these watersheds in the histories of the Zooniverse and Galaxy Zoo, we’re issuing a special challenge. Can you complete Galaxy Nurseries – the 100th Zooniverse project – in just 100 hours? We think you can do it. Prove us right!

Back to the science! What is Galaxy Nurseries? The main goal of this new project is to discover thousands of new baby galaxies in the distant Universe, using the light they emitted when the Universe was only half of its current age. Accurately measuring the distances to these galaxies is crucial, but this is not an easy task! To measure distances, images are not sufficient, and we need to analyze galaxy spectra. A spectrum is produced by decomposing the light that enters a telescope camera into its many different colors (or wavelengths). This is similar to the way that water droplets split white light into the beautiful colors of a rainbow after a storm.

The data that we use in this project come from the WISP survey. The “WISP” part stands for WFC3 IR Spectroscopic Parallel. This project uses the Wide Field Camera 3 carried by the Hubble Space Telescope to capture both images and spectra of hundreds of regions in the sky. These data allow us to find new galaxies (from the images) and simultaneously measure their distances (using the spectra).


This animation shows how a galaxy’s white light going through a prism gets decomposed into all its colors. Like the rainbow! The figure shows how the different colors end up in different positions. In this example violet/blue toward the bottom, orange/red toward the top. At each color, we have an image of the galaxy. When we sum the intensity at any given color, we obtained the spectrum to the right.

How do we do that? We need to identify features called “emission lines” in galaxy spectra. Emission lines appear as peaks in the spectrum and are produced when the presence of certain atomic elements in a galaxy (for example oxygen, or hydrogen), cause it to emit light much more strongly at a specific wavelength. The laws of physics tell us the exact wavelengths at which specific elements produce emission lines. We can use that information to tell how fast the galaxy is moving away from us by comparing the color of the emission line we actually measure with the color we know it had when it was produced. In the same way that the Doppler effect changes the apparent pitch of an ambulance’s siren as it approaches or recedes, the apparent color of an emission line depends on the speed of the galaxy that produced it. Then, we can relate the speed of the receding galaxy to how far it is from us through Edwin Hubble’s famous law.

The real trick is finding the emission line features in the galaxy spectra. Like many modern scientific experiments, we have written computer code that tries to identify these lines for us, but because our automatic line finder is only a machine, the code produces many bogus detections. It turns out that the visual processing power and critical thinking that human beings bring to bear is ideally suited for filtering out these bogus detections. By helping us to spot and eliminate the false positives, you will help us find galaxies that are  some of the youngest and smallest that have ever been discovered. In addition, we can use your classifications to create a next-generation galaxy and line detection algorithm that is much less susceptible to being fooled and generating spurious detections. All your work will also be very valuable for the new NASA WFIRST telescope and for the ESA/NASA Euclid mission, which both will be launched in the coming decade.

Emission lines in a galaxy’s spectrum can tell us about much more than “just” its distance. For example, the presence of hydrogen and oxygen lines tells us that the galaxy contains very young, newborn stars. Only these stars are hot enough to warm the surrounding gas to sufficiently high temperatures that some of these lines appear. By examining emission lines we can also learn what kind of elements were already present and in what relative proportions. We too are “star-stuff”, and by looking at these young galaxies we are following the earliest formation of the elements that make all of us.


The horizontal rainbows show the spectra for the three objects on the left. The bottom, very compact object is a star in our own Milky Way. The other two objects are an interacting pair of young galaxies, observed as they were 7 billion years ago! We can say this because we see an emission line from hydrogen in both galaxies (indicated with arrows). This emission line allows us to measure the galaxies’ distances. 

Galaxy Zoo relatives at AAS meeting – Hubble does overlapping galaxies

Among the results being presented at this week’s meeting of the American Astronomical Society in Texas (near Dallas) is this poster presentation on the status of the STARSMOG project. This program, a “snapshot” survey using the Hubble Space Telescope, selected targets from a list of overlapping galaxy pairs with spiral members and very different redshifts, so they are not interacting with each there and likely to be more symmetric. The source list includes pairs from Galaxy Zoo (about 60%) and the GAMA (Galaxy And Mass Assembly) survey. These data will allow very extensive analysis; this presentation reads more like a movie trailer in comparison, highlighting only a few results (primarily from the master’s thesis work by Sarah Bradford).

Among the highlights are:

Sharp outer edges to the location of dust lanes in spiral disks.

Distinct dust lanes disappearing for galaxies “late” in the Hubble sequence (Scd-Sd-Sdm-Sm, for those keeping track), maybe happening earlier in the sequence when there is a bar.

The dust web – in the outer disks of some spirals, we see not only dust lanes following the spiral pattern, but additional lanes cutting almost perpendicular to them. This is not completely new, but we can measure the dust more accurately with backlighting where the galaxy’s own light does not dilute its effects.

A first look at the fraction of area in the backlit regions with various levels of transmitted light. This goes beyond  our earlier arm/interam distinction to provide a more rigorous description of the dust distributions.

Bars and rings sweeping adjacent disk regions nearly free of dust (didn’t have room for a separate image on that, although the whole sample is shown in tiny versions across the bottom)

Here is a PNG of the poster. It doesn’t do the images justice, but the text is (just) legible.


New Hubble+Gemini results – history of fading AGN

Just in time to brighten our holiday season, we got word that the Astrophysical Journal has accepted out next paper on the Voorwerpje clouds around fading active galactic nuclei (AGN). The full paper is now linked on the arXiv preprint server.

This time, we concentrated on the clouds and what they can tell us about the history of these AGN. To do this, we worked pixel-by-pixel with the Hubble images of the clouds in the H-alpha and [O III] emission lines, augmented by a new (and very rich) set of integral-field spectroscopy measurements from the 8-meter Gemini North telescope, velocity maps from the Russian 6-meter telescope, and long-slit spectra from the 3-meter Shane telescope at Lick Observatory.

To examine the history of each AGN, our approach was that the AGN had to be at least bright enough to ionize the hydrogen we see glowing at each point at the time the light reaching that point was given off. Certainly we can’t expect each piece of the cloud to absorb all the deep-UV radiation, so this is a lower limit. Two external checks, on quasars unlikely to have faded greatly and on the Teacup AGN which has had detailed modeling done from spectra, suggests that the very brightest pixels at each radius absorb comparable fractions of the ionizing radiation. This gives confidence that we can track at least the behavior of a single object, underestimating its brightness by a single factor, if we look at the upper envelope of all pixels in the H-alpha images. We hoped this would be feasible all the way back to the original Hubble proposal to look at Hanny’s Voorwerp. Here is a graphic from the new paper comparing our AGN in this way. The distance in light-years at each point corresponds to the time delay between the AGN and cloud, and the curve labelled “Projection” shows how much one of these points would change if we view that location not perpendicular to the light but at angles up to 30 degrees each way. To be conservative, the plot shows the data corresponding to the bottom of this curve (minimum AGN luminosity at each point).agnhistories-sm

The common feature is the rapid brightness drop in the last 20,000 years for each (measured from the light now reaching us from the nuclei). Before that, most of them would not have stood out as having enough of an energy shortfall to enter our sample. Because of smearing due the large size of the clouds, and the long time it takes for electrons to recombine with protons at such low densities, we would not necessarily see the signature of similar low states more than about 40,000 years back.

We could also improve another measure of the AGN history – the WISE satellite’s mid-infrared sky survey gave us more accurate measure of these objects’ infrared output. That way, we can tell whether it is at least possible for the AGN to be bright enough to light up the gas, but so dust-blocked in our direction that we underestimate their brightness. The answer in most cases is “not at all”.

New data brought additional surprises (these objects have been gifts that just keep on giving). The Gemini data were taken with fiber-optic arrays giving us a spectrum for each tiny area 0.2 arcseconds on a side (although limited to 3.5×5 arc second fields), taken under extraordinarily steady atmospheric conditions so we can resolve structures as small as 0.5 arc second. We use these results to see how the gas is ionized and moves; some loops of gas that earlier looked as if they were being blown out from the nuclei are mostly rotating instead. Unlike some well-studied, powerful AGN with giant emission clouds, the Voorwerpje clouds are mostly just orbiting the galaxies (generally as part of tidal tails), being ionized by the AGN radiation but not shoved around by AGN winds. This montage shows the core of NGC 5972 seen by these various instruments, hinting at the level of mapping allowed by the Gemini spectra (and helping explain why it took so long to work finish the latest paper).ngc5972-hst-gmos-bta

Work on the Voorwerpjes continues in many ways. Galaxy Zoo participants still find possible clouds (and the moderators have been excellent about making sure we see them). There is more to be learned from the Gemini data, while X-ray observatories  are gradually bringing the current status of the AGN into sharper focus. A narrowband imaging survey from the ground can pick out fainter (and sometimes older) clouds. Colleagues with expertise in radio interferometry are addressing questions posed by the unexpected misalignments of optical and radio structures in some of our galaxies. Finally, the new DECaLS and Pan-STARRS survey data will eventually bring nearly the whole sky into our examination (for a huge range of projects, not just AGN history).

Once again, thanks to all who have helped us find and unravel these fascinating objects!

Galaxy Zoo CANDELS

We submitted the Galaxy Zoo CANDELS paper in May. Now, after some discussion with a very helpful referee, the paper is accepted! I hope our volunteers are as thrilled as I was to get the news. It happened within days of the Galaxy Zoo: Hubble paper acceptance. Hurray!


Spot the typo! (No, just kidding.) (Well, sort of. There is one, but it’s not easy to find and it’s pretty inconsequential.) This is not quite the longest paper I’ve ever written, but it is the longest author list I’ve ever been at the top of. It includes both Galaxy Zoo and CANDELS scientists. And the volunteers are acknowledged too, in that first footnote. A lot of people did a lot of work to bring this together.

If you’d like to read the paper, it’s publicly available as a pre-print now and will be published at some point soon in the Monthly Notices of the Royal Astronomical Society. The pre-print version is the accepted version, so it should only differ from the eventual published paper by a tiny bit (I’m sure the proof editor will catch some typos and so on).

The paper may be a little long for a casual read, so here’s an overview:

  • We collected 2,149,206 classifications of 52,073 subjects, from 41,552 registered volunteers and 53,714 web browser sessions where the classifier didn’t log in. In the analysis we assumed each of those unique browser sessions was a separate volunteer.

Most subjects have 40 classifications apiece, although some were retired early from active classification and others were classified further, until about 80 volunteers per galaxy had told us what they thought.

  • The raw consensus classifications are definitely useful, but we also weighted the classifications using a combination of “gold standard” data and consensus-based weighting. That is, classifiers were up- or down-weighted according to whether they could tell a galaxy apart from a star most of the time, and then the rest of the weighting proceeded in the same way it has for every other GZ dataset. No surprise: the majority of volunteers are excellent classifiers.
  • 6% of the raw classifications were from 86 classifiers who both classified a lot and gave the same answer (usually “star or artifact”) at least 98% of the time, no matter what images they saw. We have some bots, but they’re quite easy to spot.
  • Even with a pretty generous definition of what counts as “featured”, less than 15% of galaxies in the relatively young Universe that this data examines have clear signs of features. Most galaxies in the data set are relatively smooth and featureless.
  • Galaxy Zoo compares well with visual classifications of the same galaxies done by members of the CANDELS team, despite the fact that the comparison is sometimes hard because the questions they asked weren’t the same as what we did. This is, of course, a classic problem when comparing data sets of any kind: to some extent it’s always apples-vs-oranges, and the devil is in the details.

We devote an entire section of the paper to comparing with the CANDELS-team classifications (from Kartaltepe et al. 2015, which we abbreviate to K15 in the paper). The bottom line: the classifications generally agree, and where they don’t we understand why. Sometimes it’s because there’s interesting science there, like mergers versus overlaps. The greyscale shading is a 2-D histogram; the difference in the blue versus red points is in which axis was used to separate the galaxy into bins so that the average trends could be computed.

  • By combining Galaxy Zoo classifications with multi-wavelength light profile fitting — where we fit a 2D equation to the distribution of light in a galaxy, the properties of which correlate pretty well with whether a galaxy has a strong disk component — we’ve identified a population of likely disk-dominated galaxies that also completely lack the features that are common in disk galaxies in the nearby, more evolved Universe. These disks don’t have spiral arms, they don’t have bars, they don’t have clumps. They’re smooth, but they are disks, not ellipticals. They tend to be a bit more compact than disk galaxies that do have features, even though they’re at the same luminosities. They’re also hard to identify using color alone (which echoes what we’ve seen in past Galaxy Zoo studies of various different kinds of galaxies). You really need both kinds of morphological information to reliably find these.
  • The data is available for download for those who would like to study it:

With the data releases of Galaxy Zoo: Hubble and Galaxy Zoo CANDELS added to the existing Galaxy Zoo releases, your combined classifications of over a million galaxies near and far are now public. We’ve already done some science together with these classifications, but there’s so much more to do. Thanks again for enabling us to learn about the Universe. This wouldn’t have been possible without you.

Galaxy Zoo: Hubble – data release and paper accepted!

I’m incredibly happy to report that the main paper for the Galaxy Zoo: Hubble project has just been accepted to the Monthly Notices of the Royal Astronomical Society! It’s been a long road for the project, but we’ve finally reached a major milestone. It’s due to the efforts of many, including the scientists who designed the interface and processed the initial images, the web developers who managed our technology and databases, more than 80,000 volunteers who spent time classifying galaxies and discussing them on the message boards, and the distributed GZ science team who have been steadily working on analyzing images, calibrating data, and writing the paper.

The preprint for the Galaxy Zoo: Hubble paper is available here. The release of GZH also syncs up with the publication of the Galaxy Zoo: CANDELS catalog, led by Brooke Simmons; she’ll have a blog post up later today, and the GZC paper is also available as a preprint.


The first page of the project description and data release paper for Galaxy Zoo: Hubble (Willett et al. 2016).

Galaxy Zoo: Hubble began in 2010; it was the first work of GZ to move beyond the images taken with the Sloan Digital Sky Survey (SDSS). We were motivated by the need to study the evolution and formation of galaxies billions of years ago, in the early days of the Universe. While SDSS is an amazing telescope, it doesn’t have the sensitivity or resolution to make a quality image of a typical galaxy beyond a redshift of about z=0.4 (distances of a few billion parsecs). Instead, we used images from the Hubble Space Telescope, the flagship and workhorse telescope of NASA for the past two decades, and asked volunteers to help us classify the shapes of galaxies in several of Hubble’s largest and deepest surveys. After more than two years of work, the initial set of GZH classifications were finished in 2012 and the site moved on to other datasets, including CANDELS, UKIDSS, and Illustris.

So why has it taken several years to finish the analysis and publication of the data? The reduction of the GZH data ended up being more complicated and difficult than we’d originally anticipated. One key difference lies in our approach to a technique we call debiasing; these refer to sets of corrections made to the raw data supplied by the volunteers. There’s a known effect where galaxies that are less bright and/or further away will appear dimmer and/or smaller in the images which are being classified. This skews the data, making it appear that there are more elliptical/smooth galaxies than truly exist in the Universe. With SDSS images, we dealt with this by assuming that the nearest galaxies were reliably measured, and then deriving corrections which we applied to the rest of the sample.

In Galaxy Zoo: Hubble, we didn’t have that option available. The problem is that there are two separate effects in the data that affect morphological classification. The first is the debiasing issue just mentioned above; however, there’s also a genuine change in the populations of galaxies between, say, 6 billion years ago and the present day. Galaxies in the earlier epochs of the Universe were more likely to have clumpy substructures and less likely to have very well-settled spiral disks with features like bars. So if we just tried to correct for the debiasing effect based on local galaxies, we would have explicitly removed any of the real changes in the population over cosmic time. Since those trends are exactly what we want to study, we needed another approach.

Our solution ended up bringing in another set of data to serve as the calibration. Volunteers who have classified on the current version of the site may remember classifying the “FERENGI” sample. These were images of real galaxies that we processed with computer codes to make them look like they were at a variety of distances. The classifications for these images, which were completed in late 2013, gave us the solution to the first effect; we were able to model the relationship between distance to the galaxy and the likelihood of detecting features, and then applied a correction based on that relationship to the real GZH data.


Top: Example of a galaxy image processed with FERENGI to make it appear at a variety of distances. Bottom: Calibration curves based on FERENGI data that measure the effect of distance on morphological classification. From Willett et al. (2016).

The new GZH data is similar in format and structure to the data release from GZ2. The main product is a very large data table (113,705 rows by 172 columns) that researchers can slice and dice to study specific groups of galaxies with morphological measurements. We’re also releasing data from several related image sets, including experiments on fading and swapping colors in images, the effect of bright active galactic nuclei (AGN), different exposure depths, and even a low-redshift set of SDSS Stripe 82 galaxies classified with the new decision tree. All of the data will be published in electronic tables along with the paper, and are also downloadable from Our reduction and analysis code is available as a public Github repository.

The science team has already published two papers based on preliminary Galaxy Zoo: Hubble data. This included a paper led by Edmond Cheung (UCSC/Kavli IPMU) that concluded that there is no evidence connecting galactic bars and AGN over a range of redshifts out to z = 1.0. Tom Melvin (U. Portsmouth) carefully examined the overall bar fraction in disks using COSMOS data, measuring a strong decrease in bar fraction going back to galaxies 7.8 billion years ago. We’re now excited to continue new research areas, including a project led by Melanie Galloway (U. Minnesota) on the evolution of red disk galaxies over cosmic time. We hope GZH will enable a lot more science very soon from both our team and external researchers, now that the data are publicly released.

A massive “thank you” again to everyone who’s helped with this project. Galaxy Zoo has made some amazing discoveries with your help in the past eight years, and now that two new unique sets of data are openly available, we’re looking forward to many more.

Bayesian View of Galaxy Evolution

The Universe is pretty huge, and to understand it we need to collect vast amounts of data. The Hubble Telescope is just one of many telescopes collecting data from the Universe. Hubble alone produces 17.5 GB of raw science data each week. That means since its launch to low earth orbit in April 1990, it’s collected roughly a block of data equivalent in size to 6 million mp3 songs! With the launch of NASA’s James Webb Telescope  just around the corner – (a tennis court sized space telescope!), the amount of raw data we can collect from the Universe is going to escalate dramatically. In order to decipher what this data is telling us about the Universe we need to use sophisticated statistical techniques. In this post I want to talk a bit about a particular technique I’ve been using called a Markov-Chain-Monte-Carlo (MCMC) simulation to learn about galaxy evolution.

Before we dive in into the statistics let me try and explain what I’m trying to figure out. We can model galaxy evolution by looking at a galaxy’s star formation rate (SFR) over time. Basically we want know to how fast a particular galaxy is making stars at any given time. Typically, a galaxy has an initial constant high SFR then at a time called t quench (tq) it’s SFR decreases exponentially which is characterised by a number called tau. Small tau means the galaxy stops forming stars, or is quenched, more rapidly. So overall for each galaxy we need to determine two numbers tq and tau to figure out how it evolved. Figure 1 shows what this model looks like.


Figure 1: Model of a single galaxy’s SFR over time. Showing an initial high constant SFR, follow by a exponential quench at tq.

To calculate these two numbers, tq and tau, we look at the colour of the galaxy, specifically the UVJ colour I mentioned in my last post. We then compare this to a predicted colour of a galaxy for a specific value of tq and tau. The problem is that there are many different combinations of tq and tau, how to we find the best match for a galaxy? We use a MCMC simulation to do this.

The first MC – Markov-Chain – just means an efficient random walk. We send “walkers” to have a look around for a good tq and tau, but the direction we send them to walk at each step depends on how good the tq and tau they are currently at is. The upshot of this is we quickly home in on a good value of tq and tau. The second MC – Monte Carlo – just picks out random values of tq and tau and tests how good they are by comparing the UVJ colours and our SFR model. Figure 2 shows a gif of a MCMC simulation of a single galaxy. The histograms shows the positions of the walkers searching the tq and tau space, and the blue crosshair shows the best fit value of tq and tau at every step. You can see the walkers homing in and settling down on the best value of tq and tau. I ran this simulation by running a modified version of the starpy code.


Figure 2: MCMC simulation for a single galaxy, pictured in the top right corner. Main plot shows density of walkers. Marginal histograms show 1D projections of walker densities. Blue crosshair shows best fit values of tq and tau at each step.

The maths that underpins this simulation is called Bayesian Statistics, and it’s quite a novel way of thinking about parameters and data. The main difference is that instead of treating unknown parameters as fixed quantities with associated error, they are treated as random variables described by probability distributions. It’s quite a powerful way of looking at the Universe! I’ve left all of the gory maths detail about MCMC out but if you’re interested an article by a DPhil student here at Oxford does are really good job of explaining it here.

So how does this all relate to galaxy morphology, and Galaxy Zoo classifications? I’m currently running the MCMC simulation showing in Figure 2 over the all the galaxies in the COSMOS survey. This is really cool because apart from getting to play with the University of Oxford’s super computer (544 cores!), I can use galaxy zoo morphology to see if the SFR of a galaxy over time is dependent on the galaxy’s shape, and overall learn what the vast amount of data I have says about galaxy evolution.


Upcoming Galaxy Zoo: Hubble and CANDELS papers

It’s been a good amount of time since the Galaxy Zoo: Hubble and Galaxy Zoo: CANDELS projects were finished, tackling more than 200,000 combined galaxies thanks to the efforts of our volunteers. While we’ve had a couple of science papers based on the early results (Melvin et al. 2014, Simmons et al. 2014, Cheung et al. 2015), a full release of the data and catalog has taken slightly longer. However, we’ve been working hard, testing the data, and developing some new analysis methods on both image sets. This month has been really exciting, and we now have drafts for both papers that are just about finished. Once they’ve been accepted to the journals (and revised, if necessary), we’ll have some much longer posts discussing the results, and of course attaching the papers themselves. Hopefully that’ll be quite soon.

As a small teaser, here’s a little movie I just made of the Galaxy Zoo: Hubble paper as it went through the various drafts by different members of the science team. If only all paper writing were this easy … 😉

How to write a Galaxy Zoo paper in 15 seconds ...

How to write a Galaxy Zoo paper in 15 seconds or less … (Image: K. Willett)

Models of Merging

Once upon a time, there was an experimental project called Galaxy Zoo: Mergers. It used ancient, mystical technology to allow volunteers to run simulations of merging galaxies on their computers, and to compare the results of many such simulations. Their mission: to find matches to more than fifty nearby mergers selected from Galaxy Zoo data.

The wonderful Penguin Galaxy, studied in the project.

The wonderful Penguin Galaxy, studied in the project.

Amongst the chosen galaxies were not just run-of-the-mill, everyday mergers, but also the various oddities that the volunteers found, such as the Penguin galaxy. The team led volunteers through a series of tournaments designed to pit potential solutions for a particular galaxy against each other. In total, more than 3 million simulations were reviewed producing the results described in the paper, now accepted by the journal MNRAS, and in the dataset visible at the main Galaxy Zoo data repository. This represents a huge amount of effort, and a speeding up of the process – in the paper, we note that previous fits to mergers have taken months of effort to complete.

Which is not to say the analysis, led by Anthony Holincheck and John Wallin, has been easy. In a recent email to the Galaxy Zoo team, John commented:

This is by far the most complex project I have ever worked on. Most papers that model interacting galaxies contain one or two systems where the author uses a few dozen simulations. We just published a paper that modeled 62 different systems using a brand new modeling technique where the 3 million simulation results were reviewed by citizen scientists. Best of all, the 62 models were done using the same code and the same coordinate system so others can reproduce them. Doing this with other published simulations is nearly impossible.

I know an immense amount of effort went into making sure that the results weren’t wasted, and the paper thus represents a happy ending to a tale that’s been running a long time. But it is not really an end; we are already planning to observe some of these galaxies as part of surveys like MaNGA that can measure the way that the galaxies’ components are moving today, allowing us to test these models. We also hope a library of models might be useful for other astronomers, and will be looking to try and revive this kind of project.

Read more about Galaxy Zoo: Mergers in this old blog post