It’s the day before Galaxy Zoo’s tenth birthday, and the team have gathered in Oxford for three days of discussing science and our plans for the future. Because it’s Galaxy Zoo, we’re inviting any of you who are interested to follow along online.
The mornings will be taken up with talks from team members. Today’s schedule is :
10am : Chris Lintott (Oxford)
10.20am: Lee Kelvin (Liverpool John Moores)
11am: Steven Bamford (Nottingham)
11.20am: Lucy Newnham (Portsmouth)
11.40am: Sandor Kruk (Oxford)
12 noon: Bill Keel (Alabama)
All the talks will be available via Oxford’ LiveStream account here. You can ask us questions using the #GZoo10 hashtag on Twitter – we will make sure someone in the audience at each session is watching so comments online make it into the room.
The afternoon will be an unconference and hack session, with the team debating the issues raised during the day and getting to work together. These sessions won’t be streamed, but we will blog about what’s going on.
It’s Becky Smethurst blogging from here on in folks…
So we’ve kicked off the day with our fearless leader of the Zooniverse, Chris Lintott, reminding us that on this day 10 years ago the team were having conversations about how it would be amazing if they could get 10,000 people to help classify. Chris is still amazed that we’re here 10 years later with over 400,000 of you.
— chrislintott (@chrislintott) July 9, 2017
Chris is running through some of the modes in which we work with the Galaxy Zoo data. The first is looking at traditional morphologies, which the project was designed to do, like bars and spirals. The second is “distraction mode” where we’re all distracted by the serendipitous discoveries that the users make which we weren’t expecting, like the Voørwerpjes and the green peas. The final mode is the modelling mode, where we’re fitting models to the Galaxy Zoo data to explain something about the Universe. This mode also includes the amazing work with classifications of simulated galaxy images that are ongoing on the Galaxy Zoo site right now!
One of the questions from the audience for Chris is: “Why have the serendipitous discoveries dried up on Galaxy Zoo?” For one thing Chris thinks that one issue is that is takes so long to follow up on these discoveries – we’re still working on the Voørwerpjes! – but one thing we don’t have with the current images on the site (GAMA and KiDS etc.) is a link to the science survey site where the images come from. We had that with the original Sloan Digital Sky Survey (SDSS) images in Galaxy Zoo 1 & 2 which allowed the users to explore the data themselves and flag up something interesting.
Up next is one of the newest members to the Galaxy Zoo team: Lee Kelvin! He’s telling us about his work with the Galaxy Zoo classifications of the GAMA and KiDS survey images which have just been classified by users on the site. The special thing about GAMA is that it’s multi-wavelength; it takes images in various bands across the spectrum, from the ultra-violet to the infra-red. This is important because, as Lee points out, the morphology of a galaxy changes a lot across different wavelengths.
— Alice Sheppard (@PenguinGalaxy) July 10, 2017
GAMA also has cross-over with the KiDS survey (the main role for which is to map the locations of gravitational lenses in the Universe, like those users hunted for in Space Warps!) which has much higher resolution than the SDSS images originally in GZ1 & GZ2. This means they’re perfect for classifying morphologies because more detailed features are resolved. These images are on the site right now – which means lots of pretty pictures for us to classify! These classifications give the team a wealth of information on the galaxies in these surveys – especially when users flag the interesting cases on Talk.
— galaxyzoo (@galaxyzoo) July 10, 2017
The early results from these classifications with the images from KiDS look very promising but Lee says there’s lots more work to be done! Including setting up a follow-up Zooniverse project trying to distinguish between true smooth elliptical shaped galaxies and disk galaxies that look smooth – so look out for that project going live in the next couple of months!
We’re back and caffeinated after a refreshing coffee break! Now Steven Bamford has taken the stand and is talking to us about the next steps for morphology studies with Galaxy Zoo.
— galaxyzoo (@galaxyzoo) July 10, 2017
He starts us off by reminding us that we can’t just split galaxies into spiral and elliptical galaxies anymore – it’s a lot more complicated than that with a whole evolutionary sequence of smooth disk galaxies between the pure elliptical and pure spiral galaxy sequences. It’s therefore really important to get both visual classifications from Galaxy Zoo but also quantitative morphologies. A quantitative approach is where you analyse an image to reduce the description of a galaxy down to a number – for example, how disturbed or asymmetric a galaxy is. Steven is explaining how you can do this by making a model of a galaxy’s light and subtracting off the original image and analysing what you’re left with. The problem is that the models are tidy but the galaxies are messy! Deciding which model to use is very difficult but that’s where the Galaxy Zoo classifications come in – they can be used as prior information to decide which model to use.
— galaxyzoo (@galaxyzoo) July 10, 2017
Steven explains the reason why we actually want to do all this model fitting is because we care about population statistics. Sometimes we don’t care about individual objects and we want to look at the big picture – to do that we need to reduce all that information down as much as we can.
Next up is one of the newest additions to the Galaxy Zoo team, Lucy Newnham a PhD Student at Portsmouth! She’s giving us a nice introduction to the big picture of galaxy evolution and how galaxies stop star forming as they evolve. She’s particularly focussing on barred galaxies and whether the bar can cause this shut down of star formation.
— galaxyzoo (@galaxyzoo) July 10, 2017
She’s done some follow up observations of some barred galaxies picked out by Galaxy Zoo using radio telescopes! Ionised hydrogen gas emits a very specific wavelength of light in the radio part of the spectrum (21cm) – so if you can detect emission with radio telescopes at these wavelengths it means there is hydrogen gas there to fuel star formation. It took 115 hours total observing time with the VLA and GMRT to get data for just 7 galaxies! The first one she’s reduced the data for is UGC9362 and she’s found that there is a hole in the gas in the centre of the galaxy where the bar is. She thinks that means that since the bar is rotating with the galaxy, it has carved out a hole in the gas as it does so and used up all the gas needed for star formation.
The next question Lucy is trying to answer is if the strength of the spiral arms is affecting the star formation in a galaxy? To quantify the strength of the spiral arms, Lucy is using the Galaxy Zoo classifications – where more people agree that a galaxy has spiral arms the stronger the spiral arms will be! Lucy has now looked at trends in galaxy properties with the strength of the spiral arms showing us a plot that she even made this morning! LIVE SCIENCE EVERYBODY!
Taking the stand now is another PhD student, Sandor Kruk, who will be continuing this barred galaxy theme: “Dealing with bars… and other mess”. He clarifies that when he refers to “mess” he means other morphological features!
— Alice Sheppard (@PenguinGalaxy) July 10, 2017
Again, he’s focussing on this problem of what makes galaxies stop forming stars. Earlier results from Galaxy Zoo that Karen Masters worked on back in 2012 suggested that bars were a likely culprit. Sandor is now following up on this work to split the galaxy light into the separate components: bar, disk and bulge. Looking at the colour of this light will let us know if that part is star forming: red things are old, with little star formation and blue things are young, with recent star formation. To split this light he had to model the light of over 3500 galaxies! That’s a mammoth effort, but it’s paid off because he’s found that there is a difference between the colours of disks in galaxies with and without bars!
Whilst doing all this modelling, along the way he also made a serendipitous discovery: that some of the bars were offset from the centre of the disks. This is weird – it means that perhaps these galaxies have had an interaction with another galaxy which has shifted everything around. Turns out though that some of these objects had already been flagged in talk by the users! Makes us wonder what else is hiding in there that the team hasn’t yet seen!
— chrislintott (@chrislintott) July 10, 2017
Well Sandor reckons we should start with some of the questions of the Galaxy Zoo decision tree that the team haven’t yet had chance to look at. For example, what shape is the bulge of the galaxy – boxy or round? Does the galaxy have a ring? While Sandor has been fitting all of his 3500 galaxies (some barred and some unbarred as a control sample) for his bar study, he’s been getting some ideas for how we can tackle these questions – so watch this space!
So next up is one of the original science team members, Bill Keel! He’s sort of become the curator of the objects in Galaxy Zoo which don’t fit into any of the classifications we ask about on the site. He’ll be telling us specifically about the Voørwerpjes (i.e. ionization echos). The first one was flagged on August 13th 2007 (another 10 year anniversary coming up, mark it in your calendars!) by one of the volunteers who brought an unusual blue smudge below a galaxy to the team’s attention. Bill is now telling us how they figured out that the weird blue smudge near the galaxy turned out to be a gas cloud which had been ionised by emission from the active supermassive black hole in the centre of the nearby galaxy. We can tell this by looking at the spectrum of these objects – where we split the light into its component wavelengths to spot specific elements and molecules.
— chrislintott (@chrislintott) July 10, 2017
After identifying what this first object was, the users then found more! Bill ended up doing follow up observations on 20 of these objects – including 8 followed up with the Hubble Space Telescope. Turns out NGC7252, a galaxy that astronomers have been studying for 30 years, even has one of these ionised clouds!
— Alice Sheppard (@PenguinGalaxy) July 10, 2017
The search continues for more of these objects – including another one flagged by a user in February 2017 in the current data being classified on Galaxy Zoo. So keep a weather eye out people!
— Alice Sheppard (@PenguinGalaxy) July 10, 2017
We’re now going to open up the conference to discussion – between the team that are here and you following along online! If you’d like to ask a question or make a comment for discussion – either post it here on the blog or on Twitter with #GZoo10.
The discussion so far has covered how we consider more detailed features of a galaxy and how galaxy simulations will tie in with what we do in the future. We’re also starting the discussion of how the Galaxy Zoo site will be restructured in the future as we move to the new Zooniverse web platform – exciting!
Now we’re all off to lunch to fuel ourselves for a long afternoon of discussion and unconferencing! See you all in an hour – until then, keep tweeting!
— Carie Cardamone (@cariecardamone) July 10, 2017
— galaxyzoo (@galaxyzoo) July 10, 2017
We are back! After an afternoon of “un-conferencing” where we all suggest sessions for discussion and schedule them on the fly.
We first talked about what science we’re going to do with your classifications on the infrared images from the UKIDSS sample. We want to compare how the shape of galaxies changes from the optical to the infrared but it gets difficult because galaxies tend to be fainter and smaller in the infrared. A lot of us are keen to study how the number of bars changes from optical wavelengths to infrared wavelengths. There are some studies showing that bars disappear in the infrared, but there are also some that show that bars appear in the infrared where there are none in the optical. One of the Galaxy Zoo PhD students, Mel Galloway, has already had a quick look at this and we discussed where to take this work next! First thing first though – releasing the classifications as a data table to the public.
Our next discussion session was about the future of the Galaxy Zoo classification cite. How are we going to ask the users to classify the galaxies? The current mode is the classification tree that we get users to walk through and answer each question for every galaxy. This is very difficult to analyse at the end of the project though. So we discussed changing the interface to either (i) single binary questions about each galaxy, e.g. Bar or no bar? Smooth or featured? (ii) A survey project similar to the interface for Snapshot Serengeti which presents all the options for a galaxy at once, (iii) Lots of mini projects which are all offshoots of Galaxy Zoo focussing on one specific science question, or (iv) pairwise classification where we show two images of galaxies and ask which is more featured etc. There were many opinions about what the best way of doing this but we’d also love to hear your thoughts!
Later on we had an “alpha” test of a revamped Galaxy Zoo project which is survey style – it took people a while to get used to but people did seem to like it! There was also a lot of feedback but it was good to get the discussion flowing about what classifiers would like and what researchers would need.
There was also a discussion about how to study bars with the classifications from Galaxy Zoo. It’s a little difficult to pick stuff out, especially the weaker bars. One of the ways astronomers tend to find bars (e.g. when Galaxy Zoo classifications don’t exist for their sample!) is to fit light profiles to the disk of galaxies and take that model light off the original image. What you’re left with is called a “residual” – light that you didn’t account for, i.e. light from a bar. So there was a discussion about making an offshoot Zooniverse project classifying the residual light images to find weak bars.
Ross Hart then led a discussion about his new way of debiasing the Galaxy Zoo classifications to take into account the distance to galaxies and the fact that features get lost. He can recover lots more spirals with his new method. The table we link to on the Galaxy Zoo data page now has his debiased data table linked first.
We also had a discussion session about the outreach project Tactile Universe – which is a project engaging the blind community with astronomy. They’ve been 3D printing images of galaxies – the brightness being the third axis! We’d love to be able to make a tactile Galaxy Zoo but we have to wait for the tactile screen technology that we’d need to be able to do it! Looks like we’ve got our first session for our Galaxy Zoo Twentieth Anniversary Conference – watch this space #GZoo20.
— Alice Sheppard (@PenguinGalaxy) July 10, 2017
Now we’ve finished up with the discussion all about the science, we get a treat at the end of the day! Our reward is that our very own Grant Miller has come to tell us all Tales From the Zooniverse! He’s telling us all about his first day on the job in the Zooniverse and how he realised it was going to be a great job when he went into his first meeting all about penguins with the Zooniverse’s Tom Hart! He is now showcasing how amazing the Zooniverse project builder is and is currently trying to build the original Galaxy Zoo project with it in under 3 minutes! And I can tell you: Reader, he managed it! He’s now telling us about his top picks for the Top 10 Zooniverse projects you’ve never heard of:
10) Monopole Quest
9) Expert Smooth/Not
8) Letters to Ryan
7) Bash The Bug
6) Faces of the World
5) The Planetary Response Network
4) Beluga Bits
3) Supernova Hunters
2) Family Certificates
1) Grant can’t name the top one! There’s so many on there now that Grant doesn’t know all of the projects on there (he used to know all the researchers of the projects but not anymore!) – 4700 new projects created since the project builder was launched. 47 of these have been fully launched as new projects, with 31 awaiting launch now.
His take home point: a LOT can happen in ten years!
It is my pleasure to announce the launch of a brand new Zooniverse project: Galaxy Nurseries. By taking part in this project, volunteers will help us measure the distances of thousands of galaxies, using their spectra. Before I tell you more about the new project and the fascinating science that you will be helping with, I have an announcement to make. Galaxy Nurseries is actually the 100th Zooniverse project, and we’re launching it in the year that Galaxy Zoo (the project that started the Zooniverse phenomenon) celebrates its 10 year anniversary. We can’t think of a better birthday present than a brand new galaxy project!
To celebrate these watersheds in the histories of the Zooniverse and Galaxy Zoo, we’re issuing a special challenge. Can you complete Galaxy Nurseries – the 100th Zooniverse project – in just 100 hours? We think you can do it. Prove us right!
Back to the science! What is Galaxy Nurseries? The main goal of this new project is to discover thousands of new baby galaxies in the distant Universe, using the light they emitted when the Universe was only half of its current age. Accurately measuring the distances to these galaxies is crucial, but this is not an easy task! To measure distances, images are not sufficient, and we need to analyze galaxy spectra. A spectrum is produced by decomposing the light that enters a telescope camera into its many different colors (or wavelengths). This is similar to the way that water droplets split white light into the beautiful colors of a rainbow after a storm.
The data that we use in this project come from the WISP survey. The “WISP” part stands for WFC3 IR Spectroscopic Parallel. This project uses the Wide Field Camera 3 carried by the Hubble Space Telescope to capture both images and spectra of hundreds of regions in the sky. These data allow us to find new galaxies (from the images) and simultaneously measure their distances (using the spectra).
How do we do that? We need to identify features called “emission lines” in galaxy spectra. Emission lines appear as peaks in the spectrum and are produced when the presence of certain atomic elements in a galaxy (for example oxygen, or hydrogen), cause it to emit light much more strongly at a specific wavelength. The laws of physics tell us the exact wavelengths at which specific elements produce emission lines. We can use that information to tell how fast the galaxy is moving away from us by comparing the color of the emission line we actually measure with the color we know it had when it was produced. In the same way that the Doppler effect changes the apparent pitch of an ambulance’s siren as it approaches or recedes, the apparent color of an emission line depends on the speed of the galaxy that produced it. Then, we can relate the speed of the receding galaxy to how far it is from us through Edwin Hubble’s famous law.
The real trick is finding the emission line features in the galaxy spectra. Like many modern scientific experiments, we have written computer code that tries to identify these lines for us, but because our automatic line finder is only a machine, the code produces many bogus detections. It turns out that the visual processing power and critical thinking that human beings bring to bear is ideally suited for filtering out these bogus detections. By helping us to spot and eliminate the false positives, you will help us find galaxies that are some of the youngest and smallest that have ever been discovered. In addition, we can use your classifications to create a next-generation galaxy and line detection algorithm that is much less susceptible to being fooled and generating spurious detections. All your work will also be very valuable for the new NASA WFIRST telescope and for the ESA/NASA Euclid mission, which both will be launched in the coming decade.
Emission lines in a galaxy’s spectrum can tell us about much more than “just” its distance. For example, the presence of hydrogen and oxygen lines tells us that the galaxy contains very young, newborn stars. Only these stars are hot enough to warm the surrounding gas to sufficiently high temperatures that some of these lines appear. By examining emission lines we can also learn what kind of elements were already present and in what relative proportions. We too are “star-stuff”, and by looking at these young galaxies we are following the earliest formation of the elements that make all of us.
Among the results being presented at this week’s meeting of the American Astronomical Society in Texas (near Dallas) is this poster presentation on the status of the STARSMOG project. This program, a “snapshot” survey using the Hubble Space Telescope, selected targets from a list of overlapping galaxy pairs with spiral members and very different redshifts, so they are not interacting with each there and likely to be more symmetric. The source list includes pairs from Galaxy Zoo (about 60%) and the GAMA (Galaxy And Mass Assembly) survey. These data will allow very extensive analysis; this presentation reads more like a movie trailer in comparison, highlighting only a few results (primarily from the master’s thesis work by Sarah Bradford).
Among the highlights are:
Sharp outer edges to the location of dust lanes in spiral disks.
Distinct dust lanes disappearing for galaxies “late” in the Hubble sequence (Scd-Sd-Sdm-Sm, for those keeping track), maybe happening earlier in the sequence when there is a bar.
The dust web – in the outer disks of some spirals, we see not only dust lanes following the spiral pattern, but additional lanes cutting almost perpendicular to them. This is not completely new, but we can measure the dust more accurately with backlighting where the galaxy’s own light does not dilute its effects.
A first look at the fraction of area in the backlit regions with various levels of transmitted light. This goes beyond our earlier arm/interam distinction to provide a more rigorous description of the dust distributions.
Bars and rings sweeping adjacent disk regions nearly free of dust (didn’t have room for a separate image on that, although the whole sample is shown in tiny versions across the bottom)
Here is a PNG of the poster. It doesn’t do the images justice, but the text is (just) legible.
Just in time to brighten our holiday season, we got word that the Astrophysical Journal has accepted out next paper on the Voorwerpje clouds around fading active galactic nuclei (AGN). The full paper is now linked on the arXiv preprint server.
This time, we concentrated on the clouds and what they can tell us about the history of these AGN. To do this, we worked pixel-by-pixel with the Hubble images of the clouds in the H-alpha and [O III] emission lines, augmented by a new (and very rich) set of integral-field spectroscopy measurements from the 8-meter Gemini North telescope, velocity maps from the Russian 6-meter telescope, and long-slit spectra from the 3-meter Shane telescope at Lick Observatory.
To examine the history of each AGN, our approach was that the AGN had to be at least bright enough to ionize the hydrogen we see glowing at each point at the time the light reaching that point was given off. Certainly we can’t expect each piece of the cloud to absorb all the deep-UV radiation, so this is a lower limit. Two external checks, on quasars unlikely to have faded greatly and on the Teacup AGN which has had detailed modeling done from spectra, suggests that the very brightest pixels at each radius absorb comparable fractions of the ionizing radiation. This gives confidence that we can track at least the behavior of a single object, underestimating its brightness by a single factor, if we look at the upper envelope of all pixels in the H-alpha images. We hoped this would be feasible all the way back to the original Hubble proposal to look at Hanny’s Voorwerp. Here is a graphic from the new paper comparing our AGN in this way. The distance in light-years at each point corresponds to the time delay between the AGN and cloud, and the curve labelled “Projection” shows how much one of these points would change if we view that location not perpendicular to the light but at angles up to 30 degrees each way. To be conservative, the plot shows the data corresponding to the bottom of this curve (minimum AGN luminosity at each point).
The common feature is the rapid brightness drop in the last 20,000 years for each (measured from the light now reaching us from the nuclei). Before that, most of them would not have stood out as having enough of an energy shortfall to enter our sample. Because of smearing due the large size of the clouds, and the long time it takes for electrons to recombine with protons at such low densities, we would not necessarily see the signature of similar low states more than about 40,000 years back.
We could also improve another measure of the AGN history – the WISE satellite’s mid-infrared sky survey gave us more accurate measure of these objects’ infrared output. That way, we can tell whether it is at least possible for the AGN to be bright enough to light up the gas, but so dust-blocked in our direction that we underestimate their brightness. The answer in most cases is “not at all”.
New data brought additional surprises (these objects have been gifts that just keep on giving). The Gemini data were taken with fiber-optic arrays giving us a spectrum for each tiny area 0.2 arcseconds on a side (although limited to 3.5×5 arc second fields), taken under extraordinarily steady atmospheric conditions so we can resolve structures as small as 0.5 arc second. We use these results to see how the gas is ionized and moves; some loops of gas that earlier looked as if they were being blown out from the nuclei are mostly rotating instead. Unlike some well-studied, powerful AGN with giant emission clouds, the Voorwerpje clouds are mostly just orbiting the galaxies (generally as part of tidal tails), being ionized by the AGN radiation but not shoved around by AGN winds. This montage shows the core of NGC 5972 seen by these various instruments, hinting at the level of mapping allowed by the Gemini spectra (and helping explain why it took so long to work finish the latest paper).
Work on the Voorwerpjes continues in many ways. Galaxy Zoo participants still find possible clouds (and the moderators have been excellent about making sure we see them). There is more to be learned from the Gemini data, while X-ray observatories are gradually bringing the current status of the AGN into sharper focus. A narrowband imaging survey from the ground can pick out fainter (and sometimes older) clouds. Colleagues with expertise in radio interferometry are addressing questions posed by the unexpected misalignments of optical and radio structures in some of our galaxies. Finally, the new DECaLS and Pan-STARRS survey data will eventually bring nearly the whole sky into our examination (for a huge range of projects, not just AGN history).
Once again, thanks to all who have helped us find and unravel these fascinating objects!
We submitted the Galaxy Zoo CANDELS paper in May. Now, after some discussion with a very helpful referee, the paper is accepted! I hope our volunteers are as thrilled as I was to get the news. It happened within days of the Galaxy Zoo: Hubble paper acceptance. Hurray!
If you’d like to read the paper, it’s publicly available as a pre-print now and will be published at some point soon in the Monthly Notices of the Royal Astronomical Society. The pre-print version is the accepted version, so it should only differ from the eventual published paper by a tiny bit (I’m sure the proof editor will catch some typos and so on).
The paper may be a little long for a casual read, so here’s an overview:
- We collected 2,149,206 classifications of 52,073 subjects, from 41,552 registered volunteers and 53,714 web browser sessions where the classifier didn’t log in. In the analysis we assumed each of those unique browser sessions was a separate volunteer.
- The raw consensus classifications are definitely useful, but we also weighted the classifications using a combination of “gold standard” data and consensus-based weighting. That is, classifiers were up- or down-weighted according to whether they could tell a galaxy apart from a star most of the time, and then the rest of the weighting proceeded in the same way it has for every other GZ dataset. No surprise: the majority of volunteers are excellent classifiers.
- 6% of the raw classifications were from 86 classifiers who both classified a lot and gave the same answer (usually “star or artifact”) at least 98% of the time, no matter what images they saw. We have some bots, but they’re quite easy to spot.
- Even with a pretty generous definition of what counts as “featured”, less than 15% of galaxies in the relatively young Universe that this data examines have clear signs of features. Most galaxies in the data set are relatively smooth and featureless.
- Galaxy Zoo compares well with visual classifications of the same galaxies done by members of the CANDELS team, despite the fact that the comparison is sometimes hard because the questions they asked weren’t the same as what we did. This is, of course, a classic problem when comparing data sets of any kind: to some extent it’s always apples-vs-oranges, and the devil is in the details.
- By combining Galaxy Zoo classifications with multi-wavelength light profile fitting — where we fit a 2D equation to the distribution of light in a galaxy, the properties of which correlate pretty well with whether a galaxy has a strong disk component — we’ve identified a population of likely disk-dominated galaxies that also completely lack the features that are common in disk galaxies in the nearby, more evolved Universe. These disks don’t have spiral arms, they don’t have bars, they don’t have clumps. They’re smooth, but they are disks, not ellipticals. They tend to be a bit more compact than disk galaxies that do have features, even though they’re at the same luminosities. They’re also hard to identify using color alone (which echoes what we’ve seen in past Galaxy Zoo studies of various different kinds of galaxies). You really need both kinds of morphological information to reliably find these.
- The data is available for download for those who would like to study it: data.galaxyzoo.org.
With the data releases of Galaxy Zoo: Hubble and Galaxy Zoo CANDELS added to the existing Galaxy Zoo releases, your combined classifications of over a million galaxies near and far are now public. We’ve already done some science together with these classifications, but there’s so much more to do. Thanks again for enabling us to learn about the Universe. This wouldn’t have been possible without you.
I’m incredibly happy to report that the main paper for the Galaxy Zoo: Hubble project has just been accepted to the Monthly Notices of the Royal Astronomical Society! It’s been a long road for the project, but we’ve finally reached a major milestone. It’s due to the efforts of many, including the scientists who designed the interface and processed the initial images, the web developers who managed our technology and databases, more than 80,000 volunteers who spent time classifying galaxies and discussing them on the message boards, and the distributed GZ science team who have been steadily working on analyzing images, calibrating data, and writing the paper.
The preprint for the Galaxy Zoo: Hubble paper is available here. The release of GZH also syncs up with the publication of the Galaxy Zoo: CANDELS catalog, led by Brooke Simmons; she’ll have a blog post up later today, and the GZC paper is also available as a preprint.
Galaxy Zoo: Hubble began in 2010; it was the first work of GZ to move beyond the images taken with the Sloan Digital Sky Survey (SDSS). We were motivated by the need to study the evolution and formation of galaxies billions of years ago, in the early days of the Universe. While SDSS is an amazing telescope, it doesn’t have the sensitivity or resolution to make a quality image of a typical galaxy beyond a redshift of about z=0.4 (distances of a few billion parsecs). Instead, we used images from the Hubble Space Telescope, the flagship and workhorse telescope of NASA for the past two decades, and asked volunteers to help us classify the shapes of galaxies in several of Hubble’s largest and deepest surveys. After more than two years of work, the initial set of GZH classifications were finished in 2012 and the site moved on to other datasets, including CANDELS, UKIDSS, and Illustris.
So why has it taken several years to finish the analysis and publication of the data? The reduction of the GZH data ended up being more complicated and difficult than we’d originally anticipated. One key difference lies in our approach to a technique we call debiasing; these refer to sets of corrections made to the raw data supplied by the volunteers. There’s a known effect where galaxies that are less bright and/or further away will appear dimmer and/or smaller in the images which are being classified. This skews the data, making it appear that there are more elliptical/smooth galaxies than truly exist in the Universe. With SDSS images, we dealt with this by assuming that the nearest galaxies were reliably measured, and then deriving corrections which we applied to the rest of the sample.
In Galaxy Zoo: Hubble, we didn’t have that option available. The problem is that there are two separate effects in the data that affect morphological classification. The first is the debiasing issue just mentioned above; however, there’s also a genuine change in the populations of galaxies between, say, 6 billion years ago and the present day. Galaxies in the earlier epochs of the Universe were more likely to have clumpy substructures and less likely to have very well-settled spiral disks with features like bars. So if we just tried to correct for the debiasing effect based on local galaxies, we would have explicitly removed any of the real changes in the population over cosmic time. Since those trends are exactly what we want to study, we needed another approach.
Our solution ended up bringing in another set of data to serve as the calibration. Volunteers who have classified on the current version of the site may remember classifying the “FERENGI” sample. These were images of real galaxies that we processed with computer codes to make them look like they were at a variety of distances. The classifications for these images, which were completed in late 2013, gave us the solution to the first effect; we were able to model the relationship between distance to the galaxy and the likelihood of detecting features, and then applied a correction based on that relationship to the real GZH data.
The new GZH data is similar in format and structure to the data release from GZ2. The main product is a very large data table (113,705 rows by 172 columns) that researchers can slice and dice to study specific groups of galaxies with morphological measurements. We’re also releasing data from several related image sets, including experiments on fading and swapping colors in images, the effect of bright active galactic nuclei (AGN), different exposure depths, and even a low-redshift set of SDSS Stripe 82 galaxies classified with the new decision tree. All of the data will be published in electronic tables along with the paper, and are also downloadable from data.galaxyzoo.org. Our reduction and analysis code is available as a public Github repository.
The science team has already published two papers based on preliminary Galaxy Zoo: Hubble data. This included a paper led by Edmond Cheung (UCSC/Kavli IPMU) that concluded that there is no evidence connecting galactic bars and AGN over a range of redshifts out to z = 1.0. Tom Melvin (U. Portsmouth) carefully examined the overall bar fraction in disks using COSMOS data, measuring a strong decrease in bar fraction going back to galaxies 7.8 billion years ago. We’re now excited to continue new research areas, including a project led by Melanie Galloway (U. Minnesota) on the evolution of red disk galaxies over cosmic time. We hope GZH will enable a lot more science very soon from both our team and external researchers, now that the data are publicly released.
A massive “thank you” again to everyone who’s helped with this project. Galaxy Zoo has made some amazing discoveries with your help in the past eight years, and now that two new unique sets of data are openly available, we’re looking forward to many more.
The Universe is pretty huge, and to understand it we need to collect vast amounts of data. The Hubble Telescope is just one of many telescopes collecting data from the Universe. Hubble alone produces 17.5 GB of raw science data each week. That means since its launch to low earth orbit in April 1990, it’s collected roughly a block of data equivalent in size to 6 million mp3 songs! With the launch of NASA’s James Webb Telescope just around the corner – (a tennis court sized space telescope!), the amount of raw data we can collect from the Universe is going to escalate dramatically. In order to decipher what this data is telling us about the Universe we need to use sophisticated statistical techniques. In this post I want to talk a bit about a particular technique I’ve been using called a Markov-Chain-Monte-Carlo (MCMC) simulation to learn about galaxy evolution.
Before we dive in into the statistics let me try and explain what I’m trying to figure out. We can model galaxy evolution by looking at a galaxy’s star formation rate (SFR) over time. Basically we want know to how fast a particular galaxy is making stars at any given time. Typically, a galaxy has an initial constant high SFR then at a time called t quench (tq) it’s SFR decreases exponentially which is characterised by a number called tau. Small tau means the galaxy stops forming stars, or is quenched, more rapidly. So overall for each galaxy we need to determine two numbers tq and tau to figure out how it evolved. Figure 1 shows what this model looks like.
Figure 1: Model of a single galaxy’s SFR over time. Showing an initial high constant SFR, follow by a exponential quench at tq.
To calculate these two numbers, tq and tau, we look at the colour of the galaxy, specifically the UVJ colour I mentioned in my last post. We then compare this to a predicted colour of a galaxy for a specific value of tq and tau. The problem is that there are many different combinations of tq and tau, how to we find the best match for a galaxy? We use a MCMC simulation to do this.
The first MC – Markov-Chain – just means an efficient random walk. We send “walkers” to have a look around for a good tq and tau, but the direction we send them to walk at each step depends on how good the tq and tau they are currently at is. The upshot of this is we quickly home in on a good value of tq and tau. The second MC – Monte Carlo – just picks out random values of tq and tau and tests how good they are by comparing the UVJ colours and our SFR model. Figure 2 shows a gif of a MCMC simulation of a single galaxy. The histograms shows the positions of the walkers searching the tq and tau space, and the blue crosshair shows the best fit value of tq and tau at every step. You can see the walkers homing in and settling down on the best value of tq and tau. I ran this simulation by running a modified version of the starpy code.
Figure 2: MCMC simulation for a single galaxy, pictured in the top right corner. Main plot shows density of walkers. Marginal histograms show 1D projections of walker densities. Blue crosshair shows best fit values of tq and tau at each step.
The maths that underpins this simulation is called Bayesian Statistics, and it’s quite a novel way of thinking about parameters and data. The main difference is that instead of treating unknown parameters as fixed quantities with associated error, they are treated as random variables described by probability distributions. It’s quite a powerful way of looking at the Universe! I’ve left all of the gory maths detail about MCMC out but if you’re interested an article by a DPhil student here at Oxford does are really good job of explaining it here.
So how does this all relate to galaxy morphology, and Galaxy Zoo classifications? I’m currently running the MCMC simulation showing in Figure 2 over the all the galaxies in the COSMOS survey. This is really cool because apart from getting to play with the University of Oxford’s super computer (544 cores!), I can use galaxy zoo morphology to see if the SFR of a galaxy over time is dependent on the galaxy’s shape, and overall learn what the vast amount of data I have says about galaxy evolution.
It’s been a good amount of time since the Galaxy Zoo: Hubble and Galaxy Zoo: CANDELS projects were finished, tackling more than 200,000 combined galaxies thanks to the efforts of our volunteers. While we’ve had a couple of science papers based on the early results (Melvin et al. 2014, Simmons et al. 2014, Cheung et al. 2015), a full release of the data and catalog has taken slightly longer. However, we’ve been working hard, testing the data, and developing some new analysis methods on both image sets. This month has been really exciting, and we now have drafts for both papers that are just about finished. Once they’ve been accepted to the journals (and revised, if necessary), we’ll have some much longer posts discussing the results, and of course attaching the papers themselves. Hopefully that’ll be quite soon.
As a small teaser, here’s a little movie I just made of the Galaxy Zoo: Hubble paper as it went through the various drafts by different members of the science team. If only all paper writing were this easy … 😉
Once upon a time, there was an experimental project called Galaxy Zoo: Mergers. It used ancient, mystical technology to allow volunteers to run simulations of merging galaxies on their computers, and to compare the results of many such simulations. Their mission: to find matches to more than fifty nearby mergers selected from Galaxy Zoo data.
Amongst the chosen galaxies were not just run-of-the-mill, everyday mergers, but also the various oddities that the volunteers found, such as the Penguin galaxy. The team led volunteers through a series of tournaments designed to pit potential solutions for a particular galaxy against each other. In total, more than 3 million simulations were reviewed producing the results described in the paper, now accepted by the journal MNRAS, and in the dataset visible at the main Galaxy Zoo data repository. This represents a huge amount of effort, and a speeding up of the process – in the paper, we note that previous fits to mergers have taken months of effort to complete.
Which is not to say the analysis, led by Anthony Holincheck and John Wallin, has been easy. In a recent email to the Galaxy Zoo team, John commented:
This is by far the most complex project I have ever worked on. Most papers that model interacting galaxies contain one or two systems where the author uses a few dozen simulations. We just published a paper that modeled 62 different systems using a brand new modeling technique where the 3 million simulation results were reviewed by citizen scientists. Best of all, the 62 models were done using the same code and the same coordinate system so others can reproduce them. Doing this with other published simulations is nearly impossible.
I know an immense amount of effort went into making sure that the results weren’t wasted, and the paper thus represents a happy ending to a tale that’s been running a long time. But it is not really an end; we are already planning to observe some of these galaxies as part of surveys like MaNGA that can measure the way that the galaxies’ components are moving today, allowing us to test these models. We also hope a library of models might be useful for other astronomers, and will be looking to try and revive this kind of project.
Read more about Galaxy Zoo: Mergers in this old blog post blog.galaxyzoo.org/2012/03/27/the-finale-of-merger-zoo.
This post was written as a contribution by Timothy Friel, an undergraduate Australian National University student studying Theoretical Physics and Science Communication. Tim is conducting research into citizen science projects and their social media communication strategies.
Meet two of our fantastic Zooniverse members who have been recognised as co-authors for a RGZ submitted paper.
In March 2016, the Radio Galaxy Zoo (RGZ) team submitted a paper which is co-authored by two of our SuperRGZooites. Thanks to the help of citizens around the world, over 1.6 million classifications have been made. However, a very special thanks must go to two citizens who have been greatly involved in our most recent submitted paper.
Meet Ivan Terentev and Tim Matorny, our Citizen Science co-authors.
How did you discover Radio Galaxy Zoo and become involved?
Tim: I had a passion for research and to be involved with generating new knowledge. So I began to look and met [the world of] citizen science and tried many different projects. I was already familiar with the Zooniverse, when I got email about new project – RGZ.
Ivan: I became involved in RGZ from its beginning, more or less, in December 2013, and at that time I was part of the Zooniverse for two years. I was mostly contributing to the Planet Hunters project back then, but occasionally I switched to different projects just to look for what they have to offer. And it was during one of these “Let’s try something different” moments that I discovered RGZ through the announcement post in the Galaxy Zoo blog.
What parts kept you interested and motivated to stay a part of this project?
Tim: The team of scientists and their active participation is an important part. Their blog posts, comments and links have helped me to learn about the project and my involvement with the goals.
Looking for host radio lobes which are separated by a 10′ [minutes] or looking at the behaviour of jets in galaxies clusters is really exciting for me. I like that RGZ covers a wide range of data: radio, optics, IR, X-ray.
Ivan: If we are talking specifically about RGZ, it would be the RGZ Talk community and the fact that RGZ Science team is eager to communicate with simple volunteers and involve them in the research process. But a large portion of my motivation [for RGZ] is the same as for the rest of the Zooniverse projects. You see, I am sci-fi fan and it made me interested in space exploration. I like to watch documentaries about the astronomers, their work and all the amazing stuff in the universe around us and through the Zooniverse I can actually be involved in the process of science and help to shape the future, even if it just by a very tiny fraction. I never thought that something like this would be possible before I discovered Zooniverse.
How do you feel about being a co-author of a scientific research paper?
Tim: I am still amazed and feel more motivated to look for stunning new radio galaxies.
Ivan: This isn’t the first time actually, I am also a co-author for three papers from the Planet Hunters, BUT it is always awesome, like every single time! Although, I keep my head cool over that since most of the work was done by the professional scientists. A huge thanks to them for the acknowledgment of my small contribution in the form of inviting me to be a co-author in their paper. With this RGZ paper, I got a chance to see the whole process of science starting from the simple question “What is that?” and then people trying to figure out what is going on, schedule observations, discussing things and I have been a part of it! All the way through the process, ending with the actual published science article. It was an amazing experience!
Without the contributions made by our volunteers all over the world, we would not have been so successful in our endeavours.
However, we have only reached 57% of our classification target. Head to www.bit.ly/RadioGalaxyZoo1 to become involved and you could be co-authoring another great discovery with us!