Explore Galaxy Zoo Classifications

This post (and visualization) is by Coleman Krawczyk, a Zooniverse Data Scientist at the ICG at the University of Portsmouth

Today we’ve added another new tool for visualizing Galaxy Zoo, this time showing the full vote path of all users for each galaxy from GZ2 onward.  The first node of the visualization shows an image of the galaxy and each of the other nodes represents the answer to a question from the Galaxy Zoo decision tree, and the size of the node is proportional to the number of votes for that answer.  The maximal vote path is highlighted and also shown in words across to top of the tree, and the results of the “Is there anything odd?” question are shown across the bottom.
The full Galaxy Zoo catalog can be searched via Zooniverse ID (the same one used for Talk), RA and Dec, or randomly.  After picking a galaxy the nodes can be moved around by clicking and dragging, and the links can be collapsed/expanded by clicking the attached nodes, both of these functions are useful for untangling complex trees.  Various properties of the visualization can also be controlled with the sliders below the tree.  For a guided tour of this tool click the “Take a tour” button, and for a full list of features click the “Help” button.
Screenshot of the Visualisation Tool

Screenshot of the Visualisation Tool

Visualizing the decision trees for Galaxy Zoo

This post (and visualization) is by Coleman Krawczyk, a Zooniverse Data Scientist at the ICG at the University of Portsmouth

Today we’ve added a new tool that visualizes the full decision tree for every Galaxy Zoo project from GZ2 onward (GZ1 only asked users one question, and would make for a boring visualization).  Each tree shows all the possible paths Galaxy Zoo users can take when classifying a galaxy.  Each “task” is color-coded by the minimum number of branches in the tree a classifier needs to take in order to reach that question.  In other words, it indicates how deeply buried in the tree a particular question is, a property that is helpful when scientists are analyzing the classifications.

Galaxy Zoo has used two basic templates for its decision trees.  The first template allowed users to classify galaxies into smooth, edge-on disks, or face on disks (with bars and/or spiral arms) and was used for Galaxy Zoo 2, the infrared UKIDSS images, and is currently being used for the SDSS data that is live on the site. The second template was designed for high-redshift galaxies, and allows users to classify galaxies into smooth, clumpy, edge on disks, or face on disks. This template was used for Galaxy Zoo: Hubble (GZ3), FERENGI (artificially redshifted images of galaxies), and is currently being used by the CANDELS and GOODS images in GZ4.  Although these final three projects ask the same basic questions, there are some subtle differences between them in the questions we ask about the bulge dominance, “odd” features, mergers, spiral arms, and/or clumps.

Visualization of the decision tree for Galaxy Zoo 2 (GZ2), by C. Krawcyzk. Colors indicate the depth of a particular question within the decision tree.

Visualization of the decision tree for Galaxy Zoo 2 (GZ2), by C. Krawczyk. Colors indicate the depth of a particular question within the tree.

If you ever wanted to know all the questions Galaxy Zoo could possibly ask you, head on over to the new visualization and have a look!

New paper: Galaxy Zoo and machine learning

I’m really happy to announce a new paper based on Galaxy Zoo data has just been accepted for publication. This one is different than many of our previous works; it focuses on the science of machine learning, and how we’re improving the ability of computers to identify galaxy morphologies after being trained off the classifications you’ve provided in Galaxy Zoo. This paper was led by Sander Dieleman, a PhD student at Ghent University in Belgium.

This work was begun in early 2014, when we ran an online competition through the Kaggle data platform called “The Galaxy Challenge”. The premise was fairly simple – we used the classifications provided by citizen scientists for the Galaxy Zoo 2 project and challenged computer scientists to write an algorithm to match those classifications as closely as possible. We provided about 75,000 anonymized images + classifications as a training set for participants, and kept the same amount of data secret; solutions submitted by competitors were tested on this set. More than 300 teams participated, and we awarded prizes to the top three scores. You can see more details on the competition site.

Since completing the competition, Sander has been working on writing up his solution as an academic paper, which has just been accepted to Monthly Notices of the Royal Astronomical Society (MNRAS). The method he’s developed relies on a technique known as a neural network; these are sets of algorithms (or statistical models) in which the parameters being fit can change as they learn, and can model “non-linear” relationships between the inputs. The name and design of many neural networks are inspired by similarities to the way that neurons function in the brain.

One of the innovative techniques in Sander’s work has been to use a model that makes use of the symmetry in the galaxy images. Consider the pictures of the same galaxy below:

Screen Shot 2015-03-27 at 4.16.07 PM

A galaxy from GZ2, shown both with no rotation (left) and rotated by 45 degrees (right).

From the classifications in GZ, we’d expect the answers for these two images to be identical; it’s the same galaxy, after all, no matter which way we look at it. For a computer program, however, these images would need to be separately analyzed and classified. Sander’s work exploits this in two ways:

  1. The size of the training data can be dramatically increased by including multiple, rotated versions of the different images. More training data typically results in a better-performing algorithm.
  2. Since the morphological classification for the two galaxies should be the same, we can apply the same feature detectors to the rotated images and thus share parameters in the model. This makes the model more general and improves the overall performance.

Once all of the training data is in, Sander’s model takes images and can provide very precise classifications of morphology. I think one of the neatest visualizations is this one: galaxies along the top vs bottom rows are considered “most dis-similar” by the maps in the model. You can see that it’s doing well by, for example, grouping all the loose spiral galaxies together and predicting that these are a distinct class from edge-on spirals.

From Figure 13 in Dieleman et al. (2015). Example sets of images that are maximally distinct in the prediction model. The top row consists of loose winding spirals, while the bottom row are edge-on disks.

From Figure 13 in Dieleman et al. (2015). Example sets of images that are maximally distinct in the prediction model. The top row consists of loose winding spirals, while the bottom row are edge-on disks.

For more details on Sander’s work, he has an excellent blog post on his own site that goes into many of the details, a lot of which is accessible even to a non-expert.

While there are a lot of applications for these sorts of algorithms, we’re particularly interested in how this will help us select future datasets for Galaxy Zoo and similar projects. For future surveys like LSST, which will contain many millions of images, we want to efficiently select the images where citizen scientists can contribute the most – either for their unusualness or for the possibility of more serendipitous discoveries. Your data are what make innovations like this possible, and we’re looking forward to seeing how these can be applied to new scientific problems.

Paper: Dieleman, Willett, & Jambre (2015). “Rotation-invariant convolutional neural networks for galaxy morphology prediction”, MNRAS, accepted.

New Images on Galaxy Zoo, Part 1

We’re delighted to announce that we have some new images on Galaxy Zoo for you to classify! There are two sets of new images:

1. Galaxies from the CANDELS survey

2. Galaxies from the GOODS survey

The general look of these images should be quite familiar to our regular classifiers, and we’ve already described them in many previous posts (examples: here, here, and here), so they may not need too much explanation. The only difference for these new images are their sensitivities: the GOODS images are made from more HST orbits and are deeper, so you should be able to better see details in a larger number of galaxies compared to HST.

Comparison of the different sets of images from the GOODS survey taken with the Hubble Space Telescope. The left shows shallower images from GZH with only 2 sets of exposures; the right shows the new, deeper images with 5 sets of exposures now being classified.

Comparison of the different sets of images from the GOODS survey taken with the Hubble Space Telescope. The left shows shallower images from GZH with only 2 sets of exposures; the right shows the new, deeper images with 5 sets of exposures now being classified.

The new CANDELS images, however, are slightly shallower than before. The main reason that these are being included is to help us get data measuring the effect of brightness and imaging depth for your crowdsourced classifications. While they aren’t always as visually stunning as nearby SDSS or HST images, getting accurate data is really crucial for the science we want to do on high-redshift objects, and so we hope you’ll give the new images your best efforts.

Images from the CANDELS survey with the Hubble Space Telescope. Left: deeper 5-epoch images already classified in GZ. Right: the shallower 2-epoch images now being classified.

Images from the CANDELS survey with the Hubble Space Telescope. Left: deeper 5-epoch images already classified in GZ. Right: the shallower 2-epoch images now being classified.

Both of these datasets are relatively small compared to the full Sloan Digital Sky Survey (SDSS) and Hubble Space Telescope (HST) sets that users have helped us with over the last several years. With about 13,000 total images, we hope that they’ll can be finished by the Galaxy Zoo community within a couple months. We already have more sets of data prepared for as soon as these finish – stay tuned for Part 2 coming up shortly!

As always, thanks to everyone for their help – please ask the scientists or moderators here or on Talk if you have any questions!

What’s all the fuss about bars in galaxies?

Since our discovery in 2010 that the red spirals identified by your classifications in the first phase of Galaxy Zoo were twice as likely to host galactic scale bars as normal blue spirals, a lot of our research time has focused on understanding which types of galaxies host bars, and why that might be.

Barred spiral, NCG 1300, observed with the Hubble Space Telescope.

Barred spiral, NCG 1300, observed with the Hubble Space Telescope.

 

Our research with the bars identified by you in the second phase of Galaxy Zoo continues to gives us hints that these structures in galaxies might be involved in the process which quenches star formation in spiral galaxies and through that could be part of the process involved in the reduction of star formation in the universe as a whole.

We’ve also used your classifications as part of Galaxy Zoo Hubble and Galaxy Zoo CANDELS to identify the epoch in the universe when disc galaxies were first stable enough to host a significant number of bars, finding them possibly even earlier in the Universe than was previously thought.

Last Friday I spoke at the monthly “Ordinary Meeting” of the Royal Astronomical Society, giving summary of the evidence we’re collecting on the impact bars have on galaxies thanks to your classifications (a video of my talk will be available at some point). This was the second time I’ve spoken at this meeting about results from Galaxy Zoo, and it’s a delightful mix of professional colleagues, and enthusiastic amateurs – including some Galaxy Zoo volunteers.

Prompted by that I thought it was timely to write on this blog about what these bars really are, what they do to galaxies, and why I think they’re so interesting. I wrote the below some time ago when I had a spare few minutes, and was just looking for the right time to post it.

The thing about galaxies, which is sometimes hard to remember, is that they are simply vast collections of stars, and that those stars are all constantly in motion, orbiting their common centre of mass. The structures that we see in galaxies are just a snapshot of the locations of those stars right now (on a cosmic timescale), and the patterns we see in the positions of the stars reveals patterns in their orbital motions. A stellar bar for example reveals a set of very elongated orbits of stars in the disc of a galaxy.

Another extraordinary thing about a disc galaxy is how thin it is. To put this is perspective I’ll give you a real world example. In the Haus der Astronomie in Heidelberg you can walk around inside a scale model of the Whirlpool galaxy. The whole building was laid out in a design which reflects the spiral arms of this galaxy. However it’s not an exact scale model – to properly represent the thickness of the disc of the Whirlpool galaxy the building (which in actual fact has 3 stories and hosts a fairly large planetarium in its centre) would have to be only 90cm tall…..

The Haus der Astronomie, a building laid out like a spiral galaxy.

The Haus der Astronomie, a building laid out like a spiral galaxy.

Such an incredibly thin disc of stars floating independently in space would be quite unstable dynamically (meaning its own gravity should cause it to buckle and collapse on itself). This instability would immediately manifest in elongated orbits of stars, which would make a stellar bar (as part of this process of collapse). Simple computer models of disks of stars immediately form bars. Of course we now know that galaxy discs are submerged in massive halos of dark matter. So my first favourite little fact about bars is

(1) the fact that not all disc galaxies have bars was put forward as evidence that the discs must be embedded in massive halos before the existence of dark matter was widely accepted.

Now we can model dark matter halos better we discover that even with a dark matter halo, as long as that halo can absorb angular momentum (ie. rotate a bit) all discs will eventually make a bar. So my second favourite little fact is that

(2) we still don’t understand why not all disc galaxies have bars.

M101 - an unbarred spiral galaxy (Credit: ESA/NASA HST).

M101 – an unbarred spiral galaxy (Credit: ESA/NASA HST).

What this second fact means is that perhaps what I should really be doing is studying the galaxies you have identified as not having bars to figure out why it is they haven’t been able to form a bar yet. It should really be the properties of these which are unexpected….. We find that this is more likely to happen in blue, intermediate mass spirals with a significant reservoir of atomic hydrogen (the raw material for future star formation). In fact this last thing may be the most significant. Including realistic interstellar gas in computer simulation of galaxies is very difficult, but people do run what is called “smooth particle hydrodynamic” simulations (basically making “particles” of gas and inserting the appropriate properties). If they add too much gas into these simulations they find that bar formation is either very delayed, or doesn’t happen in the time of the simulation…..

Anyway I hope this has given you a flavour of what I find interesting about bars in galaxies. I think it’s fascinating that they give us a morphological way to identify a process which is so dynamical in nature. And it’s a very complex process, even though the basic physics (just orbits of stars) is very simple and well understood. Finally, I have become convinced though tests of the bars identified by you in Galaxy Zoo compared to bars identified by other methods, that if you want a clean sample of very large bars in galaxies that multiple independent human eyes will give you the best result. You are much less easy to trick that automated methods for finding galactic bars.

So thanks again for the classifications, and keep clicking. :)

Here’s a link to all blog posted tagged with “bars”.

Hubble science results on Voorwerpjes – episode 1

After two rounds of comments and questions from the journal referee, the first paper discussing the detailed results of the Hubble observations of the giant ionized clouds we’ve come to call Voorwerpjes has been accepted for publication in the Astronomical Journal. (In the meantime, and freely accessible, the final accepted version is available at http://arxiv.org/abs/1408.5159 ) We pretty much always complain about the refereeing process, but this time the referee did prod us into putting a couple of broad statements on much more quantitively supported bases. Trying to be complete on the properties of the host galaxies of these nuclei and on the origin of the ionized gas, the paper runs to about 35 pages, so I’ll just hit some main points here.

Montage of Hubble images of Voorwerpjes

Montage of Hubble images of Voorwerpjes

These are all in interacting galaxies, including merger remnants. This holds as well for possibly all the “parent” sample including AGN which are clearly powerful enough to light up the surrounding gas. Signs include tidal tails of star as well as gas, and dust lanes which are chaotic and twisted. These twists can be modeled one the assumption that they started in the orbital plane of a former (now assimilated) companion galaxy, which gives merger ages around 1.5 billion years for the two galaxies where there are large enough dust lanes to use this approach. In 6 of 8 galaxies we studied, the central bulge is dominant – one is an S0 with large bulge, and only one is a mostly normal barred spiral (with a tidal tail).<?p>

Numerical model of precessing disk of gas from a disrupted companion of NGC 5972

Numerical model of precessing disk of gas from a disrupted companion of NGC 5972

Incorporating spectroscopic information on both internal Doppler shifts and chemical makeup of the gas we can start to distinguish smaller areas affected by outflow from the active nuclei and the larger surrounding regions where the gas is in orderly orbits around the galaxies (as in tidal tails). We have especially powerful synergy by adding complete velocity maps made by Alexei Moiseev using the 6-meter Russian telescope (BTA). In undisturbed tidal tails, the abundances of heavy elements are typically half or less of what we see in the Sun, while in material transported outward from the nuclei, these fractions may be above what the solar reference level. There is a broad match between disturbed motions indicating outward flows and heavy-element fractions. (By “transported” above, I meant “blasted outwards at hundreds of kilometers per second”). Seeing only a minor role for these outflows puts our sample in contrast to the extended gas around some quasars with strong radio sources, which is dominated by gas blasted out at thousands of kilometers per second. We’re seeing either a different process or a different stage in its development (one which we pretty much didn’t know about before following up this set of Galaxy Zoo finds.) We looked for evidence of recent star formation in these galaxies, using both the emission-line data to look for H-alpha emission from such regions and seeking bright star clusters. Unlike Hanny’s Voorwerp, we see only the most marginal evidence that these galaxies in general trigger starbirth with their outflows. Sometimes the Universe plays tricks. One detail we learned from our new spectra and the mid-infared data from NASA’s WISE survey satellite is that giant Voorwerpje UGC 7342 has been photobombed. A galaxy that originally looked as if it night be an interacting companion is in fact a background starburst galaxy, whose infrared emission was blended with that from the AGN in longer-wavelength IR data. So that means the “real” second galaxy has already merged, and the AGN luminosity has dropped more than we first thought. (The background galaxy has in the meantime also been observed by SDSS, and can be found in DR12).

BTA Doppler maps of Voorwerpjes

BTA Doppler maps of Voorwerpjes


Now we’re on to polishing the next paper analyzing this rich data set, moving on to what some colleagues find more interesting – what the gas properties are telling us about the last 100,000 years of history of these nuclei, and how their radiation correlates (or indeed anti-correlates) with material being blasted outward into the galaxy from the nucleus. Once again, stay tuned!

Radio Galaxy Zoo searches for Hybrid Morphology Radio galaxies (HyMoRS): #hybrid

First science paper on hybrid morphology radio galaxies found through Radio Galaxy Zoo project has now been submitted!

hybrid_blogfig1 In the paper we have revised the definition of the hybrid morphology radio galaxy (HyMoRS or hybrids) class. In general, HyMoRS show different Fanaroff-Riley radio morphology on either side of the active nucleus, that is FRI type on one side and FRII on the other side of their infrared host galaxy. But we found that this wasn’t very precise, and set up a clear definition of these sources, which is:

”To minimise the misclassification of HyMoRS, we attempt to tighten the original morphological classification of radio galaxies in the scope of detailed observational and analytical/numerical studies undertaken in the past 30 years. We consider a radio source to be a HyMoRS only if

(i) it has a well-defined hotspot on one side and a clear FR I type jet on the other, though we note the hotspots may `flicker’, that is their brightness may be rapidly variable (Saxton et al. 2002), and, in the case the radio source has a very prominent core or is highly asymmetric,

(ii) its core prominence does not suggest strong relativistic beaming nor its asymmetric radio structure can be explained by differential light travel time effects. ”

Based on this we revised hybrids reported in scientific literature and found 18 objects that satisfy our criteria. With Radio Galaxy Zoo during the first year of its operation, through our fantastic RadioTalk, you guys now nearly doubled this number finding another 14 hybrids, which we now confirm! Two examples from the paper are below:
hybrid_blogfig2

We also looked at the mid-infrared colours of hybrids’ hosts. As explained by Ivy in our last RGZ blog post (http://blog.galaxyzoo.org/2015/03/02/first-radio-galaxy-zoo-paper-has-been-submitted/), the mid-infrared colour space is defined by the WISE filter bands: W1, W2 and W3, corresponding to 3.4, 4.6 and 12 microns, respectively.

The results are below:

hybrid_blogfig3

For those of you interested in seeing the full paper, we will post a link to freely accessible copy once the paper is accepted by the journal and is in press! :)

Fantastic job everyone!
Anna & the RGZ science team

First Radio Galaxy Zoo paper has been submitted!

The project description and early science paper (results from Year 1) for the Radio Galaxy Zoo project has been submitted!

authorlist

authorlist1We find that the RGZ citizen scientists are as effective as the science experts at identifying the radio sources and their host galaxies.

Based upon our results from 1 year of operation, we find the RGZ host galaxies reside in 3 primary loci of mid-infrared colour space.  The mid-infrared colour space is defined by the WISE filter bands: W1, W2 and W3, corresponding to 3.4, 4.6 and 12 microns; respectively.

Approximately 10% of the RGZ sample reside in the mid-IR colour space dominated by elliptical galaxies, which have older stellar populations and are less dusty, hence resulting in bluer (W2-W3) colours. The 2nd locus (where ~15% of RGZ sources are found) lies in the colour space known as the `AGN wedge’, typically associated with X-ray-bright QSOs and Seyferts. And lastly, the largest concentration of RGZ host galaxies (~30%) can be found in the 3rd locus usually associated with luminous infrared galaxies (LIRGs).  It should be noted that only a small fraction of LIRGs are associated with late-stage mergers.  The remainder of the RGZ host population are distributed along the loci of both star-forming and active galaxies, indicative of radio emission from star-forming galaxies and/or dusty elliptical (non-star-forming) galaxies. See the figure below for a plot of these results.

blog_fig2Caption to figure WISE colour-colour diagram, showing sources from the WISE all-sky catalog (colourmap), 33,127 sources from the 75% RGZ catalog (black contours), and powerful radio galaxies (green points) from (Gürkan et al. 2014). The wedge used to identify IR colours of X-ray-bright AGN from Lacy et al. (2004) & Mateos et al. (2012) is overplotted (red dashes). Only 10% of the WISE all-sky sources have colours in the X-ray bright AGN wedge; this is contrasted with 40% of RGZ and 49% of the Gürkan et al. (2014) radio galaxies. The remaining RGZ sources have WISE colours consistent with distinct populations of elliptical galaxies and LIRGs, with smaller numbers of spiral galaxies and starbursts.

In addition, we will also be submitting our paper on Hybrid Morphology Radio Sources (HyMoRS) in the next few days so stay tuned!

As always, thank you all very much for all your help and support and keep up the awesome work!

Cheers,
Julie, Ivy & the RGZ science team

Zooniverse at Mauna Kea, Day 6: This is the End

Ed, Chris, Sandor, and Becky in front of the telescope

Part 1, Part 2, Part 3, Part 4, Part 5

I’m not sure if we’ve been especially unlucky or if this is the norm for observing trips, but we once again the weather is curtailing our telescope time. After a few hours of normal observing, clouds started to blow across the top of Mauna Kea, and now it’s raining outside the dome.

Tonight's Weather

The Dip in the humidity (2nd from the top) represents when we were able to observe.

In the meantime, Becky and I shot a short video tour of the dome a couple days ago you can check out:

Tomorrow, we check out of Hale Pohaku and head down to Hilo for a night. Then I’m off to Chicago and Becky and Sandor are back to Oxford. Even with the bad weather, sleep deprivation, and static electricity, this trip has been a really great experience for me. I now know infinitely more about radio astronomy than I did before! I hope the people doing the real work were able to get all the data they needed.

A Few Notes:

Sad Becky

This sums up the general mood

  • Sandor and Becky took some sick photos around sunset, you should check out all of them.
  • When everything is terrible, you just have to let it go.
  • Thanks again to all the Galaxy Zoo volunteers, whose work made this observing trip possible for us. You are the best.

Zooniverse at Mauna Kea, Day 5: The Wind Strikes Back

windy

Part 1, Part 2, Part 3, Part 4

After few good days of observations the wind has returned to ruin our fun. The CSO telescope is supposed to be closed when the wind is above 35mph. Curiously the telescope itself doesn’t have its own anemometer, so we have to rely on readings from the other telescopes on the mountain to decide if it is safe to open the telescope building.

Feeling this entire situation was quite unsatisfactory, I decided to build my own anemometer using a clipboard with a ruler and Becky’s boot, giving you the answer to Chris’s question from earlier tonight:

Graph of Wind speed vs Deflection Angle

Shout out to Mrs Beck’s AP Physics for me understanding this

Using the above chart we tried to workout the wind speed. We had to do a bit of fudging. We decided the boot was a perfect cylinder (drag coefficient 0.82), and that it weighed about 300g. We also decided not to take into account lower air pressure. Finally when Sandor and I calculated it independently, we got wildly different results, so it was a futile exercise in the end. (Also CSO buy an anemometer)

Sandor doing the hard work!

Sandor doing the hard work!

Since then, we’ve been playing chicken with the wind. Sometimes having to close the dome. Sometimes thinking we can be open, only to have the telescope struggle to stay on target. Sometimes we hear Meg Schwamb‘s wind tracker say “Warning High Winds”.  The conditions made us miss out on a second night of observing Comet Lovejoy, and everyone seemed pretty down for most of the night.

Around 1 or 2am the wind finally let up and we were able to start observing, so the night wasn’t a complete loss. Hopefully the weather tomorrow is better.

A Few Notes:

  • It’s really hard to get enough sleep. Sleeping at altitude is hard anyway, and adding in trying to sleep during the day gives us all points for degree of difficulty. Everyone has lovely bags around their eyes.
  • This is the last day Chris is with us. We’ll be all alone tomorrow night.
  • Sandor is succumbing to the static curse now too.
  • @GeertHub on Twitter wanted to me to post a screen shot of the telescope software: snapshot1
  • All the Sex & Drugs & Rock & Roll is helping us touch the sky.
Follow

Get every new post delivered to your Inbox.

Join 21,949 other followers