Visualizing the decision trees for Galaxy Zoo

This post (and visualization) is by Coleman Krawczyk, a Zooniverse Data Scientist at the ICG at the University of Portsmouth

Today we’ve added a new tool that visualizes the full decision tree for every Galaxy Zoo project from GZ2 onward (GZ1 only asked users one question, and would make for a boring visualization).  Each tree shows all the possible paths Galaxy Zoo users can take when classifying a galaxy.  Each “task” is color-coded by the minimum number of branches in the tree a classifier needs to take in order to reach that question.  In other words, it indicates how deeply buried in the tree a particular question is, a property that is helpful when scientists are analyzing the classifications.

Galaxy Zoo has used two basic templates for its decision trees.  The first template allowed users to classify galaxies into smooth, edge-on disks, or face on disks (with bars and/or spiral arms) and was used for Galaxy Zoo 2, the infrared UKIDSS images, and is currently being used for the SDSS data that is live on the site. The second template was designed for high-redshift galaxies, and allows users to classify galaxies into smooth, clumpy, edge on disks, or face on disks. This template was used for Galaxy Zoo: Hubble (GZ3), FERENGI (artificially redshifted images of galaxies), and is currently being used by the CANDELS and GOODS images in GZ4.  Although these final three projects ask the same basic questions, there are some subtle differences between them in the questions we ask about the bulge dominance, “odd” features, mergers, spiral arms, and/or clumps.

Visualization of the decision tree for Galaxy Zoo 2 (GZ2), by C. Krawcyzk. Colors indicate the depth of a particular question within the decision tree.

Visualization of the decision tree for Galaxy Zoo 2 (GZ2), by C. Krawczyk. Colors indicate the depth of a particular question within the tree.

If you ever wanted to know all the questions Galaxy Zoo could possibly ask you, head on over to the new visualization and have a look!

About Kyle Willett

Kyle Willett is a postdoc and astronomer at the University of Minnesota. He works as a member of the Galaxy Zoo team, and gets to study galaxy morphology and evolution, AGN, blazars, megamasers, citizen science engagement, and many other cool things.

6 responses to “Visualizing the decision trees for Galaxy Zoo”

  1. Robert Maher says :

    I think that the type “Irregular” needs to be added when asked context of clumping: straight, chain, ,cluster and spiral, as these do not cover the majority of items put up for classifaction.

  2. murraycu says :

    So is there some canonical source for these decision trees other than the CoffeeScript * files here?:

    • Kyle Willett says :

      The CoffeeScript files should be canonical for the current versions, since those are what control the order in which questions are asked.

  3. cniharral says :

    Here, in this decision tree I don’t see any decision related to the colour of the galaxy. Is it that there’s no relation at all between the colour and the shape and/or the age of the galaxies?


    • Kyle Willett says :

      There’s an extremely strong relation – but separating the relationships between color and shape is exactly the aim of Galaxy Zoo. We want to measure the shape _independently_ of color, age, or other variables that are often used as proxies, and then use that data to work out the physical relationships between these variables.

Trackbacks / Pingbacks

  1. Explore Galaxy Zoo Classifications | Galaxy Zoo - April 27, 2015

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: