#GZoo10 Day 2 : Happy Birthday to us

Ten years ago today, I was trying to work out how to deal with the sudden flood of volunteers heading to our site to help explore the Universe. Lots of that traffic came from a BBC News article, so it seems appropriate they’re marking the day with a new piece reflecting on recent results.

(As Bill says, you can find out more about those results here on the Galaxy Zoo blog).

We’re ready for day 2 of the Galaxy Zoo 10 workshop at St Catherine’s College in Oxford; it was great to have so many people following along yesterday morning on the Livestream – yesterday’s talks are still up, and today’s schedule is:

9.40am: Karen Masters (Portsmouth)
10.00am: Lucy Fortson (Minnesotta)
10.20am: Hugh Dickinson (Minnesotta)
11.00am: Sam Penny (Portsmouth)
11.20am: Becky Smethurst (Nottingham)
11.40am: Ross Hart (Nottingham)
12 noon: Seb Turner (Liverpool John Moores)
12.20pm: Peter McGill (Oxford).

We’ll blog these talks as they happen here too.


From here one in folks it’s Becky Smethurst taking the reins – once more unto the breach my friends! 

We’re kicking off the day with an original science team member: Karen Masters! She’s going to be asking the question: After 10 years of Galaxy Zoo: What Now?

She kicks us off with thinking about how Galaxy Zoo classifications are just as quantitative as an automated computer classification of a galaxy’s morphology – but what can we do to combine these measurements? The problem is that computers which searching for a best fit model can get stuck in what’s called a “local minimum” in parameter space which is not actually the best fit. Combining human interaction with this computer fitting process will help to find the “global minimum” or the true best fit for the model. Karen is showing a new project that is still in the early development stages where volunteers help to fit a model light profile for a galaxy.

Karen now segways into the idea that colour ≠ morphology, which was one of the first results from Galaxy Zoo.

She’s now talking about how we need to go one step beyond the visual morphologies we have, to classify how the stars are actually rotating. New galaxy surveys, such as MaNGA that Karen is working on, are taking many spectra across the whole galaxy to get rotation maps. We can then classify these rotation maps to understand a galaxy’s history – less ordered rotation suggests that a galaxy’s disk has been disturbed by something like a merger or interaction. This is really helpful to astronomers to be able to figure this out, especially if there’s no clues in the visual morphology that an interaction or merger is happening.

Karen ends her wonderful talk by discussing how we can all get more students involved with Galaxy Zoo and astronomy as well. Galaxy Zoo is often used in Astronomy 101 classes at universities, as well as in schools – so how do we engage more with this side of the Galaxy Zoo community?

Next up this morning is Lucy Fortson talking about “The evolution of a Galaxy Zoo team member.” Ten years ago Lucy was working at the Adler Planetarium in Chicago and was aware that the best research model for museums was to get the public to participate in it – so jumped at the chance to join the Galaxy Zoo research team! She’s now going to be giving us a round up of everything going on in the group at Minnesota – one of the hubs for the Galaxy Zoo research team.

Mel Galloway’s work comparing the UKIDSS (infrared images) and SDSS (optical images) Galaxy Zoo classifications – how does the morphology of a galaxy change with wavelength? Mel has also worked with the Galaxy Zoo Hubble classifications looking at how disk galaxies evolve and become passive (i.e. non star forming). The problem was that the number of disks drops out at the high redshifts that the Hubble Space Telescope can target, so makes the sample of disks incomplete. This ended up with a new offshoot project classifying disk galaxies lovingly titled Save Mel’s Thesis! With these classifications providing a more complete sample, Mel managed to show how the the amount of passive disks increased from 6 billion years ago to the present day. Not only that, she also showed that more massive galaxies are more likely to keep their disks after they stop forming stars as well.

Now we’re all getting emotional because Lucy’s brought up Kyle Willett who recently left astronomy to work in the data science industry. But where would Galaxy Zoo be without Kyle?! He not only published a lot of the data release papers for Galaxy Zoo but also kept up the site maintenance and even ran the kaggle competition we held for machine classifications of galaxies.

Speaking of machine classifications of galaxies – did someone say Melanie Beck?! Melanie is working on using machine learning in conjunction with the user classifications to complete Galaxy Zoo much quicker by retiring the easy subjects much quicker, leaving the users with the interesting and difficult things to classify.

This is going to be really important in the future when new telescopes like the LSST start observing when we’ll have almost 1 billion images of galaxies! If the computers are working on the easy stuff, say 90% of the images, users will still be left with 1 million more interesting things to classify!

The next Minnesota person is Hugh Dickinson – but he’s in the room, so now we’re hearing from him personally. Hugh is working on the classifications from Galaxy Zoo Illustris – the first time users have been asked to classify galaxies from a different universe – albeit a universe that only exists inside a computer! Illustris is a simulation of galaxies which is one of the most realistic simualtions to date; it’s high resolution and includes all the ingredients for a Universe such as dark matter, stellar processes, feedback and gas cooling.

But why is it important to classify what are essentially “fake” galaxies? Well with a simulation we know exactly what the history of a galaxy is throughout the entire simulated life of the Universe, so if we’re getting the same morphologies as we see in the Universe then we know that the Physics we’ve assumed to make this simulated universe are right. It can also help highlight the links between a galaxy’s visual appearance and its dynamical history. So although it might seem weird classifying something that doesn’t really exist, those classifications are extremely helpful to astronomers to help down pin down the Physics that is occurring in the Universe.

So the first thing that Hugh has looked at is the first question in the Galaxy Zoo tree: smooth or featured? And the bad news is that the classifications of the simulation images don’t match the real classifications of galaxies from Galaxy Zoo 2.

This means that there’s an issue with the simulations – investigating it further they found that there is actually very good agreement between the classifications for higher mass galaxies (> 10^11 solar masses) where the same number of smooth galaxies are seen. For lower mass galaxies though there are a lot more featured things in the simulations than in the real Universe. So, from this preliminary study Hugh’s figured out that we’ll only be able to compare the detailed structures of simulated and real galaxies of high mass – so now these are back being classified right now on the Galaxy Zoo site to get answers to all the further questions about spiral arms, bars etc. So please keep classifying!

Hugh is now telling us all about one of our newest projects: Galaxy Nurseries! Users look at the spectrum of a galaxy and are asked whether the emission they’re seeing in the plot is a real feature or not. The data is a bit messy, so this is not an ideal task for a computer but perfect for a human! There’s a lot to be done with this project but Hugh is excited to see the classifications come through.


Time for coffee for us now – we need caffeine to keep up this level of science discussion!


Next up for our listening pleasure is Sam Penny from the University of Portsmouth.

She’s going to be giving us an overview of the MaNGA survey, which is a new survey looking at the kinematics of galaxies by taking lots of spectra for each galaxy in a bundle. How are we going to integrate this then with Galaxy Zoo? Currently Sam works on very low mass non star forming galaxies which tend to be quite low brightness so aren’t always the easiest to visually classify. So she’s using the kinematic information that MaNGA provides to reveal how these galaxies have evolved. But she wants to know is there a link between the kinematics and the morphology of the galaxies? That way, if we don’t have kinematics for some galaxies will we still be able to pick these galaxies out?

Sam’s other interest is void galaxies; these are found isolated from other galaxies in extremely low dense environments. Surprisingly for galaxies that have been isolated from other galaxies their entire lives, some of them are very massive, very red (i.e. no longer star forming) and have disks. So how did they get so massive on their own? The most massive galaxies in the Universe are thought to build up through mergers – but if these galaxies are isolated then this doesn’t seem to be a possibility for these objects!


This is Chris Lintott taking back over to blog Becky’s talk.


She’s discussing our elephant in the room – the fact that we don’t deal with kinematic morphologies; in other words, we classify galaxies by how they look, not how the stars and gas move within them.

Before that, though, she’s pointing out that the clean samples we assemble from our data – which require a threshold of, say, 80% of people to agree on a classification before a galaxy makes it in – throw out a huge number of galaxies. That works, says Becky, if you’re trying to assemble a sample of discrete bins, like the categories on the Hubble diagram, and not a continuous range of types of galaxies.

But! There’s a catch. If you care about kinematic morphology, there really is a true binary. Things are rotating as a disk or they are not. (The audience seems not necessarily to agree on this point; we’ll see what happens when we get to questions). To study kinematics we can use instruments like MaNGA, which I hope Becky described above during Sam’s talk – Becky tested us and though the majority could distinguish a rotating smooth galaxy from a non-rotating smooth galaxy, it certainly isn’t easy just by looking. (And we probably shouldn’t draw conclusions from a sample of one).

Becky reckons that only 20% of Galaxy Zoo smooth galaxies are ‘true’ ellipticals – those that don’t have a hidden disk-like rotation inside. How will this affect Becky’s work on fitting star formation histories? She does this using a code called StarPy, which uses statistics to decide when a galaxy started to stop star-formation and how fast that ‘quenching’ is happening. (There’s a nice description of StarPy on the blog here).

Using MaNGA to divide galaxies not by visual morphology, but by rotation, Becky has run Starpy and finds differences. Non-regular rotators – what we might call ‘true ellipticals’ – quench either quite fast or very fast; there were a bunch of smooth galaxies that only quenched slowly, but these now seem to be the regular rotators; disks hiding amongst the Galaxy Zoo smooth sample.

Becky finishes with this diagram, from a recent review of kinematic morphologies. She’s added the location of the elephant – she feels we’re good at distinguishing spirals but need to talk about the smooth ones. In questions, Karen reckons we can tell the difference visually, Jean Tate wanted confirmation that lenticulars rotate (they do!), and I reckon we need to think hard about statistics.


Becky is back!


Next up is Ross Hart who’s telling us all about his work for correcting for biases in Galaxy Zoo 2 data to get out a nice complete sample of spiral galaxies. The problem is that at greater distances it becomes harder to spot spiral arms as things get fainter.

Ross’s code does a really nice job of recovering lots of spiral galaxies that would’ve been missed otherwise in the catalogue. Now we’ve got this catalogue we can use it to study the properties of spirals with arm number – turns out many armed spirals look a lot bluer than typical 2 armed spirals. Despite the fact that they appear blue (i.e. forming lots of hot, young, blue stars!), when Ross measures the star formation rate of these galaxies there is no dependance with arm number. But, he does find that more armed spirals have more hydrogen gas than 2 armed spirals, so might have more fuel for star formation in the future.

So what is actually happening in the spiral arms? Ross is now using an automated method to identify where the spiral arms in galaxies are called SpArcFiRe. Once the spiral arm locations have been identified, we can take off their light from the rest of the disk and just study what’s going on in the spiral arms and disk separately. Ross is currently working on these results but has some interesting preliminary results – so watch this space!

Now for something completely different! We’ve got Sebastian Turner from Liverpool John Moores University telling us all about automating galaxy classifications.

He’s asking what are we going to do moving forward? How will we merge the efforts of computers and humans? Seb is working on using statistical clustering methods to pick out information from the data we already have about galaxies. Clustering methods can pick out groupings of features in a sample – Seb feeds in information about the mass, colour and shape of a galaxy and the machine returns how many groupings it thinks there are in this multi-dimension parameter space. This can tell us something more about galaxy evolution because as humans we could never visualise this multi-dimensional space. Seb is showing us how the clustering algorithm picks up the areas of the colour magnitude diagram that we’re used to including the blue cloud, red sequence and green valley. The groups it picks out also correlate nicely with Hubble type morphology as well which is encouraging! There’s a lot more work we can do with this including using the Galaxy Zoo classifications as an input to the algorithm.

Ross wanted to know if we could input kinematics into this as well? Seb definitely thinks it’s possible. Brooke then asked about how it was weird the algorithm picked out two groups in the blue cloud before it picked out a group for the green valley. Seb reckons a big issue is that the algorithm tries to equalise the numbers in the groups and there’s just so many galaxies in the blue cloud.

Next up is a summer student at the University of Oxford: Peter McGill. He’s been looking at star formation histories of galaxies (like me!) but in galaxies at greater distance (high redshift) in the COSMOS survey with images taken by the Hubble Space Telescope.

In particular he’s focussing on figuring out what is stopping star formation (or quenching it) in the high redshift Universe. He’s using Starpy again (see Becky’s talk above!) to model the star formation history but has changed Starpy to take colours from the Hubble telescope rather than the SDSS. He’s looking at how these star formation histories change at different redshifts in the COSMOS survey and in comparison the results in the local Universe with SDSS. He’s showing plots where you can see how the rate that the star formation quenches at gets quicker for galaxies at higher redshift. His future work includes figuring out to include galaxy environment in these studies and changing the method to use a better algorithm to explore the parameter space!


Now we’re off to lunch and afterwards we’ll be having “un-conference” sessions where we’ll have lots of discussions. I’ll be live blogging later on when we report back – see you then internet!


We’re back! We’ve had a very productive afternoon discussing all the weird and wonderful things.

First up, Chris is telling us about a discussion half the team had about a paper draft from four years ago that we forgot existed. It got left by the wayside when the team ran into some problems with completeness of a spiral sample – however Ross’s current work has solved that for us! So we’re going to try completing it as a team again so watch this space.

Steven is now telling us about a session they had about integrating machine learning algorithms with Galaxy Zoo classifications (and Zooniverse classifications as a whole). So how will this affect the interaction of the volunteers with the site and the quality of the data we get out? The aim of this is to speed up classifications (for GZ2 from 6 months –> 1 month) since we’re moving into an era of even larger data sets. It will also hopefully mean the machine will do the boring stuff and the interesting stuff gets left in for the users (although that will vary with project). We have to be careful though not to put in too many good images though because research has shown there is a sweet spot for the amount of boring classifications to keep people classifying on the site. Also we want to make sure we’re still showing a representative sample of objects so that the public logging onto the site don’t get a biased view of what galaxies look like in the Universe. Could we have somewhere we still put retired (either by the machine or by humans) images for users to still explore these? The discussion then came back to what the main science goal of the Galaxy Zoo project is – is the final goal science or to make an interesting data set?

There was a hack session that also occurred to get a master data table for the MaNGA Galaxy Zoo classifications. We have them all, they’re just a bit scattered all over the place and need concatenate into one giant table.

Then the science team had a discussion about fast and slow rotators – can we actually do kinematic classifications by eye? We think we’re going to challenge the astronomical community to test this.

Then the science team had a talk about Talk. The discussion was focussed on the importance of needing to engage with the Galaxy Zoo community.

There was also another hack taking place this afternoon playing around with the Galaxy Zoo:3D data. Some of us got to grips with the data and tried to make some nice plots. One thing we did realise is that it might be worthwhile in getting about a third of the sample classified a bit further for better statistics.

We had a discussion about one of the tools that we use as a team to infer star formation histories; Starpy. We want to update this code to use a more robust statistical algorithm to get results.

Whilst that was going on, there was also a discussion about how to engage more with undergraduate students. One idea was to have a summer camp with undergraduates who will be engaging in research in the future – teaching them about citizen science, interacting with data, coding skills with a focus on Python – similar to the .Astronomy summer camp or the LSST data camp. Perhaps this could be tied in each year with a science team meeting? Also users may also enjoy this type of summer camp idea as well!

And that’s it for science for the day! We have our formal conference dinner tonight though so we’re all looking forward to socialising over a delicious dinner. Until next time internet!

16 responses to “#GZoo10 Day 2 : Happy Birthday to us”

  1. Jean Tate says :

    Questions for Hugh Dickinson: Why did you decide to effectively hide the existence of ‘fake data’ (sims) in the Galaxy Nurseries project? How much did you know about the ‘fake AGN’ furor in GZ (many years’ ago now) when you were designing Galaxy Nurseries?

  2. Jean Tate says :

    Question for Becky: do lenticulars rotate?

  3. Jean Tate says :

    Missed my chance! Wanted to ask Ross what sense SpArcFiRe makes of flocculent spirals … The question behind my Becky question has to do with the distinction between rotation (rotators) and disks: we know many “ellipticals” are fast rotators and that they do not have disks, but are there galaxies which have disks but do not seem to rotate (other than by being face-on)?

    • R Smethurst says :

      The consesus at the minute seems to be disk = rotation. We haven’t found any examples (except as you say where something is directly face on) where a disk isn’t rotating.

      • Jean Tate says :

        Thanks Bekky. Is it true to say that the number of galaxies with well-defined kinematics (2D) is very small, esp cf the 900k classified in the original GZ? This also goes to the question of whether there is more than one kinematic structure in many galaxies (other than in obvious mergers), beyond the obvious polar rings (say)?

      • Brooke Simmons says :

        Jean, yes, I think that’s basically true compared to the GZ1 sample size. We’re only just now really entering the IFU age where we can get this kind of 2D kinematic information for large (1000+) samples of galaxies.

  4. Jean Tate says :

    Seb is the final speaker? The agenda has “12.20pm: Peter McGill (Oxford).”, what happened to Peter?

  5. Jean Tate says :

    Q for Seb: when nearby individual galaxies are studied in detail, is there any evidence that quenching happens per your model?

  6. Christine Macmillan says :

    Attracting people to Galaxy Zoo Talk: Moderators post many many replies to people who ask one question and seem to never come back. But we need classifiers who can recognize when a galaxy is unusual. We definitely need a more attractive and informative Talk, It takes volunteers a long time to get oriented and find useful posts. eg Galaxy Redshift Chart https://talk.galaxyzoo.org/#/boards/BGZ0000007/discussions/DGZ0000ulp?page=2

  7. Christine Macmillan says :

    I started in Galaxy Zoo in 2008, I thought that the small starforming ellipticals very hard to classify. I found the forum huge and confusing but with lots of information, and I lurked for 3 days before I posted anything. I shudder to think that my first posts were about a “supernova” (foreground star) and cosmic ray hit, but I suppose the next were reasonably scientific.

  8. Christine Macmillan says :

    I like the idea of classifying without the decision tree of questions. It will be faster, with less clicking.

  9. Christine Macmillan says :

    Homework for the team? Here are objects that we can’t explain. We would welcome input.
    Objects that need more research https://talk.galaxyzoo.org/#/boards/BGZ0000004/discussions/DGZ0000ycq

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: