Classify Now

Excited to join in? Click here to go to Galaxy Zoo and start classifying! What could you discover?

VV191: from Galaxy Zoo to JWST

Almost 15 years ago, what first attracted me to be involved with Galaxy Zoo was the ability of  participants to pick out rare galaxy types, especially silhouetted or overlapping galaxy systems. These highlight the effects of dust in the foreground galaxy on passing light, and offer ways to study the dust which are complementary to, for example, observations in the deep infrared where the dust itself shines, giving off the energy it absorbs from starlight. Visible-light measurements of backlit galaxies show us the dust no matter how cold it might be, where it can hide from IR detection, and at the high resolution available to optical telescopes (including the Hubble Space Telescope) rather than the more modest, wavelength-limited resolution we can achieve at longer wavelength. Better measurements of dust in galaxies affect our understanding of their energy output, stellar content, and even our view of the more distant Universe. Galaxy Zoo volunteers contributed to a catalog of nearly 2000 suitable galaxy pairs from the first iteration of the project, since expanded from Galaxy Zoo 2, GZ Hubble, and the most recent examinations using the Legacy Survey data. We have used this list for  number of followup studies – although, truth be told, I have also been distracted by other rare systems found by volunteers (cough, Hanny’s Voorwerp and the Voorwerpjes, for example).

The backlit-galaxy system VV191 was first reported in the Galaxy Zoo forum as a possible galaxy merger,  by user Goniners on November 2, 2007. Despite its near-perfect geometry for study of foreground dust, VV191 had eluded  our earlier searches because the inner regions, where one can see that this is a superposition of undisturbed galaxies rather a merging galaxy pair, are saturated in prints of the Palomar Sky Survey, which was the best visible-light survey before the Sloan Digital Sky Survey. At the time VV191 was selected for further study, catalogs showed a substantial redshift difference between the two galaxies, which is desirable so the two galaxies are unlikely to be physically  interacting with each other, and light from the background galaxy scattered by the dust becomes much fainter.  That has been revised by later data which put the redshifts closer; we can’t win them all, though the two galaxies are very symmetric and undisturbed in all our later data.

(Hubble red-light image of VV191, showing silhouetted dust in the foreground spiral arms)

We got a closer look with the STARSMOG project led by colleague Benne Holwerda, which was a Hubble snapshot program – one where short exposures are inserted into gaps in the telescope schedule, much like the Zoo Gems gap-filler project. STARSMOG drew promising overlapping-galaxy pairs from Galaxy Zoo forum posts and the GAMA (Galaxy And Mass Assembly) project. Over several years, it acquired images of 55 galaxy pairs of interest. Among those was VV191, generating a very detailed map of the dust silhouette of the spiral galaxy. This was one of the galaxy pairs analyzed in a project based on the master’s thesis work by Sarah Bradford at the  University of Alabama which went into a poster presentation at the January 2017 meeting of the American Astronomical Society in Texas. In fact, I used a low-contrast version of the VV191 image as the poster background. (The poster should still just be legible in this compressed  PNG version):

The data quality for VV191 stood out, because the background elliptical galaxy has its brightest region right behind the edge of the dust in the spiral. We then had a 2-dimensional map of how much light gets through the dust in the spiral at the wavelengths included in that single observation. The poster was viewed by my longtime collaborator Rogier Windhorst, who is one of the interdisciplinary scientists with the James Webb Space Telescope (JWST) project. In this capacity, he had an allocation of so-called GTO (guaranteed-time) observations, asked what we could do with JWST.  Rogier was struck by these images, and wondered what we could add to the science output with a little bit of JWST  observing time.

This led to a plan of tracking the dust signature from ultraviolet to infrared in a single galaxy with a single technique. First Hubble had to do its part with more data, using not only its high resolution but UV sensitivity. We got Hubble images in filters around 2250 and 3360 Angstroms (0.22 and 0.34 microns) , with the short end limited mostly by the elliptical galaxy being so faint in the deeper UV that we couldn’t detect its light well enough in reasonable exposure times. These data have been processed, so we are ready  for the next step – JWST. Its near-infrared camera (NIRCAM) will observe this system in four filters from 0.9-4.0 microns wavelength (two at a time since the camera can use short- and long-wavelength channels simultaneously). The wavelengths are chosen to trace the way the dust effects fall off toward longer wavelengths, which is affected both by the sizes of the interstellar dust grains and how strongly they are clumped together. One filter matches one of the wavelengths at which small grains (or indeed large molecules, so-called PAH particles) emit, so we might be able to tell how they correlate with the larger particles blocking most of the light.

Because of the enormous sensitivity of JWST and NIRCAM, each filter is exposed for only 15 minutes to get very high measurement accuracy. (The telescope will probably take longer than that to point to VV191, depending on what it’s doing beforehand). Based on when JWST can view this part of the sky, these observations are most likely to be made between December 2022-March 2023, or May-July of 2023 (we should know more in a couple of weeks when the first year’s observation schedule is released). Watch this space…

Zoo Gems – Hubble does Galaxy Zoo(s)

Since mid-2018, the Hubble Space Telescope has taken occasional short-exposure images, filling what would otherwise be gaps in its schedule, of galaxies in the list from “Gems of the Galaxy Zoos” (otherwise known as Zoo Gems). The Zoo Gems project just passed a milestone, with acceptance of a journal paper describing the project, including how votes from Galaxy Zoo and Radio Galaxy Zoo participants were used to select some of the targeted galaxies, and acting as a sort of theatrical “teaser trailer” for the variety of science results coming  from these data. (The preprint of the accepted version is here; once it is in “print”, the Astronomical Journal itself is now open-access as of  last month). The journal reviewer really liked the whole project: “The use of the Galaxy Zoo project’s unique ability to spot outliers in galaxy morphology and use this  input list for a HST gap filler program is a great use of both the citizen science project and the Hubble Space Telescope” and “I think it is a wonderful program with a clever, useful, and engaging use of both SDSS and Hubble.” (We seldom read statements that glowing in journal reviews).

Zoo Gems got its start in late 2017, when the Space Telescope Science Institute (STScI) asked for potential “gap-filler” projects. Even with  what are known as snapshot projects, there remained gaps in Hubble’s schedule long enough to set up and take 10-15 minutes’ worth of high-quality data. We put together a shockingly brief proposal (STScI wanted 2 pages, originally to gauge interest) and were very pleased to find it one of 3 selected (the other two also deal with galaxies. Makes sense to me). We had long thought that the ideal proposal for further observations of some of the rare objects identified in Galaxy Zoo ran along the lines of “Our volunteers have found all these weird galaxies. We need a closer look”. That was essentially what the gap-filler project offered.

We estimated that we could identify 1100 particularly interesting galaxies (where short-exposure Hubble images would teach us something we could foresee) from Galaxy Zoo and Radio Galaxy Zoo.  We were allocated 300 by STScI, so some decisions had to be made. A key feature of our project was the wide range of galaxy science goals it could address, so we wanted to keep a broad mix of object types. Some types were rare and had fewer than 10 examples even from Galaxy Zoo, so we started by keeping those. When there were many to choose from, we did what Galaxy Zoo history (and STScI reviewers) suggested – asked for people to vote on which merging galaxies, overlapping galaxies, and so on should go into the final list. This happened in parallel for Galaxy Zoo and Radio Galaxy Zoo objects (the latter largely managed by the late Jean Tate, not the last time we are sadly missing Jean’s contributions as one of the most assiduous volunteers). Even being on that observing list was no guarantee – gap-filler observations are selected more or less at random, taking whichever one (from whichever project’s list) fits in a gap in time and location in the sky. The STScI pilot project suggested that we could eventually expect close to half to be observed; we are now quite close to that, with 146 observations of 299 (one became unworkable due to a change in how guide stars are selected by Hubble). These include a fascinating range of galaxies. From Galaxy Zoo, the list includes Green Pea starburst galaxies,  blue elliptical and red spiral galaxies, ongoing mergers, backlit spiral galaxies, galaxies with unusual central bars or rings, galaxy mergers with evidence for the spiral disks surviving the merger or reappearing shortly thereafter, and even a few gravitational lenses. From Radio Galaxy Zoo, we selected sets of emission-line galaxies (“RGZ Green”) and possibly spiral host galaxies of double radio sources (SDRAGNs, in the jargon, and so rare that we’ve more than doubled the known set already). Both kinds of RGZ selection  were largely managed by Jean Tate, who we are missing once again. By now, of 300 possible objects,  146 have been successfully observed. One can no longer be observed due to changes in Hubble’s guide-star requirements, and two failed for onboard technical reasons  (it was during one of those, a few months ago, that a computer failure sent the telescope into “safe mode”; I have been assured that it was not our fault).

Hubble images from Zoo Gems program.
Some favorite Zoo Gems images of Galaxy Zoo objects – a merger with surrounding spiral pattern, overlapping galaxy system with backlit dust, wheel-within-a-wheel bar and ring, two mergers, and a 3-armed spiral.

Zoo Gems images show that every blue elliptical galaxy observed shows a tightly wound spiral pattern near the core, so small that it was blurred together in the Sloan Survey images used by Galaxy Zoo, and broadly fitting with the idea that these galaxies result from at least minor mergers bringing gas and dust into a formerly quiet elliptical system.

There is much more to come as harvesting the knowledge from these data continues. Already, a project led by Leonardo Clarke  at the University of Minnesota  used Zoo Gems images to demonstrate that Green Peas are embedded in redder surroundings, possibly the older stars in the galaxies that host these starbursts.  Beyond these, these data can be used to examine the histories of poststarburst galaxies, dynamics and star-formation properties of 3-armed spirals, and nuclear disks and bars – some of these show galaxies-within-galaxies patterns where the central region nearly echoes the structure of the whole galaxy.

While going through some of the Zoo Gems images to see which should go in various montages in this paper, I considered the multilayer overlapping galaxy system including UGC 12281. It didn’t go into the paper, but the visual sense of deep space in this image is so profound that it became the 2nd most-retweeted thing I’ve sent out in more than 10 years.

From a Hubble Zoo Gems image: overlapping layers of galaxies behind the nearby edge-on spiral UGC 12281. Galaxies beyond galaxies, stretching away through space and time.

In presenting these data, we wanted to make the case for the value of wide-ranging, even short, programs such as this. These gap-filler projects are continuing with Hubble, until STScI starts to have trouble filling the gaps and needs to call for more projects. Premature as it seems, I can’t help musing that someone may eventually work out a low-impact way for the James Webb Space Telescope to make brief stopovers as it slews between long-exposure targets – we have suggestions…

Data from the Zoo Gems project (like the other gap-filler programs, Julianne Dalcanton’s program on Arp peculiar galaxies and the one on SWIFT active galaxies led by Aaron Barth) are immediately public, accessible in the  MAST archive under HST program number 15445 (the others are 15444 and 15446). Claude Cornen maintains image galleries for the Zoo Gems, Arp and SWIFT projects in Zoo Gems Talk. Our thanks go to everyone who helped draw attention to these galaxies, or voted in the Zoo Gems object selection.

Announcing: Jan 13 press conference on Galaxy Zoo: Clump Scout results

I’m Nico, a PhD student with the Galaxy Zoo team, and I have an exciting announcement. About a year ago I wrote that classifications on the Galaxy Zoo: Clump Scout project had just finished. Now, with the first results nearing publication, the American Astronomical Society (AAS) has chosen Clump Scout to present its findings at an official press conference on Thursday, January 13 from 4:15-5:15pm Eastern Time (or 9:15-10:15pm GMT for our UK visitors). We’re very excited to finally share these results with our volunteers!

The press conference is free and open to all, so if you took part in the project, we encourage you to tune in to learn more about where your efforts have gone. (Or, if you’ve never heard of the Clump Scout project before, now is a great chance to learn!) I’ll spend a few minutes explaining why we created the project, and describe a few clues we’ve found as to the last 10 billion years of galaxy evolution. There will also be 4 other speakers presenting about their own citizen science work, so it will be a thorough tour of what’s going on in people-powered astronomy today.

We hope you can join us!

How to join:

You watch via YouTube live stream on AAS’s YouTube channel: https://www.youtube.com/c/AASPressOffice

PS. For more galaxies at the AAS (although not Galaxy Zoo directly), also see our PI Karen Masters talking about the completion of the MaNGA Galaxy Survey, Tue 11th Jan in the 2.15pm ET Press Conference. MaNGA the survey Galaxy Zoo: 3D was designed to help analyse; and look out for more crowd-sourcing projects to come from this complex data now it’s all publicly available, as well as much more use of the Galaxy Zoo: 3D classifications.

New Paper – Practical Galaxy Morphology Tools

Last year, we published the GZ DECaLS catalog: detailed morphology classifications for 314,000 galaxies. We classified so many galaxies by training AI models to learn from volunteers and work alongside them. This raises the question – what else can we do with those models?

It turns out that we can use them to make three new practical tools that will help both professional researchers and volunteers. You can read all about them in our new paper out today: https://arxiv.org/abs/2110.12735.

The first practical tool is a similarity search. You can type in the coordinates of a galaxy, and it will try to show you the most similar galaxies. Try it out on your favourite DECaLS galaxy. For now, it’s a simple demo website, but we hope to eventually integrate this into Galaxy Zoo.

The second is a new method for finding the galaxies most interesting to you personally. Imagine a website where you can rate galaxies by how interesting you find them. As you rate galaxies, the website shows you new ones for you based on your previous ratings – just like how Netflix suggests new series (I’m a big Bojack fan myself). The system is too complicated to create a simple demo to show you, but you can see some examples in the new paper. Thanks to funding from the Sloan Foundation, we’re making this even better and adding it as an official Zooniverse feature.

The third is about adapting the AI models to classify new kinds of galaxies. If a researcher wants a model that can find ringed galaxies, for example, they would usually have to start by gathering tens of thousands of examples of ringed galaxies with which to teach their new model. This takes a long time and a lot of effort, especially for rarer galaxies. However, a model already trained on Galaxy Zoo classifications needs just hundreds of example galaxies to learn to find rings as well. This will let researchers “fine-tune” models to help solve their own specific science problems. That includes me! I’m running a Galaxy Zoo Mobile project to make a new ring catalogue with this approach.

All these tools work because of your classifications. As well as using them directly in science catalogues, we need them to train better AI models. Thank you for your contribution.

If you have any spare time – maybe on the bus, or just sitting around scrolling – I would really appreciate your help finding ring galaxies by swiping left and right on Galaxy Zoo Mobile, part of our Zooniverse app (Apple, Android). I’m hoping to build the biggest catalogue of rings ever assembled so we can understand how they form. Please join in if you can.

Cheers,

Mike

P.S. You can find a few more technical details on my personal blog.

New Galaxy Zoo Mobile challenge – Ringed Galaxies

My name is Mike – I’m a researcher helping run the Zooniverse project Galaxy Zoo

I’m launching a new challenge within Galaxy Zoo Mobile, the version of GZ that runs on our mobile app (iOS, Android, scroll down to “Space” projects).

The challenge is to find galaxies with rings. I’ve picked out the 25,000 galaxies where some* volunteers voted for “Ring” on the final GZ question – “Does this galaxy have any rare features?”. Now it’s time to do a targeted search through these promising galaxies. Swipe left and right on GZ Mobile to tell us which ones you think have rings.

This is what galaxies with rings look like. I think these are easily the most beautiful galaxies we’ve ever shown on Galaxy Zoo, with glittering spiral arms and intricate structures. We’ve zoomed in each picture about 25% more than in Galaxy Zoo itself, so you’ll see all that fine detail.

We want to find galaxies with rings because they’re a mystery. Astronomers aren’t sure what causes rings. 

One leading theory is that they form from disk galaxies left undisturbed for hundreds of millions of years. Theoretical calculations and computer simulations suggest that the gravity of stars in the galaxy’s bar or bulge can cause the orbits of nearby stars to change, first making spiral arms and eventually a ring shape. Another theory is that rings are caused by head-on collisions where a small galaxy punches through the middle of a large disk galaxy, like a rock dropped into a pond.

The truth is that there are probably different kinds of ring, formed by different processes. Working out which processes form which rings will require many examples of each – and that’s where you come in. 

This targeted project is all about finding as many rings as possible. Once we know which galaxies have rings, we can follow up with future projects to divide them into different categories, and compare those categories to find out what creates each type of ring. 

As always with Galaxy Zoo, your classifications will be publicly shared with all researchers to help everyone investigate rings. We will also use your classifications to teach a new version of Zoobot, our galaxy-classifying AI, to find rings. Zoobot can then help find more rings in the million-or-so galaxies recently released by the DECaLS survey** that we haven’t yet uploaded to Galaxy Zoo. 

If you have any questions, come chat to our community and myself on the Galaxy Zoo Talk forum

Cheers,

Mike

* Specifically, galaxies where the fraction of volunteers answering “ring” is in the top third (typically about two or more volunteers).

** The published catalog from Galaxy Zoo DECaLS used images from Dark Energy Camera Legacy Survey data release 5 and earlier. The survey has since released more galaxy images, some of which have already been uploaded to Galaxy Zoo.

Stronger bars help shut down star formation

Hi everyone!

I’m Tobias Géron, a PhD student at Oxford. I have been using the classifications of the Galaxy Zoo DECaLS (GZD) project to study differences between weak and strong bars in the context of galaxy evolution. We have made significant amount of progress and I was able to present some results a couple of weeks ago at a (virtual) conference in the form of a poster, which I would love to share with you here as well.

To summarise: I have been using the classifications from GZD to identify many weakly and strongly barred galaxies. Some example galaxies can be found in the first figure on the poster. As the name already implies, strong bars tend to be longer and more obvious than weak bars. But what exactly does this mean for the galaxy in which they appear?

One of the major properties of a galaxies is whether it is still forming stars. Interestingly, in Figure 2 we observe that strong bars appear much more frequently in galaxies that are not forming stars (called “quiescent galaxies”). This is not observed for the weak bars. This suggests one of two things: either the strong bar helps to shut down star formation in galaxies or it is easier to form a strong bar in a quiescent galaxy.

In an attempt to answer this chicken or egg problem, we turn to Figure 3. Here, we show that the rate of star formation in the centre of the galaxy is highest for the strongly barred galaxies that are still star forming. This suggests that those galaxies will empty their gas reservoir quicker, which is needed to make stars, and are on a fast-track to quiescence. 

I’m also incredibly happy to say that we’ve written a paper on this as well, which has recently been accepted for publication! You can currently find it here. Apart from the results described above, we also delve more deeply into whether weak and strong bars are fundamentally different physical phenomena. Feel free to check it out if you’re interested!

It’s amazing too see all this coming to fruition, but it couldn’t have been possible without the amazing efforts of our citizen scientists, so I want to thank every single volunteer for all their time and dedication. We have mentioned this in the paper too, but your efforts are individually acknowledged here. Thank you!

Cheers,

Tobias

Clump Scout wrap-up: What are we doing with your 2.7 million clicks?

Hi all. My name is Nico Adams from the Galaxy Zoo science team.

Writing my first scientific paper has been equal parts exhausting and exhilarating. On Thursday, February 11, I got to put a tally in the “exhilarating” column. The paper is on the first scientific results covering the Galaxy Zoo: Clump Scout project, and I was putting the final touches on my first draft when I saw that you all had submitted the project’s final classifications. The Clump Scout project had a lofty goal — to search for large star-forming regions in over 50,000 galaxies from the Sloan Digital Sky Survey — and the fact that the Clump Scout volunteers have managed to finish it is an incredible achievement.

We’re looking forward to sharing our results over the next few months. Clump Scout is not only the first citizen science project to search giant clumps in galaxies, but it’s the first large-scale project of any kind to look for clumps in the “local” universe (out to redshift ~0.1, or within a billion-or-so light-years of us). The data set presented by this project is incredibly unique, and we are nearly finished with our first round of analysis on it.
We’re currently preparing two papers that will cover the results directly. One is focused on the algorithm that turned volunteers’ clicks into “clump locations”, while the other — my first paper — is focused on the clump catalog and scientific results we derived from it. While these papers go through a few months of revision and review, we wanted to publish a few blog posts previewing the results. This blog post will focus on the first one: We’ll explain what happened to your clicks after you sent them to us. Clump Scout could not have happened without our volunteers, and we thank you immensely for your support.


When we designed Clump Scout, we knew from the outset that we wanted classifications to be as simple as possible. The original plan was to have volunteers click on any clumps they saw, then immediately move on. While the final design was a bit more complex (a few different types of marks were available) that basic design — mark the clumps, then move on — was still present.

The classification interface after a volunteer submits their clump locations usually looks something like this:

By comparison, the “science dataset” — which consists of 20 volunteers’ classifications all laid on top of each other — looks more like this:

Just by glancing at this image, it’s clear that there are a few “hot spots” where clumps have been identified. However, correctly identifying these hot spots in every image can be EXTREMELY tricky to get right. The software that deals with this problem is called the “aggregator”, and it has to strike a balance between identifying as many clumps as possible and filtering out the isolated marks in the image.

The standard way of solving this problem in computer science is to use a “clustering algorithm”. Clustering algorithms are a very broad class of techniques used to identify clusters of points in space, and most of them are very simple to implement and run. Below, you can see the results of one clustering algorithm — called the “mean shift” algorithm — in practice.

Most clumps have been spotted correctly, and the results look good! However, it took quite a bit of fine-tuning and filtering to get the results to look like this. In the image above, the “bandwidth” parameter — the approximate “size” of each cluster — is about equal to the resolution of the image. Increasing the bandwidth can make the algorithm identify more clumps by grouping together clusters of points that are more diffuse. Unfortunately, the larger bandwidth also increases the likelihood that two or more “real” clumps will mistakenly be grouped into one. Here are the clusters we get when the bandwidth is twice as large:

Now that we’ve allowed clusters to be more spread-out, we’ve picked up on the cluster in the upper left. But, the three distinct clumps at the bottom edge of this galaxy have melded into just two, which is not what we want! This is just one of the parameters that we needed to tune. Another is the number of marks required to call a cluster a “clump”. Require too many, and you ignore valuable objects that we’re interested in. Require too few, and the algorithm picks up on objects that are really just noise.

How do we solve this problem? One thing that we tried was to have three members of the science team to classify 1,000 galaxies, so that we could see how their classifications agreed with each other and with volunteers’ marks. We found that when 2 out of 3 members of the science team identified a clump, a majority of volunteers identified it as well. This was a good sign, and it told us about how many volunteer marks to expect per clump. In general, if 60% of volunteers leave a mark within a few pixels of the same spot, we consider that spot to be a clump.

Another technique that we used was more radical. While we started out using the simple clustering algorithm we’ve described so far, we found that it was much more effective to account for who was leaving each mark. Every volunteer is an individual person, with their own clump-classifying habits. Some volunteers are very conservative and only click on a clump when they’re completely certain; others are optimists who want to make sure that no faint clumps get missed. Sometimes volunteers make genuine mistakes and believe it or not we even get a few spammers who just click all over the image! We wanted to design an aggregation system that would make best use of all volunteers’ skills and talents (and if possible even the spammers!) to help us find as many real clumps as possible, without accidentally including any other objects that can masquerade as clumps. 

To build our aggregation system, we started with an idea that was first proposed by Branson et al (2017). At its core, our system still uses a type of clustering algorithm, called a facility location algorithm. The facility location algorithm builds clusters of volunteer clicks that have a very specific connectivity pattern, which looks like this.

An example of the “facility location” algorithm. The blue “F”s mark proposed facilities, which are connected to red “C”s (cities). In practice, the facilities represent the true locations of clumps while the cities represent your marks identifying them.

Each cluster contains a central node, referred to as a “facility”, which is connected to one or more other nodes, referred to as “cities”. Facility location algorithms get their name because they are often used to minimise the cost of distributing some essential commodity like electricity or water from a small number of producers (the facilities) to a larger number of consumers (the cities). Building a facility incurs a cost and so does connecting a city to a facility. When we use the algorithm in our aggregator, the volunteer clicks that we want to group into clusters become the facilities and cities. The trick to finding the right clusters is how we choose to define the costs for facility creation and facility-city connection. 

The costs we use are based on a statistical model that tries to understand how different volunteers behave when they classify clumpy galaxies. For each volunteer, the model learns how likely that volunteer is to miss real clumps or accidentally click on other features in the subject images. The exact location of real clumps in an image can be ambiguous, so when the model thinks that a volunteer has clicked on a real clump, it also tries to predict how accurate their annotation is. But it isn’t just the volunteers that are unique – different subjects have different characteristics too, and it may be much more difficult to spot clumps in some galaxies than it is in others. For example, spotting bright, well separated clumps on a faint background is likely to be much easier than spotting faint closely packed clumps in a noisy image. Our aggregator model takes this into account as well by trying to understand just how difficult finding clumps is in different images.

How does the aggregator model work out how volunteers are behaving? Do we tell it the right answer for a handful of subjects and check the volunteers’ annotations against them? Actually no, because we don’t know exactly what the right answer is! One of the goals of Galaxy Zoo: Clump Scout was to let the volunteers decide together exactly what it takes for a feature to be a clump. So we don’t give our model any information except the clicks that the volunteers provide. Just by comparing how different volunteers respond to different images as the classifications arrive, and comparing their annotations with the clusters found by the facility location algorithm, our model slowly learns the combination of all volunteer behavioural traits and image difficulties that best explain the classification data it has seen.

Once our model provides its best description of the volunteers and images, we define the costs for the facility location algorithm. We specify that turning a volunteer’s click into a facility is more expensive for very optimistic volunteers, who might click on slightly more features that aren’t really clumps. This reduces the chance of accidentally contaminating the clump detections. Connecting clicks to an existing facility costs more if the volunteers that provided them seem optimistic. On the other hand, if it seems like a volunteer is more pessimistic or their clicks are slightly less accurate, then it becomes cheaper to connect their clicks into an existing cluster. This ensures that we don’t miss those hard-to-spot clumps with fewer clicks or more widely spread clicks.

But wait a minute! Were you reading carefully? Our model’s understanding of the volunteers and images is partly based on the clusters that were found, but the cost of creating the clusters depends on the volunteers’ behaviour! How does that work?! Good question. Whenever a new volunteer joins the project, we don’t know anything about them, so we make some reasonable assumptions about how they will behave. In a similar way, we assume that all subjects have roughly similar characteristics. We call these assumptions the “priors” of our model. These priors let us get started with a really rough set of clusters that our model can use to make an initial guess about the volunteers and subjects. Then we can use that guess to set some new costs and find some new, more refined clusters. With these clusters, our model can make another, better-informed prediction. Our algorithm keeps refining its guess and click-to-cluster assignments over and over again until the model predictions and the corresponding clusters don’t change any more. 

Compared to our simplest aggregator, our best results from our more advanced method is better at picking up faint clumps and filtering out noise. It’s also the first time this sort of method has been used in the pipeline of a major citizen science project like this one. This aggregator will be the subject of one of our upcoming papers on Clump Scout, and we are very excited to share the results.


A special thanks on this post goes out to the other members of the Clump Scout team, who helped ensure that the details of our aggregation process were as accurate and simply explained as possible. In the next week or two we’ll publish a second post detailing some of the scientific findings we’ve gotten from our results. Thank you, and stay tuned!

Happy Data Release Day: DECaLS goes live

I’m delighted to say that – with the release of the accompanying paper on the arXiv – the first data release from our Galaxy Zoo classifications of galaxies from the DECaLS survey is now live! The paper is still under review at the journal, but as lead author Mike Walmsley is handing in his thesis (congratulations!) it seemed like a good time to release the data.

As the title suggests, this data relies on classifications submitted by our wonderful Galaxy Zoo volunteers from 2015 to 2020, particularly via the ‘Enhanced’ workflow where classifications are used to educate a friendly robot assistant, speeding up the process dramatically. As a result, we have detailed classifications for 314,000 galaxies based on deeper imaging than we’ve ever had before.

The results are dramatic! In the figure above you can see a comparison between the fraction of votes a galaxy received for being ‘featured’ in our previous data release, compared to with the new DECaLS imaging. If the new imaging made no difference, the galaxies would all lie on the dotted line, but they’re mostly above it – volunteers are seeing more features in galaxies in deeper imaging. All of which makes sense, but it’s still gratifying.

We’re all looking forward to getting stuck into this dataset – and Mike has built a tool for you to explore with. Using this interface, you can sort through the data and look at the results – below is a quick sample of double rings Sandor cooked up in no time at all.

We’re not done by a long shot – unlike these systems, the galaxies currently awaiting your inspection over at GalaxyZoo.org have not been previously classified – with your help, hopefully it won’t be too long before we can add them to the catalogue. In the meantime – thanks for all your help!

Chris

P.S. Pulling this paper together was a real team effort so I want to thank each and every one of the team for their hard work getting this over the line. We haven’t forgotten the volunteers either – the final, published version will have an author list online with the names of everyone who contributed and we’ll email you all a link.

Press Release on Results from Galaxy Zoo: 3D

Many of you helped out with the Galaxy Zoo spinoff project, Galaxy Zoo: 3D. I am happy to let you know that I am presenting results from this project, today at the 237th Meeting of the American Astronomical Society. You can view the iPoster I made about it at this link.

This spin-off project was aimed at supporting the MaNGA (Mapping Nearby Galaxies at Apache Point Observatory) survey, which is part of the Sloan Digital Sky Surveys (SDSS). Thanks to your input we have been able to crowdsource maps which show where the spiral arms, bars and any foreground stars are present in every galaxy observed by MaNGA. This, combined with the MaNGA data is helping to reveal how these internal structures impact galaxies.

The results will be part of a Press Conference about this and other SDSS results, live streamed at 4.30pm ET (9.30pm GMT) on the AAS Press Office Youtube Channel. The press release about them will go live on the SDSS Press Page at the same time. Direct link to press release (will only work after 4.30pm ET).

Thanks again for your contributions to understanding how galaxies work.

A sad farewell

I recently received word from his wife of the death of Jean Tate on November 6. Jean had been a very active participant in several astronomical Zooniverse projects for a decade, beginning with Galaxy Zoo. It does no disservice to other participants to note that he was one of the people who could be called super-volunteers, carrying his participation in both organized programs and personal research to the level associated with professional scientists. He identified a set of supergiant spiral galaxies, in work which was, while in progress, only partially scooped by a professional team elsewhere, and was a noted participant in the Andromeda project census of star clusters in that galaxy. In Radio Galaxy Zoo, he was a major factor in the identification of galaxies with strong emission lines and likely giant ionized clouds (“RGZ Green”), and took the lead in finding and characterizing the very rare active galactic nuclei with giant double radio sources from a spiral galaxy (“SDRAGNs”). He did a third of the work collecting public input and selecting targets to be observed in the Gems of the Galaxy Zoos Hubble program. Several of us hope to make sure that as much as possible of his research results from these programs are published in full.

Jean consistently pushed the science team to do our best and most rigorous work. He taught himself to use some of the software tools normally employed by professional astronomers, and was a full colleague in some of the Galaxy Zoo research projects. His interests had been honed by over two decades of participation in online forum discussions in the Bad Astronomy Bulletin Board (later BAUT, then Cosmoquest forum), where his clarity of logic and range of knowledge were the bane of posters defending poorly conceived ideas.

Perhaps as a result of previous experiences as a forum moderator, Jean was unusually dedicated to as much privacy as one can preserve while being active in online fora and projects (to the point that many colleagues were unaware of his gender until now). This led to subterfuges such as being listed in NASA proposals as part of the Oxford astronomy department, on the theory that it was the nominal home of Galaxy Zoo. Jean was married for 27 years, and had family scattered in both hemispheres with whom he enjoyed fairly recent visits. Mentions in email over the years had made me aware that he had a protracted struggle with cancer, to the extent that someday his case may be eventually identifiable in medical research. He tracked his mental processes, knowing how to time research tasks in the chemotherapy cycle to use his best days for various kinds of thinking.

This last month, emails had gone unanswered long enough that some of us were beginning to worry, and the worst was eventually confirmed. I felt this again two days ago, which was the first time I did not forward notice of an upcoming Zoo Gems observation by Hubble to Jean to be sure our records matched.

Ad astra, Jean.