Galaxy Zoo and Zooniverse review article posted today on ArXiv

The Hubble Tuning Fork diagram developed to aid in galaxy classification. Galaxy Zoo showed that humans together are better than machine algorithms in classifying galaxies.

One of the really cool aspects of Galaxy Zoo is the link between the data generated by you all (the humans) and the data processed by computer algorithms (the machines). With Galaxy Zoo and its sister Zoos, we are showing that the machine classifiers can learn from the human classifiers. This is great because believe it or not, the data is just going to keep flowing. And flowing – more and more, faster and faster. By the time we reach the end of this decade when the Large Synoptic Survey Telescope (LSST) is online, the data will be coming in at tens of Terabytes a night. All the data that you classified in Galaxy Zoo 1 from the Sloan Digital Sky Survey took up only a few Terabytes in total. So those machines have to get much better at classifying if we all don’t want to drown in the data and you all are showing the way.

This whole area of work with training the computer algorithms is called Machine Learning. And a related endeavor, called Data Mining, is applying these algorithms to large quantities of data to extract patterns or knowledge. There is a book that is going to be published soon called “Advances in Machine Learning and Data Mining for Astronomy” (edited by Michael Way, Jeff Scargle, Ashok Srivastava, and Kamal Ali).  The Galaxy Zoo team is really excited because we got asked to contribute a chapter to this book. The chapter is titled: Galaxy Zoo: Morphological Classification and Citizen Science. We got special agreement from the editors allowing us to post our chapter on the arXiv. Here’s the link to the article [ http://arxiv.org/abs/1104.5513] so you don’t have to wait for the book to come out!  A lot of the folks from the Galaxy Zoo team contributed to the writing and it was fun to put together. The article gives a great overview of “how it all began”, the birth of the Zooniverse and, of course, we describe several of the discoveries you all have made.  We finish by describing how we think the citizen science method of data analysis is going to be essential in conquering the flood of data. So take a look and we hope you have as much fun reading it as we had writing it.

Lucy (on behalf of all the chapter authors)

About The Zooniverse

Online citizen science projects. The Zooniverse is doing real science online,.

2 responses to “Galaxy Zoo and Zooniverse review article posted today on ArXiv”

  1. rick nowell says :

    Nice summary, especially the paragraphs about Galaxy Zoo ‘Peas’! Discoveries such as the Peas and the Voorwerp I venture will only be possible through human study and ingenuity. There is only so much a machine or algorithm can do! Good luck though with your future findings. On a practical level, having a computer analyse differing images will certainly be necessary, but will the computers have a forum such as Galaxy Zoo’s? Probably not…

  2. Lucy says :

    Hi Rick
    Thanks for your comments. Indeed, I think you are right about discoveries such as the Peas and the Voorwerp coming through human interaction with the data. That’s the most brilliant and exciting aspect of the future with all the Zoo projects. While we would be able to train algorithms to look for Peas in LSST data, the point is that there are likely so many other types of objects out there that we just don’t know about, we need humans to find them. But it isn’t just about having Zooites twig to something different, it’s about providing tools so that when a user sees something weird or wonderful, they have a place to talk about it and study it. Fortunately, we have some funding from the NSF to help us improve the tools for further investigations by the users. More about that later!

Leave a comment