Galaxy Zoo and Zooniverse review article posted today on ArXiv
One of the really cool aspects of Galaxy Zoo is the link between the data generated by you all (the humans) and the data processed by computer algorithms (the machines). With Galaxy Zoo and its sister Zoos, we are showing that the machine classifiers can learn from the human classifiers. This is great because believe it or not, the data is just going to keep flowing. And flowing – more and more, faster and faster. By the time we reach the end of this decade when the Large Synoptic Survey Telescope (LSST) is online, the data will be coming in at tens of Terabytes a night. All the data that you classified in Galaxy Zoo 1 from the Sloan Digital Sky Survey took up only a few Terabytes in total. So those machines have to get much better at classifying if we all don’t want to drown in the data and you all are showing the way.
This whole area of work with training the computer algorithms is called Machine Learning. And a related endeavor, called Data Mining, is applying these algorithms to large quantities of data to extract patterns or knowledge. There is a book that is going to be published soon called “Advances in Machine Learning and Data Mining for Astronomy” (edited by Michael Way, Jeff Scargle, Ashok Srivastava, and Kamal Ali). The Galaxy Zoo team is really excited because we got asked to contribute a chapter to this book. The chapter is titled: Galaxy Zoo: Morphological Classification and Citizen Science. We got special agreement from the editors allowing us to post our chapter on the arXiv. Here’s the link to the article [ http://arxiv.org/abs/1104.5513] so you don’t have to wait for the book to come out! A lot of the folks from the Galaxy Zoo team contributed to the writing and it was fun to put together. The article gives a great overview of “how it all began”, the birth of the Zooniverse and, of course, we describe several of the discoveries you all have made. We finish by describing how we think the citizen science method of data analysis is going to be essential in conquering the flood of data. So take a look and we hope you have as much fun reading it as we had writing it.
Lucy (on behalf of all the chapter authors)