Man vs Machine?
Manda’s paper on improving automatic galaxy classification seems to have caused quite a bit of concern and comment. Is the future of the Zoo portrayed above, with an out-of-control machine wrecking all we’ve come to hold dear? After all, we’ve always believed it important that we don’t waste your time by having you do tasks that computers are perfectly capable of completing. Are the Zookeepers putting the Zoo out of business?
You’ll be pleased to hear the short answer is obviously ‘no’. The long answer is more interesting; it turns out that we absolutely have to work together with machines in order to keep Galaxy Zoo alive for the next five years or more.
Looking at the history of galaxy classification is interesting – by the early 1990s, astronomers were aware that surveys like the Sloan Digital Sky Survey, which provides the Galaxy Zoo data, were going to be too big for astronomers themselves to classify. They therefore threw themselves into developing automatic classification routines, but as we all know human classifications were still the gold standard. That’s why Galaxy Zoo was needed, and why we’ve made the impact we have in just a couple of short years.
Looking forward, though, it’s clear that the advent of the Zoo has only bought us a little more time in the race against machine. New surveys, larger, deeper and more ambitious that the Sloan are being planned; one of the largest, the Large Synoptic Survey Telescope, is estimated to produce 30 TB of data per night. 30TB is a lot – the equivalent of 20 months worth of high quality video, for example.
That amount of data will overwhelm even the largest Zoo. We’ll need to automatically classify most of it – and more importantly use machines to decide which objects are interesting enough (or confusing enough!) to be passed to humans for more careful attention. We’ve already done this once, with Galaxy Zoo : Supernova taking the output of an automated classification routine and sending the most likely supernovae to the Zoo for further analysis.
This is a huge challenge for machine learning academics and researchers, and I suspect you’ll be hearing a lot more about our efforts in this direction. Crucially, what Manda’s paper shows for the first time is that the automatic routines can be improved by the use of Zoo data. The neural network ‘learns’ how to think like the Zoo and does a pretty good (but not yet good enough) job – and that’s good for both humans and computers.
Heh, yep, I can see the resemblance! 😉
So, it seems to be a case of co-operation between humans and machines with us teaching the machine what to look for and then providing more human input to analyze the unusual objects selected through auto-screening by machines based on criteria set by our Zoo data. Very interesting ! Thanks, Chris for the update.
So the fleeing crowd should really be turning round and wagging their collective fingers at the machine telling it to make sure it learns well! Otherwise……. :0)
It’s always going to be a case of working in tandem with machines. After all, that’s why our forefathers invented them.
Look also at the progress of the GalaxyZoo to date: the zooites themselves have initiated several projects. There is not likely to be a case when the computers themselves would decide on new projects based on sets and spats of data they have encountered!
So, long live man!
It helps someone like me help the Zoo if I can understand more clearly what to look for.
For example, the supernova instructions left me looking at two photos that to me looked identical yet I was told one was ‘obviously’ a supernova and the other ‘obviously’ wasn’t. Sorry; there was nothing obvious to me about them either way.
The only place where I could clearly help was in whether the event was centered on the host galaxy or not. I finally made all my decisions on that basis. It speeded up the clicking, but I have no idea if my work was of any value to you.
In Galaxyzoo2 I am seeing much more clearly that some galaxies I used to classify as ellipticals are actually spirals in early or late development. [I’d love to know which.] The photos are clearer, and so are my classifications. The negative look at them continues to be awesomely helpful.
But I’d love some evidence from the professionals that this galaxy really is a sneaky spiral and that one only looks like it but is still really an elliptical. A page of spirals is pretty, but a page of galaxies that helps me see which are elliptical and which are early spirals would be more useful. A page of ‘arc’ results or a page of ‘disturbed’ results would be really helpful.
In short, you have to keep training us if we are to deal with more detailed or obscure classification. We need more feedback from you about how we’re doing.
You’ll get better results from us that way.
Thanks for this clarification, Chris.
I felt a twinge of anxiety when I read Manda’s essay, but the way the Zoo is branching out, I realized the necessity of having computers help out.
Jules and Weezerd really nailed it with their comments. The collective creative consciousness of the Zoo will never be matched by any amount of programming.
The War of the Worlds. Only this time the extraterrestials are replaced by machines.
I love that picture. 🙂
So dramatic 🙂