Howdy from Texas!
Howdy, y’all! I’m at the American Astronomical Society meeting in Austin, Texas. Many of you had said you wished you could come to this meeting. Zookeeper Chris and I are here, and we’ll try to give you a vicarious experience of what it’s like here. We’ll be posting more often than usual this week, as well as later – Austin is 6 hours behind the U.K. Next week, we’ll resume our normal Monday-Thursday schedule.
We’re also coordinating with a couple other astronomy blogs to better cover the meeting. We’re working with Fraser Cain and Pamela Gay from Astronomy Cast, and Phil Plait from Bad Astronomy.
You can follow all our blogging at the Astronomy Cast Liveblogging page:
http://www.astronomycast.com/LIVE/
Whenever any of us posts, we’ll mirror it on that site for everyone to see. If this works, we’ll try to do it again at future meetings. Hope you enjoy it!
Blue ellipticals – lots of them!
Hey all, as some of you know, I’m working on the blue ellipticals in Galaxy Zoo. I’ve been working on the formation and evolution of elliptical galaxies ever since I started my PhD. In many ways, they’re the most interesting galaxy type out there because they never really want to come out “right” in simulations. They are rather enigmatic objects and we’re not sure how they form.
They *appear* to be completely quiescent – i.e. not forming any stars* – but more recent work by our group has shown that that’s not entirely true. We used the GALEX ultraviolet space telescope to look for small amounts of hot, blue young stars in elliptical galaxies and to our surprise found them to be very common!
It turns out that it’s just too hard to really disentangle such small populations of young stars against the background of really old stars. Unfortunately, this idea that elliptical galaxies are all old and have no young stars in them led some people to specifically *exclude* any galaxies that might have young stars in them from the elliptical class. So our discovery from the ultraviolet data led us to search for more of these elliptical galaxies with young stars in them.
We knew that the only way to find them was my making our own catalogue of ellipticals where we would *not* throw out things that look like ellipticals but had the blue colours or spectra indicating young stars.
I did classify 50 000 galaxies from the SDSS by eye in a week, dividing galaxies into ellipticals vs. everything else; once I was done, I checked to see what was left. And lo and behold, there was a small but significant population of *very* blue elliptical galaxies! This project of course led to Galaxy Zoo, since 50 000 may sound a lot, but it’s only a tiny fraction of the 1 million in SDSS. So now with all the classifications from all you guys (thanks so much!), I’ve been able to study blue ellipticals in much more detail.
Here’s what I did: I selected a redshift (distance) range and a limit in absolute magnitude (luminosity – how bright the galaxies actually are) to create a “volume-limited” sample. That’s a sample where I know that I’ve got all galaxies down to a certain luminosity limit in a certain volume. That’s important when you want to compare numbers (e.g. blue vs. red ellipticals), because blue and red galaxies can have different luminosities, and we must compare apples to apples.
I already knew from earlier tests that your classifications are absolutely awesome, but when I pulled up the images of those galaxies that you classified as “elliptical” that also had very blue colours, I was amazed. Here they were! Blue ellipticals, lots of them!
So with this incredible sample in my hands, I started work on a paper. In Chris’s words, it’s a “classic astronomy paper” because we’re doing nothing fancy, but simply report what we find. The most important finding (I think) is that blue ellipticals exist (i.e. they aren’t misclassified spirals) and that they aren’t super-rare, but make up ~5% of the elliptical population.
We’ve also measured their star formation rates in a variety of ways and the measured rates make them by far the highest ever reported for ellipticals. We have some ellipticals with star formation rates of over 50 solar masses per year. To compare, our own Milky Way only manages about 3 per year!
Here are some example images (click for a larger view):
What’s next? I am still polishing the text and we’re doing some comparisons to a simulation. After that, I will circulate the draft with the other team members again for a final round of comments and then it’s probably good to submit to a journal.
Speak your weight
Following Anze’s post we’ve had a number of requests from people to see their weighting. Indeed we did think about this for a bit, but alas I reckon it’s a bad idea to publish users weights for a number of reasons – but mainly because it is very unclear what the weights really say about a user. For example, a user might be really upset to see that they have a low-weighting, indicating that they often disagree with the majority of people (if we use a democratic weighting scheme). They might then try to alter their classifying behaviour so to ‘improve’ their weighting. This would be disastrous, because perhaps they are actually excellent at classifying, and much more meticulous than the majority of other users. In this case a low weighting would be good! The majority ain’t always right, right?!
If providing this kind of feedback was to have any effect then this would be bad – because it would mean our data becomes correlated in complicated ways that we can’t trace i.e. the results from one week can affect the next week. Ultimately everyone will want to up their weighting, and I can imagine a horrible situation where everyone just clicks elliptical all the time! Because this would give everyone great weightings – but completely ruin the project!
Therefore, you see that knowing your weight can only have a bad effect (if it isn’t going to have an effect then there’s no need to know 😉 ). Ideally we don’t want anything to have an effect on you – we want everything to be as unbiased and open and transparent as possible when it comes to analysing the results.
Plus the weights change all the time, as more classifications are made and it is computationally very intensive to compute them. Further, there’s an infinite number of ways of working out the weightings! We could see how well you agree with each other for just the bright galaxies, or how well you agree with an ‘expert’. But the point of all this is that we do not know the true morphology of these galaxies – and therefore we cannot give you a true weight (ie. how well you are classifying).
I hope that helps to explain our situation a bit! I appreciate that it must be a bit frustrating not to get more feedback on your classifications. Perhaps when this phase of the project is wrapped up then we can feedback more… and ultimately all this data is probably going to be made public! Cheers, Kate
Final pre-bias data download
Today I delivered the final pre-bias testing data to the collaboration. In other words, for some time now, the website is serving only images for the purpose of bias testing – the mirrored, black and white, etc images. Therefore, the standard data are as complete as they will ever be. The process of getting the data into a form suitable for processing for individual science projects is beautifully inefficient and convoluted! Below is a somewhat technical description of what I do.
First I login into a database server at the Johns Hopkins University and perform an SQL query that dumps the entire live database into a text file, which I then compress and FTP over to my computer workstation at the Lawrence Berkeley Laboratory. This is to bridge the gap between computer science world (pretty ASP.NET code and SQL backend) and science world (spaghetti FORTRAN code on UNIX and binary files).
The data is then reduced in a series of steps. First, the data is organized and sorted by galaxies, and usernames are converted into consecutive numbers (so that the usernames are anonymous in the final database). Second, the data from various downloads are combined into one big dataset. Third bad data are weeded out (misconfigured browsers, bots and similar). Finally, the reduced “histograms” for each galaxy are produced. These correspond to our final state of knowledge about each galaxy.
There are four ways of doing these: spirals can be combined or separate and users can be reweighted or not (and two times two makes four). In the combined spirals sample, we combine all three spiral subsamples (clockwise, anti-clockwise, and edge-on) into a single spiral category: science projects that are interested purely in the galaxy evolution don’t care about orientation of a given galaxy. In the reweighted sample, we try to improve the sample by essentially comparing the agreement between users: the idea is that if ten users claim that a certain galaxy is a spiral and the eleventh users says it is an elliptical, it is likely that the 11th user got it wrong. Users who commonly disagree with everyone else gets down-weighted and those who always agree get up-weighted.
It is a purely statistical exercise meant to remove pranksters that click randomly and up-weight careful users. In practice, we can check how well it works. We do this (well, Steven does it) by looking at galaxies that have the same absolute luminosity and size and shouldn’t evolve over the small redshift range probed by the SDSS. The upshot is that it doesn’t work as well as initially anticipated: as an old english proverb goes: if one million French believe in something, it doesn’t make it right. And so we also produce the unweighted sample in which all users are given the same weight. It is up to individual science projects to decide which combination to use.
Finally, the reduced data is uploaded to a super-secret web server where other collaborators can download it.
The final datasets contain 34,617,406 clicks done by 82,931 users. Hooray for all of you! However, the previous downloads already went over 30 million, and hence this will make only small improvements to our science results. Now, the important task is to gather enough information about biases in our datasets and so keep clicking, please!
The Bias Study
Happy New Year from Galaxy Zoo! 2007 has already left for some of you, and for others, it will be leaving soon. 2007 has been a great year for the study of galaxies, thanks to the galactic classifications that you have completed. I thought that in today’s post, I’d tell you a little more about what we are doing right now, as 2007 turns to 2008. Chris originally posted about this in the forum, and Kate has been providing a status update.
In last Thursday’s blog post, Chris gave an introduction to projects that the team is now working on. When he talked about Kate and Anze’s cosmology study (finding the rotation directions of spirals), he mentioned that a key to this project was completing the bias study. That “bias study” includes the rotated, mirrored, and black-and-white galaxy images that we’re asking you to classify now.
For those of you that aren’t familiar with what scientists mean by “bias,” it’s one of those funny words that has a specific meaning in science that is different from its meaning in everyday life. When we say “bias” in daily life, we often mean that someone has an agenda that means they can’t be trusted. For example, we might say that a news organization has an [insert political viewpoint] bias that means you just don’t trust that they’re giving you an accurate picture of the news.
That’s not what we mean here, though! There are several related meanings of the word “bias” in science, but they all come down to the question of whether results were influenced by some factor that the scientists didn’t think about. The scientist might ask himself or herself: Did I really study everything I could have studied? Did I do the right type of analysis on the data? Did I have some preconceived idea that kept me from interpreting the data with an open mind?
The question that we are asking is about how people see galaxies. When you see a fuzzy galaxy, is it easier to see it as an anticlockwise spiral than a clockwise spiral?
We’ve talked with some colleagues from psychology, and there is no reason to think that should be the case – but we want to test the idea just the same. That’s why we’ve introduced the rotated and mirrored galaxy images. The mirror image of a clockwise spiral should appear anticlockwise, and vice versa. So, all the clockwise galaxies in our original data should now appear anticlockwise in the “bias study data,” and vice versa.
The specific type of bias we’re looking for is called sample bias. Scientists often take a sample of a thing by measuring properties of some number of that thing. We’ve asked you to classify many galaxies, and the sample consists of your classifications. If, as mentioned above, humans classify anticlockwise galaxies easier, then the classification sample will have more anticlockwise galaxies than the real universe. If our sample of galaxy classifications is not the same as the classifications of galaxies in the real universe, then any conclusions we draw about the universe from our sample will be wrong.
And we don’t want to be wrong. Scientists are, by nature, very careful. We want to think of every possible effect, and every possible interpretation of our data, before deciding on the right one. By being as careful as we can, we hope to increase our understanding of the real universe in which we live.
As always, we need your help to do this. See Kate’s status update on the forum, and keep classifying those galaxies!
Inside the first results
As Jordan’s already said, this blog is supposed to give you, our users and collaborators, a window into the research we’re doing. As I suspect most of you know, we’re working on a set of papers which will hopefully be submitted to a journal in the next month or so. The first set will probably contain four papers, and I thought I’d give you a run down of what each of these four is designed to do.
Although the whole team will be listed as authors (and we’ll include a link to the site which gives credit to those of you who have chosen to enter the names on the special page on the site), each is being led by a different team member. My paper is a general overview of the project, including a discussion of the process by which we’ve gone from clicks on the webpage to a catalogue of galaxies. The idea is to provide all the information that others working with our data might need in one place, and to avoid having to duplicate information in each of the individual papers.
Steven (in Portsmouth) is writing a paper that focusses on comparing the spirals and the ellipticals; he has to be more careful than most to account for the tendency of faint fuzzy things to be classified as elliptical galaxies, and has developed a whole set of tools to keep an eye on this. The results are excellent; we’ve always known that ellipticals tend to live in denser environment than their spiral counterparts, but with Galaxy Zoo we can really look at the details of this relation.
Kevin’s paper discards most of the galaxies to focus on some of the oddballs; the infamous blue ellipticals. Most elliptical galaxies are supposed to have finished star formation long ago, but these are still going strong. We’re planning to publish a list of these in the paper so hopefully other people will be able to follow them up alongside us.
Finally, Kate and Anze are leading the cosmology study, looking at the rotation direction of spirals. They’re desperate to get the bias study that’s now underway done so that their paper can be finished off – that’s the most critical thing at the moment so every classification you make gets us closer to being able to release the first science results.In the meantime you’ll hear more about each of these projects over the next week or two on this blog, but do comment in the meantime either here or on the forum.
I’ll finish with a couple of mea culpas – when I send out the email announcing our Christmas gift to you, I should have said that the link was on the left of the analysis page, and realised that for some of you it was the Summer Solstice. I won’t make either mistake again (just different ones). Chris
What this blog is all about…
Greetings, and happy holidays from Galaxy Zoo!We’ve really appreciated all the work you have done in classifying all the galaxies in the Zoo. If you haven’t been around for a while, we’d love it if you returned to the Galaxy Zoo site to classify a few more galaxies. If you haven’t already, please take a look at the Galaxy Zoo forum, where you can talk with fellow classifiers about the Zoo, astronomy, or anything else that strikes your fancy.
This blog is the latest project from the “Zookeepers”- the small but dedicated team that operates the site. Thanks to all of you, we now have a lovely sample of galaxies marked as “elliptical,” “clockwise spiral,” and so on. What we want to do now is to see what your classifications can tell us about the universe we live in. And oh my, are they telling us a lot about our universe. More than we ever imagined – and it seems like every week, we think of a new project we can do with your wonderful classifications.
Since you’re the ones who have done these classifications, it’s only fair that we keep you up to date on what we are doing.Here’s what happening now. We are busy analyzing the classifications in various ways; counting, sorting, measuring, and comparing our measurements to other scientists’. We are working on a number of projects right now. All of us are contributing to every project, but each project has one (sometimes two) people primarily responsible for it.
Soon, we will start communicating our results to other scientists. There are two main ways that scientists communicate with each other: meetings and papers. Meetings are the place to present work in progress and get feedback from other people, and papers are the written records of finished projects.* We will be giving two presentations at the American Astronomical Society (AAS) meeting in early January in Austin, Texas, USA. Chris will give a talk focusing on our research reuslts, and I will give a poster presentation on how the public has helped us create these projects. (More on this meeting as it gets closer.)
Along with preparing presentations for this meeting, we are also starting to write the scientific papers on our results.We created this blog to give you a window into the process by which we are conducting our research, and writing our papers. We’ll be doing a new post every Monday and Thursday, in the afternoon GMT (so, afternoons in the U.K. and mornings in the U.S.). We encourage you to leave comments here, and also to head over to the Forum to talk with other people.Coming up this Thursday – Chris will give an orientation of what we are working on, and who is doing what.
*No research project is really ever finished, because it always must be repeated and expanded upon, but papers are the place where results of a single project are recorded.
Welcome!
Welcome to the brand new Galaxy Zoo blog!
To start with, here’s a page with a bit of info about the project and what we hope to do with this blog.
Stay tuned for more!
