How to Translate Galaxy Zoo
Not too long ago we announced that Galaxy Zoo has gone open source – along with several other Zooniverse projects. Part of that announcement was that it is now possible for anyone to translate the Galaxy Zoo website into their own language and have that pulled back into the main site. We love translation at the Zooniverse! Using GitHub (our code repository) means we can open up the translation process to everyone.
I’ve been answering a lot of emails about how this process works so I thought I would outline a tutorial here on the blog. To get started go to: https://github.com/zooniverse/Galaxy-Zoo/tree/master/public/locales and download the .json file corresponding to your language. If there is not yet one there you have two options:
- Clone the app locally from GitHub and run the translate.rb file in root
- If step 1 doesn’t make any sense then contact email@example.com and we can create the file for you.
These JSON files are tree structures of strings in “key”: “value” pairs that contain all the translatable text on Galaxy Zoo. You need to translate just the values , which are the parts after the colon (:) shown in bold in the example chunk of the file below.
“wont_work”: “This site probably won’t work until you update your browser.”,
“recommended”: “We recommend using <a href=\”http://www.mozilla.org/firefox/\” target=\”_blank\”>Mozilla Firefox</a> or <a href=\”http://www.google.com/chrome\” target=\”_blank\”>Google Chrome</a>.”,
“ie”: “If you use <a href=\”http://www.microsoft.com/windows/internet-explorer/\” target=\”_blank\”>Microsoft Internet Explorer</a>, make sure you’re running the latest version.”,
“chrome_frame”: “If you can’t install the latest Internet Explorer, try <a href=\”http://google.com/chromeframe\” target=\”_blank\”>Chrome Frame</a>!”,
You do not translate the parts before the colon as these are the keys that are used to identify each string. so in the example you do not translate “zooniverse”, “browser_check”, “won’t_work, “recommended”, “ie”, “chrome_frame” or “dismiss”. Here’s the Spanish version of the above segment of the file:
“wont_work”: “Es probable que este sitio no funcione hasta que actualices tu navegador.”,
“recommended”: “Te recomendamos usar <a href=\”http://www.mozilla.org/firefox/\” target=\”_blank\”>Mozilla Firefox</a> o <a href=\”http://www.google.com/chrome\” target=\”_blank\”>Google Chrome</a>.”,
“ie”: “Si utilizas <a href=\”http://www.microsoft.com/windows/internet-explorer/\” target=\”_blank\”>Microsoft Internet Explorer</a>, asegúrate que estés usando la última versión.”,
“chrome_frame”: “Si no puedes instalar la última versión de Internet Explorer, intenta usar <a href=\”http://google.com/chromeframe\” target=\”_blank\”>Chrome Frame</a>!”,
Note that any quotation marks need to be escaped i.e. ” becomes \” – these files have to be valid JSON and there is a handy online tool for validating this at http://jsonlint.com/ – here you can paste in the whole file and it will tell you where there are any formatting errors if you have any.
There is very little scope for doing language-specific formatting on the website. This means that if text is too long when it’s been translated it may run off the page or be cut-off on the screen. Because of this, you need to keep the translated strings to approximately the same length. If this causes issues let us know. To test out the translation and see how it looks, which you’re welcome to do ant any time, you can either email your current file to firstname.lastname@example.org or run the Galaxy Zoo app locally by cloning it from GitHub (https://github.com/zooniverse/Galaxy-Zoo/).
We also have an email list for Zooniverse Translators. If you’d like to join it in order to ask questions of other translators and hear about other projects you might want to translate then email email@example.com. If you are planning on doing a translation it would be worth joining the list to coordinate with other translators in your language.
NOTE: If you’re familiar with GitHub, you can clone the Galaxy Zoo repo, create a local JSON file for your language and just submit a Pull Request when you’re ready. You can find the translation-creator script here.
When your translation is complete will find find an astronomer somewhere in the world who speaks your language, in order to double-check (peer-review!) the new text and give feedback. This is done to ensure that the site is still conveying the original meaning and acts as a good error-checking mechanism.
Good luck with your translation, and thank you! Hopefully we can open up Galaxy Zoo to many more people around the world.
7 responses to “How to Translate Galaxy Zoo”
Trackbacks / Pingbacks
- May 9, 2013 -
This is awesome!
If GZ is successfully translated to another language, can a zooite classify galaxies on more than one site? Or would the new, translated, GZ classify page(s) be ‘just’ the same engine dressed up with different clothes?
For the Science Team, how likely is it that they’d have to start figuring out a new/different source of bias?
What I mean is that, for many words (and many language pairs), there is not an exact mapping; ‘hong’ (in Chinese, sorry don’t know how to paste the character in here) and ‘red’ (in English) for example are similar, but if you present a range of colors to speakers of both languages (assuming no color blindness!), you’ll find they are not in perfect alignment (caveat: maybe they are for ‘red’, but not for ‘purple’). So if you had asked about ‘red’ (in English), the answers would have a different distribution than if you had asked about ‘hong’ (in Chinese), even though they are, in fact, the same colors.
Sure, GZ has little icons to go with the words (and these help), but if the words do not match very well, across languages, won’t there be an ‘offset’ between the two sets of classifications?
Also, what about Talk? When a translation of GZ is done, does that result in a translated version of GZ Talk too?
ttfnrob’s interesting blog duly recognizes the normal requirement for a good translator, which is to be a native speaker of the language being translated into. However, another requirement is an adequate appreciation of context, and JeanTate naturally homes in on the classification routine.
For example, in French, according to context, the adjective ‘smooth’ might conceivably be translated by any one of the following: lisse, soyeux, doux, glabre, onctueux, homogène, moelleux, régulier, calme, suave, mielleux, beau. What’s more, a French-speaking translator has to treat the question (‘Is the galaxy simply smooth and rounded, with no sign of a disk?’) and the permitted answers (‘Smooth’, ‘Features or disk’, ‘Star or artifact’) together as a whole, and this might take a lot of rethinking to suit French idiom. You can’t rely on Google’s translation machine to do something as delicate as this.
Of course, as JeanTate observes, the translation would necessarily conform to the answer icons, with the choice facing the classifier made as clear as possible by the phrasing of the question. All the same, there does seem to be a danger that translators might not be sufficiently conversant with the classification exercise to do a properly considered job of it, so I think JeanTate is right to be concerned by the possible implications for the calculation of bias.
As for translating Talk (or the Forum), it seems to me that if the English posts were translated into all the other languages, then all the non-English posts would have to be translated into all the other languages too, including English. Surely the content of Talk and the Forum would be generated in the relevant language, and not supplemented by translations from other languages. Indeed I assume that is the case with the already existing non-English versions of GZ, such as the Polish and German ones.
Translation of the site in this way simply means that people using the site are able to read the text in multiple language. So no, it is not possible for galaxies to be classified twice between languages.
We record the language in which classifications were recorded (as well as things like the browser and operating system) and so we can check for biases if required. There may be differences between users in different translations and those would be extremely interesting to explore.
We intend to have bilingual astronomers peer-review the translations wherever possible. We need to do what we can to make sure that someone using the site in German is doing the same task as someone using it in Finnish, for example.
We are exploring ways of translating Talk as well – but that is further down the road.
Very interesting, thanks ttfnrob/RobSimpson!
So, in a translated GZ Classify – in Finnish, say – will there be the (Finnish) choice of “Would you like to discuss this object?”
And if they click “Yes”, they’d get a (new) Talk page/thread, right? But it’d be in English, not Finnish … There might be some interesting new hashtags, especially if we have an Arabic GZ, a Japanese GZ, a Thai GZ, … 😉
Not surprisingly there are languages which have no corresponding term in another for some quite common concepts. I am thinking in this case specifically of Russian, which has some concepts of “blue” for which we have no equivalent in English.
I took a look into the translation file. There are also some administrative terms that may have no counterparts in other languages. For example “Grammar school”, “Middle school” and “High school”. In Germany we usually only have two school levels, “Grundschule” and “Oberschule”. Within “Oberschule” there are still age levels “Sekundarstufe I” and “Sekundarstufe II”. But if the corresponding question is aimed at kids they may be unable to answer because usually they don’t have to care about this difference.