Astronomy in the Era of Big Data: Citizen Science Tackles Big Data Astronomy

CosmoQuest, an organization dedicated to exciting and engaging the public in astronomy research along with mapping different aspects of the universe, recently unveiled Moon Mappers; the project is designed to crowdsource crater counting on the moon by enlisting interested amateurs. Simply put, crowdsourcing enables people or companies to obtain information by outsourcing their work to others, usually via the Internet. By employing the power of thousands of Internet users to count and sort craters, CosmoQuest hopes to expedite and refine their mapping efforts. CosmoQuest’s endeavor is in no way the first attempt to use novices for astronomical observation; in recent years, astronomers have progressively turned to the masses for aid in classifying steadily increasing amounts of new data.

In the past 40 years, the field of astronomy has blossomed from the upsurge of vast amounts of data collected from revolutionarily advanced telescopes and light detectors. The Hubble Space Telescope, the Sloan Digital Sky Survey, and other methods of observing and mapping the Universe offer unprecedented amounts of new and unsorted material. Currently, most automated classification algorithms lack the precision necessary to accurately categorize galaxies; however, the immense amount of data makes any individual effort at classification essentially futile. Crowdsourcing has provided an apt solution. In 2006, the University of California Berkeley, in conjunction with NASA, first began posting pictures from an interstellar dust collector on the web as a part of the Stardust@Home project, enticing curious Internet users to search for particles. With the expansion of both the Internet and astronomy data, crowdsourcing universe observation has become an increasingly appealing option for both astronomers and amateurs.

Today, astronomy related citizen science projects have grown to include organizations such as Galaxy Zoo, which allows anyone with an Internet connection to classify galaxies using data from the SDSS. Upon entering the site, users are asked a series of identifying questions regarding particular galaxies and are instructed to categorize by shape, spirals, and distinguishing odd features. From this data, astronomers can often ascertain information about surrounding atmosphere, age, and more. Co-founder of Galaxy Zoo Kevin Schawinski notes "we had succeeded in creating the world's most powerful pattern-recognizing super-computer, and it existed in the linked intelligence of all the people who had logged on to our website: and this global brain was processing this stuff incredibly fast and incredibly accurately.” The site, which has now expanded to 800,000 unique users, has already provided the bases for 42 formal scientific papers. Amateurs can still impact the field of astronomy through crowdsourcing; their efforts are powered by interest and curiosity instead of prizes or financial incentives.

However, detractors have voiced concerns regarding the scientific accuracy of such novice evaluations. Crowdsourcing research projects often offer little preparatory instruction and require no formal astronomy background. The University of Colorado Boulder recently tested the scientific rigidity of such projects; their study presented both novices and astrophysicists with lunar craters to count, similar to CosmoQuest’s Moon Mapping project. Surprisingly, the amateurs produced the same quality results as professionals. Crowdsourcing projects, including Galaxy Zoo, frequently ensure that their data collection is accurate; Chris Lintott, a Zooniverse leader, states that since the same galaxy is classified repeatedly by many users, accidental mistakes are generally avoided. Similarly, “the system insists that every classification is independent, and as…several people look at each classification finding any deliberate attack would be easy – in any case, we’ve never seen any evidence of such a thing.” Nevertheless, Zooniverse and other crowdsourcing projects are still in the process of filtering out unintentional error and biases; humans tend to classify anticlockwise galaxies easier, leading to sampling bias. In order to combat systematic error, in which users frequently misclassify the same systems, Zooniverse researchers methodically flip galaxy pictures and attempt other methods of reducing partiality.

With new data collection projects currently in development, including the Euclid spacecraft and the Large Synoptic Survey Telescope, it is unlikely that the necessity of citizen scientists will soon cease. While only computer classification algorithms will offer a truly realistic resolution to the big data predicament in astronomy, crowdsourcing provides a temporary solution that aids the field and engages and inspires amateurs. In a field where the data increase by factors of two annually, manpower through the masses is dramatically altering data classification for the better.

Laura Gunsalus

Friday, March 21, 2014

Citizen Science Tackles Big Data Astronomy