Taking citizen cyberscience a step further

Citizen Science

Recently I’ve stumbled upon the terms citizen science and citizen cyberscience. The last term was apparently invented by Shuttleworth Fellow Francois Grey as a label for BOINC-based distributed computing projects like seti@home, folding@home and his own lhc@home. Grey is also behind the Citizen Cyberscience Centre in Geneva – on the web pages of which there’s more information to be found about his motivation and ideas.

All fascinating stuff IMO. Not so much because of the direct scientific impact these distributed computing projects may or may not have, but because of the potential of getting more people interested in science and “academic” knowledge. Like the Khan Academy, an example of how the Internet really is improving the world.

Now, what does this have to do with GridFactory?

A lot actually! I’ll argue that GridFactory is conceptually a logical evolution of the ideas behind distributed computing software like BOINC (there is also the older distributed.net). If you visit the BOINC web site or the Wikipedia list of distributed computing projects, you’ll notice that despite what Grey and others, say, distributed computing is not really enabling “ordinary” citizens to do science… yet.

With BOINC-based systems citizens are passively witnessing their computer crunching away on some professional scientist’s problem. Yes, that might stimulate interest in the scientific problem at hand and science in general and that seems to be the hope and ambition of the enthusiasts behind these projects. A very commendable ambition IMO.

But wouldn’t it be nice if the citizen could actually take a look at the script/code he’s executing, try to understand it, change some parameters, run the modified code himself on his own PC… heck, create his own computing grid and run his modified code on all participating computers?

With GridFactory, all this and more is possible: collaborations are formed and destroyed on the fly; initiators of a collaboration can set up a software catalog (probably starting with a copy of someone else’s), a shared storage area and a compute cloud using their own, someone else’s, or even a central common server.

Of course, if GridFactory were to be used for large-scale, voluntary, distributed computing, like BOINC – with legions of workers carrying out the computations of a few, i.e. used to create a large grid of untrusted workers, it probably would not be a good idea to make it too easy for the legionnaires to mess with the code. In such cases, jobs can simply consist in booting up a locked-down virtual machine from a trusted software catalog.

That said, one well-known problem of some BOINC projects is precisely the scale and the fact that contributors are so eager to contribute more than their peers, that some cheat and produce fake results. Therefore, smaller collaborations are not necessarily bad. The world may need global collaborations to solve global problems, but some level of fragmentation or compartmentalization may have its merits too. In smaller groups, people know each other and there’s less incentive to fake results and more opportunity for real collaboration and involvement. Sure, you’ll not be running on 500’000 CPUs like seti@home, but there are many “smaller” problems out there that deserve attention and that may not need millions of CPU-hours to yield useful results.

The GridFactory equivalent of the BOINC client is the GridWorker. Like BOINC, GridFactory also uses Apache and MySQL on the server side, but where BOINC implements the actual web service as CGI scripts, GridFactory’s web services are implemented as Apache modules and designed to “talk” to each other, i.e. pull jobs from each other, and allow horizontal scale-out. But what really sets GridFactory aside, as compared to BOINC, is the integrated software catalog, the group functionality and the GridPilot GUI for creating and managing compute jobs.

Finally, I’ll add that I’m not saying GridFactory could or should replace BOINC in large-scale voluntary computing projects – it is a far less mature software product; but I am hoping that the ideas and concepts behind GridFactory may serve as a source of inspiration for future developments of citizen cyberscience and help the overall democratization of the enterprise of science.

The vision of GridFactory encompasses precisely this: democratization of citizen science – allowing citizens to not only passively contribute to science, but to engage and actually do science.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>