Category Archives: GridFactory

Taking citizen cyberscience a step further

Citizen Science

Recently I’ve stumbled upon the terms citizen science and citizen cyberscience. The last term was apparently invented by Shuttleworth Fellow Francois Grey as a label for BOINC-based distributed computing projects like seti@home, folding@home and his own lhc@home. Grey is also behind the Citizen Cyberscience Centre in Geneva – on the web pages of which there’s more information to be … Read the rest

CERN School of Computing 2011 was… awesome


As exemplified on the pictures above (from the photo gallery of one of the students), CERN School of Computing includes a good deal of extracurricular activities. Which probably goes a long way in explaining the good and humorous atmosphere in the classroom.

Having spent a lot of energy preparing the exercise with the great CSC team, it really was great … Read the rest

The vision of GridFactory


In this post I’ll try to explain the vision behind a software suite I wrote at the Niels Bohr Institute, University of Copenhagen, 2009-11.

The ideas for the software arose from my dual involvement with the high energy physics and the e-science groups. Some of the code originated in previous projects, including distributed computing efforts of the ATLAS experiment at … Read the rest

CERN School of Computing 2011 – Exercise 2

In this exercise we’ll solve the same assignment as in exercise 1, but using a prepackaged GridPilot application.

  • Download and install GridPilot
  • Start GridPilot and answer the initial questions – enable only the computing system “GridFactory” and set the submission host to
  • Import the app “ttbar_exercise”
  • Select the application/dataset “ttbar_exercise-100k-events” and click “Run”
  • Select the application/dataset “ttbar_exercise-merge” and
  • Read the rest

CERN School of Computing 2011 – Exercise 1

pT diboson distribution.

pT diboson distribution.

This exercise was created for the 2011 CERN CERN School of Computing, hosted by the Niels Bohr Institute, University of Copenhagen.

Credits: physics content of this exercise by Jørgen Beck Hansen from the Niels Bohr Institute.

LHC Monte Carlo event generation and analysis

Read the rest


The original purpose of GridPilot was to make it easy for researchers to run, preserve, rerun, share computations on the emerging national and international grid infrastructures. The word infrastructure is used in plural here – indicating our failure to create the grid. Thus GridPilot was born with a pluggable backend support.

Eventually, I created my own take at a grid … Read the rest

GridFactory software suite – overview and documentation

In this post I’ll give an overview of the GridFactory software suite (including GridPilot) and provide the minimum information to get started as well as pointers to more thorough documentation.



GridFactory is a software suite composed of the following programs:

factory_c_smallGridFactory server
The GridFactory server is the software running on the server to which jobs are submitted. It

Read the rest

GridFactory server installation instructions

Notice that currently the software has only been tested on Ubuntu-9.10, 10.04 and Fedora-12, 13 – i386*

Notice also that you must first install SUN/Oracle’s Java (=1.6), either from a distribution repository or directly from Oracle.

Download and installation on Ubuntu


Download mod_gacl, mod_gridfactory, gridfactory_server: either use your browser and get the files from

Read the rest

Public beta!


Dear grid warriors: new tools are now available to assist you in your battles. The GridFactory software suite, including GridPilot, is now available for download.

Supported platforms

GridPilot and GridWorker have been tested on the following platforms:

  1. Ubuntu 9.0.4, 9.10, 10.0.4 – i386
  2. Windows XP i386, Windows Vista i386 and Windows 7 i386 and x86_64
  3. Mac OS X 10.6
  4. Read the rest

CERN/ATLAS data processing on grids and GridFactory

In this post I’ll report on running the application “mc09_7TeV.107691….” from the GridPilot app store. In the case of NorduGrid and WLCG, the ATLAS software is preinstalled on the resources. In the case of GridFactory, the jobs run inside a CernVM appliance with ATLAS software loaded through the AFS network file system. The input dataset consisted of 26 files totaling … Read the rest

CERN/ATLAS Monte Carlo simulation on grids and clouds

In previous posts we saw that I/O bound jobs ran ~3 faster on standard SATA disks than on network file systems and block devices (GPFS, NFS, EBS). This post reports on CPU bound jobs. I ran standard ATLAS Monte Carlo simulation on both grid and cloud resources: imported the ATLAS simulation app and ran the default 100 small jobs, each … Read the rest

CERN/ATLAS boildown on clouds

In this post, I’ll take a look at some more runs of the “atlas_d3pd_boildown” application available in the GridPilot app store. The difference w.r.t. the runs described in a previous post is that this time I ran on cloud as opposed to grid resources. On dedicated hardware and on two public clouds, Amazon’s EC2 and Cabo’s Irigo cloud, I … Read the rest

POV-Ray II – no free lunch on EC2

In this post, I’ll take a look at some runs of the POV-Ray application available in the GridPilot app store: To import this app, just choose “File → Import application”, navigate to the relevant folder and click “OK”.

This application is a bit more sophisticated than the one used for the simple benchmarking described in a previous post. Now, … Read the rest

Feature extraction of medical images on grids and clouds

This example is special in that it does not depend on any preinstalled software package (runtime environment), but includes a precompiled binary. This binary will of course only for certain run on the system it was compiled on. We compiled on Debian Sarge and Scientific Linux 5 and run on all back-ends: a local virtual machine, GridFactory without virtualization and … Read the rest

MP3 encoding on GridFactory

Here is a video I put together to demo how to use GridPilot to run computations on a GridFactory cluster:

The demo uses the default input files – which are 12 royalty free music files found on This can be changed – by right-clicking on the input dataset, “music_files”, and choosing “Import file(s)”. If you’ve already imported the … Read the rest

MP4 transcoding on EC2 with GridPilot and GridFactory

Things to Come  The Last Man on Earth  Sintel  Elephant’s Dream

Given the popularity of the iPhone, an interesting use of a batch system is conversion of movie files from the AVI to the MPG4 format. In this post I’ll explain 3 ways doing this with GridPilot. Which way you prefer will likely depend on the number and size of files you want to convert and the power … Read the rest

POV-Ray rendering

To gauge the performance of both GridFactory and virtualization layers in a high-CPU/low-throughput setting, we chose the standard ray-tracing program POV-Ray and a standard benchmarking image, shipped with the program.

Povray default benchmark image
The standard image that was rendered.

This example is a fairly naive benchmarking exercise consisting simply in rendering the same image with POV-Ray 20 times. Each POV-Ray job used a … Read the rest

CERN/ATLAS n-tuple boildown on NorduGrid, WLCG and GridFactory

Plot of ATLAS data created with GridPilot from official datasets.

This example demonstrates the use of GridPilot in data processing in high energy physics (HEP). It makes extensive use of some HEP-specific technologies, that are incapsulated in GridPilot in the form of plugins: the ATLAS DB plugin and the NG and GLite computing system plugins. The jobs chosen are so-called … Read the rest