[Air-l] Re: e-science, the grid, and supercomputers
Denise N. Rall
denrall at yahoo.com
Wed Sep 24 10:52:16 PDT 2003
Dear Jeremy and Air'ers -
Having attended the e-science & e-social science
session at the recent Oxford Internet Institute iCS
symposium, here's a bit of a survey of what grid
computing might mean in the context of a series of
challenges that face e-Science. (ref's a bit sketchy
at this point).
Cheers! Denise
24. September. 2003
Denise N. Rall, Ph.D. candidate,
School of Environmental Science & Management
Southern Cross University, Lismore, NSW 2480 Australia
and
visiting scholar, Learning Technology and Distance
Education (LTDE),
Division of Information Technology
University of Wisconsin-Madison
Madison, Wisconsin 53706 USA
Why high speed computing infrastructures (grid)
computing are required to support the enterprise of
e-science
High speed (grid) computing infrastructures designed
to support e-science can be divided into the following
categories:
1) high speed connectivity to large computing
resources
2) interactivity among virtual research teams &
projects
3) remote access to scientific technologies and
tele-medicine
4) information services (messaging and databases).
For examples of present and envisioned use of grid
computing for purposes of e-science, see below.
1) high speed connectivity to large computing
resources
Grid computing can look to the history of
super-computer access for scientists who had large
data processing requirements (e.g. physics). The
super-computer sites were remotely accessed, and
service was provided by a time-shared operation that
was very expensive. The code for the project was
beamed over a high-speed satellite link that was not
ftp-compatible. Therefore, small subsets of computing
problems had to be organized and represented in code
that could be tested locally. Then, the code to
address the large data set could be uploaded to the
local computer, where it would sit in a queue for
beaming to the super-computer site (MACC, n.d.). Note
that all of these procedures were expensive. It is
particularly costly to code up small data sets or
simulations in a small computing environment for
delivery onto a larger computing platform.
In theory, access to grid computing will remove the
necessity for working in an 'offline-online' fashion
to process and analyze large data sets. Likewise,
grid computing should provide real time analysis for
ongoing large computations in the fields of physics,
astronomy, computer science, economics as well as
biological areas such as the human genome project.
2) interactivity among virtual research teams &
projects
The scientific community lives in a fully interactive
environment, with globally distributed research
projects. For one recent example, the zebra mussel
invasion of the coastal waters of the United States
has now invaded the Great Lakes (Lee, 2002). Due to
practices in international shipping, it was difficult
to trace the geneology of this invasive pest, although
current thinking points to the Black sea in Turkey.
The research effort (at this point) includes data
collection from over 100 research sites, inclduing,
the Black sea, the Caspian sea, European seas and
lakes, the St. Lawrence seaway in Canada, and the
Great Lakes in the US. Clearly, high speed computer
access to input data and provide real-time analysis
for this vast database is required. For another
timely example, the SARS epidemic showed an
unprecedented international response. The data
analysis and broadcast via web site alerted the world
wide community. The concerted interactivity, and
frequent updates via the WWW allowed public health
officers, epidemiologists, genetic analysis
laboratories, medical staff, and the World Health
Organization (WHO) to work together: resulting in the
quickest ever identification of a life-threatening
virus to date (see www.who.org).
In short, natural phenomenon are not demarcated along
political boundaries. Scientists will increasingly
need their colleagues based at research centers,
universities, and medical laboratories around the
world to assist with consultation, data collection and
analysis, as well as real-time world wide
dissemination of results.
3) remote access to scientific technologies and
tele-medicine
Real time (grid) computing will be essential to
promote a variety of remote services that include, at
their core, the application of scientific work. As
above, political boundaries cannot contain scientific
technologies, particularly during military conflicts,
environmental, and health-based disasters. While
real-time computing is used to support strategic
weapon systems, the Web also provides access for
real-time reportage on military conflicts and a quite
visible channel for political resistance. Beyond
posting on the web, however, is the urgency to work
via the web, which the current internet does not
allow. Note that there is a long history of remote
diagnotic work by medical pioneers in remote
locations, including the African subcontinent and the
Australian outback. Currently, heart surgeons in
Germany have already begun to work under computing
'hoods' that allow them to operate on patients
physically located in the next room via robotic
medical instrumentation (see report from Catalyst,
abc.net.au/science). And generally, many more remote
patients need assistance than is currently available,
but working in areas of conflict and epidemic are
fraught with risk to the medical staff.
Tele-medicine, however, could diagnose via x-ray and
other instrumentation, perhaps located on floating
hospitals not directly in the zone of conflict or
epidemic.
Working remotely in tele-medicine and other remote
technologies will clearly require a high-speed
computing grid, which will supplement the internet's
ability to deliver the publication of international
conflicts (and solutions).
4) information services (messaging and databases).
The requirement for high speed computing to deliver
increasing large data resources and information access
to facilitate scientific work has already been well
detailed by other authors (see Palmer, 2001).
However, one theme carries on from above. Remote high
speed access to large scale information and database
infrastructures will greatly assist the world's
information-poor, but they will also assist the
world's information rich. Remote links to developing
world technicians and scientists are obviously
required to facilitate scientific data collection and
analyses (above). But there is more. Today's computing
instrastructures cannot be built and maintained
without high speed access to a world-wide community of
programmers and database managers. The internet, in
its current topology is a global enterprise that
cannot develop within the confines of the
industrialized world community. Likewise, the
implementation and management of a high speed grid to
support the enterprise of e-science, and all of its
supporting technologies will be a world enterprise.
In conclusion, locating the e-science enterprise means
building a truly international framework alongside the
infrastructural requirements for grid-based computing.
References: [sketchy]
Catalyst, science television show,
www.abc.net.au/science.
Lee, C. E. 2002. Evolutionary genetics of invasive
species. Trends. Ecol. Evol. 17: 386-391.
MACC [Madison Academic Computing Center], n.d.
"Procedures for Super-computer access"
printed brochure, 3 pp.
Palmer, Carole L. (2001). Work at the Boundaries of
Science: Information and the Interdisciplinary
Research Process. Dordrecht: Kluwer.
SARS epidemic reports, collected on www.who.org.
=====
"The distance between here and there is growing; and getting even larger as we speak" (S. S. Hall)
Denise N. Rall, PhD student, School of EnvironSciMgmt, Southern Cross Uni, Lismore, NSW, 2480 Australia Phone +61-2-6624-8627 Fax +61-2-6624-8637 Office (Tuesdays) (02) 6620 3577 Mob 0438 233 344
http://www.scu.edu.au/schools/rsm/staff/pages/drall/index.html
__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com
More information about the Air-L
mailing list