Popular Posts

Wednesday, November 24, 2010

HPCC Program Accomplishments and Plans

1. Networking 

The HPCC Program provides network connectivity among advanced computing resources, scientific instruments, and members of the research and education communities. The Program has successfully accommodated the phenomenal growth in the number of network users and their demands for significantly higher and ever increasing speeds while maintaining operational stability. R&D in advanced networking technologies is guiding the development of a commercial communications infrastructure for the Nation. The development and deployment of this new technology is jointly funded and conducted by the HPCC Program, state and local governments, the computer and telecommunications industries, and academia.

1.1. The Internet

One illustration of the global reach of HPCC technologies is that the Internet now extends across the country and around much of the world. Initially the domain of government scientists and U.S. academics, by the beginning of FY 1994:
    • Almost two million computers were accessible over the Internet.
    • More than 15,000 regional, state, and local U.S. networks and 6,300 foreign networks in approximately 100 countries were part of the Internet.
    • Nearly 1,000 4-year colleges and universities, 100 community colleges, 1,000 U.S. high schools, and 300 academic libraries in the U.S. were connected.
The HPCC Programs Internet investment primarily supports the high speed "backbone" networks linking Federally-funded high performance computing centers.

1.2. The Interagency Internet

The Interagency Internet, that portion of the Internet funded by HPCC, is a system of value-added services carried on the Nation's existing telecommunications infrastructure for use in federally-funded research and education. Its three-level architecture consists of high speed backbone networks (such as NSFNET) that link mid-level or regional networks, which in turn connect networks at individual institutions. At the beginning of the HPCC Program in FY 1992, most of the backbones were running at T1 speeds (1.5 Mb/s megabits per second or millions of bits per second), and international connections had been established. Peak monthly traffic on NSFNET had reached 10 billion packets (of widely varying size). In FY 1992, NSFNET speed was upgraded to T3 (45 Mb/s) and NSF made awards to industry for network registration, information, and database services and for a Clearinghouse for Networked Information Discovery and Retrieval. By the beginning of FY 1994:
    • Peak monthly NSFNET traffic reached 30 billion packets.
    • DOE established six ATM (asynchronous transfer mode) testbeds to evaluate different approaches for integrating this technology between wide area and local area networks.
    • NASA provided T3 services to two of its Grand Challenge Centers (Ames Research Center and Goddard Space Flight Center) through direct connection to NSFNET. In addition, service to several remote investigators was upgraded to T1 data rates.
    • NASA launched its Advanced Communications Technology Satellite (ACTS).

Advanced Communications Technology Satellite (ACTS) deployed by the Space Shuttle.

    • NIH and NSF funded 15 Medical Connections grants for academic medical centers and consortia to connect to the Internet.
    • Five "gigabit testbeds" established by NSF and ARPA (described below) became operational. In addition, a DOD-oriented testbed founded by ARPA focuses on terrain visualization applications.
    • ARPA established a gigabit testbed in the Washington, DC area in cooperation with more than six other agencies in the area.
Projected FY 1994 Accomplishments
    • Awards will be made to implement a new NSFNET network architecture (including network access points (NAPs), a routing arbiter, a very high speed backbone, and regional networks).
    • Additional very high speed backbones will link HPCRCs (High Performance Computing Research Centers, described below).
    • Connectivity for DOE's ESnet will grow to 27 sites.
    • More universal and faster Internet connections for the research and education community
    • Improved network information services and tools
Proposed FY 1995 Activities 
    • NSFNET -- Implement new architecture (awards were made in FY 1994); implement very high speed backbone to NSF Supercomputer Centers; establish some additional high speed links for demanding applications
    • DOE -- Upgrade ESnet services to T3 and selected sites to 155 Mb/s; upgrade connectivity to Germany and Italy to T1 and to Russia to 128 Kb/s (kilobits per second or thousands of bits per second)
    • NASA's AEROnet and NSI -- Establish internal T3 and higher speed network backbone to five NASA centers
    • Expand connectivity to schools (K-12 through university) -- connectivity funded by NSF and NIH will reach a total of 1,500 schools, 50 libraries, and 30 medical centers; NASAs Spacelink computer information system for educators will be made available via the Internet; toll-free dial-up access will be provided to teachers without Internet access.
    • NIH -- Acquire gigabit (billions of bits per second) local networks for use with multiple parallel computers and as a backbone to enable development of the Xconf image conferencing system
    • Integrate NOAA's more than 30 environmental data centers into the Internet through high speed connectivity and new data management tools
    • Expand EPA connectivity to reach a substantial percentage of Federal, state, and industrial environmental problem-solving groups and test distributed computing approaches to complex cross-media environmental modeling
    • Continue to support and improve information services such as the NSFNET Internet Network Information Center (InterNIC)

1.3. Gigabit Speed Networking R&D

New technologies are needed for the new breed of applications that require high performance computers and that are demanded by users across the U.S. These technologies must move more information faster in a shared environment of perhaps tens of thousands of networks with millions of users. Huge files of data, images, and videos must be broken into small pieces and moved to their destinations without error, on time, and in order. These technologies must manage a users interaction with applications. For example, a researcher needs to continuously display on a local workstation output from a simulation model running on a remote high performance system in order to use that information to modify simulation parameters.
As these gigabit speed networks are deployed, the current barriers to more widespread use of high performance computers will be surmounted. At the same time, high speed workstations and small and mid-size scalable parallel systems will gain wider use.

A teraflops (a trillion floating point operations per second) computing technology base needs gigabit speed networking technologies.


HPCC-supported gigabit testbeds funded jointly by NSF and ARPA test high speed networking technologies and their application in the real world.

The HPCC Program is developing a suite of complementary networking technologies to take fullest advantage of this increased computational power. R&D focuses on increasing the speed of the underlying network technologies as well as developing innovative ways of delivering bits to end users and systems. These include satellite, broadcast, optical, and affordable local area designs.
The HPCC Program's gigabit testbeds are putting these technologies to the test in resource-demanding applications in the real world. These testbeds provide working models of the emerging commercial communications infrastructure, and accelerate the development and deployment of gigabit speed networks.
In FY 1994-1995, HPCC-funded research is addressing the following:
    • ATM/SONET (Asynchronous Transfer Mode/Synchronous Optical Network) technology -- "fast packet switched" cell relay technology (in which small packets of fixed size can be rapidly routed over the network) that may scale to gigabit speeds
    • Interfacing ATM to HiPPI (High Performance Parallel Interface) and HiPPI switches and cross connects to make heterogeneous distributed high performance computing systems available at high network speeds
    • All-optical networking
    • High speed LANs (Local Area Networks)
    • Packetized video and voice and collaborative workspaces (such as virtual reality applications that use remote instruments)
    • Telecommuting
    • Intelligent user interfaces to access the network
    • Network management (for example, reserving network resources, routing information over the networks, and addressing information not only to fixed geographical locations but also to people wherever they may be)
    • Network performance measurement technology (to identify bottlenecks, for example)
    • Networking standards (such as for interoperability) and protocols (including networks that handle multiple protocols such as TCP/IP, GOSIP/OSI, and popular proprietary protocols)
Additional FY 1995 plans include completing DOE's high speed LAN pilot projects and providing select levels of production-quality video/voice teleconferencing capability. NASA and ARPA plan experiments in mitigating transmission delay to the ACTS satellite. NASA plans to extend terrestrial ATM networks to remote locations via satellite and to demonstrate distributed airframe/propulsion simulation via satellite.

1.4. Network Security

Network data security is vital to HPCC agencies and to many other users such as the medical and financial communities. FY 1994-1995 research is directed at incorporating security in the management of current and future networks by protecting network trunks and individual systems. Examples include:
    • Joint ARPA/NSA projects in gigabit encryption systems for use with ATM
    • Use of the ARPA-developed KERBEROS authentication system by DOE for distributed environment authentication and secure information search and retrieval
    • Methods for certifying and accrediting information sent over the network
NSA is addressing the compatibility of DOD private networks with commercial public networks.
The rapid growth of networks and of the number of computers connected to those networks has prompted the establishment of incident response teams that monitor and react to unauthorized activities, potential network intrusions, and potential system vulnerabilities. Each team serves a specific constituency such as an organization or a network. One of the first such teams was CERT, the Computer Emergency Response Team, based at the Software Engineering Institute in Pittsburgh, PA. CERT was established in 1989 by ARPA in response to increasing Internet security concerns, and serves as a response team for much of the Internet. FIRST, the Forum of Incident Response and Security Teams, was formed under DOD, DOE, NASA, NIST, and CERT leadership. FIRST is a growing global coalition of response teams that alert each other about actual or potential security problems, coordinate responses to such problems, and share information and develop tools in order to improve the overall level of network security.


2. High Performance Computing Systems

At the beginning of the HPCC Program in 1991, few computer hardware vendors were developing scalable parallel computing systems, even though they acknowledged that traditional vector computers were approaching their physical limits. By 1993, all major U.S. vendors had adopted scalable parallel technology. Today, a wide range of new computing technologies is being introduced into commercial systems that are now being deployed at the HPCRCs, in industry, and in academia. These include the whole range of scalable parallel and traditional systems such as fine- and coarse-grained parallel architectures, vector and vector/parallel systems, networked workstations with high speed interfaces and switches, and heterogeneous platforms connected by high speed networks. Some of these systems now scale to hundreds of gigaflops (billions of floating point operations per second). The HPCC Program is well on track toward meeting its FY 1996 goal of demonstrating the feasibility of affordable multipurpose systems scalable to teraflops (trillions of floating point operations per second) speeds.
The architectures of scalable systems -- how the processors connect to each other and to memory, and how the memory is configured (shared or distributed) -- vary widely. How these architectures communicate with storage systems such as disks or mass storage and how they network with other systems also differ.


Simulation of the behavior of materials at the fundamental atomic scale (adsorption and diffusion of germanium on a reconstructed Si(100 ) surface). Simulated using the iPSC/860 hypercube and the Paragon XP/S- 5 supercomputers.

In past years, the HPCC Program concentrated on the design and manufacture of high performance systems, including fundamental underlying components, packaging, design tools, simulations, advanced prototype systems, and scalability. ARPA is the primary HPCC agency involved in developing this underlying scalable systems technology, often cost-shared with vendors, for the high performance computing systems placed at HPCC centers across the country. Efforts are still devoted to developing the foundation for the next generation of high performance systems, including new system components that overcome speed and power limitations, scalable techniques to exploit mass storage systems, sophisticated design technology, and ways to evaluate system performance. Additional effort is now being devoted to developing systems software, compilers, and environments to enable a wide range of applications.


Scaling of memory chip technology is essential to increase both speed and capacity of computer-based systems. This figure shows details of 16 memory cells in a high density 1 megabit Static Random Access Memory (SRAM) that were created and visualized using advanced model tools for integrated circuit and technology design developed at Stanford University. Each color represents a physical layer of material (grey-silicon, yellow- silicon oxide, pink-polysilicon, teal-local interconnect and blue-metal) which has been patterned using advanced lithography and etching techniques. The details of geometries and spacings between the various layers are critical in determining both the performance of the SRAM and its manufacturability. Solid geometry models and three-dimensional simulations of both materials interactions and electrical performance are invaluable in optimizing such high density chips.
(Figure Courtesy of Cypress Semiconductor)

The applications now running on the new systems handle substantially more data -- both input and output -- than on traditional systems. Graphical display is critical to analyzing these data quickly and effectively. For example, output from three-dimensional weather models must be displayed and overlaid with real-time data collected from networked instruments at remote observation stations. Hardware to handle this task well, such as workstations for scientific visualization, is part of a high performance computing environment.

Comparison of Josephson Junction technology with gallium arsenide and CMOS (complementary metal oxide semiconductor) technologies showing potential for dramatic improvements in performance with low power.
The HPCC Program develops and evaluates a variety of innovative technologies that have potential for future use beyond the next generation of systems. Included in these are superconductive devices. These devices have demonstrated blinding speed and exceptionally low power consumption at the chip level, but need to be scaled up to more complex components to be useful. If transferable to the system level, these devices would have major impact on computing and communications switching systems. NSA is developing a technology demonstration of a multigigabit per second 128x128 crossbar switch that is potentially expandable to 1000x1000 at very low power. If successful, the technology will be evaluated in a system level computing application.

3. High Performance Computing Research Centers (HPCRCs)

HPCRCs are a cornerstone of the HPCC Program. HPCRCs are the home of a variety of systems: early prototypes, small versions of mature systems, full-scale production systems, and advanced visualization systems. The current production systems, capable of hundreds of gigaflops of sustained throughput, will be succeeded by teraflops systems. These systems are being used on Grand Challenge class applications that cannot be scaled down without modifying the problem being addressed or requiring unacceptably long execution times. The largest of these applications are being run on multiple high performance computers located around the country and networked in the gigabit testbeds.
An interdisciplinary group of experts meets at these centers to address common problems. These include staff from the HPCRCs themselves, hardware and software vendors, Grand Challenge applications researchers, industrial affiliates that want to develop industry-specific software, and academic researchers interested in advancing the frontier of high performance computing. Funding is heavily leveraged, with HPCC agencies often contributing discretionary funds, hardware vendors providing equipment and personnel, and affiliate industries paying their fair share. Industrial affiliation offers a low risk environment for exploring and ultimately exploiting HPCC technology. Two of these industrial affiliations are:
    • The Oil Reservoir Modeling Grand Challenge in which more than 20 companies and several universities participate
    • The High Performance Storage System Project in which more than 12 companies and national laboratories participate
Production-quality operating systems and software tools are developed at these centers, thereby removing barriers to efficient hardware use. Applications software tailored to high performance systems is developed by early users, many of whom access these systems over the Internet, and increasingly over the gigabit testbeds, from their workstations. Production-quality applications software is often first run on HPCRC hardware. The wide range of hardware at HPCRCs makes them ideal sites for developing the conventions and standards that enable and test interoperability, and for benchmarking systems and applications software.

Production-quality applications software often is run first on computing systems at HPCRCs.

The major HPCRCs are:
NSF Supercomputer Centers --
    • Cornell Theory Center, Ithaca, NY
    • National Center for Supercomputer Applications, Champaign-Urbana, IL
    • Pittsburgh Supercomputer Center, Pittsburgh, PA
    • San Diego Supercomputer Center, San Diego, CA
Tens of thousands of users from more than 800 institutions in 49 states and 111 industrial partners have computed on systems at the NSF centers. Currently there are 8,000 users and 78 partners. The centers are developing a National Metacenter Environment in which a user will view multiple centers as one. The National Center for Atmospheric Research (NCAR) in Boulder, CO, also receives HPCC funds.
NSF Science and Technology Centers
    • Center in Computer Graphics and Scientific Visualization -- Brown University, Providence, RI; CalTech, Pasadena, CA; Cornell University, Ithaca, NY; University of North Carolina, Chapel Hill, NC; University of Utah, Salt Lake City, UT
    • Center for Research on Parallel Computation, Rice University, Houston, TX
NASA Centers --
    • Ames Research Center, Mountain View, CA
    • Goddard Space Flight Center, Greenbelt, MD
DOE Centers --
    • Los Alamos National Laboratory, Los Alamos, NM
    • National Energy Research Supercomputer Center, Lawrence Livermore National Laboratory, Livermore, CA
    • Oak Ridge National Laboratory, Oak Ridge, TN
The DOE centers accommodate more than 4,000 users from national laboratories, industry, and academia.
Major systems at HPCRCs include one or more of each of the following (the number of processors in the largest machine at an HPCRC is shown in parentheses):
    • Convex
- C3880 (8 vector processors)
    • Cray Research
- C90 (16 vector processors)
- T3D (512 processors)
- YMP (8 vector processors)
    • Digital Equipment Corp.
- Workstation Cluster
    • Hewlett-Packard
- H-P Workstation Cluster
    • IBM
- ES9000/900 (6 vector processors)
- PVS
- SP1 (512 processors)
- Workstation Cluster
    • Intel
- iPSC 860 (64 processors)
- Paragon (512 processors)
    • Kendall Square Research
- KSR 1 (160 processors)
    • MasPar
- MasPar 2 (16,000 processors)
- MasPar MP-1 (16,000 processors)
    • nCube
- nCUBE2
    • Thinking Machines
- CM2 (32,000 processors)
- CM5 (1,024 processors)
Smaller versions of some of these scalable high performance systems have been installed at more than a dozen universities. The HPCRCs also use a variety of scientific workstations, such as those from Silicon Graphics and Sun Microsystems, for numerous tasks.
FY 1995 Plans
    • NSF will install new scalable parallel hardware and hardware upgrades, enhance Metacenter resources, and establish several more regional alliances.
    • DOE will install two different 150 gigaflop machines at two sites.
    • NASA will establish a prototype high performance computing facility comparable in nature but not in performance to the ultimate teraflops facility. It will be configured with advanced high performance machines, early systems or advanced prototypes of important storage hierarchy subsystems, and sufficient advanced visualization facilities to enable system scaling experiments. NASA Grand Challenge researchers in Federal laboratories, industry, and academia will access these advanced systems using the Internet and gigabit speed networks. These researchers will provide a spectrum of experiments for scalability studies. Prototype systems and subsystem interfaces and protocol standards will be established and evaluated, accelerating the understanding of the character of future teraflops computing systems.
    • NOAA will acquire a high performance computing system for its Geophysical Fluid Dynamics Laboratory at Princeton, NJ to develop new scalable parallel models for improved weather forecasting and for improved accuracy and dependability of global change models.
    • EPA will acquire a scalable parallel system to support more complex multipollution and cross-media (combined air and water) impact and control assessments. 

No comments:

Post a Comment