MICROELECTRONIC SYSTEMS NEWS
FILENUMBER: 491
BEGIN_KEYWORDS
FPGAS RECONFIGURABLE PROCESSING ELEMENTS XILINX
END_KEYWORDS
DATE: january 1996
TITLE: FPGAS as Reconfigurable Processing Elements
FPGAS as Reconfigurable Processing Elements
(Contributed by Brad Fawcett of Xilinx Corp.)
The following information is taken from an article which appeared
last year in XCell, a quarterly journal published by Xilinx. The
journal may be accessed via WWW .
Alternately, a hardcopy subscription may be obtained at no charge
by sending email to brad.fawcett@xilinx.com.
In most applications, FPGAs are used to implement "glue logic",
providing the advantages of high integration levels without the
expense and risk of custom ASIC development. However, as SRAM-
based FPGA devices have increased in capability, their use as
in-system-configurable computing elements is receiving consider-
able attention. Indeed, reconfigurable FPGA technology holds the
potential for reshaping the future of computing by providing the
capability to dynamically alter a computer's hardware resources
to optimally service the immediate computational needs.
Computing circuits built from SRAM-based FPGAs can meet the true
goal of parallel processing - executing algorithms in circuitry
with the inherent parallelism of hardware, while avoiding the in-
struction fetch and load/store bottlenecks of traditional von
Neumann architectures. There are many computationally-intensive
algorithms that can benefit from being partially or wholly imple-
mented in hardware. Typically, these algorithms are too special-
ized to justify the expense of manufacturing custom IC devices.
Just as often, the "algorithm space" is very large, and it may be
impractical to perform enough simulations to find the optimal ap-
proach before committing to custom hardware.
FPGA-based coprocessors address all these issues. With an FPGA-
based "configurable coprocessor" the user can design (via FPGA
configuration) exactly the special hardware required for a given
task without having to construct new hardware for each applica-
tion. Different tasks can be time-multiplexed into the same sil-
icon. Errors can be corrected and different algorithmic ap-
proaches explored, with no further hardware expense.
Several universities and research laboratories have explored the
use of SRAM-based FPGAs to implement multi-purpose, high-speed
coprocessors for accelerating operations in computer systems.
Using these systems, desktop workstations have delivered the per-
formance of supercomputers for specific applications. In par-
ticular, two projects have gained considerable notoriety - the
PeRLe systems from DEC's Paris Research Lab, and the SPLASH
machines from the Supercomputing Research Center (Bowie, Mary-
land). These systems consist of FPGA-based attached processors
in engineering workstations, complete with programming tools and
run-time environments, and have been the target for a variety of
"real-world" applications.
DEC's Paris Research Lab has designed and implemented four gen-
erations of FPGA-based reconfigurable coprocessors called Pro-
grammable Active Memories (PAMs). The most-widely used version -
the PeRLe-1 - is based on a 5 x 5 array of XC3090 FPGAs.
Developed applications include long multiplication, RSA cryptog-
raphy, heat and LaPlace equations, a Viterbi decoder, a sound
synthesizer, and a stereoscopic vision system. The C++ language
(coupled with a specialized library) is used for programming the
algorithms.
Two generations of the SPLASH processor, based on a linear array
of FPGAs, have been designed at the Supercomputing Research
Center (SRC). The SPLASH-1 includes a 32-stage linear logic ar-
ray with a VME interface to a Sun workstation. Each stage con-
sists of an XC3090 FPGA and a 128 Kbyte static memory buffer.
SPLASH-1's first application was to implement a systolic algo-
rithm for one-dimensional pattern matching during DNA research,
where it outperformed a Cray-2 by a factor of 325 and a custom-
built nMOS device by a factor of 45. The SPLASH-2 system is based
on XC4010 FPGA devices, and has been used to implement a number
of applications, including string searches and image processing.
Officials at the SRC indicate that they have attracted about a
dozen licensees for the SPLASH technology to date.
Some corporations have built their own FPGA-based reconfigurable
processors for inclusion in their products. For example, the
Configurable Hardware Algorithm Mappable Preprocessor (CHAMP) was
designed by Lockheed Sanders (Nashua, NH) to perform spatial
filtering, spectral filtering, and background normalization in an
Advanced Missile Warning System. Six processing elements reside
on a VME board. Each processing element consists of two XC4013
FPGAs and a 16K x 32 dual-port RAM; the processing elements are
connected by a crossbar switch. The Bioccelerator from Compugen
Ltd. (Petah-Tikva, Israel) uses sixteen processing elements based
on XC4000 family FPGAs to accelerate "profile searches" in pro-
tein and DNA databases; the system provides two to three orders
of magnitude acceleration compared to high-end workstations.
Several emerging companies are bringing general-purpose FPGA-
based processors to the commercial marketplace, including Annapo-
lis Micro Systems, Inc. (Annapolis, MD), Giga Operations Corp.
(Berkeley, CA), Metalithic Systems, Inc. (Sandy, Utah), and Vir-
tual Computing Corp. (Reseda, CA). Overviews of these companies
and their products are provided below.
The WILDFIRE Custom Configurable Computer is based on SPLASH 2
technology transferred from the National Security Agency and the
Institute for Defense Analysis' Supercomputing Research Center.
Annapolis Micro Systems is the first licensee to offer a commer-
cial product based on the SPLASH technology. A thirty-person en-
gineering design company, Annapolis Micro Systems accepted the
challenge to bring SPLASH 2 reconfigurable computing out of the
prototype/research environment.
Annapolis Micro Systems has enhanced the SPLASH 2 design to in-
crease its commercial appeal and performance by expanding the I/O
capabilities and interoperability of the system. The most signi-
ficant enhancement is the move from a custom Futurebus-like back-
plane to a VME64 standard backplane. WILDFIRE has an open archi-
tecture that allows it to interface with commercial or custom-
built VME cards, such as high-speed data acquisition devices
(e.g., cameras and communication lines) and standard storage dev-
ices (e.g., disks and tapes).
A WILDFIRE computer consists of one to sixteen WILDFIRE Array
Cards residing in a rack-mountable VME chassis and connected with
a Sun SPARC or PCI host processor through a VMEbus to SBus or PCI
Bus interface card set. Each WILDFIRE Array Card has an array of
sixteen programmable Processing Elements (PEs) with crossbar con-
nections to each PE. Each Processing Element consists of an
XC4010 FPGA device with 512 Kbytes of high-speed memory. An ad-
ditional XC4025 device with one megabyte of memory acts as the
Control PE. A Motorola 68EC030 microprocessor on each Array Card
is used for configuration, readback, diagnostics, and high-level
data transfer and control. Three bidirectional 36-bit wide FIFOs
provide data buffering - one allocated for SIMD (single instruc-
tion, multiple data operations), one each for the left and right
systolic I/O. The reconfigurable 18-port, 36-bit crossbar is
built of XC4010 FPGAs. Crossbar connectivity is ser-
programmable; up to sixteen configurations can be stored, allow-
ing the current configuration to be changed on any clock tick.
Standard WILDFIRE software includes a VHDL model of WILDFIRE, a C
Runtime Library, A Host Interface Driver, and a debugger. WILD-
FIRE supports classic SIMD, MIMD (multiple data, multiple in-
struction), and systolic processing.
The mission of Annapolis Micro Systems is to provide a fully-
supported, commercial product to address the high-speed needs of
telecommunications, real-time image processing, encryption, and
pattern matching. The company provides full technical support
for their products and offers special classes and tutorials as
well as application development services.
Established in 1982, Annapolis Micro Systems, Inc. provides cus-
tom electronic product design, including expert Xilinx design
services, to commercial and government customers. The company
has completed over 400 Xilinx FPGA designs. "Our background and
expertise in Xilinx and product design place us in a unique posi-
tion to successfully bring this Xilinx-based computer architec-
ture to market," states Jane Donaldson, President and founder of
Annapolis Micro Systems.
Annapolis Micro Systems, Inc. 190 Admiral Cochrane Drive, Suite
130 Annapolis, Maryland 21401 Phone: 410-841-2514 Fax: 410-
841-2518 E-mail: AnnapMicro@aol.com
Giga Operations Corp. was founded in 1991 to develop FPGA-based
reconfigurable computing products that deliver supercomputer per-
formance at microcomputer prices. The company is working with
OEMS and developers to create a standard Reconfigurable Computer
architecture.
Giga Operations has designed a modular, scalable reconfigurable
computing platform and development software that can be applied
to many computationally-intensive tasks. The architecture is op-
timized for processing tileable data bases, e.g., video fields or
frames. Data flow structures and computing architectures are im-
plemented in FPGAs.
One developed application, the Spectrum(tm) video computing en-
gine, is intended for applications such as machine vision sys-
tems, video editing, image processing, and 3-D volumetric image
rendering and visualization. This video processor represents the
first use of FPGAs in a reconfigurable computing product
developed specifically for general-purpose visual computing ap-
plications.
Hardware products are based on plug-and-play XMOD(tm) computing
modules, small boards with FPGAs and memory that can connect to
third-party hardware or be embedded in peripheral devices. For
example, the X210MOD features two XC4010 FPGAs, 8 Mbytes of DRAM,
256 Kbytes of SRAM, and three configurable data bus interfaces on
a 2.4" by 3.65" card. The use of isolation buffers between
FPGAs and memories allows local systolic connections to be imple-
mented in local stacks of XMODs. For video processing applica-
tions, the VIDMOD/SC module implements S-video and composite
video connections for a camera, tuner, television, or VCR.
Giga Operations' G800 VESA VL-bus PC board delivers high-
bandwidth I/O and is a carrier for XMODs. Four XMODs can be
stacked four deep, for a total of sixteen per G800 board. In-
tended for high-bandwidth applications, the G800 includes a 133
MB/s VL-Bus interface to a host PC, a 100 MB/s VMC (VESA Media
Channel) bus, and a 100 MB/s interface to an external connector.
All busses are programmable and connect to FPGA pins, allowing
the development of other bus interfaces. The G800 provides three
virtual busses to each XMOD processor; one 64-bit data bus, one
32-bit data bus, and one 16-bit data bus.
At the core of Giga Operations' modular and expandable architec-
ture is the XLINK(tm) operating system, a linker and algorithm
packaging program that maps variables and FPGA functions into the
user's C program. A compiler allows the use of C syntax to build
hardware descriptions output as .XNF files. These enable C pro-
grammers to integrate host-based C programs with reconfigurable
computers for execution at supercomputer speeds. Giga Operations
also provides a Viewlogic interface for designers working with
standard Xilinx tools. A library of video computing routines and
example applications is included in the development system.
Giga Operations' modular, expandable, high-bandwidth engine is
available in an open architecture with C-based software tools and
libraries. Giga Operations, OEMs, and third-party developers are
developing hardware and software standards for reconfigurable
computing. The company and its partners are working to establish
the XLINK operating system as the basis for commercial architec-
tures in reconfigurable computing.
Giga Operations Corp. 2374 Eunice Street Berkeley, CA 94708
Phone: 510-528-8438 Fax: 510-526-6688 Email: goteam@gigaops.com
Metalithic Systems Inc. (MSI) provides quality computing plat-
forms and tools for the emerging reconfigurable computing market.
MSI has established significant expertise in reprogrammable ar-
chitectures, reconfigurable instruction set processors,
computationally-intensive macros, state machine synthesis, and
various other tools for reconfigurable logic - a technology that
will redefine the future of computing. To assist development,
MSI has developed a tool suite that augments vendor tools and in-
cludes an assembler and compiler for virtual processors, and an
integrated Windows-based development system.
Gateware, a high-performance technology developed by MSI
President and CEO Kent Gilson, is the foundation for all MSI pro-
ducts. By utilizing FPGA technology, Gateware combines the
reprogrammability of conventional microprocessors with the high-
speed processing obtained using ASICs to deliver supercomputer-
class processing for a fraction of the cost.
MSI's ACE-12 Reconfigurable Compute Engine utilizes 12 FPGAs to
perform high-speed processing; the board can be populated with
XC3090, XC3195, XC4005, or XC4010 devices. These processing ele-
ments are arranged in a parallel fashion, each with its own
high-speed SRAM, thereby allowing for the efficient implementa-
tion of multiple processors for general-purpose reconfigurable
computing. The ACE-12 system includes software and predefined
configurations that support the read/write of SRAM, downloading
of configurations to any subset of FPGAs, and concurrent confi-
guration and execution.
One application currently running on the ACE-12 is a 3 x 3
transform engine used in various computations like machine vi-
sion, time-to-frequency conversion, video filtering, video ef-
fects processing and texture mapping. This engine currently runs
at a rate of over 500 million multiply and accumulate operations
per second. A swap/sort algorithm implemented on the ACE-12 sys-
tem executes over 360 times faster than an Intel 486 processor
running at 33 MHz.
For multimedia and general-purpose applications where "moderate"
computing power is needed, MSI created the ACE-2 card, based on
two FPGA devices. It can stand alone or operate in concert with
the ACE-12. Fully configured, the ACE-2 can operate as a person-
al audio/video/MIDI recording studio, video teleconferencing sta-
tion, high-speed voice and data modem, video on demand CODEC, or
other I/O and computationally-intensive applications.
The SonicACE is a complete recording studio for the PC. The Son-
icACE system, comprised of an ACE-2 card and SonicACE software,
can perform operations normally handled only by similar stand-
alone equipment costing as much as 4 to 10 times more. SonicACE
allows the user to mix up to 128 digital tracks and includes
group mute/solo capabilities, nondestructive pan/fade/echo/EQ, a
24-voice synthesizer, WAV sampling, master/slave MIDI synchroni-
zation , and simultaneous Play and Record.
Metalithic Systems Inc. 9500 South 500 West , Suite 104 Sandy,
Utah 84070 Phone: 801-561-0114 Fax: 801-561-4702
In 1988, Virtual Computer Corp. (VCC) was awarded a grant through
the Small Business Innovative Research Program to design a recon-
figurable computer for the Naval Surface Weapons Department of
the U.S. Navy. The resulting system, the P-Series Virtual Com-
puter, is now available commercially, and provides over 600,000
gates of reconfigurable logic. The system includes fifty-two
XC4010 or XC4013 FPGAs interconnected using twenty-four I-Cube
IQ160 programmable interconnect devices, 8 Mbytes of high-speed
SRAM, 256 Kbytes of dual-ported RAM, and three 64-bit I/O ports.
An SBus interface is available, and interfaces to other busses
can be developed with relative ease.
"We define transformable computing systems as those machines that
use the reconfigurable aspects of FPGA technology to implement an
algorithm," states Steve Casselman, President and founder of Vir-
tual Computer Corp. "Transformable computers will play a lead
role in the evolution of a new generation of programmers,
researchers, students, and users of computers."
The recently-released EVC-1 system is the first of a series of
SBus-based reconfigurable computers from VCC. The EVC-1 board
includes the bus interface, a single XC4010, XC4013, or XC4025
FPGA device, and 256K bytes of memory, and is intended for use as
a coprocessor in a Sun Sparcstation. The EVC-1 also supports a
daughterboard with a 96 I/O lines for additional hardware proto-
typing. The EVC-1 package includes the EVC SBus transformable
computer board, schematics, manual, SBus interface driver, source
code, and example programs. An optional 2-Mbyte SRAM module
currently is available.
The EVC-1 can be used as a logic emulator for rapid product
development, as an evaluation platform for new chips and designs,
or as an accelerator for computationally-intensive algorithms.
Used as a rapid prototyping system, the EVC-1 can accelerate
time-to-market at a relatively low cost. To evaluate the use of
a new device (such as MPEG, DSP, or ATM chip), the IC can be
placed on an add-on module, using the EVC-1 to implement the in-
terface protocols and glue logic; this provides a cost-effective
means of testing the functionality of a variety of new chips
under consideration for designs. The EVC-1 also can be used to
"hardwire" software algorithms, dramatically accelerating slow,
repetitive software processes.
VCC's software tools are developed around the concept of
"hardware objects" - FPGA-based computing engines that are linked
to the application via C subroutine calls. VCC provides pre-
developed hardware objects obtained from third-party experts as
well as the development software for creating the objects. For
example, the Virtual Random Number Generator (VRNG) hardware ob-
ject is a true, non-deterministic random number generator produc-
ing double floating point numbers for use in simulations. The
VRNG hardware object performs fifteen times faster than running
the same algorithm on a SPARC2 workstation.
The EVC-1 transformable computer is being used by over a dozen
universities in Canada, Europe, and the United States, as well as
a number of industrial users. Military and Aerospace magazine
named it the "Editor's Choice" in the October 1994 issue. In De-
cember, VCC was selected over 60 other nominees for the first
"Small Bre object performs fifteen times faster than running the
same algorithm on a SPARC2 workstation.
The EVC-1 transformable computer is being used by over a dozen
universities in Canada, Europe, and the United States, as well as
a number of industrial users. Military and Aerospace magazine
named it the "Editor's Choice" in the October 1994 issue. In De-
cember, VCC was selected over 60 other nominees for the first
"Small Business Innovative Research Technology of the Year" award
at a conference sponsored by NASA and the Technology Utilization
Foundation.
Virtual Computer Corp. 6925 Canby Ave., #103 Reseda, CA 91335
Phone: 818-342-8294 Fax: 818-342-0240 E-mail: info@vcc.com
dbouldin@utk.edu