MICROELECTRONIC SYSTEMS NEWS


FILENUMBER: 521
BEGIN_KEYWORDS
ASSESSMENT FPL-95
END_KEYWORDS
DATE: april 1996
TITLE: An Assessment of FPL-95



An Assessment of FPL-95

(Contributed by Alan Hunsberger, National Security Agency)

The following is an excerpt from a trip report which represents Alan's
personal views and not those of his employer.  To obtain the full
report, please send email to dbouldin@utk.edu

The Fifth International Workshop on Field-Programmable Logic and
Applications (FPL-95)
was hosted by the University of Oxford and was organized by the
University's Department for Continuing Education in conjunction with
the Department of Engineering Science at Oxford and the Department of
Computing at the Imperial College of Science, Technology and Medicine.
FPL'95 was held from 29 August  to 1 September 1995 (Tuesday through
Friday) at Jesus College (part of the University of Oxford) in Oxford,
United Kingdom. Accommodations were in student rooms at the college and
at local hotels, bed and breakfasts, etc. A conference banquet was held
in the medieval dining hall of the college.
England was very hot and dry this summer, but during the week of the
conference, the temperature was in the 60's and 70's and we had a few
days of rain. Jesus College is located near several museums and
Oxford's central shopping district. During one lunch break, I got a
chance to briefly visit the science and technology museum. The museum
contains the largest collection of sundials in Europe. One pocket
sundial was rather unique in that it contained a miniature cannon that
could be set up to automatically fire a blank charge at twelve noon
when the sun's rays would be focused through a glass lens onto the
cannon's fuse.

The conference began on Tuesday with an all-day tutorial session and
ended on Friday right after lunch. Paper sessions were roughly 90
minutes (three papers, once four) and were separated by a 90 minute
lunch or a 50 minute tea break/poster session (usually four posters,
once three). Four paper sessions were held on Wednesday, three on
Thursday, and two on Friday. Two poster sessions were held on Wednesday
and Thursday and one on Friday. The conference banquet was held on
Thursday night in the college's dining hall.

The list of attendees contained 79 names; 36 people were from the U.K.,
20 from the U.S., 5 each from Germany and Austria, and 1 or 2 from
twelve other countries (including Australia, Japan and South Africa).
There were 46 people from academia, 31 from industry, and 2 from
government agencies. The largest academic contingents were from the
University of Oxford (5), the University of Edinburgh (3) and the
Technical University of Vienna (3). The largest contingents from
industry were Hewlett-Packard, Pilkington and Xilinx with 4 each. There
were 73 male attendees and 6 female.

There were 28 formal papers and 19 poster papers presented at the
conference. All of these papers (except one which was handed out later)
were included in the printed proceedings available at the start of the
conference. 36 of the 47 papers were from academic institutions. Only a
couple of vendors participated with small exhibits.

Some of the more interesting papers are noted below:

*"A Discussion of Trade-Offs Made in the Development of a New FPGA
Architecture" - Anthony Stansfield and Ian Page - Describes what logic
is needed in FPGAs to provide good synthesis, what logic was in FPGAs
circa 1993, and what logic is in a new architecture designed to meet
the need for more on-chip memory and a better combination of large and
small gates. The chip is based on a content-addressable memory (CAM)
cell that can be used both as a functional cell and a routing cell. For
more information on this and other FPGA projects at the University of
Oxford (e.g., HARP1 and Ruby), see the web site
      http://www.comlab.ox.ac.uk/oucl/hwcomp.html  
  
*"Some Notes on Power Management on FPGA-Based Systems" - Eduardo Boemo
- Explores the usefulness of low-power design methods based on
architectural and implementation modifications (e.g., pipelining,
better partitioning and path delay equalization) and presents a
methodology for FPGA power analysis. Higher power usage causes (among
other concerns) lower speeds since delays increase by 0.3% for each 1
degree C increase in temperature in CMOS devices.
 
*"The Proper Use of Hierarchy in HDL-Based High Density FPGA Design" -
Carol Fields - Greater system complexity coupled with time-to-market
pressures (short design cycle) are causing a move to top-down design
methodologies (i.e., where the designer works with high-level logical
or functional abstractions rather then gates) and logic synthesis
tools. Hierarchical design structure is critical in getting high
utilization in FPGA devices and provides other benefits (e.g., easier
debugging, mixed mode design entry, work distribution among designers).
With the current state-of-the-art in design tools, the most effective
combination of FPGA device utilization and performance is achieved when
large designs are partitioned into modules of size 150 - 250 CLBs (for
Xilinx FPGAs) or 3,000 to 5,000 gates, and floor planning techniques
are used to place these modules on the FPGA. The paper gives examples
using Synopsys FPGA Compiler and Xilinx Floorplanner.

*"Compiling RUBY onto FPGAs" - Shaori Guo - Describes a prototype
hardware complier which compiles a design written in the Ruby language
to a netlist that can be mapped to an FPGA by vendor software. Ruby is
a relational language for capturing block diagrams parametrically. One
focus is on constructing provably correct designs.

*"Use of Reconfigurability in Variable-Length Code Detection at Video
Rates" - Gordon Brebner - Considers the implementation of a fax
compression scheme decoder in a Xilinx XC6215. Makes use of the partial
reconfigurability of the XC6215 to provide logic paging of tree search
structures; the idea being that a couple of tree structures stay
resident in the FPGA to detect the most common codewords while other
tree structures are swapped in as needed. Makes good use of the
capability of the XC6200 family to support registers that can be
read/written from a processor controlling the FPGA. Testing of the
design is awaiting the availability of the XC6215 chip.

*"Classification and Performance of Reconfigurable Architectures" -
Steven Guccione - Over 40 computation systems based on reconfigurable
logic have been constructed to date. A list of these is available at
the web page:
      http:/www.utexas.edu/~guccione/HW_list.html
Traditional machine architectures require a trade-off between
performance and flexibility; with instruction set architectures (e.g.,
a microprocessor chip) that provides flexibility but low performance
compared to a hardware architecture (e.g., a VLSI chip) that provides
high performance but is inflexible. Machines based on reconfigurable
logic can provide flexibility and performance. {Note that an
instruction set architecture can be viewed as containing a
reconfigurable processing unit (RPU). The arithmetic logic unit is an
RPU with a dedicated port for reconfiguration data (the opcode), with
only a few possible configurations (determined by the number of bits in
the opcode), and with rapid and frequent reconfiguration.} The paper
presents a classification system for RPU machines based on the RPU size
(small or large) and the presence or absence of dedicated local memory
with the RPU. All of the RPU machines contain a host processor along
with the RPU. The four categories are:
    Reconfigurable Supercomputer (RS) - large RPU with (large) local
memory (and a high bandwidth link to the host processor); e.g., Splash,
Teramac, and Virtual Computer.
    Application Specific Architecture (ASA) - large RPU with no local
memory; e.g., GANGLION and logical simulators for rapid prototyping.
    Reconfigurable Logic Coprocessor (RLC) - small RPU with (small)
local memory; e.g., BORG and Xputer or an accelerator board processor.
    Custom Instruction Set Architecture (CISA) - small RPU with no local
memory; e.g., PRISM and Spyder or a setup where the host views the RPU
as an "instruction" that it can execute.

*"Automatic Synthesis of Parallel Programs Targeted to Dynamically
Reconfigurable Logic Arrays" - Maya Gokhale - Developing a high level
programming language to use instead of an HDL, especially to achieve
temporal partitioning on partially reconfigurable FPGAs. Temporal
partitioning enables the running of larger problem sets and allows a
separate debug subroutine to be run only when needed. The paper
discusses a data-parallel bit C (dbC) compiler for partitioning a
parallel program. dbC is a procedural language using a computational
model of a SIMD processor array where a host processor issues a single
instruction stream to an array of execution units (EUs) which perform
each instruction on local instances of data in lock step (the lock step
can be modified based on e.g., local conditions in the data). Each EU
is a virtual processor. The dbC compiler generates an intermediate
format that is further compiled by subroutine into a VHDL source module
that can be run through standard CAD tools to produce an FPGA variables
have registers that are placed in fixed locations. The current FPGA
target is National's Adaptive Processor Architecture (NAPA) which
contains up to four ring-connected Adaptive Processing Units each of
which contains a multichip module with four SRAMs and four National
Semiconductor CLAy FPGAs. A compilation example is given using the
genome match problem. Further testing is awaiting a NAPA board.

*"Implementation Approaches for Run Time Reconfiguration" - Brad
Hutchings - A good overview of compile-time reconfiguration (CTR) and
run-time reconfiguration (RTR) and the design issues involved. With
CTR, the FPGA system is configurated once for a particular application
which runs until completion. Designing for CTR is very similar to
designing for an ASIC implementation and current CAD tools work quite
well. With RTR, multiple, cooperating configuration of the FPGA system
is done as needed during the running of application. Designing for RTR
requires dividing the application into time-exclusive segments (a.k.a
temporal partitioning) that do not run concurrently and coordinating
the transfer of data between configuration segments (i.e., providing
for inter-configuration communications). Current CAD tools don't do
either of these tasks, but research on tool development is underway
(e.g., see Maya Gokhale's paper above). Two different RTR strategies
are possible: global and local. With global RTR, all of the FPGA chips
are reconfigured at each time-segment; and the system goes through a
configure- run-configure-run sequence. The design task with global RTR
is to partition the application into roughly equally-sized pieces (for
efficient use of the FPGAs); this requires an iterative process because
you don't really know the physical size of the partitions until
placement and routing has been done. Inter-configuration communication
can be accomplished by storing intermediate results in a fixed system
memory or register resource.  Conventional CAD tools can be used once
the partitioning has been done.  With local RTR, only a piece of the
system (e.g., pieces of some FPGA chips and the whole of other FPGA
chips) is configured at various times as the rest of the system
continues to operate. The design task with local RTR is to functionally
partition the application into various fine-grained operations that are
not necessarily temporally exclusive.  Local RTR design is highly
complex; current CAD tools are a poor match.  Both forms of RTR are
still in the early stages of investigation.

*"Using Reconfigurable Hardware to Speed Up Product Development and
Performance" - Describes the HARP1 circuit board which was designed to
demonstrate the use of mathematical methods to help overcome problems
in the design of correct circuits. The board contains a transputer, a
commercial FPGA, RAM, etc. For more information, see:

Oxford web site.
  
A separate fee was charged for the tutorial session on "Field
Programmable Devices (Technologies, Applications and Tools)" conducted
by Stephen Brown of the University of Toronto. The tutorial was
attended by nine people including myself. Dr. Brown gave a full
overview of programmable devices from simple and complex PLDs to the
wide range of FPGAs now available. He also gave an overview of the CAD
design process and described the operation of a typical CAD tool at
each step. The tutorial included a nice text written by Dr. Brown and
published by Stan Baker Associates.

FPL'96 will be held on 23-25 September 1996 in
Darmstadt, Germany and will be chaired by Manfred Glesner of the
Darmstadt University of Technology. For more information, you can check
 WWW or email the FPL'96 secretary at

   fpl96@microelectronic.e-technik.th-darmstadt.de
Return to MSN Home Page

dbouldin@utk.edu