MICROELECTRONIC SYSTEMS NEWS

FILENUMBER: 9823 BEGIN_KEYWORDS TRIP_REPORT FPGA-98 END_KEYWORDS DATE: July 1998 TITLE: TRIP REPORT ON FPGA-98
=======================================================================

FPGA'98 - ACM 6th International Symposium on  Field  Programmable
Gate Arrays

Trip Report from Alan Hunsberger

26 March 1998

[Editor:

1. The Program for FPGA-98 may be accessed at: 
   http://www.eecs.nwu.edu/~hauck/fpga98

2. Information on FPGA-99 may be accessed at:
   http://www.ece.nwu.edu/~hauck/fpga99 
]

Sponsorship

As is the custom, the symposium was sponsored by the ACM  Special
Interest  Group  on  Design Automation (SIGDA). Financial support
(e.g., for the opening reception and other food) was provided  by
the  three previous industrial sponsors of the conference, Actel,
Altera, and Xilinx, with the addition this year  of  Lucent.  The
ACM  furnishes  the  administrative support to set up and run the
conference  (and,  as  usual,  the  conference  was   very   well
organized).  Jason  Cong  from  UCLA was the General Chair of the
conference this year, and Sinan Kaptanoglu  from  Actel  was  the
Program Chair.


Dates/Venue

FPGA'98 was held from 22  through  24  February  (Sunday  through
Tuesday)   at  the  Doubletree  Hotel  in  Monterey,  California.
Actually, the conference  sessions  were  held  in  the  Monterey
Conference Center which is a separate organization from the hotel
but physically connected to it. The paper sessions were  held  in
an  auditorium  setting  which  is  different  from previous FPGA
conferences where the presentations were given  in  a  "ballroom"
setting  with tables. The conference facilities and the food were
very good. The weather at the  FPGA  conference  has  been  going
downhill  for  the  past few years (sunny and 70's in 1996, sunny
and 60's in 1997, and  partly  rainy  and  50's  this  year)  but
hopefully  next  year  will  be better. The hotel is a very short
walk from Monterey's Fisherman's  Wharf,  so  there's  good  food
available  before  and  after the conference and you can have fun
watching the seals swimming in the harbor.


Schedule

The conference runs much the same each year and begins on  Sunday
evening  with  registration  and  a reception. This year, the two
working days of the conference were broken into  10  presentation
sessions  with  two  or  three  presenters  speaking for about 20
minutes each. The presentation sessions were separated by roughly
an  hour  poster session/coffee break or by a longer lunch break.
During the breaks, posters on  the  presented  papers  and  other
papers  were available for review and discussion. There were also
a few displays by FPGA, software, and system vendors.  A  banquet
followed  by  a  panel  session  entitled "Impossible Demands and
Constraints from Hell; How to Tell What Makes a  Good  FPGA"  was
held on Monday night.


Attendees

A list of attendees was available at the start of the second day.
I  like  getting  the  list  during  the conference, but for some
reason this  year's  list  didn't  include  email  addresses  and
telephone  numbers. Here are some attendance comparisons with the
previous two years based on the attendees lists.

               FPGA'98        FPGA'97        FPGA'96

Total           208            205            173

Industry        152 (73%)      131 (64%)      130 (75%)
Academia         56 (27%)       74 (36%)       43 (25%)

U.S.            178 (86%)      173 (84%)      138 (80%)
Non U.S. 	     30 (14%)       32 (16%)       35 (20%)

Attendance this year was about the same as in 1997. Almost all of
the  attendees  were male; only about a dozen women attended. The
largest industrial contingents were  from  Xilinx  (15),  Cypress
(13),  Vantis  (13),  Actel  (12)  and  Altera  (10). The largest
academic  contingents  were  from  University  of  California  at
Berkeley   (10),   University   of  Toronto  (6),  University  of
California at Los  Angles  (5)  and  Stanford  (5).  The  largest
attendance  from outside of the U.S. came from Canada (11), Japan
(5) and Germany (4).


Presenters

This year, 25 papers were presented  in  the  regular  conference
sessions.   Sinan  Kaptanoglu,  the Program Chair, noted that the
quality of the papers is continuing to go up and that  about  30%
more  papers were submitted this year. (I estimate that 80 papers
were submitted this year given that 62 were submitted last year.)
All but two of the presenters were male. Some demographics on the
presented papers are:

Industry       5 (20%)               U.S.       18 (72%)
Academia      20 (80%)               Non U.S.    7 (28%)

Three of the 7 foreign papers came  from  Canada.  Eight  of  the
papers  were presented in PowerPoint style, and I'm sure that the
trend will be toward fewer viewgraph style presentations  in  the
coming  years.  Seven  of  the  papers  acknowledged  some  DARPA
funding, which I expect comes from the DARPA  Adaptive  Computing
Systems Program.


Presentation Sessions

Below for each paper, I give the title, the  authors'  names  and
organizations,  and  a  short summary of what they presented. The
presenter is indicated by an " * ". The papers that  acknowledged
some  DARPA  funding are noted. I don't include the poster papers
because they were not included in the proceedings  and  were  not
usually available as handouts.


Session 1: "New FPGA Architectures"

"A Novel Predictable Segmented FPGA Routing  Architecture"  -  E.
Ochotta*,  P. Crotty, C. Erickson, C. Huang, R. Jayaraman, R. Li,
J.  Linoff, L. Ngo, H. Nguyen, K. Pierce, D. Wieland,  J.  Zhuang
and  S.   Nance, Xilinx Inc. - Describes research at Xilinx on an
architecture known  as  "Alexander".  The  idea  was  to  try  to
overcome   the  tileability  and  area  problems  of  a  crossbar
interconnect approach while keeping the ability  to  connect  any
two  CLBs through a single PIP (programmable interconnect point).
Some of the ideas were used in the XC4000 EX, XL and XV  families
and Virtex.

"More Wires and Fewer LUTs: A Design Methodology for FPGAs" -  A.
Takahara*,  T. Miyazaki, T. Murooka, M. Katayama, T. Ichimori, K.
Hayashi, A. Tsutsui and K.  Fukami,  NTT  -  Describes  a  design
methodology that can be used to develop FPGA architectures with a
good balance of logic blocks and routing resources (wires).  They
used  this  methodology  to  develop  PROTEUS-Lite,  and  enhance
version of their previous PROTEUS chip. The area taken up by  the
wires  in  the  new chip is much larger than the area taken up by
the logic (1120 LUTs). Routability was, of course, much better in
PROTEUS-Lite  which  tested at about 100% routability with 80% of
the LUTs in use.

"Optimizations for a  Highly  Cost-Efficient  Programmable  Logic
Architecture"  -  K. Veenstra*, B. Pedersen, J. Schleicher and C.
Sung,  Altera  Corp.  -  Describes  Altera's  research   on   the
"Botticelli"  FPGA  architecture  which resulted in the FLEX 6000
family. The family is designed for low manufacturing  cost  while
keeping  good  performance  and  usability. Manufacturing cost is
becoming more and more dominated by testing and  packaging  costs
rather than die costs. So for the configuration memory, they used
a small sequential memory cell rather than a standard RAM cell to
allow  faster  configuration times (and so allow faster testing).
They  also  focused  on   their   customers'   future   packaging
requirements  and  eliminated the costs of supporting out-of-date
packages.


Session 2: "Technology Mapping for FPGAs"

"Boolean Matching  for  Complex  PLBs  in  LUT-based  FPGAs  with
Application  to Architecture Evaluation" - J. Cong and Y. Hwang*,
UCLA - Describes a way to  determine  if  any  particular  "wide"
boolean  function  can  be  implemented  in  a  given logic block
architecture  containing  multiple,  small  LUTs.  (Note:  if   a
particular  logic  block structure is guaranteed to implement any
function of up to n variables, a "wide" function is a function on
more  than  n variables.) They tested their methods on the XC4000
CLB and several other structures.

"A New Retiming-based Technology Mapping Algorithm for  LUT-based
FPGAs"  -  P.  Pan*,  Clarkson Univ. and C. Lin, Verplex Systems,
Inc. - Retiming is a method of relocating the FFs  in  a  circuit
without  changing  its  functionality  or  structure.  The  paper
describes their iterative retiming procedure  to  find  the  best
mapping  (smallest  clock period) of any sequential circuit to an
architecture based on k-LUTs. They experimented with  4-LUTs  and
showed the potential of their approach.


Session   3:   "Multi-FPGA   Systems   &   Other   Reprogrammable
Architectures"

"A Hybrid Complete-Graph  Partial-Crossbar  Routing  Architecture
for  Multi-FPGA  Systems"  -  M.  Khalid*  and  J. Rose, Univ. of
Toronto - Describes their research on using a  mix  of  hardwired
and programmable interconnects in a system with some given number
of FPGAs. Each FPGA has n hardwired interconnects to  each  other
FPGA   and  has  m  interconnects  to  each  of  some  number  of
programmable interconnect devices. They have a routing tool which
understands the architecture of the system being considered. They
experimented with various size systems based on the  Xilinx  4013
(1152  4-LUTS,  1152  FFs  and 192 usable I/O pins) and looked at
results for a set of circuits (e.g., with 5366 LUTs, 1040 FFs and
60  I/Os).  Their  architecture typically used fewer pins and had
lower circuit delay than the partial crossbar approach.

"Managing  Pipeline  Reconfigurable  FPGAs"  -  S.  Cadambi*,  J.
Weener,  S.   Goldstein, H. Schmit and D. Thomas, Carnegie Mellon
Univ. - DARPA funding - Describes the FPGA  architecture,  called
PipeRench,  they  developed  to  allow hardware virtualization of
pipelined circuits. Their approach uses physical  hardware  logic
"stripes"  through  which  the configurations for the stages of a
pipelined circuit can be run. They have a prototype chip and  CAD
tools. They believe a chip with 0.35 micron technology could hold
enough configuration  memory  for  512  virtual  stripes  and  28
physical stripes each with 32 4-bit ALUs.

"Configuration  Prefetch  for   Single   Context   Reconfigurable
Coprocessors"  -  S. Hauck*, Northwestern Univ. - DARPA funding -
Describes configuration prefetch experiments he  is  doing  on  a
simulation  of  a  system  consisting of a microprocessor coupled
with a reconfigurable coprocessor. As the microprocessor executes
some  application  code,  it  calls  the  coprocessor  to execute
particular instructions. As it's  running  the  application,  the
microprocessor  can preload configuration code to the coprocessor
so that the coprocessor is ready to execute its next  instruction
call.  The  problem  is  to  determine the proper time and proper
instruction for the microprocessor to preload so as  to  minimize
the overall delay in executing the application code. He describes
the algorithm and tools he has  developed  to  do  this  and  his
experimental results.


Session 4: "Partitioning and Floor Planning for FPGAs"

"Circuit Partitioning with Complex Resource Constraints in FPGAs"
-  H.   Liu*,  D. Wong, Univ of Texas at Austin and K. Zhu, Actel
Corp. - Increasingly, there are more different kinds of computing
resources available on a given FPGA, for example, different kinds
of logic  modules,  on-board  memory,  and  support  for  special
functions.  To do a good job of partitioning/placing a circuit on
a single FPGA or a set of several FPGAs, there  needs  to  be  an
accurate  metric  of  available  resources  and  the partitioning
algorithm needs to take this into account. This  paper  describes
one  such metric and partitioning algorithm and gives the results
of experiments using their approach  with  parameters  consistent
with Actel's ES6500 family.

"Timing  Driven  Floor  Planning  on  Programmable   Hierarchical
Targets"  -  S.  Senouci, A. Amoura*, H. Krupnova and G. Saucier,
Institut National Polytechnique de Grenoble - Describes a boolean
network  model  of  a  circuit and a way to use timing "cones" of
nodes in  the  circuit  to  guide  floorplanning.  Locations  are
assigned  to  the  critical  parts  of  the  circuit,  and  these
assignments are then stored as constraint files for  input  to  a
standard  vendor  place and route tool. Experiments targeting the
Vantis MACH5 CPLD  family  were  conducted  using  a  variety  of
circuits with gains of 15% to 20% in the critical path delay.


Session 5: "Fault Detection and Fault Tolerance for FPGAs"

"Bridging Fault Detection in FPGA Interconnects Using IDDQ" -  L.
Zhao,  D.  Walker and F. Lombardi*, Texas A&M Univ. - Describes a
static current testing  method  to  check  for  shorts  (bridging
faults)  in  the routing resources (both the connection nets that
run between the logic and the control nets that set the  switches
between  the  connection  nets). Experiments with the Xilinx 4000
family showed that the method is very efficient  and  gives  high
fault  coverage.  The  method  didn't work well when used to test
Actel devices; the reason isn't known yet.

"Efficiently Supporting Fault Tolerance in FPGAs" - J. Lach*,  W.
Mangione-Smith and M. Potkonjak, UCLA - DARPA funding - Describes
a way to take advantage of reconfigurability  to  provide  system
reliability   without   resorting  to  component  replication.  A
physical design (after initial place and  route)  is  partitioned
into  "tiles"  with  fixed  functionality and fixed interfaces to
other tiles. Spare resources are then introduced into  the  tiles
so  that  if  a fault is detected, a new configuration of the bad
tile can be loaded into the same physical space occupied  by  the
bad  tile. They have developed the CAD tools to do this and their
experiments have indicated the technique  is  effective  and  the
overhead  is  low.  This  particular  effort assumes that a fault
detection and location system is in place.


Session 6: "Fast CAD Tools for FPGAs"

"Fast Module Mapping and Placement for Datapaths in FPGAs"  -  T.
Callahan*,   P  Chong,  A.  DeHon  and  J.  Wawrzynek,  Univ.  of
California at Berkeley - DARPA funding -  Traditional  CAD  tools
don't  do  well with datapath designs involving multi-bit logical
and arithmetic operations.  This paper describes  a  new  mapping
and  placement  tool  called  GAMA  that  preserves  the datapath
structure rather than flattening the design to gates. GAMA uses a
tree-covering  algorithm that does not provide optimal solutions.
But experiments mapping datapath designs to the Xilinx 4000  have
shown  that  GAMA  gives  fast results that are as good or better
than from a flattened or hard macro approach.

"Fast Integrated Tools  for  Circuit  Design  with  FPGAs"  -  S.
Gehring  and  S. Ludwig*, Swiss Federal Institute of Technology -
Describes a design system for the Xilinx 6200 family using a Lola
HDL front-end and a new 6200 back-end. The compile/place cycle is
very  fast  and  deterministic  so  the  process   can   be   run
interactively    with   the   designer   iteratively   specifying
placements. The router  lets  individual  nets  be  routed  in  a
specified  order.  Their system is one or two orders of magnitude
faster than the XACT Step 6000 V1.1.2 tool but only supports  one
global clock and does not support "Magic" routing. The system can
be downloaded free from  and  the  tools
are  part  of  the  H.O.T.  Works development system from Virtual
Computer Corporation VCC.

"A Fast Routability-Driven Router for FPGAs"  -  J.  Swartz*,  V.
Betz,  and  J.  Rose,  Univ.  of  Toronto  -  Describes a routing
algorithm/tool (based on the PathFinder algorithm)  that  quickly
determines  whether a particular circuit routing problem is "low-
stress" (there are at least 10%  more  routing  tracks  available
than   needed),   difficult   and  time-consuming  to  route,  or
impossible to route. Using parameters for a  target  FPGA  and  a
given  placed circuit, the algorithm estimates the minimum number
of tracks per routing channel needed and  compares  this  to  the
number  of  tracks  available. Benchmark tests on an island style
FPGA architecture with unit-length track  segments  gave  results
comparable to a simulated annealing process and at routing speeds
of roughly 55,000 LUT/FF pairs per minute on low-stress problems.
The next step is to do the same for segmented FPGA architectures.


Session 7: "Time Multiplexed FPGAs"

"Scheduling   Designs   into   a   Time-Multiplexed   FPGA",   S.
Trimberger*,  Xilinx,  Inc. - Describes a tool that takes a large
design targeted to a Xilinx 4000 type architecture and divides it
into  multiple  configurations  that  can  be  run  on  a  multi-
configuration, time-multiplexed version of  the  architecture.  A
paper  discussing the 8-configuration device being considered was
presented at FCCM'97. The 8-layer device was triple the  size  of
the  one-layer version, consumed double digit watts of power, and
ran slower than the one-layer version.

"Partitioning Sequential Circuits on  Dynamically  Reconfigurable
FPGAs"  - D. Chang* and M. Marek-Sadowska, Univ. of California at
Santa  Barbara  -  Describes  investigations  into   partitioning
sequential  circuits  for multi-configuration architectures. They
present a gate-level model to describe computations  in  such  an
architecture.   Experiments   showed   that   you  need  to  have
proportionately more communication resources than logic resources
as you add more configuration layers.


Session 8: "Technology Mapping for FPGAs with Embedded Memory"

"SMAP: Heterogeneous Technology Mapping  for  Area  Reduction  in
FPGAs with Embedded Memory Arrays" - S. Wilton*, Univ. of British
Columbia - Describes an algorithm to use embedded  memory  arrays
to  implement  logic.  The  algorithm identifies the parts of the
circuit being implemented that can be efficiently mapped  to  the
available  memory arrays and then maps the rest of the circuit to
the LUTs. Tests of the algorithm using 2K memory  arrays  and  4-
LUTs showed area savings over the simple 4-LUT implementation.

"Technology Mapping for FPGAs with Embedded Memory Blocks"  -  J.
Cong  and  S. Xu*, UCLA - Describes a tool targeted to the Altera
FLEX10K family for using the embedded memory arrays to  implement
logic.  The  algorithm  uses the concepts of "Maximum Fanout Free
Cones"  and  "Maximum  Fanout  Free  Subgraphs"  and   could   be
generalized  to other FPGA families.  Experiments showed that the
technique reduces LUT use while keeping roughly the same delays.


Session 9: "Novel FPGA Applications"

"A Survey of CORDIC Algorithms for FPGA  Based  Computers"  -  R.
Andraka*,  Andraka  Consulting Group, Inc. - CORDIC is an acronym
for COordinate Rotation DIgital Computer and  denotes  a  set  of
hardware  efficient  iterative  techniques  using only shifts and
adds to compute a wide range of trigonometric, hyperbolic, linear
and  logarithmic  functions.  CORDIC algorithms generally produce
one bit of accuracy  for  each  iteration.   CORDIC  methods  are
powerful  DSP  tools  and  have  been proposed for solving linear
systems and  for  computing  DFTs,  DCTs,  Discrete  Hartley  and
Chirp-Z  transforms. This paper surveys the CORDIC algorithms and
gives  some  ideas  for  implementing  them   on   FPGAs   either
iteratively  or as pipelines. More info on CORDIC can be found at
>.

"FPGA-Based Sonar Processing" - P. Graham* and B. Nelson, Brigham
Young  Univ.  -  DARPA  funding  -  Describes  a multi-board FPGA
approach   to   conventional   time-delay   sonar    beamforming.
Implementations  based  on the Xilinx 4028XL FPGA compared to the
SHARC ADSP-21060 DSP show about a factor of 4 cost to performance
advantage of the FPGA system over the DSP system including memory
costs.

"Evolving Computer Programs using  Rapidly  Reconfigurable  FPGAs
and  Genetic Programming" - J. Koza and F. Bennett III*, Stanford
Univ.; J.  Hutchings and S. Bade, Convergent Design,  L.L.C.;  M.
Keane,  Martin  Keane, Inc.; and D. Andre, Univ. of California at
Berkeley - Gives a good set of references on genetic programming,
describes  the  types of genetic programming problems where FPGAs
can be of use, and describes how the Xilinx 6200  family  can  be
used   to   accelerate   the  fitness  measurement  part  of  the
processing. Gives a very good,  detailed  example  of  using  the
genetic programming approach to design sorting networks.


Session 10: "Programmable Architectures with Special Features"

"High-Performance Carry Chains for FPGAs" - S. Hauck*, M.  Hosler
and  T.  Fry, Northwestern Univ. - DARPA funding - Describes some
new techniques to speed up the carry structure found  in  current
FPGAs. They have gotten up to a factor of 3.8 faster in the carry
chain.

"A Coarse-Grained  FPGA  Architecture  for  High-Performance  FIR
Filtering"  -  J. Anderson* and S. Sheth, Intel Corp, and K. Roy,
Purdue  Univ.  -  DARPA  funding  -  Describes  a  coarse-grained
architecture  specialized  for  high-performance  Finite  Impulse
Response filtering. Their cell is basically a 4-tap FIR  with  8-
bit  precision  that can also be grouped in twos or threes for 16
or 24-bit precision. Performance and area efficiency is close  to
that  of  an  ASIC.  They  plan  to  look at DSP functions in the
future.

"An LPGA with Foldable PLA-Style Logic Blocks" - J. Anderson* and
S.  Brown, Univ. of Toronto - Describes a laser programmable gate
array architecture where the granularity of the  logic  block  (a
CPLD-like AND/OR structure) can be varied. In their architecture,
the unused part of a large block can be separated from  the  used
part and additional logic implemented on it. They have a CAD tool
and have done theoretical and empirical work to estimate the area
and speed improvements possible.


Panel Session

This year's panel session addressed the topic "Impossible Demands
and  Constraints  from Hell; How to Tell What Makes a Good FPGA".
The panel was organized by Jonathan Rose (Univ. of  Toronto)  and
moderated  by  Sinan  Kaptanoglu (Actel). The panel addressed the
question of what makes a good FPGA from the  viewpoint  of:  "The
Customer",  Steve Taylor (Nortel); "The Academic", Jonathan Rose;
"The  Architect",  Clive   McCarthy   (Altera);   "The   Software
Developer",    Rob    Smith    (Actel);    and   "The   Marketing
Representative", Sandip Vij (Xilinx). Nortel expects to spend $70
million on FPGA/CPLD devices in 1998 to go in products for public
carrier nets, etc. Customers want predictable, constraint driven,
incremental  design  with  a close coupling between synthesis and
place  and  route  (e.g.,  don't  through  away  the  information
specified  in  the  HDL).  They  view  a big challenge to be FPGA
product obsolescence.  From the  academic  viewpoint,  there  are
many  possible measures of FPGA goodness (for a particular set of
benchmarks), e.g., area, delay, and pin count, depending  on  how
much  work you want to do to take the measurement. Speed and area
is the best measure, but it takes a lot of work. You want to pick
a  measure that fits your specific need so that you don't have to
do more work than is necessary. A good architectural approach  is
to  follow Van der Poel's (sp. ?) rule that says you want to have
zero, one or an "infinite" amount of each resource  so  that  the
users don't have to worry about what happens if they run out of a
resource. You should develop the architecture to be at the center
of the market's revenue. The architecture should always have many
more wires than gates, e.g., the human brain has 10**14  synapses
(wires) but only 10**10 neurons (gates). Software developers want
FPGAs that are software, layout  and  synthesis  friendly,  e.g.,
uniform   features   and  a  concise  meta-model  representation.
Marketing reps want all  the  developers  to  think  "a  customer
ahead"  and  "a  trend  ahead". They want the software cores done
well.


FPGA'99

The general chair for FPGA'99 is Sinan Kaptanoglu of  Actel.  The
conference  will  again  be  held  at  the  Doubletree  Hotel  in
Monterey.  There was talk that the conference date might be moved
to  March  in  1999  to avoid conflict with a European conference
that I understood drew away 30 normal FPGA attendees this year.
[]
=======================================================================

Return to MSN Home Page

dbouldin@utk.edu