MICROELECTRONIC SYSTEMS NEWS

FILENUMBER: 521 BEGIN_KEYWORDS ASSESSMENT FPL-95 END_KEYWORDS DATE: april 1996 TITLE: An Assessment of FPL-95 An Assessment of FPL-95 (Contributed by Alan Hunsberger, National Security Agency) The following is an excerpt from a trip report which represents Alan's personal views and not those of his employer. To obtain the full report, please send email to dbouldin@utk.edu The Fifth International Workshop on Field-Programmable Logic and Applications (FPL-95) was hosted by the University of Oxford and was organized by the University's Department for Continuing Education in conjunction with the Department of Engineering Science at Oxford and the Department of Computing at the Imperial College of Science, Technology and Medicine. FPL'95 was held from 29 August to 1 September 1995 (Tuesday through Friday) at Jesus College (part of the University of Oxford) in Oxford, United Kingdom. Accommodations were in student rooms at the college and at local hotels, bed and breakfasts, etc. A conference banquet was held in the medieval dining hall of the college. England was very hot and dry this summer, but during the week of the conference, the temperature was in the 60's and 70's and we had a few days of rain. Jesus College is located near several museums and Oxford's central shopping district. During one lunch break, I got a chance to briefly visit the science and technology museum. The museum contains the largest collection of sundials in Europe. One pocket sundial was rather unique in that it contained a miniature cannon that could be set up to automatically fire a blank charge at twelve noon when the sun's rays would be focused through a glass lens onto the cannon's fuse. The conference began on Tuesday with an all-day tutorial session and ended on Friday right after lunch. Paper sessions were roughly 90 minutes (three papers, once four) and were separated by a 90 minute lunch or a 50 minute tea break/poster session (usually four posters, once three). Four paper sessions were held on Wednesday, three on Thursday, and two on Friday. Two poster sessions were held on Wednesday and Thursday and one on Friday. The conference banquet was held on Thursday night in the college's dining hall. The list of attendees contained 79 names; 36 people were from the U.K., 20 from the U.S., 5 each from Germany and Austria, and 1 or 2 from twelve other countries (including Australia, Japan and South Africa). There were 46 people from academia, 31 from industry, and 2 from government agencies. The largest academic contingents were from the University of Oxford (5), the University of Edinburgh (3) and the Technical University of Vienna (3). The largest contingents from industry were Hewlett-Packard, Pilkington and Xilinx with 4 each. There were 73 male attendees and 6 female. There were 28 formal papers and 19 poster papers presented at the conference. All of these papers (except one which was handed out later) were included in the printed proceedings available at the start of the conference. 36 of the 47 papers were from academic institutions. Only a couple of vendors participated with small exhibits. Some of the more interesting papers are noted below: *"A Discussion of Trade-Offs Made in the Development of a New FPGA Architecture" - Anthony Stansfield and Ian Page - Describes what logic is needed in FPGAs to provide good synthesis, what logic was in FPGAs circa 1993, and what logic is in a new architecture designed to meet the need for more on-chip memory and a better combination of large and small gates. The chip is based on a content-addressable memory (CAM) cell that can be used both as a functional cell and a routing cell. For more information on this and other FPGA projects at the University of Oxford (e.g., HARP1 and Ruby), see the web site http://www.comlab.ox.ac.uk/oucl/hwcomp.html *"Some Notes on Power Management on FPGA-Based Systems" - Eduardo Boemo - Explores the usefulness of low-power design methods based on architectural and implementation modifications (e.g., pipelining, better partitioning and path delay equalization) and presents a methodology for FPGA power analysis. Higher power usage causes (among other concerns) lower speeds since delays increase by 0.3% for each 1 degree C increase in temperature in CMOS devices. *"The Proper Use of Hierarchy in HDL-Based High Density FPGA Design" - Carol Fields - Greater system complexity coupled with time-to-market pressures (short design cycle) are causing a move to top-down design methodologies (i.e., where the designer works with high-level logical or functional abstractions rather then gates) and logic synthesis tools. Hierarchical design structure is critical in getting high utilization in FPGA devices and provides other benefits (e.g., easier debugging, mixed mode design entry, work distribution among designers). With the current state-of-the-art in design tools, the most effective combination of FPGA device utilization and performance is achieved when large designs are partitioned into modules of size 150 - 250 CLBs (for Xilinx FPGAs) or 3,000 to 5,000 gates, and floor planning techniques are used to place these modules on the FPGA. The paper gives examples using Synopsys FPGA Compiler and Xilinx Floorplanner. *"Compiling RUBY onto FPGAs" - Shaori Guo - Describes a prototype hardware complier which compiles a design written in the Ruby language to a netlist that can be mapped to an FPGA by vendor software. Ruby is a relational language for capturing block diagrams parametrically. One focus is on constructing provably correct designs. *"Use of Reconfigurability in Variable-Length Code Detection at Video Rates" - Gordon Brebner - Considers the implementation of a fax compression scheme decoder in a Xilinx XC6215. Makes use of the partial reconfigurability of the XC6215 to provide logic paging of tree search structures; the idea being that a couple of tree structures stay resident in the FPGA to detect the most common codewords while other tree structures are swapped in as needed. Makes good use of the capability of the XC6200 family to support registers that can be read/written from a processor controlling the FPGA. Testing of the design is awaiting the availability of the XC6215 chip. *"Classification and Performance of Reconfigurable Architectures" - Steven Guccione - Over 40 computation systems based on reconfigurable logic have been constructed to date. A list of these is available at the web page: http:/www.utexas.edu/~guccione/HW_list.html Traditional machine architectures require a trade-off between performance and flexibility; with instruction set architectures (e.g., a microprocessor chip) that provides flexibility but low performance compared to a hardware architecture (e.g., a VLSI chip) that provides high performance but is inflexible. Machines based on reconfigurable logic can provide flexibility and performance. {Note that an instruction set architecture can be viewed as containing a reconfigurable processing unit (RPU). The arithmetic logic unit is an RPU with a dedicated port for reconfiguration data (the opcode), with only a few possible configurations (determined by the number of bits in the opcode), and with rapid and frequent reconfiguration.} The paper presents a classification system for RPU machines based on the RPU size (small or large) and the presence or absence of dedicated local memory with the RPU. All of the RPU machines contain a host processor along with the RPU. The four categories are: Reconfigurable Supercomputer (RS) - large RPU with (large) local memory (and a high bandwidth link to the host processor); e.g., Splash, Teramac, and Virtual Computer. Application Specific Architecture (ASA) - large RPU with no local memory; e.g., GANGLION and logical simulators for rapid prototyping. Reconfigurable Logic Coprocessor (RLC) - small RPU with (small) local memory; e.g., BORG and Xputer or an accelerator board processor. Custom Instruction Set Architecture (CISA) - small RPU with no local memory; e.g., PRISM and Spyder or a setup where the host views the RPU as an "instruction" that it can execute. *"Automatic Synthesis of Parallel Programs Targeted to Dynamically Reconfigurable Logic Arrays" - Maya Gokhale - Developing a high level programming language to use instead of an HDL, especially to achieve temporal partitioning on partially reconfigurable FPGAs. Temporal partitioning enables the running of larger problem sets and allows a separate debug subroutine to be run only when needed. The paper discusses a data-parallel bit C (dbC) compiler for partitioning a parallel program. dbC is a procedural language using a computational model of a SIMD processor array where a host processor issues a single instruction stream to an array of execution units (EUs) which perform each instruction on local instances of data in lock step (the lock step can be modified based on e.g., local conditions in the data). Each EU is a virtual processor. The dbC compiler generates an intermediate format that is further compiled by subroutine into a VHDL source module that can be run through standard CAD tools to produce an FPGA variables have registers that are placed in fixed locations. The current FPGA target is National's Adaptive Processor Architecture (NAPA) which contains up to four ring-connected Adaptive Processing Units each of which contains a multichip module with four SRAMs and four National Semiconductor CLAy FPGAs. A compilation example is given using the genome match problem. Further testing is awaiting a NAPA board. *"Implementation Approaches for Run Time Reconfiguration" - Brad Hutchings - A good overview of compile-time reconfiguration (CTR) and run-time reconfiguration (RTR) and the design issues involved. With CTR, the FPGA system is configurated once for a particular application which runs until completion. Designing for CTR is very similar to designing for an ASIC implementation and current CAD tools work quite well. With RTR, multiple, cooperating configuration of the FPGA system is done as needed during the running of application. Designing for RTR requires dividing the application into time-exclusive segments (a.k.a temporal partitioning) that do not run concurrently and coordinating the transfer of data between configuration segments (i.e., providing for inter-configuration communications). Current CAD tools don't do either of these tasks, but research on tool development is underway (e.g., see Maya Gokhale's paper above). Two different RTR strategies are possible: global and local. With global RTR, all of the FPGA chips are reconfigured at each time-segment; and the system goes through a configure- run-configure-run sequence. The design task with global RTR is to partition the application into roughly equally-sized pieces (for efficient use of the FPGAs); this requires an iterative process because you don't really know the physical size of the partitions until placement and routing has been done. Inter-configuration communication can be accomplished by storing intermediate results in a fixed system memory or register resource. Conventional CAD tools can be used once the partitioning has been done. With local RTR, only a piece of the system (e.g., pieces of some FPGA chips and the whole of other FPGA chips) is configured at various times as the rest of the system continues to operate. The design task with local RTR is to functionally partition the application into various fine-grained operations that are not necessarily temporally exclusive. Local RTR design is highly complex; current CAD tools are a poor match. Both forms of RTR are still in the early stages of investigation. *"Using Reconfigurable Hardware to Speed Up Product Development and Performance" - Describes the HARP1 circuit board which was designed to demonstrate the use of mathematical methods to help overcome problems in the design of correct circuits. The board contains a transputer, a commercial FPGA, RAM, etc. For more information, see: Oxford web site. A separate fee was charged for the tutorial session on "Field Programmable Devices (Technologies, Applications and Tools)" conducted by Stephen Brown of the University of Toronto. The tutorial was attended by nine people including myself. Dr. Brown gave a full overview of programmable devices from simple and complex PLDs to the wide range of FPGAs now available. He also gave an overview of the CAD design process and described the operation of a typical CAD tool at each step. The tutorial included a nice text written by Dr. Brown and published by Stan Baker Associates. FPL'96 will be held on 23-25 September 1996 in Darmstadt, Germany and will be chaired by Manfred Glesner of the Darmstadt University of Technology. For more information, you can check WWW or email the FPL'96 secretary at fpl96@microelectronic.e-technik.th-darmstadt.de

Return to MSN Home Page

dbouldin@utk.edu