Computer Aided Design of VLSI Systems II



ECE 652 PROJECT REPORT

Team - Advanced Encryption Standard (AES) IP Block



Rishi Raj Srivastava, Nitin Tiwari

April, 2002

Electrical & Computer Engineering
University of Tennessee
Knoxville, TN 37996


Contents

  1. Abstract
  2. Introduction
  3. AES IP Core
  4. Task Requirements
  5. Implementation Steps and Methodology
  6. Top & State Machine implemented in the TOP design
  7. DesignWare Memory
  8. Tri-State Buffer and Test bench
  9. Results
    1. StandAlone Core Implementation
    2. AES core with DW RAM
  10. Summary and Conclusion
  11. Reference and Acknowledgement



1.0 Abstract

This project is a part of a System on Chip (SoC) project being implemented as a one large team class project.Our part in the project involved testing of the Advanced Encryption Standard Cipher - Inverse Cipher core obtained from www.opencore.com,Targetting it the Core for Xilinx 1000e (FPGA) and TSMC 0.18 6 metal ASIC. Once that was done we added Synopsys DesignWare RAM for reading the Key and Data for the Cipher and writing out the Ciphered text. The whole RAM-Core-RAM system was then simulated and synthesized and Placed and Routed targeting the FPGA and ASIC mentioned above. The IP block with the RAM was then wrapped around a wrapper to connect it to the AMBA bus then had to be placed in the SOC.


2.0 Introduction to Advanced Encryption Standard (AES)

AES is the latest Federal Information Processing Standard (FIPS). AES is implemented using the Rjindael(read- rhinedahl) Algorithm. The 'Rjindael' algoritm was designed by Joan Daemen and Vincent Rijmen from Belgium. Rjindael is a block Cipher that takes in Key and input text in variable bit block lengths. The current version can have 128,196,256 bit Key to cipher data with block length of 128,196,256 with all the nine combinations possible. Rjindael algoritm's combination of Security, Performance, Efficiency, Ease of implementation and flexibilty made it the best choice for being AES. It has a Round Permutation Module that can be looped 10, 12 or 14 times, further enhancing its ciphering capabilities.

3.0 AES Cipher


The AES cipher core consists of a key expansion module, an initial permutation module, a round permutation module and a final permutation module. The round permutation module will loop internally to perform 10 iteration (for 128 bit keys).

4.0 Task Requirements

Our two member sub-team was required to complete these tasks :
1. Simulate the AES Cipher core before synthesis
2. Synthesize the core targetting FPGA Xilinx Virtex 1000e and ASIC TSMC 0.18 technology using FPGA Compiler and Design Compiler.
3. Place and Route the Synthesized design using XVMake (Xilinx virtex FPGA) and Silicon Ensemble (ASIC) to get th SDF files for the design
4. Perform PostLayout Back Annotated Simulation using SDF File for both Technologies.
5. Add DesignWare RAM to the front and back of the design to read the Key and Data required by the AES Cipher and write back the Ciphered text into the RAM.
6. Perform Pre-Synthesis Simulation on the RAM-IPCore-RAM System.
7. Synthesize this system like Step 2.
8. Place and Route this system like in Step 3 and get the SDF files for both technology
9. Perform PostLayout Back Annotated Simulation using SDF File for both Technologies.
10. Add the AMBA bus wrapper aroung the TOP design.
11. Connect the RAM-IPCore-RAM system to the SOC design Via AMBA bus
12. Check it for correct Functioning


Figure 1 shows the flow of design and verification tasks described above.


Figure 2. Process Flow

5.0 Implementation Steps and Methodology

1. The AES core was obtained from opensources.com. The core is basically two parts. The AES Cipher top and the AES Inverse Cipher top. The core comes along with a Verilog TestBench. The test bench supplies the Key, Plain Text and Ciphered data (to cross-check simulation results) in blocks of 128 bits.

2. The first thing that was done with the core obtained was pre-synthesis simulation on the Cipher-Inverse Cipher core using the TestBench provided along with it.

3. Once that was done the core was then synthesized targetting ASIC tsmc 0.18 technology on the Design Compiler and then Place and Route (P&R) performed on the synthesized Net-List using Silicon Ensemble. The SDF (standard delay format) file was obtained after P&R was done. This file along with the synthesized Net-List were used to perform Back Annotation Simulation (Post-Layout Simulation) on the stand alone core. Besides that Gate Level Simulation was done using the synthesized NetList without the SDF.

4. The Stand Alone core was also synthesized targetting the Virtex 1000e FPGA part using the FPGA Compiler fc2. But there was a resource constraint that we faced with the Input/Output pins . We had to overcome this limitation by writing a Top over the AES Cipher top to input data (Key and Plain text) as 32 bits and then collect that data to form 128 bit Key and Plain text blocks. The Top has a State machine that is synchronous in nature and is used to get data in blocks of 32 bits and then appending these 32 bit block to form a 128 bit block of data. The Top proved to be helpful when RAM was attached to the Cipher core as the RAM has a width of 32 bits only to store any data.
Once the Synthesis was completed the netlist was used to do Place and Route using Xilinx tools like XVmake and ngd build. A Post layout simulation was also performed.

5. After the standalone tests were done DesignWare RAMs were added around the Top made for 32 Bits of Input. There are three RAMs . One for storing the Key, one for the Input plain Text and the third one for storing the Output Ciphered text. The ram stores one block of 128bit Key/Plain Text in 4 locations of 32 bits.

6. To test this setup (RAM-Core-RAM) the test bench had to write to the RAM and then the Top had to read 28bit Data from the RAM and build up the 128 bit block. This firstly required modification of the Test bench to feed in 28bits of data instead of 132 and then a mechanism was required to pass the control of bus from Test Bench to the main Cipher Core. The mechanism used was a set of Tri-State buffer (described below).

7. Once the TestBench, Tri-State Buffers and the DW RAM were all intergrated, PreLayout simulation was carried out.

8. Synthesis was performed targetting ASIC TSMC 0.18 technology. Due to the large size of the design the RAM-Core-RAM design couldnt be targetted for the Virtex 1000e FPGA. Place and Route was peformed using the netlist on SE Ultra. The SDF file that was generated was used for back annotated simulation.



6.0 Top & The State Machine implemented in the TOP design

The AES Top that we wrote in VHDL is important in many ways:

  • Firstly this Top module acts like a wrapper to integrate the AES Cipher with the DesignWare Memory. Both, AES cipher as well as the DW RAMs are components to in this module.
  • Nextly, the TestBench sits on top of this TOP and is used to read in sample KEY and Plain Text data to the RAM using the Data and Address busses of the DW RAM and then give over the control of these busses to the Top to read from these RAMs and send the data for ciphering.
  • Nextly, the TOP has tri-state buffers to switch access of the RAM between the test bench and the Cipher.
  • The Top has buffering functions to form a Key and Plain Text block of 128bit. The DW RAMs have a memory with a word length of 32 bits(max), whereas the cipher takes in 128 bit wide Key and Text i/p for ciphering. so the Top has a state machine which pulls out 32 bits of data (Key and Text simultaneously) from 4 consecutive memory locations on the DW RAM and combines all those data to get one large 128 bit Key and Text. More has been described later.
  • Lastly, it acts as a link between the AES cipher & Inverse cipher core which otherwise were obtained as seperate Verilog modules with hierarchial verilog components under them. This helps in simulating the whole AES core with input and output RAMs together.

    Figure 2 shows the AES Top State Machine. This state machine is synchronous, and thus, requires atleast one clock tick to change to the next state. Additionally, being synchronous in nature, verification that data has successfully been read and/or written to the RAM.

    The State Machine
    As explained above, the state machine in Top is used to group 32bits of Data from 4 consecutive memory locations on the RAMs to get 132 bit Key and Text Input.
    To implement the smooth working of the state machine we have two state variables namely State1 and State2. State1 variable is being used to decide as to whether the Top polls for reset to initialize all signals or load signal (from testBench) to start loading Key and text (State1->0),the Top is reading Data from the DW RAMs (State1->1), the Top is writing back to the memory once a ciphered o/p has been recieved.State2 variable changes within each State1 state to perform variety of functions like waiting for one clock cycle, assigning address to the address bus and reading data from / writing data to the Data bus of DW RAMs.


    7.0 DesignWare Memory

    We are using Synchronous Single-Port, Read/Write RAM (Flip-Flop-Based) DW_ram_rw_s_dff from Synopsys. The RAM-CORE-RAM implementationfor the AES Cipher has been done using 3 instances of the above RAM. Two of the RAMs are being used as Input RAMS for storing the KEY and the Plain Text while the third RAM is being used to store Output Ciphered Data. Since the RAMs are synchronous, data is being read on the rising edge of the clock.

    Note: Since the RAM group have included the documentation of the same RAM in their report, we are not including it in our report



    8.0 Tri-State Buffers and Testbench

    Figure 3 shows the test bench block interaction in which tri-state buffers are placed on the address, data (Key and Plain Text), and read/write lines controlling the memory. During the first stage, the tri-state buffers give control to the test bench. This allows the test bench to read in the Key and the plain text to be ciphered and write them into the DesignWare RAM. When the Key and text have been read into the DW RAM, the control is switched to the AES Cipher side so that it can read the Key and the Plain text and then cipher it to encripted text.

    The tri-state buffers are necessary in order to drive the DesignWare RAM from two separate sources. Also it prevents the cipher block from accidently reading from the memory when data is being fed into the RAMs.


    Figure 3: Tri-State Buffers and TestBench interactions

    The Test bench is in Verilog format and was provided to us along with the StandAlone core. We used this to test the AES Cipher StandAlone module. We had to modify this test bench to meet the 32bit Input requirement. For standAlone 128bit data was read in at one time but for the RAM-Core-RAM arrangement the same data was read in as 32 bit data in four readin cycles. The TestBench also is used to give Clock and Reset sinals to the whole system.


    9.0 Results


    9.1 Standalone AES Core Implementation

    StandAlone PreSynthesis Simulations : Performed on the core using the Test bench originally supplied with the Core
    Pre-Synthesis Simulation (Waveform; Modelsim) this is simulation of the cipher and inv cipher together.
    Pre-Synthesis Simulation (Waveform; Modelsim) this is just showing the text and key being loaded
    Pre-Synthesis Simulation (Waveform; Modelsim) this shows the ciphered text being outputted.

    StandAlone AES Simulations and layout

    Xilinx Virtex 1000e
    PostLayout simulation of the AES cipher core only
    Layout of the AES Cipher core
    Layout of the AES Inverse core

    ASIC TSMC 0.18
    Cipher Core
    PostSynthesis simulation data being input into the Cipher module
    PostSynthesis simulation ciphered data coming out with text out same as the given ciphered text

    PostLayout simulation Data in
    PostLayout simulation data out
    PostLayout simulation showing delay due to SDF file
    Layout Silicon Ensemble

    Inverse Cipher
    PostSynthesis Simulation Data Load
    PostSynthesis Simualtion Encrypyed data out
    PostSynthesis Simualtion Decrypyed data out

    9.2 AES Cipher Simulations with Designware RAM and Top

    PreSynthesis Simulation carried out after intergrating the AES Cipher Top with DW Memory and TOP
    PreSynthesis Simulation with data being fed into the DesignWare memory
    PreSynthesis Simulation with data being input in four steps through TOP
    PreSynthesis Simulation Encryption
    PreSynthesis Simulation with data being output in four steps by breaking 128 bit data into 32 bit pieces


    Xilinx Virtex 1000e Design too large to be Synthesized

    ASIC TSMC 0.18
    PostSynthesis Simulation carried out by synthesizing the core and using the verilog netlist to simulate in Modelsim
    PostSynthesis Simulation
    PostSynthesis Simulation
    PostSynthesis Simulation



    10.0 Summary and Conclusions

    The Project involved taking up a AES Core and testing it Standalone and then integrating it with a DesignWare RAM and testing aiming at FPGA(Virtex 1000e) and ASIC (tsmc 0.18) in both cases. The standalone design passed through easily through both the technologies. The RAM-Core-RAM system will not go through synthesis and P&R for the FPGA due to its large size. The RAM-CipherCore-RAM went through synthesis and we got simulations but when the whole system was placed and routed on Silicon Ensemble and a Post layout Simulation done using the SDF file there are some problems with the simulation. We have tried over and over again to get it but have not been able to get a fully correct output.



    11.0 References and Acknowledgement

    References
  • AES Proposal : Rijndael – Daemen & Rijmen
  • AES home : http://csrc.nist.gov/CryptoToolkit/aes/rijndael/ . This web-page has links to developers home pages and FAQs and related documentation on AES
  • Designware RAM Documentation
  • AES core downloaded from www.opencores.org
  • AES Core documentation (available with the core at opencores.org)

    We would like to thank Dr. Don Bouldin for his guidance. Also, the students who worked with us in this project. Also would like to acknowledge the help we got from Dr. Chandra Tan.