Tuesday, October 30, 2018

UART Communication Link Implementation with Verilog HDL on FPGA

UART Communication Link Implementation with Verilog HDL on FPGA

This post is regarding a HDL implementation of a UART(Universal Asynchronous Receiver Transmitter) for one of our university fourth semester projects. This was a group project of four group members. My group members are Chirath Diyagama, Isuru Nuwanthilaka and Dileepa Sandaruwan. For the project we were supposed to implement a UART link for a FPGA development board using Verilog as the HDL and send some data to another FPGA development board which also have a UART implementation.  Here we used a Used a Digilant Atlys FPGA development board with a Xilinx FPGA. That board is actually expensive to buy. But we barrowed that one from our laboratory for this project. There is no problem if you want to implement this using another board.




Here we actually doing a bi-directional asynchronous communication between two FPGA board. Atlys FPGA board has certain number of switches as internal peripherals. We used each of them to represent a binary value. Using 8 switches we sent 8bit data words to other board. Then there are 8 LEDs assigned in to the output of the receiver and it will display the received data using that LEDs. After the communication between two boards we will talk about communication between a FPGA board and a Computer through USB connections.  In order to do so the only need is a USB to UART bridge for your board. There are so many cheaper boards with Xilinx and Altera chips which you can find from online sites.

UART data frame

The structure of a UART frame can be illustrated as the above. Normally the data field can be varied from 4 bits to 9bits. There can be occupied a parity bit also.
When we consider the UART data frames that has several data fields. Here we are using only these fields.
       Start Bit
       Data bits
       Stop bit/bits
For our project we are using one start bit, eight data bits and two stop bits.

Baud Rate Generator

Every microprocessor and microcontroller require a clock signal because there are sequential circuits inside them. In our case also, there is an internal clock signal generator integrated onto the FPGA development board which provides a 100 MHz clock signal.

UART communication process also require a clock signal in order to generate and send each bit. But we cannot use the internal clock signal directly to our application. There are some standard Serial communication baud rates which the both transmitter and receiver parties should agree. Here we decided to use 115200 bits per second as the serial baud rate. We need a square shaped pulse at the above-mentioned clock rate in order to maintain a successful data communication. 
As a solution, we designed a Baud tick generator module using Verilog HDL and the schematic can be illustrated as shown in above figure.  The only input is the internal 100Mhz clock signal and the output is the Required Baud tick signal.

Verilog Code for the Baud Tick Generator


module BaudTickGen(

                input clk, enable,

                output tick  // generate a tick at the specified baud rate * oversampling

);

parameter ClkFrequency = 100000000;                 //100MHz

parameter Baud = 9600;

parameter Oversampling = 1;

function     integer log2(input integer v);
                begin log2=0;
                while(v>>log2)
                                log2=log2+1;
                end
endfunction

localparam   AccWidth = log2(ClkFrequency/Baud)+8;  // +/- 2% max timing error over a byte
reg          [AccWidth:0] Acc = 0;
localparam   ShiftLimiter = log2(Baud*Oversampling >> (31-AccWidth));
                 // this makes sure Inc calculation doesn't overflow
localparam   Inc = ((Baud*Oversampling << (AccWidth-ShiftLimiter))...
                                                                ...+(ClkFrequency>>(ShiftLimiter+1)))/(ClkFrequency>>ShiftLimiter);

always @(posedge clk) if(enable) Acc <= Acc[AccWidth-1:0] + Inc[AccWidth:0]; else Acc <= Inc[AccWidth:0];
                assign tick = Acc[AccWidth];
endmodule

Here we are taking two inputs for our module as the input clock signal (which is a 100MHz ) and enable input to turn on or off the communication process. We have one output which is the baud tick signal.


parameter ClkFrequency = 100000000;                 //100MHz
parameter Baud = 9600;

Here we defined the Input Clock frequency in Hertz and the the desired baud rate in ticks per second.  

Integration of Baud Generator

Here we integrate this baud tick generator module in to both UART Receiver module and UART Transmitter modules. Therefore, both of them taking input clock as our FPGA board’s internal clock signal which is a 100MHz signal and then inside each module Baud tick generator converts 100MHz square pulse in to a 115200 ticks per second baud signal. 

UART Transmitter


Here we take three inputs to the Transmitter which are the parallel 8 set of input which contains the exact data we need to send, the starting input which act as a transmission on/off  switch and the inbuilt clock signal from the FPGA development board. 

Inside this module we have our previously discussed baud tick generator and parallel to serial converter. Parallel data coming from 8 lines are time synchronously added together in to the rhythm of the baud tick and give to the out as a serial data  stream on a single line. We also have another output to indicate an end of a packet called 'Tx_done'.   

Verilog Code for Transmitter

module UART_transmitter(
            input clk,
            input TxD_start,
            input [7:0] TxD_data,
            output wire TxD,
            output reg TxD_done//Txdone,
           
);

// Assert TxD_start for (at least) one clock cycle to start transmission of TxD_data
// TxD_data is latched so that it doesn't have to stay valid while it is being sent

parameter ClkFrequency = 100000000;         // 100MHz
parameter Baud = 115200;
wire TxD_busy;

/////////////////////////////////////Baud tick Generartor/////////////////////
wire BitTick;
BaudTickGen #(ClkFrequency, Baud) tickgen(.clk(clk), .enable(TxD_busy), .tick(BitTick));
//`endif
//////////////////////////////////////////////////////////////////
reg [3:0] TxD_state = 0;
assign TxD_ready = (TxD_state==0);
assign TxD_busy = ~TxD_ready;
reg [7:0] TxD_shift = 0;

always @(posedge clk)
begin
            if(TxD_ready & TxD_start)
                        begin TxD_shift <= TxD_data; end
            else
            if(TxD_state[3] & BitTick)                                                                                         //Data bits(byte) sending
                        begin TxD_shift <= (TxD_shift >> 1); end                //input byte is sening in each clock cycle one by one
            case(TxD_state)                                                                                                                                  //counting each bit
                        4'b0000 : if(TxD_start)
                                                            begin
                                                            TxD_state <= 4'b0100;
                                                            TxD_done<=0;
                                                            end                                                                                         
                        4'b0100: if(BitTick) TxD_state <= 4'b1000;  // start bit
                        4'b1000: if(BitTick) TxD_state <= 4'b1001;  // bit 0
                        4'b1001: if(BitTick) TxD_state <= 4'b1010;  // bit 1
                        4'b1010: if(BitTick) TxD_state <= 4'b1011;  // bit 2
                        4'b1011: if(BitTick) TxD_state <= 4'b1100;  // bit 3
                        4'b1100: if(BitTick) TxD_state <= 4'b1101;  // bit 4
                        4'b1101: if(BitTick) TxD_state <= 4'b1110;  // bit 5
                        4'b1110: if(BitTick) TxD_state <= 4'b1111;  // bit 6
                        4'b1111: if(BitTick) TxD_state <= 4'b0010;  // bit 7
                        4'b0010: if(BitTick) TxD_state <= 4'b0011;  // stop1
                        4'b0011: if(BitTick)                                                                                          // stop2
                                                            begin
                                                                        TxD_state <= 4'b0000;
                                                                        TxD_done<=1; 
                                                            end
                        default: if(BitTick) TxD_state <= 4'b0000;
            endcase
end

assign TxD = (TxD_state<4) | (TxD_state[3] & TxD_shift[0]);  // put together the start, data and stop bits
endmodule

Here we are calling the Baud tick generator from the inside of the transmitter to sample the input data. 

UART Receiver


UART receiver taking two inputs which are input clock signal and the serial data input which is carrying a serial data stream sampled in a standard Baud Rate. Input serial data is again formed in to a set of 8 data lines after each packet is arrived in to the receiver.   

Verilog Code for Receiver

module UART_receiver(
            input clk,
            input RxD,
            output reg Rx_done=0,
            //output reg RxD_data_ready = 0,
            output reg [7:0] RxD_data = 8'd0  // data received, valid only (for one clock cycle) when RxD_data_ready is asserted

              // asserted when no data has been received for a while
            //output reg RxD_endofpacket = 0  // asserted for one clock cycle when a packet has been detected (i.e. RxD_idle is going high)
);

parameter ClkFrequency = 100000000; // 100MHz
parameter Baud = 115200;
parameter Oversampling = 8;  // needs to be a power of 2
// we oversample the RxD line at a fixed rate to capture each RxD data bit at the "right" time
// 8 times oversampling by default, use 16 for higher quality reception
wire RxD_idle;


////////////////////////////////
reg [3:0] RxD_state = 0;
//reg RxD_data_ready = 0;

`ifdef SIMULATION
wire RxD_bit = RxD;
wire sampleNow = 1'b1;  // receive one bit per clock cycle

`else
wire OversamplingTick;
BaudTickGen #(ClkFrequency, Baud, Oversampling) tickgen(.clk(clk), .enable(1'b1), .tick(OversamplingTick));

// synchronize RxD to our clk domain
reg [1:0] RxD_sync = 2'b11;
always @(posedge clk) if(OversamplingTick) RxD_sync <= {RxD_sync[0], RxD};

// and filter it
reg [1:0] Filter_cnt = 2'b11;
reg RxD_bit = 1'b1;

always @(posedge clk)
if(OversamplingTick)
begin
            if(RxD_sync[1]==1'b1 && Filter_cnt!=2'b11) Filter_cnt <= Filter_cnt + 1'd1;
            else
            if(RxD_sync[1]==1'b0 && Filter_cnt!=2'b00) Filter_cnt <= Filter_cnt - 1'd1;

            if(Filter_cnt==2'b11) RxD_bit <= 1'b1;
            else
            if(Filter_cnt==2'b00) RxD_bit <= 1'b0;
end

// and decide when is the good time to sample the RxD line

function integer log2(input integer v); begin log2=0; while(v>>log2) log2=log2+1; end endfunction

localparam l2o = log2(Oversampling);
reg [l2o-2:0] OversamplingCnt = 0;
always @(posedge clk) if(OversamplingTick) OversamplingCnt <= (RxD_state==0) ? 1'd0 : OversamplingCnt + 1'd1;
wire sampleNow = OversamplingTick && (OversamplingCnt==Oversampling/2-1);
`endif

// now we can accumulate the RxD bits in a shift-register
always @(posedge clk) begin
//if(Rx_done) Rx_done =0;
case(RxD_state)
            4'b0000: if(~RxD_bit)
                                                begin
                                                RxD_state <= `ifdef SIMULATION 4'b1000 `else 4'b0001 `endif;  // start bit found?
                                                Rx_done<=1'b0;
                                                //Rxdone<=0;
                                                end
            4'b0001: if(sampleNow) RxD_state <= 4'b1000;  // sync start bit to sampleNow
            4'b1000: if(sampleNow) RxD_state <= 4'b1001;  // bit 0
            4'b1001: if(sampleNow) RxD_state <= 4'b1010;  // bit 1
            4'b1010: if(sampleNow) RxD_state <= 4'b1011;  // bit 2
            4'b1011: if(sampleNow) RxD_state <= 4'b1100;  // bit 3
            4'b1100: if(sampleNow) RxD_state <= 4'b1101;  // bit 4
            4'b1101: if(sampleNow) RxD_state <= 4'b1110;  // bit 5
            4'b1110: if(sampleNow) RxD_state <= 4'b1111;  // bit 6
            4'b1111: if(sampleNow) RxD_state <= 4'b0010;  // bit 7
            4'b0010: if(sampleNow)                                       // stop bit
                                                begin
                                                            RxD_state <= 4'b0000;
                                                            Rx_done <= 1'b1;
                                                            //Rxdone<=1;
                                                end
            default: RxD_state <= 4'b0000;
endcase
end

always @(posedge clk)
if(sampleNow && RxD_state[3]) RxD_data <= {RxD_bit, RxD_data[7:1]};

`ifdef SIMULATION
assign RxD_idle = 0;
`else
reg [l2o+1:0] GapCnt = 0;
always @(posedge clk) if (RxD_state!=0) GapCnt<=0; else if(OversamplingTick & ~GapCnt[log2(Oversampling)+1]) GapCnt <= GapCnt + 1'h1;
assign RxD_idle = GapCnt[l2o+1];

`endif

Endmodule



Output from the Simulator



Tuesday, April 17, 2018

Serial Communication

                                    Serial Communication

Most of the electronic designs which are based on integrated circuits are doing data communication with external peripherals connected in to the same design or with another device. Any kind of embedded system which having a micro-controller or a microprocessor are definitely capable and need data communication. Also, that kind of designs are having several circuits combined together using a communication protocol.

 In order to perform a successful data communication between any kind of two devices, they must have a common communication protocol. Communication protocols can be divided into several categories considering their properties.  Most common and widely used two categories are;
  • ·        Serial Communication Protocols 
  • ·        Parallel Communication Protocols

Parallel Communication
In parallel communication protocols, one entity (Transmitter) can send more than one bit at a time. In other word communication can be done with several data bits transmitting parallelly using several physical lines.  Usually we can see bunch of wires which are serving as data busses in parallel communication devices. Following figure shows a typical 8-bit parallel communication port.

According to the figure there are 8 data lines to communication and another wire to send a clock signal. This clock signal is sending in order to having a synchronization between two entities. Width of the data bus can be varied such as 4-bits, 8bits, 16-bits,32-bits etc.

Serial Communication 

Here we are sending a stream of data as a single bit at a time. This communication method needs only a pair of data lines to transmit data bits. But sometimes we need to use another clock signal line also. 
Above figure shows a single direction serial communication model between two entities with a clock signal at the bottom line. It sends a set of 8bits (one byte) serially using a single line but using several clock cycles. That means spend more time than sending parallelly.

Parallel vs. Serial

When we compare these two categories of protocols, each have their own pros and cons. Parallel communication is really faster than serial communication because it sends multiple bits in a single clock cycle, while the serial protocol uses multiple clock cycles to send the same data amount.
But it saves number of physical wires of a design and reduce the hardware complexity. On the other hand, upper layers of the communication model needed to handle serial data by may be increasing the software complexity.
In some projects number of output/input pins of a micro-controller or an ASIC will be a critical thing rather than the complexity of the code. Then it will be a great relief with having a serial communication protocol with you. 

Synchronous and Asynchronous Communication
What is synchronization?
When there is a digital data communication between any kind of two electronic devices or circuits, both of them should have some common things. Specially a common rhythm between two entities. That means a common clock rate to determine where the data bits or frames are starting and ending. So, maintaining a common clock rate and common rising and falling edges of the clock signal between two devices is called the synchronization.
There are two different synchronization methods for data transmission which are differ by the clock source. In Synchronous method an external clock signal is supplied over the medium (ex- extra line for clock signal) while asynchronous method has no need of external clock but, the receiver entity can recover clock signal and synchronize using some specific data signal variations such as start and stop bits.

Comparison between Synchronous and Asynchronous

Asynchronous communication

According to the previous details, here we are transmitting data without using an external clock signal. When we sending data using this method the data is sent frames, with several synchronization symbols within each frame instead of as a continuous data stream.
Normally data is sent as one byte at a time and adding start bit at the beginning of the byte and stop bit at the ending of the byte.

Asynchronous Serial

Here data is sent through a serial interface and asynchronous method is used for synchronization. When we are considering some serial interfaces USB (Universal Serial Bus), Ethernet, I2C and SPI (Serial peripheral Interface) have widely used in many applications. Among them, I2C and SPI are serial synchronous protocols which are using extra line for clock signal other than the communication lines.
An asynchronous serial interface communicates without a support from an external clock line by minimizing the number of needed wires or number of IO pins of a device. On the other hand, we have to put some extra effort to synchronization from software side. Bluetooth modules, GPS modules, wi-fi modules and most of other modules we have used for our projects are based on this serial communication.

 Structure of an Asynchronous Serial data frame

In asynchronous serial protocol, there are several things to consider other than the set of data bits. 


Data bits: Each packet contain a set of data bits which has a varying length (ex-one byte). But most of the times we are using 8bit as the data bits and sometimes 7bits if we are transferring 7bit ASCII characters. Both devices should agree with the length of the frames and what are the endianness of the data bits set. 

Synchronization bits: There two or three bits in every data frame which are specially assigned as synchronization bits. Their job is to show the starting point and ending point of a data frame. Therefore, they are called as start bit and stop bits. Normally start bit contains only one bit while one or two bits for the stop bits.

Parity bits: Most of the data frames there is an error detection data bit which is based on a parity detection mechanism handled by two entities. This parity detection method is based on whether the sum of the data bits (number of ‘1’s in the data bits) are even or odd. There also two parity methods called odd parity and even parity. If the sum is odd and the even parity is used then parity bit is set to “1”.
Set the parity bit to ‘1’ or ‘0’ as following,
Even parity ---“sum of Data bits + Parity bit ---> is an even number”
Odd parity ---“sum of Data bits + Parity bit ---> is an odd number”

Apart from the data frame, there is another important thing that we need to consider in asynchronous serial communication. That is the baud rate.

Baud Rate

In telecommunication theory, baud rate is the number of symbols sent along a channel per second. Here also it describes the same thing and the baud rate is measured by how much bits per second the serial interface can communicate. It gives us a sense how fast a serial link can send data. Normally we are using 1200, 2400, 4800, 9600,19200,38400,57600 and 115200 bits per second as standard baud rates. Most common and simple applications are using 9600 bps while the speed critical applications are using 115200 bps as the baud rate.
Whatever the data rate, the two communication entities must communicate in the same data rate in order to obtain a successful communication.

Serial Asynchronous Interface in Hardware

In most of the applications we expect a duplex communication between two devices. That means either of two devices capable receive and transmit data. There are two kinds of duplex communication. One is Full-duplex which enable to receive and transmit data at the same time. Other is the Half duplex which enable transmit or receive data to one entity after other one is completed.

So, these kinds of interfaces must have two lines to receive and transmit data and another line for a common ground. 

There an important thing about wiring the two entities. RX and TX pins are working relative to themselves and we must have wired them considering the system. Tx pin of one device should be connected in the RX pin of the other device and vice versa.
Bluetooth modules, GPS modules, RTC modules and Communication between two microcontrollers are some common applications of duplex.
Sometimes we are using one direction data transmission. That means we need only to transmit data to someone or only to receive data from someone. Then we use a single line to obtain the communication. This method is called as the Simplex communication.


References:https://en.wikipedia.org/wiki/Serial_communication
 https://learn.sparkfun.com/tutorials/serial-communication/all




The Featured Post

A Deep Dive into Deep Packet Inspection

This blog post aims to unfold the details of DPI, exploring its definition, methods, applications, modern technologies, and the challenges i...