A novel design and implementation of FPGA based 3D-CORDIC processor

By

Al-Homosy, G.M. *  Abass, Y.M. **  Al-Kholy, S.A. *  A.M. Rashed***

Abstract:

A new complete design and implementation of FPGA-based Three Dimensions CORDIC processor (3D-CORDIC) is introduced. Efficient mappings on FPGA have been performed leading to the fastest implementations. Simulation process have been performed for the proposed 3D-CORDIC processor using ModelSim SE tools of Mentor Graphics simulations and the MATLAB Software simulations, a good agreement of the proposed processor performance has been achieved. The 3D-CORDIC processor architecture has been implemented with 12 bit word-length in Xilinx Spartan-II series field programmable gates arrays (FPGA). The 3D-CORDIC processor use only 37 % of SLICEs and 52 % of IOBs with maximum clock frequency 116 MHz, which is suitable for many CORDIC processor applications.

Keywords:

FPGA, CORDIC, VHDL, 3D-CORDIC PROCESSOR

* Physics & Mathematics Dept, Suez Canal University
** Physics Dept, Suez Canal University
*** Systems & Computers Dept, Al-Azhar University
1. Introduction:

The well known CORDIC algorithm, which has been applied with a great success to the hardware implementations of many signal processing tasks, e.g. sine and cosine generation, vector rotation, coordinate transformation, and linear system solving, is suitable for the implementation of 3-D vector interpolation [1]. In CORDIC, only simple shifters and adders are needed, which can be realized by the use of reconfigurable hardware platforms, especially by FPGA [2].

While hardware-efficient solutions often exist, the dominance of the software systems has kept those solutions out of the spotlight. Among these hardware-efficient algorithms is a class of iterative solutions for trigonometric and other transcendental functions. The trigonometric functions are based on vector rotations, while other functions such as square root are implemented using an incremental expression of the desired function. The trigonometric algorithm is called CORDIC, an acronym for (COordinate Rotation DIgital Computer). The incremental functions are performed with a very simple extension to the hardware architecture, and while not CORDIC in the strict sense, are often included because of the close similarity. The CORDIC algorithms generally produce one additional bit of accuracy for each iteration.

Volder [3] developed the underlying method of computing the rotation of a vector in a Cartesian coordinate system and evaluating the length and angle of a vector. Volder's algorithm is derived from the general equations for the step by step vector rotation. All the evaluation procedures in CORDIC are computed as a rotation of a vector in three different coordinates systems (circular coordinate, linear coordinate and hyperbolic coordinate) with an iterative unified formulation.

The CORDIC method was later expanded for multiplication, division, logarithm, exponential and hyperbolic functions. The various function computations were summarized into a unified technique in Walter [4]. The multiplications in CORDIC calculations are replaced by calculation based on shift registers and adders, what saves much hardware resources. CORDIC is used for polar to rectangular and rectangular to polar conversions and also for calculation of trigonometric functions, vector magnitude and in some transformations, like discrete Fourier transform (DFT) or discrete cosine transform (DCT). A broad field of digital signal processing applications fined a valuable kernel in the generalized CORDIC algorithm proposed by Walter.

The CORDIC-algorithm provides an iterative method of performing vector rotations by arbitrary angles using only shift and add operations. The CORDIC arithmetic technique makes it possible to perform two dimension rotations using simple hardware components. Its use was originally intended for solving navigation problems, by calculating some elementary functions [5] and as a useful kernel for many digital signal
processing tasks [6] [7].

Many researches have been developed for the overall quantization error (OQE) and the optimization of a CORDIC processor [8] [9]. Most of today's efforts at direct compilation from a high-level language to FPGAs target very simple arithmetic units such as adders, multipliers, shifters, etc. Instead, more complex arithmetic units such as CORDICs coupled with various alternatives of number representations should be targeted by higher-level compilers to exploit the full potential of reconfigurable computing [8] [10].

More specifically, there are a number of digital signal processing (DSP) applications using the CORDIC-based hardware such as modulation [11], digital filtering [12], and Fast Fourier Transforms (FFTs) [13] [14]. Modified CORDIC algorithms have also been proposed in order to overcome the disadvantages of the conventional algorithm such as low computational speed and high complexity [15] [16] [17].

In this paper, the architecture of 3D-CORDIC Processor based on the CORDIC algorithm is proposed. It is suitable for Field Programmable Gate Arrays (FPGA) implementation in terms of the computational complexity. One objective of this paper is to show a possible direction for high-level compilation to 3D-CORDIC Processor. The remainder of the paper is organized as follows: The basics of CORDIC algorithm technique are discussed in Section 2. The new 3D-CORDIC Processor technique is introduced in Section 3. Design, Simulation and implementation of 3D-CORDIC processor are introduced in section 4. The MATLAB Ver. 7 is used to simulate 3D-CORDIC processor is giving in section 5, Finally, discussion and conclusion are given in section 6.

2. THE BASICS OF CORDIC ALGORITHM:

The CORDIC is an algorithm performing a sequence of iteration computations by the use of coordinate rotation [3] [4] [18]. It can be used to generate important elementary functions by using only simple adders and shifters. The basic CORDIC iteration equations are given by:

\[ X_{i+1} = X_i - m_i d_i \left(2^{-s(m,i)}\right) Y_i \]  

\[ Y_{i+1} = Y_i + d_i \left(2^{-s(m,i)}\right) X_i \]  

\[ Z_{i+1} = Z_i + d_i \alpha_{m,i} \]

where \( m \) denotes the circular ( \( m = 1 \) ), linear ( \( m = 0 \) ) or hyperbolic ( \( m = -1 \) ) coordinate system [19] [20], \( i = 0, 1, 2, \ldots, n-1 \),

\[
s(m,i)=\begin{cases} 
0, 1, 2, 3, 4, 5, \ldots, & m = 1 \\
1, 2, 3, 4, 5, 6, \ldots, & m = 0 \\
1, 2, 3, 4, 5, 6, \ldots, & m = -1 
\end{cases}
\]

and \( \alpha_{m,i} = m^{-1/2} \tan^{-1}[\sqrt{m} 2^{-s(m,i)}] \) (2.4)

The rotation \( d_i = \text{sign}(Z_i) \) for the rotation mode \( (Z_n \rightarrow 0) \);
\( d_i = -\text{sign}(X_i) \cdot \text{sign}(Y_i) \) for the vectoring mode \( (Y_n \rightarrow 0) \).

To explain the basic concept of CORDIC, consider a Two-Dimensional Euclidean Space in figure (2.1). Let \( X_i \) and \( Y_i \) be the X and Y coordinates of the vector \( OP \) with magnitude \( R_i \). This vector is rotated through an angle \( \phi \) to from the new vector \( OQ \). The rotation is not a pure vector rotation but a motion of vector \( OP \) along the tangent of the circle formed by \( OP \) as radius at the point P. Then the resultant vector will have a magnitude given by \( R_i \sqrt{(1 + \tan^2 \varphi)} \) which is \( R_i \sec \varphi \). The coordinates of \( Q (X_{i+1}, Y_{i+1}) \) can be expressed as follows [21]:
\[
X_{i+1} = X_i - Y_i \tan(\varphi) \\
Y_{i+1} = Y_i + X_i \tan(\varphi)
\]
Taking into consideration the direction of rotation \( (s) \), \( s='+1' \) for anticlockwise rotation and \( s='-1' \) for clockwise rotation, the Equation (2.5) can be expressed as
\[
X_{i+1} = X_i - s_i Y_i \tan(\varphi) \\
Y_{i+1} = Y_i + s_i X_i \tan(\varphi)
\]

In CORDIC algorithm the angle of rotation is achieved by a series of micro-rotation. In simple words the input angle is decomposed into small micro-angles that take values of \( \tan^{-1}(2^{-i}) \), where \( i \) take values from 0 to \( n \).

Figure (2.1) Iterative vector rotation, initialized with \( V_0 \)

The scale factor \( k_{m,i} = \cos(\tan^{-1}(\varphi)) = 1 / (1 + m d_i^2 2^{-s(m,i)})^{1/2} \) in the \( i \)-th iteration, as shown in figure(2.1). After \( n \) iterations, the product of all the scale factors is as follows [22].
where the rotation direction is defined by \( d_i = \{-1, +1\} \) and the scale factor is:

\[
K_n = \prod_{i=0}^{n} k_i = \prod_{i=0}^{n} 1/(1 + m d_i^2 2^{-2s(m,i)})^{1/2} = \prod_{i=0}^{n} 1/(1 + m 2^{-2s(m,i)})^{1/2}
\]

For data of \( B \)-bits wordlength, no more than \( B \) iterations need be performed, i.e., \( n \leq B \). In addition, the final values, \( X(n) \) and \( Y(n) \), need to be scaled by an accumulated scaling factor \( K_n \). The resulting vector \( V'(X(n)/K, Y(n)/K) \) is the unit vector as shows in figure(2.1).

### 3. A New 3D-CORDIC Processor Based on 2D-CORDIC Design:

There are different techniques to implement a CORDIC processor [8]. The ideal architecture depends on the speed versus area tradeoffs in the intended application. In the a new design of the CORDIC algorithm the 3D Cartesian Coordinate \((X_0, Y_0, Z_0)\) are converted to Polar Coordinate \((R_{final}, \alpha, \gamma)\), as shown in figure(3.1).

The CORDIC structure as described in equations (2.1, 2.2 and 2.3) is represented by the schematics as shown in figure(3.2). An iterative CORDIC architecture can be obtained simply by duplicating each of the three difference equations in hardware as shown in figure(3.2). The decision function, \( \sigma_i \), is driven by the sign of the \( y \) or \( z \) register depending on whether it is operated in rotation mode or vectoring mode. In operation, the initial values are loaded via multiplexers into the \( x \), \( y \) and \( z \) registers. Then on each of the next \( n \) clock cycles, the values from the registers are passed through the shifters and adder-subtractors and the results placed back in the registers. The shifters are modified on each iteration \( n \) to cause the desired shift for the iteration [19] [23].

The ROM address is incremented on any iteration so that the appropriate elementary angle value is presented to the \( z \) Adder. On the last iteration, the results are obtained directly from the Adder. Obviously, a simple state machine is required to keep track of the current iteration, and to select the degree of shift and ROM address for any iteration [24].
Figure (3.2): Block Diagram of a Parallel CORDIC Architecture.

To determine the location of any point in XY-plan as \((x_o, y_o)\), reading \((x_o, y_o)\) in X-Y plane then we must be able to specify the location of sign \((x_o)\) and sign \((y_o)\) then recording them in a register 2-bit which called Select register \((SEL(2\text{bit}))\). Table (3.1) shows the Select register related to different of sign \((x_o)\) and sign \((y_o)\).

The Vector Mode CORDIC unite (VMC) as shown in figure (3.3), VMC unite are used to calculate the \(x_{n(\text{x-y plane})}\) and \(z_{n(\text{x-y plane})}\) by input positive value of the \(x_o\) and \(y_o\) to register \(x_o(\text{in})\), \(y_o(\text{in})\) respectively. In the new design the number of iterations \(n = 12\) and the CORE shown in figure(3.4), which includes contents of 11 iterative of VMC unit.

| Table(3.1) The Select Register related to different of sign\((x_o)\) and sign\((y_o)\) |
|-----------------------------------------------|---------------|----------------|
| sign\((x_o)\) | sign\((y_o)\) | SEL(2 bit) |
| +ve | +ve | 00 |
| - ve | +ve | 01 |
| - ve | - ve | 11 |
| +ve | - ve | 10 |
The output values \((X_{n(x-y-plane)}, Z_{n(angel)})\) in the new design are modified and accurate as follows [19]:

1- CORE by correcting value of \(x_{n(x-y-plane)}\) by using equation:

\[
R_{(x-y-plane)} = \frac{X_{n(x-y-plane)}}{K}
\]

Where \(R_{(x-y-plane)}\) is radius of XY-plan, as shown in figure (3.1). The K factor is constant \((K \approx 1.6468)\), the value of \(1/K\) is divided to different values as shown in equation (3.1):

\[
\frac{1}{K} = \frac{1}{1.6468} = \frac{1}{2^2} + \frac{1}{2^3} - \frac{1}{2^6} - \frac{1}{2^9}
\]

\[
\therefore R_{(x-y-plane)} = \frac{X_{n(x-y-plane)}}{K} = X_{n(x-y-plane)} \left( \frac{1}{2^1} + \frac{1}{2^3} - \frac{1}{2^6} - \frac{1}{2^9} \right)
\]

Now, \(R_{(x-y-plane)}\) value is corrected by using right shift register process by 1-bit, 3-bit, 6 bit and 9-bit from \(x_{n(x-y-plane)}\) according to equation (3.2). After that, adding and subtractions processes is shown equation (3.2).

2- A correction in the angle value (Alfa) become possible according to it position in xy-plane, if the \(x, y\) are positive together \((SET = 00)\) the angle will be equal to \(z_{n(out)}\), if the \(SET = 10\) the angle will be equal to \((\pi - z_{n(out)})\), but if the select \((SET = 11)\) the value of the angle will be equal to \((\pi + z_{n(out)})\) and finally if the select \((SET =01)\) the angle is equal to \((2\pi - z_{n(out)})\), as shown in figure(3.5).
The correction process was made by block diagram of Correct Radius and Angle unit (C-Rad&Ang) as shown in figure (3.6). Figure (3.7) shows the block diagram of Two Dimensions Cartesian to Polar coordinate (2D-KtoP) or (2D CORDIC processor), which converts from two dimensional Cartesian coordinate to polar coordinate. From 2D-KtoP block diagram, radius $R_{(x-y-plane)}$ and angle $\alpha$ can be been calculated [19].

The radius $R_{(x-y-plane)}$ and angle Alfa ($\alpha$) can be calculated in the XY-plan, by using 2D CORDIC processor (2D-KtoP), at $X_{in}=X_0$ and $Y_{in}=Y_0$. And in ZR$_{(x-y-plane)}$ plan at $X_{in}=Z_0$ and $Y_{in}=R_{(x-y-plane)}$, the radius $R_{final}$ and angle Gamma ($\gamma$) is calculated, as shows in figure(3.2).

Figure (3.8) shows the block diagram of the Three Dimensions Cartesian to Polar coordinate (3D-KtoP) or (3D CORDIC processor).
4. FPGA IMPLEMENTATION AND RESULTS:

The VHDL (VHSIC hardware Description Language) is used to implement the algorithm and map it to FPGA [25]. The probability values are floating point numbers so we used the IEEE 754 standard to represent it.

The IEEE 754 standard represents the floating point numbers by three fields (sign (s), exponent (e), and mantissa (F)), Figure (4.1) shows the bit width alignment of the three fields for precession representation.

Sometimes much smaller bit widths than those specified in the IEEE 754 standard are sufficient to provide desired precision and occupy less resources than the full standard bit width implementation [26][27].

The Cartesian coordinates of a point in a 3-dimensional space (Xk, Yk, Zk) contains 12-bit signed words. The 3D-CORDIC processor unit returns this point to the equivalent Polar coordinates (R, Alfa (\( \alpha \)), Gamma (\( \gamma \))), where red_p(out) is the radius (R), an_alfat_p (out) is the angle Alfa (\( \alpha \)), an_gamma_p (out) is the angle Gamma (\( \gamma \)).

As illustrated in figure (4.1), the sign field (S) is bit number D11 and is used to specify the sign of the number. Bits D10 down to D3 are the exponent field, this 8-bit quantity is a signed number represented by using integer value. Bits D2 down to D0 are used to store the binary representation of the floating point number. But in the angle values, as illustrated in figure(4.2). Not sign filed, bits D11 down to D3 are the exponent filed, and this 9-bit quantity is signed number represented by using integer value.
In the process of building the 3D-CORDIC processor, the 12-bits of word-length are used, the last bit referring to the sign bit at the numerical values, if the bit equal 0, the number is positive, or if the bit equal 1, the number is negative. The inputs and the output radius \((x_{in}, y_{in}, z_{in}, \text{rad}_p(out))\) are in following format as shown in figure (4.1) [19]:

- The last bit referring to the sign bit (s)
- The upper 8-bits represent the decimal value (e)
- The lower 3-bits represent the fractional value (f)

While the output angles (Alfa and Gamma) represented in the following format as shown in figure (4.2):

- The upper 9-bits represent the decimal value (e)
- The lower 3-bits represent the fractional value (f)

No sign bit of angles (Alfa and Gamma), because the values of any angle are having positive value. According to the above angles (Alfa and Gamma) format the binary values of \(\pi\)-angle is equal to \((010110100.000)\) and of \(2\pi\)-angle is equal to \((101101000.000)\).

The implementation of 2D-CORDIC processor generated from Xilinx FPGAs by using Spartan-II 2.5V [XC2S200E-5-PQ208C]. The design was coded in VHDL and successfully implementation in Xilinx Spartan-II FPGA.

The VHDL code of proposed 3D-CORDIC is compiled by XILINX Foundation Series Express (ISE 4.2i) and simulated using ModelSim SE tools of Mentor Graphics [28], [29]. This model needs 1700 ns for processing the input values and presenting a result, this is a very fast implementation.

A simulation and results of the 3D-CORDIC processor with the inputs of table (4.1) is shown in figure (4.3).
Figure (4.3) : Shows the Simulation and Results for 3D-CORDIC Processor

For the inputs operands:

\[ Xin = (001000000.000)_B = (+64.00)_{D}, \quad Yin = (001000111.000)_B = (+71.00)_{D}, \]
\[ Zin = (001000000.000)_B = (+64.00)_{D} \]

at time 0 sec, and the outputs is radius
\[ R = (001110011.101)_B = (+115.625)_{D} \]
\[ \alpha = (000110000.000)_B = (48.00)_{D} \]
\[ \gamma = (000111000.100)_B = (56.50)_{D} \]

At the time 1.7 \( \mu \) sec, the input is
\[ Xin = (001100000.000)_B = (+96.00)_{D}, \quad Yin = (001100111.000)_B = (+103.00)_{D}, \]
\[ Zin = (001000000.000)_B = (+64.00)_{D} \]

at time 2 \( \mu \) sec, and the output is radius
\[ R = (010011011.011)_B = (+155.375)_{D} \]
\[ \alpha = (000101111.010)_B = (47.25)_{D} \]
\[ \gamma = (001000001.110)_B = (65.75)_{D} \]
at the time 3.6 \( \mu \) sec.

<table>
<thead>
<tr>
<th>Table(4.1): Shows the Results for 3D-CORDIC Processor</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Input values</strong></td>
</tr>
<tr>
<td>[ Xin = (001000000.000)<em>B = (+64.00)</em>{D} ]</td>
</tr>
<tr>
<td>[ Yin = (001000111.000)<em>B = (+71.00)</em>{D} ]</td>
</tr>
<tr>
<td>[ Zin = (001000000.000)<em>B = (+64.00)</em>{D} ]</td>
</tr>
<tr>
<td>wait after 2 ( \mu ) sec</td>
</tr>
</tbody>
</table>

The experimental results and simulation are given in figure (4.3) and figure (4.4), which represent the simulation for 3D-CORDIC processor. Table (4.2) gives the summarized results of 3D-CORDIC Processor where the input is the step Clock (clk) and the three 12-bit signed as signed inputs (Xk, Yk, Zk) and outputs (rad_p, an_alfa_p, an_gama_p).
Table (4.3) and Table (4.4) summarizes the device utilization for Spartan-II (XC2S200E-5-PQ208C) FPGA with a speed grade of -5. This is shown synthesis results of 3D-CORDIC processor are found to running at a frequency of (116.618 MHz) with a total equivalent gate count of (21,626 gates).

**Timing Summary:**

- Minimum period: 8.575ns (Maximum Frequency: 116.618MHz)
- Minimum input arrival time before clock: 13.193ns
- Maximum output required time after clock: 7.999ns
- Maximum combinational path delay: No path found

Downloading and implementation our design gives the following: Device utilization summary:

<table>
<thead>
<tr>
<th>Number of Resource</th>
<th>Used</th>
<th>Avail</th>
<th>Utilization</th>
</tr>
</thead>
<tbody>
<tr>
<td>External GCLKIOBs</td>
<td>1</td>
<td>4</td>
<td>25%</td>
</tr>
<tr>
<td>External IOBs</td>
<td>73</td>
<td>140</td>
<td>52%</td>
</tr>
<tr>
<td>LOCed External IOBs</td>
<td>73</td>
<td>73</td>
<td>100%</td>
</tr>
<tr>
<td>SLICEs</td>
<td>872</td>
<td>2352</td>
<td>37%</td>
</tr>
<tr>
<td>GCLKs</td>
<td>1</td>
<td>4</td>
<td>25%</td>
</tr>
</tbody>
</table>
### Table 4.2: List of IO Ports for 3D-cordic processor

<table>
<thead>
<tr>
<th>Port</th>
<th>Width</th>
<th>Direction</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>CLK</td>
<td>1</td>
<td>Input</td>
<td>System Clock</td>
</tr>
<tr>
<td>Xk</td>
<td>12</td>
<td>Input</td>
<td>X-coordinate input. Signed value</td>
</tr>
<tr>
<td>Yk</td>
<td>12</td>
<td>Input</td>
<td>Y-coordinate input. Signed value</td>
</tr>
<tr>
<td>Zk</td>
<td>12</td>
<td>Input</td>
<td>Z-coordinate input. Signed value</td>
</tr>
<tr>
<td>red_p</td>
<td>12</td>
<td>Output</td>
<td>Radius output. Unsigned value.</td>
</tr>
<tr>
<td>an_alfat_p</td>
<td>12</td>
<td>Output</td>
<td>Angle (α) output. Unsigned value.</td>
</tr>
<tr>
<td>an_gama_p</td>
<td>12</td>
<td>Output</td>
<td>Angle (γ) output. Unsigned value.</td>
</tr>
</tbody>
</table>

### Table 4.3: Synthesis Results for 3D-CORDIC Processor

<table>
<thead>
<tr>
<th>Vendor</th>
<th>Family</th>
<th>Device</th>
<th>Resource usage</th>
<th>Max. Clock speed</th>
</tr>
</thead>
<tbody>
<tr>
<td>Xilinx</td>
<td>Spartan-II</td>
<td>XC2S200E-5</td>
<td>872 slices</td>
<td>116.618 MHz</td>
</tr>
</tbody>
</table>

**Design Summary**

- Number of Slices: 872 out of 2,352, 37%
- Number of Slices containing unrelated logic: 0 out of 872, 0%
- Number of Slice Flip Flops: 1,549 out of 4,704, 32%
- Total Number 4 input LUTs: 851 out of 4,704, 18%
  - Number used as LUTs: 782
  - Number used as a route-thru: 69
- Number of bonded IOBs: 72 out of 140, 51%
- Number of GCLKs: 1 out of 4, 25%
- Number of GCLKIOBs: 1 out of 4, 25%
- Total equivalent gate count for design: 21,626
- Additional JTAG gate count for IOBs: 3,504

### Table 4.4: Synthesis Results of Xilinx XC2S200E-5-PQ208C FPGA

<table>
<thead>
<tr>
<th>Resource</th>
<th>Used</th>
<th>Available</th>
<th>Utilization</th>
</tr>
</thead>
<tbody>
<tr>
<td>IOs</td>
<td>73</td>
<td>140-208</td>
<td>52.21%</td>
</tr>
<tr>
<td>Function generator 4 input LUTs</td>
<td>851</td>
<td>4704</td>
<td>18.09%</td>
</tr>
<tr>
<td>CLB Slice</td>
<td>872</td>
<td>2352</td>
<td>37.07%</td>
</tr>
<tr>
<td>Dffs or Latches (Flip Flops)</td>
<td>1549</td>
<td>4704</td>
<td>32.92%</td>
</tr>
</tbody>
</table>

The 3D-cordic processor and reconfigurable hardware implementation on FPGA are two important and promising technologies to scientific researches. In this paper we propose a 3D-CORDIC processor implemented using FPGA. The 3D-CORDIC processor is implemented on Xilinx Spartan-II ship (XC2S200E-5-PQ208C). A good agreement has been observed between the FPGA and MATLAB simulation. The 3D-CORDIC processor used in applications of the aconite position sewing of Robot Arm and efficient control in satellite and guidance technique. Also CORDIC implementation in VHDL has been used in angle calculation in wireless LAN receiver block. The results
show that, the 3D-CORDIC processor uses only 37% of SLICEs and 52% of IOBs which means that the system can be upgrade to fit more user requirements without technology or even the FPGA ship. The maximum clock frequency is 116 MHz, Which is high sufficiently for many applications.

5. MATLAB SIMULATION:

To compare FPGA simulation results with another simulation method, the MATLAB V.7 is used to simulate 3D-CORDIC processor [30].

The VHDL code is converted to a MATLAB code. The MATLAB simulation results of 3D-CORDIC processor, with the same inputs of FPGA foundation simulation are shown in table (5.1).

In figure(5.1) and figure(5.2) demonstrates the MATLAB simulation of X, Y, Z convergence after 12 iterations, for the first row value in table(5.1) for the 3D-CORDIC processor. The initial conditions in XY-plan are \(x_1 = 64.000\), \(y_1 = 71.000\) and \(\alpha_1 = 0.000\), after 12 iterations, the registers contain: \(x_{12} = 157.409\), \(y_{12} = -0.092\), \(\alpha_{12} = 48.002\), where the computed adjustment is \(K \approx 1.646\). Thus, the final solutions are \(\alpha = 48.002\) and Radius \(R_{XY\text{-plane}} = x_{12} / K = 95.631\). The initial conditions in ZR\(_{XY\text{-plane}}\)-plan are \(x_1 = z_1 = 64.000\), \(y_1 = R_{XY\text{-plane}} = 95.631\) and \(\gamma_1 = 0.000\), after 12 iterations, the registers contain: \(x_{12} = 189.495\), \(y_{12} = 0.154\), \(\gamma_{12} = 56.161\), the final solutions are \(\gamma = 56.161\) and Radius \(R_{final} = x_{12} / K = 115.125\).

The simulation by using MatLab v.7 of 3D-CORIC processor, the results of different values inputs are shown the table (5.1).

<table>
<thead>
<tr>
<th>Table(5.1):</th>
<th>Input values</th>
<th>Output values</th>
</tr>
</thead>
<tbody>
<tr>
<td>Xin = 64.000</td>
<td>Radius (R) = 115.125</td>
<td>Angle ((\alpha)) = 48.002</td>
</tr>
<tr>
<td>Yin = 71.000</td>
<td>Angle ((\alpha)) = 48.002</td>
<td>Gamma ((\gamma)) = 56.161</td>
</tr>
<tr>
<td>Zin = 64.000</td>
<td>Radius (R) = 154.795</td>
<td>Angle ((\alpha)) = 46.994</td>
</tr>
<tr>
<td>Xin = 96.000</td>
<td>Gamma ((\gamma)) = 65.601</td>
<td></td>
</tr>
</tbody>
</table>
Figure(5.1) : CORDIC Inverse Tangent Convergence at XY-Plan
\[ X(1) = X_{in}, \quad Y(1) = Y_{in}, \quad X(12) = K \times R_{xy}, \quad Y(12) \approx 0, \quad \text{Angle}(12) = \text{Alfa} \]

Figure(5.1) : CORDIC Inverse Tangent Convergence at ZR_{XY-plan}-Plan
\[ X(1) = Z_{in}, \quad Y(1) = R_{xy}, \quad X(12) = K \times R_{final}, \quad Y(12) \approx 0, \quad \text{Angle}(12) = \text{Gamma} \]

6. DISCUSSION AND CONCLUSION:

The 3D-cordic processor and reconfigurable hardware implementation on FPGA are two important and promising technologies to scientific researches. In this paper we propose a 3D-CORDIC processor implemented using FPGA. The 3D-CORDIC processor is implemented on Xilinx Spartan-II ship (XC2S200E-5-PQ208C). A good agreement has been observed between the FPGA and MATLAB simulation. The 3D-CORDIC processor used in applications of the aconite position sewing of Robot Arm and efficient control in satellite and guidance technique, and angle calculation in
wireless LAN receiver. The results show that, the 2D-CORDIC processor uses only 37% of SLICEs and 52% of IOBs which means that the system can be upgrade to fit more user requirements without technology or even the FPGA ship. The maximum clock frequency is 116 MHz, Which is high sufficiently for many applications.

**REFERENCE:**


