In May 2020, we presented a compact heterogeneous computing embedded platform.
The platform is based on a coldfire Microcontroller and a Spartan6 LX9 device all put on a two layer board. Simple and effective. We also demonstrated two of the prototype systems live showing the configuration of the FPGA from the microcontroller and then the seamless register mapping into the microcontroller memory space.
Some time ago I wanted to test the capabilities of the PerseusCLE board. I created an expansion card which supported motor drivers for DC Brushed or Stepper motors, Analog front ends etc.
I always wanted to try and output a DVI/HDMI signal using TMDS and I knew that my spartan 6 device was capable of doing this. However when I initially designed PerseusCLE, I did not think at all trying this, I just wanted a strip-down version of my bulky PerseusCFE to a more cost effective solution.
What are these CLE/CFE stand for anyway? Well I started with CFE: Coldfire Full Edition.
This board had all the bells and whistles I wanted at the time. Dual switching power supplies (logic and motor power), second crystal for the FPGA clock, SDRAM on FPGA, Ethernet connectivity, USB connectivity, SD Card, CANBus, model servo PWM outputs and lot’s of Olimex UEXT connectors for UEXT modules. All in just 2 layers PCB.
The board is large and I wanted something smaller and cheaper. Hence I decided to strip down many of the features of the Full Edition, creating the CLE: Coldfire Light Edition.
Features reduced to a minimum, like SDCard, native USB only, no separate FPGA clock (used same clock as MCU), still many connectors and a single switching power supply.
So designing the expansion board, I thought to give it a try and add an HDMI connector with a crystal oscillator to provide the missing external clock to my FPGA. I tried to match signal length for the TMDS signals from the FPGA to the expansion board as initially did not plan to have equal signal lengths up to the PerseusCLE connectors. It wasn’t my intention to drive so high speed signals back then. I needed to use Excel and measuring the length on the main board and calculating what was the actual signal length for each signal and add the corresponding missing length in the I/O board. Pretty challenging.
You can find how DVI/HDMI works as a concept and a Verilog implementation at FPGA4FUN. However I am using VHDL and searching the net I found various implementations some from Xilinx some from derivative works of Mike Field. I used a mix of the available sources. I liked this repo from drxzc. I also created and tested with GHDL Xilinx IP, like PLL and SERDES modules.
I was so anxious that I procrastinated to check the actual hardware. After creating the interconnections and verified that the setup was probably good, I decided to give it a try.
Although I expected to fail, I hoped for the best. Everything was wrong. The TMDS signals had to pass a simple flat cable to interconnect the boards. My reference 25MHz clock had to go with wires back to the main board. In order to reduce the effects of the signal integrity, I used a low resolution of 640×480. For simplicity I added a simple pattern generation. The idea if this worked was to replace it with video memory that the microcontroller would write. The bit rate in the data lanes would be 10 times my 25MHz clock giving 250Mbps per lane. This is where the TV shows says: “Don’t do this at home, experiment executed by Experts”. Well I would stick on the first part: “Don’t do this at home”; I see no expert around….
I put my FPGA configuration to my SDCard and modified COFILOS code to load this DVI configuration. I checked that my reference clock was running. My poor 100MHz DPO had not a good chance to capture the high speed data lanes of the serializers outputs.
When my full setup was up an running I connected the HDMI cable… Silence. Excitement. Fear. Waiting to see the result. Nope, needed to select the correct HDMI input at the television. Ok. Let’s see. Oh!
It worked! Well not as it should, but given the circumstances and the implementation I had to follow I am more than happy. The next boards would be tailored to provide proper signal integrity and produce a clean signal.
I did a small redesign in my VHDL to make sure that the issue I was looking, was not related to internal FPGA timings, instead of driving with my test pattern generator I tried driving a constant RGB value. Retrying this on another monitor I had very similar results. I need more specialized hardware to drive it with proper signal integrity and clock signals. No surprise.
At a later time, I also tried to use the internal PLL to generate my clock frequencies. I was not happy with my external 25MHz clock running around. I also did some modifications on my VHDL code as follows .
First I created generics input for the various VESA timings. Now the design is parametric. I also changed the color values to be zero during sync. To reduce timing issues on place and route I also used registered outputs from the Test Pattern Generator.
I started the experiments again with either clock coming from my MCU and create the clock frequencies using the PLL, but still got same results.
As this setup had the same behavior as the original configuration, I reverted to the external 25MHz clock. It seems that this worked after the last changes! I had my DVI output on my monitor. Sometimes tweaking with the HDMI cable could lose the stability of my signal, or maybe the stability of my clock signal going around with cables was not good enough to have a good output, but nevertheless, the proof of concept was completed.
It was really fun to work with SERDES and proprietary vendor IPs and see how they actually work. Really getting into these details provide a good background for other applications.
Last Friday (2020-05-15) in the 7th ICT handshake organized by University Of Nicosia we presented technologies that will be used for the Mari-Sense project. In this presentation we explained the function and design process of embedded systems and how these will be used to enhance processing at the edge (in Greek).
Hardware and Firmware development is essential for the age of Internet of Things or for the more traditional term embedded systems. Recently more and more processing is required to be performed closer at the physical locations where the sensory or IoT devices exist, called edge processing. The traditional way of developing such systems is using application processor systems running on Linux.
Development of such products is fast due to the ecosystem using commercially available platforms and proof of concept projects are easy to achieve; However when someone tries to make the necessary modifications to create a custom product, comply with certifications and perform changes required to make it a viable product, soon he/she may fail short, as:
There is not much control for customizing the core boards; Design from scratch is the only option if a single board is needed
The base hardware is complicated for the majority of applications
Highly skilled hardware engineers and sophisticated tools are needed
Cost for production of a custom featured PCB usually is much higher for individual production
Critical parts are hard to source in small quantities
Designs may not be efficient from power or performance perspective
So many times we are obliged to
select and change parts because of the limitations of our mainstream
microcontrollers to a higher end one or we need to add external logic and
circuits to accommodate richer input-output architecture.
Wouldn’t be great to have a uniform
polymorphism platform that can scale easily to work with for the majority of
Another aspect that is considered is
design verification. Embedded systems usually need to have real-time
performance, thus classic debugging (step-through) under real-time conditions
is not always possible or is an additional challenge. Stack checking on RTOS or
timings are not easily observed with accuracy without an impact on performance.
Wouldn’t be great to have a much easier time to debug embedded systems?
Microcontrollers offer a small footprint system with high level of integration (memories, peripherals etc), but sometimes the internal peripherals or the processing capacity are not adequate to tackle with more demanding applications. FPGA on the other side are more flexible and capable but usually they are not the best for control flows and require more knowledge for development. Edge processing sometimes require a higher processing capacity at a lower power rate.
We created an embedded platform with its firmware ecosystem, that allows fast application development, without compromising the later steps for final production. In order to combine the benefits of both microcontrollers and FPGAs the PerseusCLE was built.
This platform provides the following key features:
Simple 2 or 4 layer PCB, which is within a medium skilled engineer to modify
Programmable Hardware to create custom peripherals and interfaces
A firmware framework that allows fast development in C language
A compact and extensible platform
Support of External Hardware parts for specific interfaces (motors, servos etc)
Wide range DC 9-36V ac/dc supply voltage
MCF52258 Coldfire @48MHz, 512KB Flash, 64KB RAM
Spartan XC6S-9LT FPGA @48MHz
24MB/sec Link between MCU-FPGA, memory mapped
RTOS based design framework
Developed with and supporting TDD or Unit Testing
Olimex UEXT Connectors for external modules
The platform allows for companies or individuals to develop prototypes fast, but at the same time provide easier migration to a final product, with parts and designs that can be manipulated by a medium skilled engineer.
The programmable hardware can provide one more level of expansion thus providing a more reach peripheral set than the ones included in of the shelf microcontrollers.
This is a simple universal design that integrates a microcontroller
and an FPGA in order to perform hardware acceleration, functional and I/O
expansion, or design verification of an embedded system, using a minimum amount
The two major parts are interlinked with a high speed connection to enable FPGA mapping inside the microcontroller’s memory space, giving programming simplicity for the firmware, while achieving high speed transfers, and allows use of the internal MCU DMA. Eventually this provides a two chip solution and simple two layers PCB which allows low cost on low production quantities.
To support the potential uses of the hardware a firmware stack is also provided, that enables programming the FPGA configuration from the SD card, and provides the access methods to the FPGA side along with drivers for Serial, USB, SD, FATFS etc. The stack is based on a modified version of FunkOS RTOS, and the extensions provided are developed using TDD to ensure high quality and reliability of the system.
Using FPGAs moves the programmable barrier lower to the
layer stacking of a product.
On the left side we see the traditional CPU stack. If we upgrade the CPU we need to change the Driver/OS layer to fit the new CPU/Hardware
On the right side the FPGA device is replaced. We need to “recompile” the FPGA-Logic (Program) to the new device. Driver/OS do not need to change!
The FPGA VHDL code flow for simulation uses UVVM and OSVVM to support a solid verification strategy. Along with the hardware, we provide source code with examples for the microcontroller, and example VHDL interface.
This platform can be used very effectively for the following
As the hardware is flexible, controlling multiple motors and
acquiring sensor data from multiple sensors, make this platform ideal. The MCU
can be off-loaded from low-level motor driving, while concentrating on the main
control system. The FPGA can handle the low level functions along with the
sensor fusion for multiple sources (ie. camera).
Having a platform that can handle more motors can create a more
capable 3D printer or even a 3D printer in combination with a lathe. Again the
high level functions can run on the controller while the FPGA keeps track of
the precision in time.
The video signals stopped to be analog and transformed to high-speed
interfaces. The Spartan 6 series can handle these and create video input or
output generators (or a combination there-off) while the microcontroller can
handle the content (ie. transfer it through the USB). No more complex CPU high
frequency arrangements are required.
As the FPGA can offer a high degree of parallelization, applications that require a high number of parallel units or hardware acceleration are good candidates for this platform. For example this platform is going to be used in the MARI-Sens project for signal classification at the edge.
Controlling a divert number of actuators and sensors hasn’t be
easier. Controlling Stepper motors, LED arrays, BLDC or gathering sensor data
from different sources can be done easily. The FPGA can include the low level
logic of control and pre-processing while the MCU handles the control and
connectivity side of the application.
Embedded Design Verification For many applications the FPGA is overkill device to have. However you may be able to test the real-time embedded firmware, without any performance impact if you use the FPGA for capturing processor data. For example stack checking in hardware is very efficient and accurate. So you can use the combined system to trace events, check stacks, and any other aspects of your embedded system before you deploy it and gain more confidence of the quality of your product.
Well, why should I use this platform
while I can get similar setups from the FPGA vendors? I can get single or dual
ARM cores along with a larger set of available logic.
This is true, however, these solution are micro-processor based and not micro-controller based. The devices are huge BGA, and they still need a lot of peripherals to make it work (SDRAM, Flash etc). Our solution offers a two-chip solution (MCU and FPGA) which is more compact and less power hungry and within design reach of a small or medium sized company.
What is the advantage over
microcontroller based solution that contains logic?
Using an external FPGA device your solution is not bonded to a specific microcontroller or FPGA device. The split architecture allows more flexibility. You can scale up capacity for example using the same footprint (just replace the FPGA with a higher capacity logic one), or you may decide you need another processor (ie. Coldfire or Kinetis) that support the same inter-chip interface.
If you are interested to learn more send us a message.
Hardware design starts from the requirements document. Then a detailed specification is created which drives the schematic drawing phase. An often neglected part is the selection of parts to accomplish an electronic design. Selecting parts with high availability and less lead delivery times is an important task. Another aspect is selecting the components packages as these affects the board (PCB) area and the assembly process. A denser PCB may require higher manufacturing class and thus increase production costs. A good understanding of the PCBA production phases is required to have an efficient design that can be produced smoothly in quantities without disruptions.
Producing is not the end of the line though. Testing is another major factor which should be considered. There are many test strategies employed depending on technology used, production quantities and yield.
Understanding the above issues explains why there are many failures in kick-start type projects when it comes to hardware. Take the wrong decisions and you may end up well out of your estimated cost and time budget.
Our experience in hardware offers many years of expertise in embedded systems design starting from specifications down to product support. Our portfolio of work includes dc motors, sensing, microcontrollers and FPGAs.