JeVois: Open-Source Quad-Core Smart Machine Vision Camera

JeVois: Open-Source Quad-Core Smart Machine Vision Camera project video thumbnail

About this project

Open-source machine vision finally ready for prime-time in all your projects!
JeVois = video sensor + quad-core CPU + USB video + serial port, all in a tiny, self-contained package (28 cc or 1.7 cubic inches). Insert a microSD card loaded with the provided open-source machine vision algorithms, connect to your desktop, laptop, and/or Arduino, and give your projects the sense of sight immediately.
JeVois started as an educational project, to encourage the study of machine vision, computational neuroscience, and machine learning as part of introductory programming and robotics courses at all levels (from K-12 to Ph.D.). At present, these courses often lack a machine vision component. This is mainly, we believe, because there is no simple machine vision device one can use together with the Raspberry Pi, Arduino, or similar device used in these courses. JeVois aims to fill this gap by providing a self-contained, configurable machine vision engine that can deliver both visual outputs of how it is analyzing what it sees (useful to understand the algorithms), and text outputs over a serial link that describe what it has found (useful to send to a micro-controller that can control a robot).

Overview

The JeVois framework operates as follows: video is captured from the camera sensor, processed on the fly through some machine vision algorithm directly on the camera's own processor, and the results are streamed over USB to a host computer and/or over serial to a micro-controller.
To the host computer, the JeVois smart camera is just another USB camera. Different vision algorithms are selected by changing USB camera resolution and framerate. Users or machines can also interact with the JeVois smart camera, change its settings, or listen for text-based vision outputs over serial link (both hardware serial and serial-over-USB are supported).
Three major modes of operation:
  • Demo/development mode: the smart camera outputs a demo display over USB that shows the results of its analysis, possibly along with simple results communicated over serial port (e.g., coordinates and content of any QRcode that has been identified).
  • Text-only mode: the smart camera provides no USB output, but only text strings, for example, commands for a pan/tilt controller.
  • Pre-processing mode: the smart camera outputs video that is intended for machine consumption, for example an edge map computed over the video frames captured by the camera sensor, or a set of image crops around the 3 most interesting objects in the scene. This video can then be further processed by the host computer, for example, using a massive deep neural network running on a cluster of high-power GPUs to recognize the three most interesting objects that the smart camera has detected. Text outputs over serial are of course also possible in this mode.
But, really, anything is possible, since the whole JeVois software framework is open-source.

Hardware Specs

At a glance:
 A brief tour:
The JeVois smart camera is a complete Linux computer. It can run on its own, without a host PC. Thanks to the cooling fan, it can run under full processing load without overheating and while maintaining a constant 1.34 GHz CPU speed.

 Software Specs

A few examples (watch the video for more). All results shown below are computed on the JeVois camera itself. The smart camera does all the work, including image capture, vision processing, and creating the demo displays. The host computer only runs a standard video camera software (e.g., guvcview on a Linux host) to display the smart camera's results.
Visual Attention detects interesting things in the world
Visual Attention detects interesting things in the world
 In the above, the algorithm processing speed is 73.1 fps on the smart camera's processor, that is, it takes 13.68 ms to process one video frame. The camera sensor captures frames at 60 fps (16.67 ms per frame), which sets the video streaming speed. One could run some additional algorithms during the 3 ms difference. Also note that the CPU is not fully loaded by this algorithm (148.7% load, while 400% would correspond to fully loading all 4 CPU cores). Hence additional algorithms could also run in parallel with visual attention.
Augmented reality markers (ArUco)
Augmented reality markers (ArUco)

Object detection and recognition
Object detection and recognition

Road detection for autonomous driving
Road detection for autonomous driving

120Hz eye tracking (camera streams at 120 fps, processing runs at 200+ fps)
120Hz eye tracking (camera streams at 120 fps, processing runs at 200+ fps)
and see many more in the campaign video.
This is an open-source project. Anyone can contribute. Which vision algorithms will you create?

Host Computer Requirements

The JeVois smart camera can work as a standalone computer, with no USB video streaming. In such case, one would usually simply stream commands to an Arduino or similar over the serial port. All you need then is to provide power to the JeVois camera's mini-USB connector.
For video streaming over USB:
Linux (including Raspberry Pi): Works out of the box, no drivers needed, all functionality is available. You can switch between different vision processing modes on the fly and at runtime, by selecting different camera resolutions and frame rates on your host computer. For example, 640x300@60fps may run the visual attention algorithm, while 320x256@60fps may run the QRcode detection algorithm. A configuration file on the MicroSD card establishes the mapping between USB video resolution and frame rate, camera sensor resolution and frame rate, and vision algorithm to run.
Windows, Mac OS X: Streaming works but selecting among different available video resolutions has become increasingly difficult with newer versions of Windows and OS X. It seems that these operating systems do not want to let you choose a camera resolution or frame rate. This choice, however, is how one can switch on the fly between different vision algorithms on the JeVois camera. One workaround is to configure your JeVois smart camera with only one vision algorithm (achieved by commenting out entries for other resolutions in a configuration file on the MicroSD card). Then the host computer will have no choice but to use that one.
Android: The camera is detected but streaming video does not work yet. We are working on this and we suspect this has to do with deviations from the USB video class (UVC) standard on the Android side. For example, a USB packet sniffer has revealed that the Android device insists on querying camera controls that were not declared as being supported by the JeVois hardware.
iOS: The JeVois smart camera, and any other USB camera we have tried, are currently reported as not being a supported device. This may change in the future.

Developing for JeVois

Everything is cross-compiled on a host computer. At present only Linux is supported for compiling JeVois software. JeVois software is written in C++17, with a few low-level algorithms in optimized C. We use CMake to manage the build process. This allows you to build for both your host computer and for the JeVois hardware, both at the same time. This is very useful during development, since you can test your algorithms using any USB webcam and observe the results in a window on your screen. You can also use your host computer to train a deep neural network quickly, and then just load the weights onto your MicroSD card. The JeVois operating system is Linux and its features are managed by buildroot.
Scripts are provided to automate the process:
rebuild-host.sh - compile the entire JeVois software suite natively on a host computer. You can use it with a USB webcam, and it display outputs in a window on your screen.
rebuild-platform.sh - cross-compile the entire JeVois software suite for the smart camera hardware.
jevois-build.sh - flash the cross-compiled results from rebuild-platform.sh to a MicroSD card.

Contributions & novelty

In principle, you could assemble a working machine vision system using existing embedded computer boards, adding a camera, and programming a USB output. In fact, this how we started... - er, two years ago!
The hurdles we encountered and solved in creating JeVois are as follows:
Linux support for camera chips is often very limited. We wrote our own highly efficient kernel driver and figured out how to configure undocumented camera chip registers to support all the controls available on the camera (exposure, frame rate, etc). Most camera drivers in the Linux kernel only support one or a few resolutions and frame rates, many do not support controls such as manual exposure, etc. You could use a USB camera, for which support for the controls has been standardized, but not many support 60 or 120 frames/s, and they would increase the overall size, weight, power and cost of the system. Our kernel camera driver exploits hardware that is integrated into our CPU chip for direct image capture from a camera sensor. This is faster, provides lower latency (time between image capture and start of rocessing), and uses less CPU resources than attaching a USB camera sensor.
Linux support for the device-side of video streaming over USB is virtually non-existent (in open source). This is software that you need to run on the smart camera's CPU to make it appear as if it was a USB camera to a connected host computer. A webcam gadget module has been present in the Linux kernel source tree for several years, but its functionality is very limited, and, as we discovered, its very core logic is broken. We developed a fully working device-side kernel USB Video Class driver, with highly efficient data streaming and pass-through support for video resolutions, pixel formats, frame rates, and camera controls (when users change those on the host computer, the changes are relayed to the camera sensor chip).
Many small embedded computer boards exist, but software support is often limited, and quality sometimes is low. We had to fix kernel USB drivers, kernel GPU drivers, kernel camera drivers, and many other elements in the Linux kernel to make it work flawlessly while supporting all the features listed above. Small embedded boards also are no match for our solution in terms of total system size (including camera, connectors, fan), power consumption, speed and video latency (delay between capturing an image and presenting the processing results to the host computer), and out-of-the box enjoyment and reliability. For example, one has to face the reality that a quad-core chip will just overheat under load if it does not have a fan or very, very large heatsink.
We employed special design and manufacturing techniques which enabled us to deliver a complete, self-contained system in a tiny form factor. As an example, we run our DDR3 memory at its full rated clock speed (DDR3-1600), while other ARM boards we have surveyed on the market sometimes run at just half that speed. We achieved optimum speed as the result of extremely detailed circuit board design and optimization, and through the use of custom design, simulation, and fabrication tools. It makes a big difference for many vision processing algorithms (about 15% faster).
In summary, because we are developing the hardware, software and mechanical design jointly, we are able to deliver a highly optimized, plug-and-play, high satisfaction solution that just does not exist anywhere else today.

Long-Term Vision

With JeVois we aim to enable everyone to use machine vision in embedded, IoT, robotics, and educational projects. While this campaign is to help us bootstrap the mass manufacturing of the hardware, we continue to work on the following open-source software aspects:
  • Software development is gearing up to provide a repository of machine vision modules shared by the community. This is the equivalent of an app store for JeVois. It is expected to come online by February, before the devices start shipping.
  • We are developing a curriculum of activities around the JeVois camera for all levels (from kindergarten to Ph.D.). Activities will start at the entry level (just point the camera towards something and see whether it can identify it), to learning how the underlying machine vision algorithms work, to modifying existing algorithms, and finally learning vision theory and developing your own algorithms. This is expected to become available in April. We will leverage significant material we have developed by teaching robotics, vision, and artificial intelligence courses at the undergraduate and graduate level over the past 15 years.
  • Our overall mission is an educational and outreach one. We hope that JeVois will help us and others translate the latest vision research results into working machines. JeVois will be integrated to our outreach activities, which include working with neighboring K-12 schools, our Robotics Open House program, and others. We have a small budget already available for outreach, which includes loaning and donating some JeVois hardware to interested students. Details will be available in February.

Comparison to the Raspberri Pi 3 model B

Colors

Choose your colors in the backer survey after you place your pledge. Note that with the lighter colors, you will see the LED shine through the plastic case and make a small halo around the hole in the case for the LED. You may want this for better visibility or not if you find it distracting. The color that makes the most sense from a machine vision standpoint is black.

Thanks

This campaign is basically to build a batch of units that achieves low mass production costs. Please help us spread the word if you like this project. Thanks!
The science behind JeVois was in part made possible by research grants from the National Science Foundation (NSF) and the Defense Advanced Research Projects Agency (DARPA) .
The views, opinions, and findings contained in this presentation are those of the authors and should not be interpreted as representing the official views or policies, either expressed or implied, of the United States government or any agency thereof.
Raspberry Pi is a trademark of the Raspberry Pi Foundation. Arduino is a trademark of Arduino LLC. Windows is a trademark of Microsoft Corporation in the United States and/or other countries. Mac, OS X and iOS are trademarks of Apple Inc., registered in the U.S. and other countries. Android is a trademark of Google Inc.


Risks and challenges

We just received CE, FCC, and RoHS certification. We also received our first samples from our injection molding partner and they are great (shown in the video).
The main remaining challenges are mass sourcing of the components (i.e., avoiding shortages of some parts), and possible export or custom issues when shipping to foreign countries. We are working on addressing those by securing multiple component sources and by working with international shipping experts.
Learn about accountability on Kickstarter

Comments