.. _expertise: Expertise and services ======================== Refer to the homepage for my :ref:`work experiences and business verticals/horizontals `. High-throughput imaging system POCs and MVPs ---------------------------------------------- I build end-to-end **giga-pixel, high-throughput imaging systems** for research labs and early-stage startups. Compared to conventional R&D pipelines that advocates for "make it work, make it right, then make it fast". End-to-end design projects requires a concurrent R&D workflow where the entire technical stack should develop jointly to ensure timely project delivery. .. figure:: attached/96eyes-data-flow.png An overview of a typical high-throughput, giga-pixel imaging system as a **Proof of concept (POC)**. (a) Plate loading and image acquisition steps overlapped in time. (b-c) High speed data links streams 96 camera pixels to the 5x GPU cards. (d) Front view of the 96 Eyes hardware, version 1. .. figure:: attached/96eyes-mvp.jpg :width: 50% **Minimum viable product (MVP)** for the client. Compared to the version 1, the entire instrument is rebuilt with aluminum enclosures. It also comes with a more robust motion & thermal control electronics, mechanical interlocks, laser safety housing, and remote control interfaces. It is also equipped with auxiliary nano-positioning stages for self-calibration of consumables. Design and prototyping capabilities, assuming a cost-plus contract: - **Physical layer:** choose between widefield fluorescence imaging setup, or structured-illumination, or light-sheet, depending on the use cases; - **Analog layer:** sourcing CMOS cameras, avalanche photodiodes, and/or illumination modules; thermal control via Peltier modules and custom heatsink; motion PID control with custom PCBs. - **Digital layer:** AVR microcontroller units for low-latency instrument control; Nvidia Jetson system-on-board (SoB) for real-time embedded computer vision. Multi-GPU compute workstation for hardware-accelerated image signal processing. - **Tooling and DevOps:** Ansible for continuous deployment (CD); Buildbot for continuous integration (CI). - **Storage layer:** ZFS for near-instant file garbage collection; EXT4 for peak RW throughput. HDF5 for high-throughput data container; OME-TIFF for long-term storage. - **Compute layer:** instrument control software/firmware in either Go/C++ or Python/C++ depending on the pixel throughput; CPU multithread or single/multi-GPU accelerated algorithms for image reconstructions. - **Presentation layer:** Terminal UI for factory-floor QC testing; WebUI for z-stack image dashboards; or OpenGL/GLFW UI for real-time volume rendering. .. note:: Learn about the Amgen-Caltech project from: **A.C.S. Chan**, J Kim, A Pan, H Xu, D Nojima, C Hale, S Wang, C Yang, `"Parallel Fourier ptychographic microscopy for high-throughput screening with 96 cameras (96 Eyes)" `_ *Scientific Reports* **9**, 11114 (2019). .. note:: Also, read the whitepaper regarding the multidisciplinary collaboration aspects to facilitate timely MVP launch: https://arxiv.org/html/2508.18512 System-level design and architecture review --------------------------------------------- .. figure:: attached/design-thinking.png :width: 100% I offer services for system requirement capture and design, tracing the hardware/software design decisions back to the original design requirements and constraints. The deliverable can be a design-input tracking database hosted on the premises, or a static requirement analysis report with a simplified schema. Read more about my work on :ref:`end-to-end design input capture and analysis ` for early stage startups (i.e. 5 to 30 stakeholders). Cross-compilation and multi-platform support, software only, C++17 ------------------------------------------------------------------ Linux high-performance computing cluster/workstations ######################################################### .. figure:: attached/gpu-cooling.jpg :width: 70% Custom-designed Linux workstation. (Left) custom GPU liquid cooling heatsinks from OEMs. (Right) Fully assembled multi-GPU workstation for high-throughput image processing. Photo courtesy of Bitspower and Caltech. I primarily write multi-threaded, GPU-accelerated scientific code with the Clang/LLVM toolchain. I also offer material sourcing service for high-performance computing (HPC) workstations with server-grade dual-CPU motherboards, USB/Ethernet hubs, and multi-GPU cards. Read my code for `multi-GPU workstations `_. Nvidia GPU Jetson system-on-board (SoB) ######################################### I offer Ansible-assisted, over-the-air (OTA) application deployment over the Nvidia Jetpack's Linux base image. Compared to the Yocto's *build-everything-from-scratch* approach, Ansible approach ensures minimal wear and tear of the boot sector in the UFS/eMMC chips. Camera MIPI/USBVision interface integration in userspace is also supported, as long as the kernelspace drivers are already licensed from the OEMs to the clients. Examples of camera OEMs are: ArduCAM, Allied Vision. Windows 10/WSL native C++/Golang applications, with WebGUI ############################################################ I provide cross-compilation supports and DevOps to support C++/Golang application testing on virtual machine hosted on Windows (i.e. Windows system for Linux, WSL), and continuous deployment of end-user applications to Windows 10/11. I specialize in C++ function mangling to call MSVC-style symbols from GNU/MinGW64 runtime, getting the best of both worlds. Xilinx/AMD FPGA system-on-chip ################################### .. figure:: attached/ebaz4205.png I customize Yocto build system directory structures integrating user-provided Vivado hardware designs (e.g. as blackbox). I also offer Meson build systems integrations to facilitate userspace application testing for both off-target (i.e. no QEMU) and on-target. Android support, native boot sector ######################################### I no longer provide this service because of the cost-prohibitive EULAs and licenses that comes with Snapdragon/Tensor's hardware development kits (HDK). Exceptions can be made if the customers has already licensed from the SoC vendors. Learn about my contributions to `Build systems `_ GPU acceleration and domain-specific language design for scientists ---------------------------------------------------------------------- .. figure:: attached/proximal-banner.png The hardware-accelerated compute landscape is far more fragmented than ever, resulting in vendor lock-in of user algorithms. We are now witnessing CPUs (equipped with 512-bit SIMDs) out-competing mainstream GPUs on specific algorithm pipelines. Even for GPUs from the same vendor, the micro-architecture drastically changes from one generation (e.g. Nvidia Maxwell) to another (e.g. Volta). Therefore, instead of writing multiple versions of static, hand-optimized (CUDA or C++) code targeting specific system-on-chips (SoCs), it makes more sense to design high-throughput image processing algorithms in a portable language with zero-cost abstraction, similar to the decoupling between computer-aided design (CAD) and computer-aided manufacture (CAM). Read more about my work on :ref:`Imaging problem formulation language `. Low-latency computing, C++17/C++20 ----------------------------------- .. figure:: attached/cpp20-demo.jpg (Left) procedural programming style, versus (Right) declarative programming style enabled by C++17/C++20 features. I specialize in baremetal, embedded system programming in C++17/C++20 language, with stack-space optimization. Microcontroller unit (MCU) and CPU architecture ranging from AVR to ARMv8/NEON. I also offer tutorials for founding scientists/engineers to equip them for Static-Type-driven development (i.e. parse, don't verify) and Behavior-driven development (BDD) for early algorithms with the low-latency mindset. Read my contribution of the security-hardened smartphone camera to :ref:`Adobe Content Authenticity Initiative (CAI) `. Illumination/imaging optics design ------------------------------------ .. figure:: attached/plastic-molded-lens.png Optical system design, as a business, has become very costly to run over the past few years for freelancer CDM (contracted development & manufacturing), especially without purchasing authority and a host lab to support the R&D. Nowadays, I mostly compose/evaluate optical schematics, and verify them against OEM-provided specifications. If you find anyone who wishes to ship the instrument on loan to my office, I am more than willing to pick up the slack. Please feel free to contact me on `LinkedIn `_. Read more about my projects :ref:`here `. Instrument control PCB design -------------------------------- .. figure:: https://github.com/antonysigma/piezo-stage-pid-board/raw/master/preview.jpg I used to build mixed-signal control PCBs for a living. Recently, I stopped offering this service amid rising BOM costs and PCBA fabs. But I am willing to pick up the slack if my clients offer a cost-plus contract. My specialties: - single-axis motion control with PID feedback, implemented via biquads tuned from Z-transformed designs; - Biomedical signal preconditioning/amplifier with second-order OpAmp filters; - Analog PLL circuit design, GHz input bandwith, ~10MHz clock output. - Peltier thermal control. - Adapter boards for Zynq7000s and/or Nvidia Jetson system on boards (SoBs). Read my :ref:`portfolio here `. Technical illustration ------------------------- .. figure:: https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fs41598-019-47146-z/MediaObjects/41598_2019_47146_Fig1_HTML.png?as=webp :width: 80% Nowadays, I rarely offer technical illustration services except to friends at work; it is no longer a profitable service sector given the low hourly wage, and the emergence of text-to-figure generative AI, e.g. DALL-E. I still keep my own palette and pre-fab icons, in case I need to create and present novel ideas to customers and/or investors. Read my :ref:`portfolio here `.