Tech

Corbin Foucart
Members Public

Cross-compiling a C++ Library Using the Android NDK (Part I)

A common theme in the world of MLOps is the development of machine learning models on state-of-the-art cloud computing environments or desktop Linux computers with GPU capabilities and x86-64 hardware. However, we often do so with the intention of eventually using these models on edge devices such as smartphones, tablets,

Bar Rozenman
Members Public

Fully Sharded Data Parallelism (FSDP)

In this blog we will explore Fully Sharded Data Parallelism (FSDP), which is a technique that allows for the training of large Neural Network models in a distributed manner efficiently. We’ll examine FSDP from a bird’s eye view and shed light on the underlying mechanisms. Why FSDP? When

Corbin Foucart
Members Public

Understanding the TFLite Quantization Approach (Part I)

Running machine learning models can be computationally expensive, and these models don’t typically port well to hardware or embedded devices. The TensorFlow Lite (TFL) library is, according to the documentation, “a mobile library for deploying models on mobile, microcontrollers and other edge devices.” TFL allows the conversion of native