This is the Linux app named LiteRT whose latest release can be downloaded as litert_npu_runtime_libraries.zip. It can be run online in the free hosting provider OnWorks for workstations.
Download and run online this app named LiteRT with OnWorks for free.
Follow these instructions in order to run this app:
- 1. Downloaded this application in your PC.
- 2. Enter in our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 3. Upload this application in such filemanager.
- 4. Start the OnWorks Linux online or Windows online emulator or MACOS online emulator from this website.
- 5. From the OnWorks Linux OS you have just started, goto our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 6. Download the application, install it and run it.
SCREENSHOTS:
LiteRT
DESCRIPTION:
LiteRT is Google's next-generation on-device machine learning framework and the successor to TensorFlow Lite, designed for high-performance AI and generative AI deployment across edge devices. It provides efficient model conversion, optimization, and runtime execution while leveraging hardware acceleration from CPUs, GPUs, and NPUs. LiteRT supports a wide range of platforms, including Android, iOS, Linux, macOS, Windows, web environments, and IoT devices. The framework simplifies on-device AI development through automated accelerator selection, asynchronous execution, and optimized memory handling. It also includes specialized support for large language models and generative AI workloads through LiteRT-LM and related tooling. With broad hardware compatibility and advanced performance optimizations, LiteRT enables developers to build fast, scalable, and efficient AI applications that run directly on user devices.
Features
- Cross-Platform AI Deployment – Supports Android, iOS, Linux, macOS, Windows, web, and IoT environments from a unified framework.
- Advanced Hardware Acceleration – Optimizes inference using CPUs, GPUs, and NPUs from leading chipset providers, including Google Tensor, Qualcomm, MediaTek, and Intel.
- Compiled Model API – Automates accelerator selection, enables asynchronous execution, and improves I/O buffer management for streamlined development.
- Generative AI Optimization – Provides dedicated tools and runtimes for deploying large language models, diffusion models, and other GenAI workloads on-device.
- Model Conversion & Quantization – Converts and optimizes PyTorch and other machine learning models for efficient edge deployment and reduced resource usage.
- High-Performance Runtime Engine – Delivers low-latency inference with zero-copy buffer interoperability, advanced GPU acceleration, and efficient model execution.
Programming Language
C++
Categories
This is an application that can also be fetched from https://sourceforge.net/projects/litert.mirror/. It has been hosted in OnWorks in order to be run online in an easiest way from one of our free Operative Systems.