This is the Linux app named OCRBase whose latest release can be downloaded as ocrbasesourcecode.zip. It can be run online in the free hosting provider OnWorks for workstations.
Download and run online this app named OCRBase with OnWorks for free.
Follow these instructions in order to run this app:
- 1. Downloaded this application in your PC.
- 2. Enter in our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 3. Upload this application in such filemanager.
- 4. Start the OnWorks Linux online or Windows online emulator or MACOS online emulator from this website.
- 5. From the OnWorks Linux OS you have just started, goto our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 6. Download the application, install it and run it.
SCREENSHOTS
Ad
OCRBase
DESCRIPTION
OCRBase is a self-hostable document OCR and structured extraction system built to turn PDFs into machine-usable outputs at scale, aiming to bridge the gap between raw text extraction and production-ready pipelines. Instead of treating OCR as a one-off script, it presents an API-driven workflow where documents are submitted as jobs and processed through a queue-based architecture that can handle high throughput. The core output is designed for downstream automation, producing structured results like JSON according to user-defined schemas while also providing readable formats like Markdown for human review or indexing. It includes real-time job progress updates via WebSockets, which makes it easier to integrate into UIs, dashboards, or ingestion systems where users need feedback on long-running document processing.
Features
- OCR pipeline using PaddleOCR-VL-0.9B for text extraction
- Schema-driven structured extraction that returns JSON outputs
- Queue-based processing designed for high-volume document workloads
- Type-safe TypeScript SDK including React hooks for integration
- Real-time WebSocket updates for job progress and completion
- Self-hostable deployment model built around Docker and Bun
Programming Language
TypeScript
Categories
This is an application that can also be fetched from https://sourceforge.net/projects/ocrbase.mirror/. It has been hosted in OnWorks in order to be run online in an easiest way from one of our free Operative Systems.
