This is the Linux app named PRM800K whose latest release can be downloaded as prm800ksourcecode.tar.gz. It can be run online in the free hosting provider OnWorks for workstations.
Download and run online this app named PRM800K with OnWorks for free.
Follow these instructions in order to run this app:
- 1. Downloaded this application in your PC.
- 2. Enter in our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 3. Upload this application in such filemanager.
- 4. Start the OnWorks Linux online or Windows online emulator or MACOS online emulator from this website.
- 5. From the OnWorks Linux OS you have just started, goto our file manager https://www.onworks.net/myfiles.php?username=XXXXX with the username that you want.
- 6. Download the application, install it and run it.
SCREENSHOTS
Ad
PRM800K
DESCRIPTION
PRM800K is a process supervision dataset accompanying the paper Let’s Verify Step by Step, providing 800,000 step-level correctness labels on model-generated solutions to problems from the MATH dataset. The repository releases the raw labels and the labeler instructions used in two project phases, enabling researchers to study how human raters graded intermediate reasoning. Data are stored as newline-delimited JSONL files tracked with Git LFS, where each line is a full solution sample that can contain many step-level labels and rich metadata such as labeler UUIDs, timestamps, generation identifiers, and quality-control flags. Each labeled step can include multiple candidate completions with ratings of -1, 0, or +1, optional human-written corrections (phase 1), and a chosen completion index, along with a final finish reason such as found_error, solution, bad_problem, or give_up.
Features
- 800,000 step-level correctness labels for MATH problems via JSONL
- Detailed schema with labeler IDs, timestamps, generations, QC flags, and finish reasons
- Multi-candidate step ratings of -1, 0, +1 with optional human-completion entries
- Labeler instruction docs for both phase 1 and phase 2
- Python grading logic using math normalization and sympy equivalence checks
- Nonstandard MATH train/test split and large-scale scored samples with PRM/ORM eval scripts
Programming Language
Python
Categories
This is an application that can also be fetched from https://sourceforge.net/projects/prm800k.mirror/. It has been hosted in OnWorks in order to be run online in an easiest way from one of our free Operative Systems.