This is the Linux app named Perception Models whose latest release can be downloaded as perception_modelssourcecode.tar.gz. It can be run online in the free hosting provider OnWorks for workstations.
Download and run online this app named Perception Models with OnWorks for free.
이 앱을 실행하려면 다음 지침을 따르세요.
- 1. 이 애플리케이션을 PC에 다운로드했습니다.
- 2. 파일 관리자 https://www.onworks.net/myfiles.php?username=XXXXX에 원하는 사용자 이름을 입력합니다.
- 3. 이러한 파일 관리자에서 이 응용 프로그램을 업로드합니다.
- 4. 이 웹사이트에서 OnWorks Linux 온라인 또는 Windows 온라인 에뮬레이터 또는 MACOS 온라인 에뮬레이터를 시작합니다.
- 5. 방금 시작한 OnWorks Linux OS에서 원하는 사용자 이름으로 파일 관리자 https://www.onworks.net/myfiles.php?username=XXXXX로 이동합니다.
- 6. 응용 프로그램을 다운로드하여 설치하고 실행합니다.
스크린샷:
인식 모델
설명 :
Perception Models is a state-of-the-art framework developed by Facebook Research for advanced image and video perception tasks. It introduces two primary components: the Perception Encoder (PE) for visual feature extraction and the Perception Language Model (PLM) for multimodal decoding and reasoning. The PE module is a family of vision encoders designed to excel in image and video understanding, surpassing models like SigLIP2, InternVideo2, and DINOv2 across multiple benchmarks. Meanwhile, PLM integrates with PE to power vision-language modeling, achieving results competitive with leading multimodal systems such as QwenVL2.5 and InternVL3, all while being fully reproducible with open data. The project supports a wide range of research applications, from visual recognition and dense prediction to fine-grained multimodal understanding. Additionally, it includes several large-scale open datasets for both image and video perception.
기능
- Combines Perception Encoder (PE) for vision encoding and Perception Language Model (PLM) for multimodal decoding
- State-of-the-art performance in image, video, and vision-language benchmarks
- Open, reproducible models using freely available datasets for transparency
- Multiple PE variants specialized for core, language-aligned, and spatial tasks
- PLM available in 1B, 3B, and 8B parameter sizes for flexible research needs
- Integrated with popular tools such as Hugging Face Transformers, timm, and lmms-eval
프로그래밍 언어
Python
카테고리
This is an application that can also be fetched from https://sourceforge.net/projects/perception-models.mirror/. It has been hosted in OnWorks in order to be run online in an easiest way from one of our free Operative Systems.