Ai Document Tools Tongyi Zhiwen
An intelligent AI reading assistant designed to streamline the consumption of long-form content, from academic papers to digital documents.
Ai Programming Tools CodeFuse
CodeFuse is an enterprise-grade AI programming assistant developed by Ant Group to streamline the software development lifecycle through intelligent automation.
Ai Model Benchmarks H2O EvalGPT
An advanced evaluation system by H2O.ai that utilizes Elo rating methodologies to benchmark and rank Large Language Models (LLMs).
Ai Model Benchmarks LLMEval3
A professional evaluation benchmark from Fudan University’s NLP Lab designed to measure the performance and reliability of large language models.
Ai Model Benchmarks MMBench
MMBench is a comprehensive evaluation framework designed to measure the capabilities of multimodal large language models across a wide array of visual and textual tasks.
Ai Model Benchmarks HELM
A standardized, holistic evaluation framework from Stanford University designed to measure the performance and safety of large language models.
Ai Model Benchmarks OpenCompass
OpenCompass is an open-source evaluation framework developed by the Shanghai AI Lab to provide standardized, comprehensive benchmarking for large language models.
Ai Model Benchmarks FlagEval
An open-source evaluation framework developed by the Beijing Academy of Artificial Intelligence (BAAI) to standardize and scale LLM benchmarking.
Ai Model Benchmarks LMArena
A crowdsourced benchmarking platform where users battle-test Large Language Models through blind side-by-side comparisons.