benchmark evidence

MMStar

MMStar: manually filtered multimodal benchmark requiring genuine visual perception, resistant to contamination.

no matched results

No public rows from this source currently map to an active callable model.

what this result means

MMStar: manually filtered multimodal benchmark requiring genuine visual perception, resistant to contamination.

This benchmark contributes direct public evidence. Read its scope before generalizing the result.

A win here is a win on MMStar. Broad task pages require independent corroboration before naming a general winner.

source record

category: multimodal

metric: accuracy

matched models: 0

latest source date: date unavailable

direction: higher is better