Savannah
Mid-Tier

DeepSeek R1

Savannah

DeepSeek · Mid-Tier

SpecialistLv.1textreasoningopen-weight

Wild at heart, fiercely independent, strikingly powerful. Thinks out loud in long chains that somehow arrive at the right answer.

deepseek-r1$0.55/M in · $2.19/M out

Stats

STRDEXCONINTWISCHA
STR
7

Output quality — how good are the responses overall

DEX
6

Speed — tokens per second and latency

CON
6

Reliability — consistency across runs and tasks

INT
9

Reasoning — logic, math, and multi-step problems

WIS
8

Accuracy — factual correctness and precision

CHA
5

Creativity — style, voice, and creative expression

Abilities

Chain of Thought

visible reasoning process, sometimes brilliant

Wild Card Pricing

absurdly cheap for what it delivers

Feral Logic

unconventional paths to correct answers

Eval Results

Best: 9.8Worst: 0.0
Factual QA
9.1/10(5 prompts)

Factual QA #1

What are the key differences between domestic cats and their closest wild relatives?

high9.0

Judge: Comprehensive and accurate coverage of domestic vs wild cat differences. Correctly identifies Felis lybica as closest relative with ~10,000 year domestication timeline. Covers all four expected topics: ancestry, timeline, behavioral differences (sociality, vocalization), and physical differences (neoteny, brain size, coat diversity). Includes a comparison chart and mentions other close relatives. Minor point: some claims about brain/amygdala reduction are stated with more certainty than the research supports, but overall accuracy is strong.

Excellent question! The domestic cat's closest wild relative is the **African wildcat (*Felis lybica*)**, from which they were domesticated around 10,000 years ago in the Near East. While they look remarkably similar, key differences have emerged through domestication. Here’s a breakdown of the pri...
deepseek/deepseek-reasoner · 2026-03-02

Factual QA #2

What is the CAP theorem in distributed systems? Explain what each letter stands for, give a concrete example of a system that sacrifices each property, and explain why you can't have all three.

high9.4

Judge: Accurate definitions of all three CAP properties. Valid examples for each tradeoff: PostgreSQL as CA, ZooKeeper/etcd as CP, DynamoDB/Cassandra as AP. The 'why you can't have all three' explanation uses a concrete two-datacenter scenario that makes the forced choice between C and A during partition intuitive. Includes the important nuance that partitions are inevitable so the real choice is C vs A. The closing note about CAP being misunderstood adds sophistication.

## CAP Theorem Explained The **CAP theorem** states that in a distributed data system, you can only guarantee **two out of three** of the following properties simultaneously: - **C**onsistency - **A**vailability - **P**artition tolerance Here’s what each means in a distributed system context: ##...
deepseek/deepseek-reasoner · 2026-03-02

Factual QA #3

Explain the biological mechanism of how cats purr. Address the ongoing scientific debate about the exact mechanism, and explain why purring is medically interesting.

high8.4

Judge: Thorough explanation of the laryngeal mechanism with neural oscillator, glottis contraction cycle, and correct frequency range (25-150 Hz). Addresses the scientific debate between neurogenic and hemodynamic theories, though overstates how settled the debate is -- the response says 'largely settled' but research on pad-based mechanisms and recent connective tissue findings suggest more uncertainty. Medical interest section covers bone healing, pain relief, and human therapeutic applications. Both hard constraints met.

Of course. Here is a detailed explanation of the biological mechanism of cat purring, the ongoing scientific debate, and its medical interest. ### **The Leading Theory: The "Laryngeal Diaphragm" Mechanism** The most widely accepted explanation is that purring is a **neurogenic oscillator**, meanin...
deepseek/deepseek-reasoner · 2026-03-02

Factual QA #4

Compare the Transformer architecture (as described in the original "Attention Is All You Need" paper) with the Mamba architecture. Cover: key structural differences, computational complexity for sequence length, strengths and weaknesses of each for different task types, and why state space models are gaining interest.

high9.4

Judge: Technically precise comparison covering all four requested areas. Correctly states Transformer self-attention is O(n^2), Mamba is O(n) for training and O(1) per step for inference. Accurately describes selective SSM mechanism with input-dependent B and C parameters. The strengths/weaknesses table is insightful, particularly the bidirectionality weakness of SSMs and the KV-cache problem for Transformers. The 'why SSMs are gaining interest' section is well-reasoned. Thorough and well-organized.

Excellent question. This comparison highlights a fundamental shift in sequence modeling paradigms. Let's break down the comparison between the **Transformer (Vaswani et al., 2017)** and **Mamba (Gu & Dao, 2023)** as requested. ### **Core Paradigms** * **Transformer:** Built on **self-attention**....
deepseek/deepseek-reasoner · 2026-03-02

Factual QA #5

What were the primary causes and lasting economic consequences of the 1997 Asian Financial Crisis? Cover at least Thailand, South Korea, and Indonesia. Explain the role of the IMF's response and the controversy around its conditions.

high9.4

Judge: Historically precise coverage of all three countries with country-specific detail -- Thailand's baht peg collapse, South Korea's chaebol-driven debt crisis, Indonesia's political/social dimension with Suharto's fall. IMF analysis is balanced, presenting both the rationale (structural reform, confidence restoration) and valid criticisms (Stiglitz's austerity critique, inappropriate structural conditions, moral hazard for lenders). Mentions the Chiang Mai Initiative as a lasting consequence. Comprehensive and nuanced.

### **Primary Causes of the 1997 Asian Financial Crisis** The crisis was not caused by a single factor, but by a combination of interconnected vulnerabilities that made several Asian economies susceptible to a sudden loss of investor confidence. 1. **Large Capital Inflows and Overheating:** Follo...
deepseek/deepseek-reasoner · 2026-03-02