Devon Rex
Nano

Phi-4

Devon Rex

Microsoft · Nano

RogueLv.1textlocalopen-weightlocal

Elfin, mischievous, impossibly light. Weighs nothing, eats everything, and somehow keeps up with cats three times its size.

phi-4Free (local)

Stats

STRDEXCONINTWISCHA
STR
4

Output quality — how good are the responses overall

DEX
8

Speed — tokens per second and latency

CON
5

Reliability — consistency across runs and tasks

INT
6

Reasoning — logic, math, and multi-step problems

WIS
5

Accuracy — factual correctness and precision

CHA
5

Creativity — style, voice, and creative expression

Abilities

Pixie Dust

tiny footprint, runs anywhere

Acrobat

surprisingly agile for its size

Warm Lap

fast enough to feel interactive

Eval Results

Best: 9.0Worst: 2.4
Reasoning
7.2/10(5 prompts)

Reasoning #1

A cat knocks 3 things off a table every hour. The owner puts 2 things back every hour. There are 12 things on the table. When does the table become empty?

high8.6

Judge: Correct answer (12 hours) with clean, step-by-step reasoning. Immediately identifies the net rate of -1 per hour and applies it to the starting count. The math is simple but airtight. Clear presentation with no unnecessary complexity.

To solve this problem, we need to determine how many hours it takes for the number of items on the table to reach zero. Initially, there are 12 things on the table. Every hour, the cat knocks 3 things off, and the owner puts 2 things back. Therefore, the net change in the number of things on the ta...
ollama/phi4 · 2026-03-01

Reasoning #2

You have 8 identical-looking balls. One is slightly heavier than the rest. Using a balance scale, what is the minimum number of weighings needed to guarantee you find the heavy ball? Explain your strategy.

high8.4

Judge: Correct answer (2 weighings) with a complete strategy covering all branches. The 3-3-2 split is the standard optimal approach. All cases are covered: balanced first weighing leads to comparing the 2-ball group, unbalanced leads to comparing 2 of 3 from the heavy group. Clear formatting with step-by-step analysis of each branch.

To find the single heavier ball among the eight using a balance scale with the minimum number of weighings, you can follow this strategy: ### Step-by-Step Strategy 1. **First Weighing:** - Divide the 8 balls into three groups: two groups of 3 balls each (Group A and Group B), and one group of 2...
ollama/phi4 · 2026-03-01

Reasoning #3

A farmer needs to cross a river with a cat, a mouse, and a bag of catnip. The boat can carry the farmer and one item at a time. If left alone together: the cat will eat the mouse, and the cat will destroy the catnip (cats gonna cat). The mouse ignores the catnip. How does the farmer get everything across? Show all steps.

high2.4
Hard constraint failed: Cat and catnip must never be left alone together — Step 1 leaves cat and catnip unsupervised on the left bank.

Judge: The solution violates hard constraints: Step 1 takes the mouse across, leaving cat and catnip alone on the left bank — but the cat destroys catnip when left unsupervised. The correct approach is to take the cat first. The presentation is clear and well-formatted, but the core logic is wrong. All items do end up across, but the path is invalid.

To solve this problem, we need to ensure that the cat is never left alone with the mouse or the catnip without supervision from the farmer. Here’s a step-by-step solution: 1. **First Trip:** The farmer takes the mouse across the river and leaves it on the other side. - Left Bank: Cat, Catnip ...
ollama/phi4 · 2026-03-01

Reasoning #4

In a round-robin tournament with 6 players, each player plays every other player exactly once. A win is worth 2 points, a draw is 1 point each, and a loss is 0 points. After the tournament: Player A has 10 points, Player B has 8 points, and Player C has 6 points. Players D, E, and F each have the same score. What score do D, E, and F each have? Show your reasoning.

high9.0

Judge: Correct answer (2 points each) with elegant reasoning. Cleanly calculates total games (C(6,2)=15), total points (15x2=30), subtracts known scores (24), and divides remainder (6) by 3 players. The mathematical reasoning is airtight and clearly presented with proper notation.

In a round-robin tournament with 6 players, each player plays every other player exactly once. This means there are a total of \(\binom{6}{2} = 15\) games played in the tournament. Each game results in either: - A win and a loss (2 points distributed: 2 for the winner, 0 for the loser), or - A draw...
ollama/phi4 · 2026-03-01

Reasoning #5

There are 100 closed lockers in a hallway. 100 cats walk by. The first cat toggles every locker. The second cat toggles every 2nd locker. The third cat toggles every 3rd locker. And so on — the nth cat toggles every nth locker. After all 100 cats have passed: (a) which lockers are open? (b) Why? Prove your answer mathematically.

high7.8

Judge: Correctly identifies perfect squares as the answer with a sound mathematical proof. Explains that divisors come in pairs except for square roots, giving perfect squares an odd number of divisors. The response is truncated before listing all 10 perfect squares, but the reasoning and proof are solid. Covers both the 'what' and 'why' as requested.

To solve this problem, we need to determine which lockers remain open after all 100 cats have toggled them according to the given pattern. ### Analysis: Each locker is toggled by a cat if the locker number is a multiple of the cat's number. For example, locker 12 is toggled by cats 1, 2, 3, 4, 6,...
ollama/phi4 · 2026-03-01