Bengal
Small

Llama 3

Bengal

Meta · Small

RogueLv.1textlocalopen-weightlocal

Unpredictable energy, occasionally brilliant. Might write you a masterpiece or knock your coffee off the desk.

llama-3.1-8bFree (local)

Stats

STRDEXCONINTWISCHA
STR
6

Output quality — how good are the responses overall

DEX
9

Speed — tokens per second and latency

CON
5

Reliability — consistency across runs and tasks

INT
6

Reasoning — logic, math, and multi-step problems

WIS
6

Accuracy — factual correctness and precision

CHA
7

Creativity — style, voice, and creative expression

Abilities

Feral Speed

blazing fast on local hardware

Zero Cost

runs free on your own metal

Wild Card

output quality varies run to run

Eval Results

Best: 8.0Worst: 0.6
Summarization
5.8/10(5 prompts)

Summarization #1

Summarize the following in exactly 3 bullet points: The domestic cat (Felis catus) is a small, typically furry, carnivorous mammal. They are often called house cats when kept as indoor pets. Cats have been associated with humans for at least 9,500 years and are currently the second most popular pet in the world. Cats are skilled predators and are known to hunt over 1,000 species for food. They are also intelligent and can be trained to obey simple commands. Despite being solitary hunters, cats are social species and use a variety of vocalizations, pheromones, and body language for communication. Cats have a rapid breeding rate. Under controlled breeding, they can be bred and shown as registered pedigreed pets. Failure to control their breeding has contributed to large numbers of feral cats worldwide, contributing to the extinction of bird, mammal, and reptile species.

high7.0

Judge: Exactly 3 bullet points capturing key information accurately. Good coverage: domestication history, predator/intelligence traits, and feral cat problem. Minor imprecision: says 'one of the most popular' instead of 'second most popular.' Concise and well-structured.

Here are 3 bullet points summarizing the text: * Cats have been associated with humans for at least 9,500 years and are currently one of the most popular pets in the world. * Domestic cats are skilled predators that can hunt over 1,000 species and are also intelligent, trainable animals. * Uncontro...
ollama/llama3.1:8b · 2026-03-01

Summarization #2

Read the following technical proposal and produce a one-paragraph executive summary suitable for a non-technical VP. No jargon, no acronyms. "We propose migrating our monolithic Django application to a microservices architecture using Kubernetes orchestration. The current system handles approximately 50,000 requests per second at peak, with p99 latency of 2.3 seconds — well above our 500ms SLA. Root cause analysis indicates that the user authentication module, the recommendation engine, and the payment processing pipeline are tightly coupled, causing cascading failures when any single component is under load. Our proposal: decompose into 7 bounded contexts, each deployed as independent services communicating via gRPC with an async event bus (Kafka) for non-critical paths. Expected outcomes: independent scaling of hot paths (auth and recommendations), isolation of payment processing failures, and reduction of p99 latency to under 400ms. Estimated timeline: 6 months with a team of 8. Estimated infrastructure cost increase: 30% during migration, settling to 15% above current after optimization. Risk: data consistency during the transition period requires a careful strangler fig migration pattern."

high6.4

Judge: Good jargon-free translation -- no acronyms, no technical terms. Captures the core problem (performance issues from tightly coupled components), the solution (breaking apart into independent services), and the timeline (6 months). However, loses key numbers (doesn't mention the 30%/15% cost increase, the specific latency improvement, or the risk of data consistency during migration). Says 'long-term cost savings' which actually misrepresents the proposal (costs go UP 15%).

Here's a one-paragraph executive summary: Our proposal is to restructure our main application into smaller, independent services that can work together more efficiently. Right now, when many users are interacting with our system at the same time, it slows down significantly and doesn't meet our per...
ollama/llama3.1:8b · 2026-03-01

Summarization #3

Summarize this research abstract, then list 3 strengths and 3 weaknesses of the study design: "We conducted a double-blind, placebo-controlled study of 2,847 participants across 12 sites to evaluate whether daily exposure to cat purring (25Hz frequency, 30 minutes per session) accelerates bone fracture healing. The treatment group (n=1,424) used calibrated vibration devices mimicking cat purr frequencies; the control group (n=1,423) used identical devices producing imperceptible vibrations. After 12 weeks, the treatment group showed a 12% reduction in healing time (95% CI: 8-16%, p<0.001) as measured by radiographic union. Secondary outcomes included a 23% reduction in self-reported pain scores (p<0.01) and no significant difference in adverse events. Limitations include: participant awareness of vibration may have compromised blinding for the treatment group, the study population skewed younger (mean age 34), and the 12-week endpoint may not capture long-term outcomes."

high5.4

Judge: Correct format (summary + 3 strengths + 3 weaknesses). Summary is accurate. However, the strengths and weaknesses are mostly just restating what the abstract itself says -- the limitations section is literally echoed back as 'weaknesses.' No non-obvious observations. The strengths are textbook obvious (large sample, RCT design, well-defined measures).

**Summary:** This double-blind, placebo-controlled study examined whether daily exposure to cat purring frequencies could accelerate bone fracture healing. The results showed that participants who received vibrations mimicking cat purrs had a 12% reduction in healing time and reported less pain comp...
ollama/llama3.1:8b · 2026-03-01

Summarization #4

Create a layered summary of the following — provide three versions: (1) a tweet-length summary (under 280 characters), (2) a one-paragraph summary for a general audience, (3) a detailed summary preserving all key numbers and caveats. "The global AI chip market reached $53.4 billion in 2024, growing 42% YoY from $37.6 billion in 2023. NVIDIA dominates with approximately 80% market share in data center GPUs, though AMD has grown from 5% to 12% share with its MI300X accelerator. Custom silicon is the emerging disruptor: Google's TPU v5p, Amazon's Trainium2, and Microsoft's Maia 100 collectively represent $4.2 billion in avoided NVIDIA purchases. The supply chain remains concentrated — TSMC fabricates over 90% of advanced AI chips (sub-7nm), and ASML is the sole supplier of EUV lithography machines. Geopolitical risks are significant: US export controls have cut China's access to cutting-edge chips, pushing Chinese firms toward mature nodes (14nm+) and domestic alternatives like Huawei's Ascend 910B, which benchmarks at roughly 70% of an A100. Analysts project the market will reach $140 billion by 2027, though this assumes continued data center buildout and no major regulatory disruption. Key uncertainties: inference vs training chip mix shift, potential overcapacity in 2026, and whether sovereign AI initiatives (EU Chips Act, India's semiconductor mission) will meaningfully diversify manufacturing."

high4.4

Judge: All three versions present but tweet is 237 characters (just under 280 limit -- actually valid). However, the layering is weak: the tweet and paragraph versions cover very similar ground without proper compression for the tweet or expansion for the detailed version. The detailed version is truncated at the end. Key numbers are mostly preserved. AMD's growth from 5% to 12% is missing from all versions.

### Tweet-length summary The global AI chip market has grown to $53.4 billion in 2024, with NVIDIA holding an 80% share. However, custom silicon from Google, Amazon, and Microsoft is emerging as a disruptor, representing $4.2 billion in avoided NVIDIA purchases. ### One-paragraph summary for genera...
ollama/llama3.1:8b · 2026-03-01

Summarization #5

Explain the main argument of this passage in one sentence, then rate how convincing it is on a scale of 1-10 with justification: "Remote work is fundamentally inferior to office work for creative collaboration. Studies show that chance encounters — the hallway conversation, the overheard brainstorm — account for up to 30% of innovative ideas at major tech companies. Slack messages and Zoom calls are poor substitutes for the bandwidth of in-person interaction. Companies that went fully remote in 2020 have seen patent filings drop by 15% compared to hybrid peers. The data is clear: if you want innovation, you need butts in seats."

high5.6

Judge: Main argument correctly captured in one sentence. Rating of 6/10 is reasonable. The justification identifies some real issues (anecdotal tone, confounding factors) but misses the strongest critiques: correlation vs causation in the patent filing claim, no control for company size/industry, and the cherry-picked nature of the 30% statistic. Adequate but not incisive.

The main argument of this passage is that in-person office work is superior to remote work for fostering creative collaboration and innovation. I would rate the convincingness of this argument a 6 out of 10 with justification. While the passage cites some data points about patent filings and the im...
ollama/llama3.1:8b · 2026-03-01