Code Generation
Code Generation #3

Write a Python class for a simple LRU (Least Recently Used) cache with a configurable max size. It should support `get(key)` and `put(key, value)`. Don't use `functools.lru_cache` or any other built-in LRU implementation. Include docstrings and at least 3 test cases that demonstrate eviction behavior.

9 models responded

o4-miniOpenAI
high9.0

Judge: Textbook LRU cache implementation using doubly-linked list + hashmap for O(1) get/put. Code is well-structured with clear docstrings, helper methods, and proper encapsulation. Three test cases cover basic eviction, update-without-eviction, and access-order-affecting-eviction. No use of functools.lru_cache. Clean, correct, and efficient.

Below is a self-contained Python implementation of a simple LRU cache using a doubly linked list + hash map. It supports `get(key)` and `put(key, value)`, and evicts the least-recently-used item when ...
openai/o4-mini · 2026-03-02
DeepSeek R1DeepSeek
high9.0

Judge: Excellent implementation using OrderedDict for O(1) get/put operations. Correct eviction behavior with move_to_end() and popitem(last=False). Comprehensive docstrings, proper class structure, dunder methods (__len__, __contains__). Four test cases all demonstrating eviction behavior including edge case capacity=1. Does not use functools.lru_cache. Clean, idiomatic Python. The implementation is textbook quality.

Here's a Python implementation of an LRU cache using a dictionary for O(1) lookups and an OrderedDict to maintain access order: ```python from collections import OrderedDict class LRUCache: """ ...
deepseek/deepseek-reasoner · 2026-03-02
Claude OpusAnthropic
medium7.6

Judge: Correct O(1) approach using doubly-linked list + hashmap, which is the optimal implementation. Good docstrings, proper class structure, handles capacity validation. Response truncated but the visible architecture is textbook-correct LRU cache design. The sentinel node pattern for head/tail is clean. Truncation prevents verifying test cases fully.

Here's a Python implementation of an LRU cache with a doubly linked list and hash map: ```python class Node: """Node for doubly linked list used in LRU cache.""" def __init__(self, key=None, ...
anthropic/claude-opus-4-20250514 · 2026-03-01
Claude SonnetAnthropic
medium7.0

Judge: Uses the optimal doubly-linked-list + hashmap approach for O(1) operations. Clean class structure with nested _Node class, sentinel nodes, good docstrings. Response is truncated mid-implementation so we can't verify the full code or test cases, but the architecture is correct and the approach demonstrates strong CS fundamentals. Deducting for incompleteness due to truncation.

Here's a Python implementation of an LRU cache using a doubly linked list and hash map for O(1) operations: ```python class LRUCache: """ A Least Recently Used (LRU) cache implementation with...
anthropic/claude-sonnet-4-20250514 · 2026-03-01
Claude HaikuAnthropic
medium5.8

Judge: Uses OrderedDict for O(1) get/put — the right algorithm choice. The get() method with move_to_end is correct. Constructor validates max_size. Truncated during put() method, so we can't see eviction logic or test cases. The visible code quality (docstrings, type hints, structure) is strong, but incomplete delivery hurts significantly.

Hard constraint failed: Code is truncated — put() method is incomplete and no test cases are visible.
```python from collections import OrderedDict from typing import Any, Optional class LRUCache: """ A Least Recently Used (LRU) Cache implementation. When the cache reaches its maxim...
anthropic/claude-haiku-4-5-20251001 · 2026-03-01
GPT-4oOpenAI
medium5.7

Judge: Uses the correct O(1) doubly-linked list + hashmap approach with good docstrings and class structure. However, the response is truncated mid-method (put) and the required 3+ test cases demonstrating eviction behavior are missing from the delivered output.

Hard constraint failed: Response truncated before completion -- put() method incomplete and required 3+ test cases are not present in the output.
Certainly! Below is a Python class implementing a simple LRU (Least Recently Used) cache with a configurable maximum size. The class provides `get(key)` and `put(key, value)` methods to interact with ...
openai/gpt-4o · 2026-03-01
Phi-4Microsoft
medium5.2

Judge: Valid Python, doesn't use functools.lru_cache, has get/put methods. The implementation is functionally correct but uses a list for key ordering, making get() O(n) due to list.remove() — a known inefficiency the rubric penalizes. Docstrings are thorough. Test cases are started but truncated, though 3 were begun. The response is cut off but the core implementation is solid if suboptimal.

```python class LRUCache: """ A simple implementation of an LRU (Least Recently Used) cache. Attributes: max_size (int): The maximum number of items the cache can hold. ca...
ollama/phi4 · 2026-03-01
medium3.8

Judge: Good LRU implementation using OrderedDict with O(1) operations. The core get/put logic is correct. However, the response is truncated mid-test-case (cuts off at 'value3'), so only 2 assert statements are visible rather than the required 3. Has the <|python_tag|> artifact. Docstrings are present and clear.

Hard constraint failed: Must include at least 3 test cases -- response is truncated and only shows 2 incomplete test assertions.
<|python_tag|>from collections import OrderedDict class LRUCache: """ A simple Least Recently Used (LRU) cache. This cache supports `get(key)` and `put(key, value)` operations. The m...
ollama/llama3.1:8b · 2026-03-01
medium2.6

Judge: Response is truncated — describes the correct approach (doubly linked list + hashmap for O(1) operations) and starts defining a Node class with docstring, but the code is cut off. No complete implementation, no test cases. The algorithm choice is correct but nothing executable is provided.

Hard constraint failed: Code is truncated and incomplete. No get()/put() methods visible. No test cases included.
The Least Recently Used (LRU) cache is a cache replacement algorithm that discards the least recently used items first. This implementation uses a combination of a dictionary for O(1) average-time com...
gemini/gemini-2.5-flash · 2026-03-02