Instruction Following
Instruction Following #3

Write a professional email declining a meeting invitation. Constraints: - Exactly 3 paragraphs - Total length between 75 and 100 words (inclusive) - Must include a specific alternative time suggestion - Tone: polite but firm - Must NOT use the words "sorry", "unfortunately", or "regret"

9 models responded

o4-miniOpenAI
high8.4

Judge: Exactly 3 paragraphs, 84 words (within 75-100 range), no forbidden words ('sorry', 'unfortunately', 'regret'), and includes a specific alternative time (Thursday March 12 at 2:00 PM). Tone is polite but firm -- doesn't grovel, states the conflict clearly, offers an alternative. All constraints met.

Thank you for inviting me to the project kickoff meeting on March 10 at 10:00 AM. I appreciate the opportunity to discuss the upcoming milestones and strategies with the team. I have a conflicting co...
openai/o4-mini · 2026-03-02
Claude HaikuAnthropic
high8.2

Judge: Exactly 3 paragraphs, 82 words (within 75-100 range), no forbidden words (sorry/unfortunately/regret), includes specific alternative time (Thursday 2:00 PM). Tone is polite but firm without groveling. All hard constraints met. Minor deduction: the subject line wasn't requested and could be seen as extra text.

**Subject: Re: Meeting Invitation – [Date]** Thank you for inviting me to the meeting scheduled for [original date/time]. I appreciate the opportunity to participate and the time you've taken to orga...
anthropic/claude-haiku-4-5-20251001 · 2026-03-01
GPT-4oOpenAI
medium7.5

Judge: Meets the core constraints: 3 body paragraphs, word count near 75-100 range, no forbidden words, specific alternative time (Thursday at 2 PM). Tone is polite and firm but slightly stiff. The sign-off line and subject header are extras that don't violate constraints.

Subject: Re: Meeting Invitation Dear [Name], Thank you for considering me for the meeting. I am unable to attend at the proposed time due to prior commitments. Your understanding is appreciated as I...
openai/gpt-4o · 2026-03-01
high5.6

Judge: Includes a specific alternative time (Tuesday, October 24th, 2:00 PM PST) and avoids 'sorry', 'unfortunately', and 'regret'. Professional tone. However, the email has 5 paragraphs (salutation, body P1, body P2, body P3, sign-off) rather than exactly 3, and at 90 words it's within the 75-100 range. The paragraph count constraint is clearly violated.

Hard constraint failed: Email has 5 paragraphs (including salutation and sign-off lines) rather than exactly 3.
Subject: Regarding your meeting invitation Dear [Invitee Name], Thank you for the meeting invitation. I appreciate you thinking of me and the opportunity to connect. However, I will be unable to att...
gemini/gemini-2.5-flash · 2026-03-02
Claude SonnetAnthropic
high5.2

Judge: 3 paragraphs as required, no forbidden words, includes alternative times. However, the body is approximately 72 words -- under the 75-word minimum. Including 'Best regards, [Your name]' might bring it to 74-75, but this is borderline. The tone is well-calibrated: polite, firm, not apologetic. The constraint violation on word count is the main issue.

Hard constraint failed: Email body is approximately 72 words, below the required 75-100 word range.
Subject: Unable to Attend Meeting - Alternative Time Proposed Thank you for the meeting invitation regarding the quarterly review. I must decline the scheduled time due to a prior commitment that can...
anthropic/claude-sonnet-4-20250514 · 2026-03-01
Claude OpusAnthropic
high4.4

Judge: Fails the word count hard constraint: the body is approximately 69 words, below the 75-word minimum. Also arguably has 4 paragraphs if counting the sign-off block. No forbidden words used. Alternative time suggested (Thursday 2 PM). Tone is appropriately polite and firm. The content quality is good but the constraint violations are clear and significant for an instruction-following task.

Hard constraint failed: Body word count is approximately 69, below the required minimum of 75 words. Sign-off block may constitute a 4th paragraph, violating the 3-paragraph constraint.
Subject: Re: Meeting Invitation - Alternative Time Proposed Thank you for the meeting invitation. I have a prior commitment during the proposed time slot that cannot be rescheduled. I remain interes...
anthropic/claude-opus-4-20250514 · 2026-03-01
high4.2

Judge: Has 3 body paragraphs, includes specific alternative time (Wednesday at 2 PM), avoids forbidden words. However, word count is borderline: the body (from greeting to sign-off) is approximately 92 words. The subject line and signature push total to ~103 words. Whether the word count is measured on just the body or the whole email affects compliance. Has 5 paragraph-like blocks total (subject, greeting, 3 body, signature) which muddies the '3 paragraphs' constraint. Tone is polite but firm, which is good.

Subject: Declining Meeting Invitation for [Meeting Date] Dear [Host's Name], I appreciate your invitation to meet on [Meeting Date]. However, I must decline as my schedule is fully committed until t...
ollama/llama3.1:8b · 2026-03-01
Phi-4Microsoft
high4.0

Judge: The email body has 2 substantive paragraphs plus a sign-off block and a meta-note — not exactly 3 paragraphs of email content (hard constraint issue). Word count is approximately 98 including the sign-off, borderline meeting 75-100. No forbidden words used. Includes a specific alternative time (April 5th at 10:00 AM). The extra '(Note: ...)' line and placeholder brackets reduce professionalism. Tone is polite and firm.

Hard constraint failed: Exactly 3 paragraphs — the email has 2 content paragraphs plus sign-off and meta-note, not 3 clean paragraphs.
Subject: Re: Meeting Invitation on March 15th Dear [Recipient's Name], Thank you for your invitation to the meeting scheduled for March 15th. After reviewing my commitments, I find it necessary to a...
ollama/phi4 · 2026-03-01
DeepSeek R1DeepSeek
high0.0

Judge: Empty response. No email, no paragraphs, no content provided.

Hard constraint failed: Empty response fails all hard constraints -- no paragraphs, no word count, no alternative time suggestion.
deepseek/deepseek-reasoner · 2026-03-02