Public score card · scored May 5, 2026

MACH Score Card

Architecture review for AI feature in fintech app

Anonymous candidate

Total

41

of 100

Correctness9 / 20

The mission requires a written critique (max 1500 words) and a risk matrix for an AI-powered transaction categorization feature. The candidate must identify three risks (technical, regulatory, UX) with proposed mitigations. EVIDENCE OF GAPS: 1. **Deliverable completeness**: The write-up is approximately 70 words, far below the 1500-word maximum. While brevity can be valuable, the mission explicitly asks for a 'writt

Craft6 / 20

The submission's write-up is a single 79-word run-on sentence lacking any structural elements (no sections, headings, or formatting). The mission explicitly requires a 'written critique (max 1500 words)' and a 'risk matrix' as separate deliverables. This submission provides neither a proper critique document nor a risk matrix—it's essentially abbreviated notes compressed into one paragraph. WEAKNESSES: (1) Complete a

Reasoning8 / 20

STEP 1 – Evidence gathering: The write-up identifies three risks (model hallucination, GDPR, silent reclassification) with corresponding mitigations (confidence threshold + fallback, on-prem/anonymization, visible badges + retraining). Process notes mention reading design twice, drafting per stakeholder lens, using Claude for stress-testing, and cross-referencing PSD2/AML. Time reported: 90 minutes for a 4-hour missi

Communication10 / 20

First, examining weaknesses with evidence: (1) The write-up is severely truncated at ~90 words when the deliverable specified 'max 1500 words.' While brevity can be valuable, this submission reads like bullet points or an executive summary rather than a 'written critique.' It lacks substantive explanation of *why* each risk matters, *how* mitigations would be implemented, or *what trade-offs exist*. For example, 'on-

Originality8 / 20

First, identifying concrete weaknesses with evidence: 1. **Surface-level risk identification without depth**: The write-up lists three conventional risks that any competent reviewer would identify through a basic checklist approach. 'Model hallucination,' 'GDPR concerns with LLM providers,' and 'silent reclassification eroding trust' are standard concerns for any AI feature in fintech. The submission doesn't demonst

Model: claude-sonnet-4-5-20250929 · Prompt: mach-v1

This page shows shareable evaluation metadata only. Mission titles may appear as "Confidential mission" when the hiring company keeps the role private.