Quality Assurance

14 skills with this tag

affaan-m
Passed
Eval Harness
Eval Harness provides a formal evaluation framework for Claude Code sessions, implementing eval-driven development (EDD) principles. It helps define expected behaviors before implementation, run evals continuously during development, and track pass/fail metrics for both capability and regression tests.
TestingEvaluationTdd+3
6332.2k
anthropics
Passed
cookbook-audit
Audit an Anthropic Cookbook notebook based on a rubric. Use whenever a notebook review or audit is requested.
DocumentationCode ReviewJupyter+3
94630.4k
wshobson
Passed
code-review-excellence
Code Review Excellence is a comprehensive guide for conducting effective code reviews. It provides detailed methodologies for reviewing pull requests including checklists for security, performance, and testing, along with templates for feedback and techniques for giving constructive criticism while maintaining team morale.
Code ReviewPull RequestsBest Practices+3
50327.0k
wshobson
Passed
e2e-testing-patterns
A comprehensive guide to end-to-end testing with Playwright and Cypress frameworks. It teaches patterns like Page Object Model, test fixtures, network mocking, visual regression testing, accessibility testing, and debugging strategies for building reliable and maintainable test suites.
E2e TestingPlaywrightCypress+3
68527.0k
wshobson
Passed
llm-evaluation
This skill teaches comprehensive evaluation strategies for LLM applications, covering automated metrics (BLEU, ROUGE, BERTScore), human evaluation frameworks, LLM-as-Judge patterns using Claude, A/B testing with statistical analysis, and regression detection. It includes ready-to-use Python code examples and integrates with tools like LangSmith.
A B TestingQuality AssuranceLlm Evaluation+3
53727.0k
obra
Passed
Verification Before Completion
Use when about to claim work is complete, fixed, or passing, before committing or creating PRs - requires running verification commands and confirming output before making any success claims; evidence before assertions always
WorkflowTestingVerification+3
66813.2k
obra
Passed
Requesting Code Review
Use when completing tasks, implementing major features, or before merging to verify work meets requirements
Code ReviewWorkflowGit+3
69613.2k
anthropics
Passed
Code Review
A code review skill that launches multiple AI agents to audit pull requests for bugs and guideline compliance, filtering results by confidence score to reduce false positives.
Code ReviewPull RequestsGithub+3
3592.1k
alinaqi
Passed
Code Review
A comprehensive code review skill that enforces automated code reviews before commits and deployments. It supports multiple AI engines (Claude, OpenAI Codex, Google Gemini) and provides integration patterns for pre-commit hooks and GitHub Actions CI/CD pipelines.
Code ReviewCi CdGithub Actions+3
429453
rsmdt
Passed
Specification Validation
A specification validation skill that ensures quality of PRDs, SDDs, and implementation plans using the 3Cs framework (Completeness, Consistency, Correctness). It can validate individual files, compare implementations against specifications, check cross-document alignment, and validate understanding of design decisions.
SpecificationValidationDocumentation+3
546168
rsmdt
Passed
Implementation Verification
This skill ensures code implementations match documented specifications (PRD, SDD, implementation plans). It checks interface contracts, data structures, business logic, and architecture decisions against requirements, then provides structured compliance reports with deviation classification (critical, notable, acceptable).
SpecificationComplianceValidation+3
467168
NeoLabHQ
Passed
Agent Evaluation
A comprehensive evaluation framework for assessing Claude Code agents, commands, and skills. Provides LLM-as-Judge implementation patterns, multi-dimensional rubrics, bias mitigation techniques, and metrics for measuring agent quality across instruction following, completeness, tool efficiency, reasoning, and coherence.
EvaluationQuality AssuranceLlm As Judge+3
529160
rsmdt
Passed
Code Review
Coordinate multi-agent code review with specialized perspectives. Use when conducting code reviews, analyzing PRs, evaluating staged changes, or reviewing specific files. Handles security, performance, quality, and test coverage analysis with confidence scoring and actionable recommendations.
Code ReviewSecurity AnalysisMulti Agent+3
420135
frmoretto
Passed
Clarity Gate
Clarity Gate is a document verification system that checks whether claims are properly marked as uncertain or validated before documents enter RAG knowledge bases. It helps prevent LLMs from mistaking assumptions for facts by enforcing epistemic markers and requiring human-in-the-loop verification for unverified claims.
DocumentationRagVerification+3
7715