Question 1

How do I set up a custom evaluation with my own prompts?

Accepted Answer

Use the `web-codegen-scorer init` command to interactively create a custom environment config, or manually configure a .mjs file based on the environment reference docs. This allows you to specify prompts, models, and tools tailored to your project.

Question 2

Is Web Codegen Scorer better than general benchmarks like HumanEval for web code?

Accepted Answer

Yes, for web-specific assessments, as it focuses on practical metrics like accessibility and runtime errors relevant to web development. General benchmarks often miss framework-specific issues, making this tool more actionable for web teams.

Question 3

Can I use Web Codegen Scorer with React or Vue.js instead of Angular?

Accepted Answer

Absolutely. The tool is framework-agnostic—you can configure evaluations for any web library or framework, as stated in the FAQ. Built-in examples include Angular and Solid, but custom setups are supported.

Question 4

How does the automated repair feature work when it finds errors?

Accepted Answer

It re-runs the LLM with error context to attempt fixes, up to a configurable number of attempts set by --max-build-repair-attempts. This iterative process can improve code quality but may increase API costs and time.

Question 5

What are the estimated costs of running evaluations with Web Codegen Scorer?

Accepted Answer

Costs vary based on the LLM models used and the number of prompts evaluated, as each call incurs API fees. Use the --concurrency and --limit flags to control requests, but there's no built-in budgeting, so monitor usage externally.

Question 6

Web Codegen Scorer vs using Claude Code directly for code generation?

Accepted Answer

Web Codegen Scorer is for evaluation, not generation—it can use Claude Code as a runner to assess output quality. It's complementary, enabling systematic comparison across tools rather than replacing them.

web-codegen-scorer

What is web-codegen-scorer?

Overview

Use Cases

Best For

Related Projects

Found a gem we're missing?

Not Ideal For

Pros & Cons

Pros

Cons

Frequently Asked Questions