Quality
Testing and CI
Understand which Origin checks run locally, which run in GitHub Actions, and which evals stay manual.
At a glance
01
Origin splits checks by cost and signal: fast local checks gate development; heavy evals stay manual.
02
PR CI must prove daemon correctness, but retrieval-quality claims need separate eval discipline.
01
Why checks are layered
Origin is a local daemon, CLI, MCP server, core library, and shared type crate. A single slow mega-check would make normal contribution work worse.
The repo separates correctness checks from quality measurement. Tests and clippy gate normal changes; coverage and evals inform decisions without pretending to be cheap smoke tests.
Check layers
Local iteration targeted cargo test / cargo check
Pre-commit cargo fmt --all + clippy on changed crates
Pre-push workspace clippy + workspace library tests
PR CI fmt, lint, tests for daemon crates
Coverage informational on PR, not a local gate
Manual eval GPU/API-backed benchmarks, run on demand02
Local verification
Use targeted crate tests while iterating, then run full formatting, clippy, and tests before opening or merging a PR.
The public contributor path expects evidence. If a change affects behavior, include the smallest relevant test rather than relying on manual inspection.
Contributor checks
cargo fmt --check --all
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace
# faster iteration examples
cargo test -p origin-core --lib
cargo test -p origin-server03
Git hooks
The repo includes hooks for routine local guardrails. Pre-commit handles formatting and changed-crate clippy. Pre-push runs workspace clippy plus library tests.
Hooks reduce CI churn, but they do not replace the final PR checks. Treat them as early feedback.
Hook setup
bash scripts/setup-hooks.sh
# hooks then run focused checks before commit/push
git commit
git push04
CI and coverage
GitHub Actions runs the required PR gate: formatting, linting, and tests across the daemon workspace. Coverage runs separately as informational signal.
Coverage is not a pre-push percentage gate. The project intentionally avoids local coverage gates that are slow, brittle, and not mirrored by the required CI lane.
05
Manual evals
LoCoMo, LongMemEval, KG faithfulness, page faithfulness, and API-backed judge runs have different cost and hardware requirements. Some run only as ignored tests or manual eval workflows.
Do not cite new retrieval or quality numbers from a casual run. Public benchmark claims should follow the eval docs and state the fixture, model, run count, and limits.
06
Before asking for review
A good PR says what changed, why it matters, and how it was tested. Include the commands that prove the change instead of saying it should work.
Docs-only changes still need a build. Code changes should include relevant tests or a clear explanation of why the behavior is covered elsewhere.
Next
Development Conventions
Codebase rules that keep Origin's daemon, CLI, MCP connector, shared types, and core logic maintainable.
Read next