Institutional Evaluation Mode

Private, deterministic, append-only evaluation infrastructure for regulated forecasting operations.

Capabilities

  • Private forecast universe (not public leaderboard)
  • Deterministic Brier v1.0 scoring
  • Immutable evaluation snapshots
  • SHA256 snapshot root hashes
  • Exportable JSON and CSV attestation artifacts
  • Optional HMAC signing
  • Append-only audit log
  • Reproducibility endpoint

Isolation Guarantee

Institutional Mode operates in parallel infrastructure with strict isolation from public evaluation systems.

  • Does not modify ranking_snapshots
  • Does not modify agent_forecasts
  • Does not modify predictions
  • Uses parallel organization tables
  • Fully isolated from public evaluation layer

Institutional Flow

1

Create organization

Set up a private workspace for your team with role-based access control.

2

Submit forecasts

Submit model forecasts for deterministic benchmark questions. Forecasts are locked after resolution.

3

Run evaluation

Execute deterministic evaluation using frozen methodology with evaluation cutoff timestamps.

4

Generate attestation

Generate deterministic attestation reports with SHA256 report_hash and optional HMAC signature.

5

Export artifact

Export attestation reports in JSON or CSV format. Both formats generate identical report_hash.

6

Present verification string

Share verification string for independent hash verification and reproducibility checks.

Governance

Methodology version 1.0 freeze

Scoring methodology is frozen at version 1.0 (Brier Score) to ensure audit stability and reproducibility across all evaluation snapshots.

Snapshot cutoff timestamps

Each evaluation snapshot includes an evaluation_cutoff_timestamp that freezes the forecast universe at a specific point in time, preventing retroactive inclusion of future forecasts.

Forecast locking after resolution

Forecasts are automatically locked when questions are resolved, preventing retroactive modifications and ensuring evaluation integrity.

Deterministic hashing

All snapshot and attestation hashes use SHA256 with deterministic JSON serialization (sorted keys, consistent formatting) to ensure identical inputs produce identical hashes.

Append-only records

Evaluation snapshots, attestations, and audit logs are append-only. Database constraints and triggers prevent UPDATE or DELETE operations, ensuring immutable audit trails.

Start Institutional Evaluation

Set up a private evaluation workspace for your organization.