Index formulas (v1)
Versioned formulas for derived indexes, DQI, CCI, AFuI, CRI, OVI, AFI, ACI (indexes/v1)
Derived index formulas (v1)
This document defines version 1 formulas for OpenDecisions derived indexes ("decision labels"). All derived outputs must include:
- value: 0–100 or structured (e.g. band + interval for ACI)
- uncertainty: at least CI90, or a defined confidence rule
- explanation: drivers, missingness, warnings
- citations: Source references (
computedFrom)
Formula version: indexes/v1
Schema version: odg/v1
Versioning
- Schema:
odg/v1 - Formula version:
indexes/v1 - (Optional later) Model version: e.g.
aci_model/v1
Any change to formulas must:
- Bump
formula_version - Update release notes / changelog
- Include migration notes if the semantic meaning of an index changes
Notation
clamp(x, a, b): clamp x into the interval [a, b]sigmoid(z) = 1 / (1 + exp(-z))zscore(x): computed within an appropriate peer group (e.g. same level and Carnegie band)
1. CCI: Competitiveness / Crowd Index (0–100)
Question: "How competitive is this cycle independent of a given applicant?"
Inputs (cycle-level when available):
applications,admitted,enrolled- Optional: year-over-year application growth
Reference formula:
acc = admitted / applications
acc_score = 1 - acc
yield = enrolled / admitted
growth_z = zscore(log(applications) - log(prev_year_applications)) // clipped
yield_z = zscore(yield)
raw = 0.55*acc_score + 0.25*sigmoid(growth_z) + 0.20*sigmoid(yield_z)
CCI = round(100 * raw)Uncertainty: Binomial CI for acc and yield, then propagate through the formula (e.g. bootstrap).
2. DQI: Data Quality Index (0–100)
Question: "How much should you trust what you see?"
Components (each in [0, 1]):
- staleness: 0 if current; approaches 1 when beyond policy threshold
- missingness: fraction of required inputs missing for the requested label view
- auditability: 1 for official reported; lower for scraped-only or unclear
- inconsistency: cross-source conflicts / unreconciled mismatches
Reference formula:
DQI = 100 - (25*staleness + 45*missingness + 20*(1 - auditability) + 10*inconsistency)Rules:
- DQI MUST be returned whenever any other index is returned.
- DQI MUST influence presentation (warnings) and uncertainty widening.
3. AFI: Academic Fit Index (0–100)
Question: "How aligned are my academics with the reported admitted range?"
Inputs:
- Applicant academics (GPA, tests where used)
- Admitted distributions (often from institution documents; "submitted-only" caveats in test-optional years)
Reference (simple, explainable):
- Convert applicant value to percentile against admitted distribution:
p_gpain [0,1],p_testin [0,1] prereq_matchin [0,1] (optional; requirements checklist)
AFI = round(100 * mean(p_gpa, p_test, prereq_match))Note: AFI is an alignment signal with explicit data caveats, not a "worth" score.
4. AFuI: Affordability & Funding Index (0–100)
Question: "Can I afford this, given my budget?"
Inputs:
- COA components and/or net price distributions
- User's stated ability to pay
Reference formula:
expected_net_cost = median(net_cost_distribution) // or another robust estimator
AFuI = 100 * (1 - min(1, expected_net_cost / ability_to_pay))Applicability:
- If net price is defined for domestic aid regimes only, publish with an applicability warning.
- Do not claim "international net cost" unless the source supports it.
5. CRI: Completion Risk Index (0–100)
Question: "What's the downside risk of not completing?"
Reference formula:
CRI = 100 * (1 - graduation_rate_T)- T = 6 years for US undergrad benchmark.
Note: Cohort-level signal, not an individual prediction.
6. OVI: Outcome Value Index (0–100)
Question: "What is the expected upside, conservatively?"
Reference (v1-safe):
OVI = scale_0_100(0.6*grad_rate + 0.4*earnings_score)Rules:
- If earnings/outcomes are not auditable or comparable, reduce weight and lower DQI.
7. ACI: Admission Chance Index (band + interval; v1-safe)
Question: "Given my profile and this cycle's competitiveness, is this Reach / Target / Likely?"
v1-safe posture: Default to bands and wide uncertainty unless labeled outcomes exist and governance permits probabilistic outputs.
Inputs:
- AFI in [0, 1] (e.g. AFI/100)
- CCI in [0, 100]
- missingness in [0, 1]
acc: cycle acceptance rate when available
Reference: composite score and bands:
s = 0.6*AFI + 0.4*(1 - CCI/100)Band thresholds:
- likely if
s >= 0.75 - target if
0.55 <= s < 0.75 - reach if
s < 0.55
Conservative probability interval:
p_mid = clamp(acc * (0.5 + 1.0*AFI), 0.01, 0.95)
width = 0.25 + 0.35*missingness
p_low = clamp(p_mid - width/2, 0, 1)
p_high = clamp(p_mid + width/2, 0, 1)Publish band (reach | target | likely) and the interval (p_low, p_mid, p_high) with clear uncertainty messaging.
8. International mobility / work pathway (flags; v1)
This is not a numeric index in v1. Publish structured flags and citations only:
- OPT/CPT policy notes (general; not individualized)
- STEM designation flag (where auditable)
- "Policy can change" warnings
- Visa issuance stats as country-level context, not school-level odds
Summary
| Index | Name | Question | Scale |
|---|---|---|---|
| DQI | Data Quality Index | How much to trust what you see? | 0–100 |
| CCI | Competitiveness Index | How competitive is this cycle? | 0–100 |
| AFuI | Affordability Index | Can I afford this? | 0–100 |
| CRI | Completion Risk Index | What's the downside risk? | 0–100 |
| OVI | Outcome Value Index | What's the expected upside? | 0–100 |
| AFI | Academic Fit Index | How aligned are my academics? | 0–100 (profiled) |
| ACI | Admission Chance Index | Reach / Target / Likely? | Band + interval (profiled) |
See also: DerivedIndex type (schema), Schema reference.