Versioned formulas for derived indexes, DQI, CCI, AFuI, CRI, OVI, AFI, ACI (indexes/v1)

Derived index formulas (v1)

This document defines version 1 formulas for OpenDecisions derived indexes ("decision labels"). All derived outputs must include:

value: 0–100 or structured (e.g. band + interval for ACI)
uncertainty: at least CI90, or a defined confidence rule
explanation: drivers, missingness, warnings
citations: Source references (computedFrom)

Formula version: indexes/v1
Schema version: odg/v1

Versioning

Schema: odg/v1
Formula version: indexes/v1
(Optional later) Model version: e.g. aci_model/v1

Any change to formulas must:

Bump formula_version
Update release notes / changelog
Include migration notes if the semantic meaning of an index changes

Notation

clamp(x, a, b): clamp x into the interval [a, b]
sigmoid(z) = 1 / (1 + exp(-z))
zscore(x): computed within an appropriate peer group (e.g. same level and Carnegie band)

1. CCI: Competitiveness / Crowd Index (0–100)

Question: "How competitive is this cycle independent of a given applicant?"

Inputs (cycle-level when available):

applications, admitted, enrolled
Optional: year-over-year application growth

Reference formula:

acc = admitted / applications
acc_score = 1 - acc
yield = enrolled / admitted
growth_z = zscore(log(applications) - log(prev_year_applications))  // clipped
yield_z = zscore(yield)
raw = 0.55*acc_score + 0.25*sigmoid(growth_z) + 0.20*sigmoid(yield_z)
CCI = round(100 * raw)

Uncertainty: Binomial CI for acc and yield, then propagate through the formula (e.g. bootstrap).

2. DQI: Data Quality Index (0–100)

Question: "How much should you trust what you see?"

Components (each in [0, 1]):

staleness: 0 if current; approaches 1 when beyond policy threshold
missingness: fraction of required inputs missing for the requested label view
auditability: 1 for official reported; lower for scraped-only or unclear
inconsistency: cross-source conflicts / unreconciled mismatches

Reference formula:

DQI = 100 - (25*staleness + 45*missingness + 20*(1 - auditability) + 10*inconsistency)

Rules:

DQI MUST be returned whenever any other index is returned.
DQI MUST influence presentation (warnings) and uncertainty widening.

3. AFI: Academic Fit Index (0–100)

Question: "How aligned are my academics with the reported admitted range?"

Inputs:

Applicant academics (GPA, tests where used)
Admitted distributions (often from institution documents; "submitted-only" caveats in test-optional years)

Reference (simple, explainable):

Convert applicant value to percentile against admitted distribution: p_gpa in [0,1], p_test in [0,1]
prereq_match in [0,1] (optional; requirements checklist)

AFI = round(100 * mean(p_gpa, p_test, prereq_match))

Note: AFI is an alignment signal with explicit data caveats, not a "worth" score.

4. AFuI: Affordability & Funding Index (0–100)

Question: "Can I afford this, given my budget?"

Inputs:

COA components and/or net price distributions
User's stated ability to pay

Reference formula:

expected_net_cost = median(net_cost_distribution)   // or another robust estimator
AFuI = 100 * (1 - min(1, expected_net_cost / ability_to_pay))

Applicability:

If net price is defined for domestic aid regimes only, publish with an applicability warning.
Do not claim "international net cost" unless the source supports it.

5. CRI: Completion Risk Index (0–100)

Question: "What's the downside risk of not completing?"

Reference formula:

CRI = 100 * (1 - graduation_rate_T)

T = 6 years for US undergrad benchmark.

Note: Cohort-level signal, not an individual prediction.

6. OVI: Outcome Value Index (0–100)

Question: "What is the expected upside, conservatively?"

Reference (v1-safe):

OVI = scale_0_100(0.6*grad_rate + 0.4*earnings_score)

Rules:

If earnings/outcomes are not auditable or comparable, reduce weight and lower DQI.

7. ACI: Admission Chance Index (band + interval; v1-safe)

Question: "Given my profile and this cycle's competitiveness, is this Reach / Target / Likely?"

v1-safe posture: Default to bands and wide uncertainty unless labeled outcomes exist and governance permits probabilistic outputs.

Inputs:

AFI in [0, 1] (e.g. AFI/100)
CCI in [0, 100]
missingness in [0, 1]
acc: cycle acceptance rate when available

Reference: composite score and bands:

s = 0.6*AFI + 0.4*(1 - CCI/100)

Band thresholds:

likely if s >= 0.75
target if 0.55 <= s < 0.75
reach if s < 0.55

Conservative probability interval:

p_mid = clamp(acc * (0.5 + 1.0*AFI), 0.01, 0.95)
width = 0.25 + 0.35*missingness
p_low = clamp(p_mid - width/2, 0, 1)
p_high = clamp(p_mid + width/2, 0, 1)

Publish band (reach | target | likely) and the interval (p_low, p_mid, p_high) with clear uncertainty messaging.

8. International mobility / work pathway (flags; v1)

This is not a numeric index in v1. Publish structured flags and citations only:

OPT/CPT policy notes (general; not individualized)
STEM designation flag (where auditable)
"Policy can change" warnings
Visa issuance stats as country-level context, not school-level odds

Summary

Index	Name	Question	Scale
DQI	Data Quality Index	How much to trust what you see?	0–100
CCI	Competitiveness Index	How competitive is this cycle?	0–100
AFuI	Affordability Index	Can I afford this?	0–100
CRI	Completion Risk Index	What's the downside risk?	0–100
OVI	Outcome Value Index	What's the expected upside?	0–100
AFI	Academic Fit Index	How aligned are my academics?	0–100 (profiled)
ACI	Admission Chance Index	Reach / Target / Likely?	Band + interval (profiled)

See also: DerivedIndex type (schema), Schema reference.

Index formulas (v1)

Derived index formulas (v1)

Versioning

Notation

1. CCI: Competitiveness / Crowd Index (0–100)

2. DQI: Data Quality Index (0–100)

3. AFI: Academic Fit Index (0–100)

4. AFuI: Affordability & Funding Index (0–100)

5. CRI: Completion Risk Index (0–100)

6. OVI: Outcome Value Index (0–100)

7. ACI: Admission Chance Index (band + interval; v1-safe)

8. International mobility / work pathway (flags; v1)

Summary

On this page