Most AI risk frameworks score the whole role. RoleOS scores at the task level. The reason is simple: AI does not replace jobs. It changes which tasks inside a job stay, change, or disappear.
The RoleOS framework sorts every task into one of three buckets: automate, augment, keep human. The Automation Risk Score is the sub-layer that feeds the automate bucket, answering whether a task is technically vulnerable to AI today. Strategic Moat Score then weights the defensibility of the same task underneath.
It answers a specific question: how vulnerable is this task to being done by current AI tools, with current quality, at current cost?
What ARS actually measures
ARS is not a prediction of where AI will be in five years. It is a measurement of where AI is right now, applied to the task in question, in the context of the role. That distinction matters. Future-state speculation is interesting. Current-state automatability is actionable.
A task with a high ARS today is a task you can hand to AI this quarter. A task with a low ARS today is a task that still needs the human, for reasons grounded in the actual capability of the tools you can buy.
The three inputs into the score
Three dimensions feed into the Automation Risk Score for any task.
Task structure. Is the task rule-based or judgment-based? Rule-based tasks (data lookups, status reports, formatting work) score high. Judgment-based tasks (prioritization calls, stakeholder management, ambiguity resolution) score low.
Data availability. Does the task have clean, structured inputs? Tasks with well-defined data flows score high. Tasks that require context, history, or relationship knowledge score low.
Human oversight required. Can the task run without supervision? Tasks with low oversight requirements score high. Tasks where every output carries judgment risk score low.
The four-quadrant model
Underneath the three-way framework, each task also plots against two deeper scoring layers: Automation Risk on one, Strategic Moat on the other. The result is a four-quadrant grid that informs which of the three buckets a task lands in, with a recommended action for each quadrant.
High risk, low moat. Automate now. The most leverage with the least loss.
High risk, high moat. Augment carefully. AI does the heavy lifting. The human owns the call.
Low risk, low moat. Monitor. AI is not ready, but it will be.
Low risk, high moat. Keep human. This is the value the role exists to deliver.
Most leaders score a role by asking "can AI do this job?" That is the wrong question. AI does not do jobs. It does tasks.
How RoleOS applies the score
Every task in the decomposition gets an ARS. The score is not subjective. It derives from the three inputs above, weighted against benchmark data from prior engagements. When two analysts score the same task independently, the numbers should land within a tight band. That is the test for whether the framework is doing its job.
The ARS also lets the same task be re-scored over time. As tools improve, scores rise. As edge cases emerge, scores fall. The framework is built to evolve with the AI landscape, not to lock in a one-time judgment.
ARS is calibrated against real engagement data, not theory. Each new engagement adds calibration weight to future scores.
Common questions about the Automation Risk Score
How is ARS different from a generic AI risk assessment?
Whole-role assessments score a job. ARS scores every task inside that job. The two outputs are not interchangeable, and only the second one tells you what to redesign.
Is the score predictive or descriptive?
Descriptive of current capability. The score is meant to be re-run as the AI landscape changes, so the recommendations stay grounded in what AI can actually do today.
Can a high-ARS task still need humans?
Yes, if oversight requirements are high. Risk and oversight are two different inputs. A task can be technically automatable and still require a named human accountable for the output.
What if the task changes after we automate it?
Re-score it. The task profile is meant to evolve with the role redesign. Once a task shifts shape, its risk and moat scores both move.
RoleOS scoring is calibrated against real engagement data and grounded in research-backed task analysis.