
AI Fluency Assessment.
Measured Through Performance
Most AI fluency tools rely on self-report. Candidates and employees rate themselves 1 to 5, and the data is unreliable. Bryq measures AI fluency through scenario performance, whether you are hiring AI-fluent talent or identifying the genuine experts on your existing team. Real tasks. Real AI tools. Scored outputs.

AI Fluency Assessment.
Measured Through Performance
Most AI fluency tools rely on self-report. Candidates and employees rate themselves 1 to 5, and the data is unreliable. Bryq measures AI fluency through scenario performance, whether you are hiring AI-fluent talent or identifying the genuine experts on your existing team. Real tasks. Real AI tools. Scored outputs.

AI Fluency Assessment.
Measured Through Performance
Most AI fluency tools rely on self-report. Candidates and employees rate themselves 1 to 5, and the data is unreliable. Bryq measures AI fluency through scenario performance, whether you are hiring AI-fluent talent or identifying the genuine experts on your existing team. Real tasks. Real AI tools. Scored outputs.
Why self-reported AI fluency does not work
Why self-reported AI fluency does not work
Self-rated skill correlates weakly with measured skill, especially in novel domains. The original Kruger & Dunning study (Journal of Personality and Social Psychology, 1999) has been replicated across domains since, and AI is the textbook case. People who have used ChatGPT three times often rate themselves higher than people who have built and deployed production AI workflows for two years. This matters for hiring decisions. It matters even more inside your existing team, where the loudest self-rated experts are often not the ones doing the strongest work.
For hiring decisions, this introduces unacceptable variance. A self-report score does not distinguish the confident-but-incompetent candidate from the actually-fluent one. The same problem shows up inside an organisation when you are mapping internal AI capability. Worse, self-rating introduces bias. Research consistently shows demographic patterns in self-assessment confidence that have nothing to do with actual capability. Performance-based measurement eliminates both problems, whether you are hiring or baselining your existing team.
Most "AI fluency assessment" content on the market today is self-administered. Candidates check boxes about which AI tools they have used, rate their confidence, and the system aggregates the scores. It is fast. It is also nearly worthless for hiring.
Self-rated skill correlates weakly with measured skill, especially in novel domains. The original Kruger & Dunning study (Journal of Personality and Social Psychology, 1999) has been replicated across domains since, and AI is the textbook case. People who have used ChatGPT three times often rate themselves higher than people who have built and deployed production AI workflows for two years. This matters for hiring decisions. It matters even more inside your existing team, where the loudest self-rated experts are often not the ones doing the strongest work.
For hiring decisions, this introduces unacceptable variance. A self-report score does not distinguish the confident-but-incompetent candidate from the actually-fluent one. The same problem shows up inside an organisation when you are mapping internal AI capability. Worse, self-rating introduces bias. Research consistently shows demographic patterns in self-assessment confidence that have nothing to do with actual capability. Performance-based measurement eliminates both problems, whether you are hiring or baselining your existing team.
Most "AI fluency assessment" content on the market today is self-administered. Candidates check boxes about which AI tools they have used, rate their confidence, and the system aggregates the scores. It is fast. It is also nearly worthless for hiring.
Self-rated skill correlates weakly with measured skill, especially in novel domains. The original Kruger & Dunning study (Journal of Personality and Social Psychology, 1999) has been replicated across domains since, and AI is the textbook case. People who have used ChatGPT three times often rate themselves higher than people who have built and deployed production AI workflows for two years. This matters for hiring decisions. It matters even more inside your existing team, where the loudest self-rated experts are often not the ones doing the strongest work.
For hiring decisions, this introduces unacceptable variance. A self-report score does not distinguish the confident-but-incompetent candidate from the actually-fluent one. The same problem shows up inside an organisation when you are mapping internal AI capability. Worse, self-rating introduces bias. Research consistently shows demographic patterns in self-assessment confidence that have nothing to do with actual capability. Performance-based measurement eliminates both problems, whether you are hiring or baselining your existing team.
Most "AI fluency assessment" content on the market today is self-administered. Candidates check boxes about which AI tools they have used, rate their confidence, and the system aggregates the scores. It is fast. It is also nearly worthless for hiring.
What "performance-based AI fluency" means
What "performance-based AI fluency" means
Anthropic's AI Fluency Framework defines fluency as a set of trainable behaviours. Bryq complements that with the measurement layer. Where Anthropic's framework describes what fluency looks like, Bryq's assessment tells you who has it. The candidate or employee is given a scenario, a set of AI tools, and clear constraints. They perform the task within a time window and submit the output. Bryq scores the output against multiple dimensions: quality, completeness, error detection, ethical handling, time efficiency.
What the candidate cannot do: bluff. The output is the data. If they say they are fluent and produce a hallucinated mess, they are not fluent. If they say they are a beginner and produce strong, evaluated work in 12 minutes, they are fluent. The measurement matches the reality.
Anthropic's AI Fluency Framework defines fluency as a set of trainable behaviours. Bryq complements that with the measurement layer. Where Anthropic's framework describes what fluency looks like, Bryq's assessment tells you who has it. The candidate or employee is given a scenario, a set of AI tools, and clear constraints. They perform the task within a time window and submit the output. Bryq scores the output against multiple dimensions: quality, completeness, error detection, ethical handling, time efficiency.
What the candidate cannot do: bluff. The output is the data. If they say they are fluent and produce a hallucinated mess, they are not fluent. If they say they are a beginner and produce strong, evaluated work in 12 minutes, they are fluent. The measurement matches the reality.
Anthropic's AI Fluency Framework defines fluency as a set of trainable behaviours. Bryq complements that with the measurement layer. Where Anthropic's framework describes what fluency looks like, Bryq's assessment tells you who has it. The candidate or employee is given a scenario, a set of AI tools, and clear constraints. They perform the task within a time window and submit the output. Bryq scores the output against multiple dimensions: quality, completeness, error detection, ethical handling, time efficiency.
What the candidate cannot do: bluff. The output is the data. If they say they are fluent and produce a hallucinated mess, they are not fluent. If they say they are a beginner and produce strong, evaluated work in 12 minutes, they are fluent. The measurement matches the reality.
The five dimensions of AI fluency Bryq measures
The five dimensions of AI fluency Bryq measures
Dimension
Dimension
Dimension
What it measures
What it measures
What it measures
AI Task Strategy
AI Task Strategy
What it
measures
Picks the right approach quickly. Knows when to use AI vs. when not to. Sets clear hand-off boundaries.
Picks the right approach quickly. Knows when to use AI vs. when not to. Sets clear hand-off boundaries.
Picks the right approach quickly. Knows when to use AI vs. when not to. Sets clear hand-off boundaries.
Prompting & Interaction
Prompting & Interaction
Core
approach
Designs effective prompts on first or second attempt. Iterates productively. Gets useful output efficiently.
Designs effective prompts on first or second attempt. Iterates productively. Gets useful output efficiently.
Designs effective prompts on first or second attempt. Iterates productively. Gets useful output efficiently.
Critical Evaluation
Critical Evaluation
AI
proficiency
Spots hallucinations, bias, and weak output. Verifies before submitting. Knows what "good" looks like.
Spots hallucinations, bias, and weak output. Verifies before submitting. Knows what "good" looks like.
Spots hallucinations, bias, and weak output. Verifies before submitting. Knows what "good" looks like.
Ethical & Responsible Use
Ethical & Responsible Use
Candidate
experience
Handles sensitive data correctly. Maintains transparency. Escalates appropriately. No corners cut.
Handles sensitive data correctly. Maintains transparency. Escalates appropriately. No corners cut.
Handles sensitive data correctly. Maintains transparency. Escalates appropriately. No corners cut.
Workflow Integration
Workflow Integration
Candidate
experience
Embeds AI naturally into the work. No friction. Quality checks built in. Time-efficient.
Embeds AI naturally into the work. No friction. Quality checks built in. Time-efficient.
Embeds AI naturally into the work. No friction. Quality checks built in. Time-efficient.
Fluency is the manifestation of competency across all five dimensions. A person can be competent in any single dimension and not fluent overall; fluency requires the dimensions to operate together as a natural extension of the work.
Fluency is the manifestation of competency across all five dimensions. A person can be competent in any single dimension and not fluent overall; fluency requires the dimensions to operate together as a natural extension of the work.
Fluency is the manifestation of competency across all five dimensions. A person can be competent in any single dimension and not fluent overall; fluency requires the dimensions to operate together as a natural extension of the work.
Sample tasks, what candidates actually do
Sample tasks, what candidates actually do
Three anonymised examples of the kinds of scenarios candidates and employees work through. The same tasks run for hiring and for internal baselining, with role-relative scoring. Specific items rotate to maintain assessment integrity; the structure stays the same.
Three anonymised examples of the kinds of scenarios candidates and employees work through. The same tasks run for hiring and for internal baselining, with role-relative scoring. Specific items rotate to maintain assessment integrity; the structure stays the same.
Three anonymised examples of the kinds of scenarios candidates and employees work through. The same tasks run for hiring and for internal baselining, with role-relative scoring. Specific items rotate to maintain assessment integrity; the structure stays the same.
Task 1: Customer reply with constraints
Task 1: Customer reply with constraints
Given an inbound customer complaint with mixed factual and emotional content, the candidate uses an AI assistant to draft a reply. Constraints: must address the complaint fully, must not commit to refunds beyond policy, must maintain professional tone, must complete in under 8 minutes.
Scoring captures: prompt quality, output quality, constraint adherence, error detection (was the AI's first draft factually correct?), final-output suitability.
Given an inbound customer complaint with mixed factual and emotional content, the candidate uses an AI assistant to draft a reply. Constraints: must address the complaint fully, must not commit to refunds beyond policy, must maintain professional tone, must complete in under 8 minutes.
Scoring captures: prompt quality, output quality, constraint adherence, error detection (was the AI's first draft factually correct?), final-output suitability.
Given an inbound customer complaint with mixed factual and emotional content, the candidate uses an AI assistant to draft a reply. Constraints: must address the complaint fully, must not commit to refunds beyond policy, must maintain professional tone, must complete in under 8 minutes.
Scoring captures: prompt quality, output quality, constraint adherence, error detection (was the AI's first draft factually correct?), final-output suitability.
Task 2: Noisy data extraction
Task 2: Noisy data extraction
Given a messy dataset (CSV with formatting errors, missing values, ambiguous labels), the candidate uses an AI tool to extract structured insights. Constraints: must identify at least three places where the AI output is likely wrong; must rank the reliability of each insight; must complete in under 10 minutes.
Scoring captures: tool-use efficiency, accuracy of error identification, ranking quality, judgement about what the AI got right vs. wrong.
Given a messy dataset (CSV with formatting errors, missing values, ambiguous labels), the candidate uses an AI tool to extract structured insights. Constraints: must identify at least three places where the AI output is likely wrong; must rank the reliability of each insight; must complete in under 10 minutes.
Scoring captures: tool-use efficiency, accuracy of error identification, ranking quality, judgement about what the AI got right vs. wrong.
Given a messy dataset (CSV with formatting errors, missing values, ambiguous labels), the candidate uses an AI tool to extract structured insights. Constraints: must identify at least three places where the AI output is likely wrong; must rank the reliability of each insight; must complete in under 10 minutes.
Scoring captures: tool-use efficiency, accuracy of error identification, ranking quality, judgement about what the AI got right vs. wrong.
Task 3: Ethical-use review
Task 3: Ethical-use review
Given an AI-generated business plan that contains a subtle ethical issue (e.g., a marketing approach that crosses GDPR consent boundaries), the candidate must identify the issue and rewrite the section. Constraints: must explain the specific ethical concern; must propose a defensible alternative; must complete in under 12 minutes.
Scoring captures: ethical-issue detection, depth of reasoning, quality of the alternative, defensibility of the proposed solution.
Given an AI-generated business plan that contains a subtle ethical issue (e.g., a marketing approach that crosses GDPR consent boundaries), the candidate must identify the issue and rewrite the section. Constraints: must explain the specific ethical concern; must propose a defensible alternative; must complete in under 12 minutes.
Scoring captures: ethical-issue detection, depth of reasoning, quality of the alternative, defensibility of the proposed solution.
Given an AI-generated business plan that contains a subtle ethical issue (e.g., a marketing approach that crosses GDPR consent boundaries), the candidate must identify the issue and rewrite the section. Constraints: must explain the specific ethical concern; must propose a defensible alternative; must complete in under 12 minutes.
Scoring captures: ethical-issue detection, depth of reasoning, quality of the alternative, defensibility of the proposed solution.
AI fluency vs AI proficiency vs AI competency
Three terms that overlap heavily. The disambiguation:
AI competency
the framework
Structured model of skills, knowledge, and behaviours. Use this term when designing an L&D programme or capability map.
AI proficiency
the practical performance
What a person can do with AI in their actual work. Use this term in hiring decisions.
AI fluency
the natural-extension quality
How smoothly the person operates AI as part of how they work. Use this term for senior roles and leadership pipeline.
Bryq's framework measures the same five dimensions for all three. The output framing changes; the measurement does not. For the full mapping see the disambiguation guide.
Use cases for hiring and for your current team
For For hiring: senior and leadership-pipeline roles
Fluency is the right term when the question is depth, not basic competence. Senior individual contributors and managers who lead AI-augmented teams need to operate AI as an extension of how they work. The assessment differentiates fluent candidates from competent ones at the offer stage.
For hiring: AI-intensive technical roles
Engineers, ML specialists, and AI-product roles benefit from performance-based fluency measurement that goes beyond technical interviews. The framework captures the ethical, evaluation, and workflow-integration dimensions that pure technical interviews often miss.
For your current team: internal capability differentiation
Where a workforce has uneven AI capability and you need to identify the high-fluency individuals (for mentoring, internal advocacy, or programme design) the performance-based assessment surfaces who is genuinely fluent versus who self-reports fluency they do not have. The same assessment that screens candidates baselines current employees with role-relative scoring.
For your current team: re-baselining as tools change
AI tooling evolves quickly. The fluency that mattered in mid-2025 is not the fluency that matters in 2027. Run the assessment annually across the workforce to track real capability change over time. The longitudinal data is far more useful than year-on-year self-report surveys, which mostly track confidence inflation.
Customer evidence
Customers use Bryq for technical-role hiring where AI fluency matters in real time. Others, AI-native customers, run it to hire genuinely AI-fluent talent. Across the 140+ teams using Bryq globally: 3x improvement in quality of hire, 47% lower attrition, 2x faster hiring.
Results measured across Bryq customer engagements. Individual outcomes vary by role, industry, and baseline hiring maturity.
Bryq measures AI fluency the way it should be measured
Through the work, not through the self-assessment. The data matches the reality. The hiring decisions improve. The internal capability baseline becomes something you can act on.
Ready to Measure AI Proficiency?
Book a 30-minute demo. We’ll build your first AI Proficiency profile on the call, for a role you're hiring or a team you want to assess.
Ready to Measure AI Proficiency?
Book a 30-minute demo. We’ll build your first AI Proficiency profile on the call, for a role you're hiring or a team you want to assess.
Ready to Measure AI Proficiency?
Start hiring based on
real data.
FAQ
Find answers to the most frequently asked questions
Resources
GDPR
COMPLIANT
AICPA SOC
CERTIFIED

Resources
GDPR
COMPLIANT
AICPA SOC
CERTIFIED

Resources

