Exam statistics

Learn how to use the Question statistics page to make more data driven decisions and gain a new level of insights into your questions.

The Exam statistics page provides a comprehensive view of how your exam is performing across all attempts. It brings together high-level metrics, detailed question analysis and tag-based insights to help you evaluate performance, improve question quality and ensure your assessments are reliable.

Exam statistics is split into 3 different tabs

  1. Exam

  2. Questions

  3. Tags

All statistics can be filtered by one or multiple exam versions, allowing you to:

  • Compare different versions of the same exam

  • Track performance over time

  • Measure the impact of changes to questions or scoring

Question Statistics

The Questions tab provides a detailed breakdown of performance at the individual question level.

For each question, you can analyse:

  • Average score or credits achieved

  • Distribution of responses

  • Correct, incorrect and partial answer rates

  • How candidates interact with answer options

Use the Manage columns drop down to select which columns to have shown or hidden, different question types have different statistics and information available to view

Managing columns

The table below summarises the different columns and what they display. As a quick reminder on credits and points:

  • Points - Are what is awarded for a correct / partially correct response. If negative marking is being used then points are what are deducted for an incorrect answer

  • Credits - The maximum points up for grabs for answering a question correctly

Question stats example with statistics
Column
Description

Question stem

A preview of the question as it appeared in the exam (open the toggle to view)

Average credits

The average number of credits achieved by users in this exam on this question

Max credits

The maximum number of points that can be awarded for a correct answer

Partial credits

The number of points awarded for a partially correct answer

Negative scoring

The number of credits deducted for an incorrect response

Average score

The average percentage score of the question in this exam

EMQ

The EMQ group the question belonged to (if any)

Section

The section ID the question belongs to (if any)

Subject, Topic, Subtopic, Skill and Difficulty

The faceted tag/s set on the question (if any)

Total

The total number times the question appeared in completed exam attempts

Responses

The number of times the question was answered correct, incorrect, partially correct, or skipped

Option breakdown

The number of times each answer option was selected for this question during this exam

Statistics: Chance

The probability of guessing the correct answer, this is based on the number of possible answers with a single correct answer

Statistics: P-Value

Item difficulty index ranges from 0 to 1, the higher the p-value the easier the question

Statistics: Type

The discrimination type will either be the Pearson or Rpb (point biserial). Rpb is a special case of the Pearson's correlation used when the question only has two possible outcomes (correct or incorrect). Pearson will be displayed when there are 15+ responses, and candidates received correct, incorrect, or partial marks, Rpb will be displayed when there are 15+ responses, and candidates received only correct or incorrect marks.

Statistics: Value

The discrimination index is a measure of the correlation between success in a particular question and success in the entire exam, and ranges from -1 to 1. The higher the value the better the question discrimination between high and low performing candidates. A negative value suggests low performing candidates are more likely to answer the question correctly than high performing candidates. These values will be displayed when there are 15++ responses.

Search by ID or External ID using the search bar and use the toggle to expand each question in more detail. Questions that require manual marking don't have an aggregated response to show.

Expanded question stats
circle-info

Remember when creating exams, the questions are cloned from the source quiz or quizzes at the time of creation, so any edits you make to questions after the exam has been created won't be pulled through. This means your question stats page could look different to your actual questions if they have been updated.

Understanding Questions Statistics

Synap provides two key psychometric measures to help evaluate question quality: 1.Item Difficulty Index (P-Value) & 2.Item Discrimination (Rpb or Pearson Correlation)

1.Item Difficulty Index (P-Value)

Formula:

Range:

  • 0.00 = No candidates answered correctly (very difficult)

  • 1.00 = All candidates answered correctly (very easy)

Important Notes:

  • P-values are calculated from actual response data, not from question settings.

  • If a P-value is unexpectedly 0.00 or 1.00, check:

    • That the correct answer is set in the question editor

    • That the question has been attempted and marked correctly

  • Partial credit does not affect the P-value — a candidate must receive full marks to count as a correct response.

2.Item Discrimination (Rpb or Pearson Correlation)

Definition: Item discrimination measures how well a question differentiates between high- and low-performing candidates. In other words: do stronger candidates tend to get this question right?

How it's Calculated in Synap:

  • Point Biserial (Rpb): Used when a question has only correct or incorrect outcomes (i.e. full marks or no marks).

  • Pearson Correlation: Used when a question allows for partial marks (e.g. multi-mark, multi-step).

Requirements:

  • Discrimination values are only shown when a question has received at least 15 valid responses to ensure statistical reliability.

Interpretation Guide:

Discrimination Value
Interpretation

≥ 0.40

Very good discrimination

0.30 – 0.39

Good

0.20 – 0.29

Fair

< 0.20

Low — may need review

< 0.00

Negative — likely an issue (e.g. miskeyed)

Why 15+ Responses? Correlation-based stats (Rpb or Pearson) can be unstable with small sample sizes. Fewer than 15 responses can result in misleading values — for example, a single outlier could skew the correlation significantly.

Synap uses a 15-response threshold to ensure that item discrimination values are statistically meaningful and actionable.

Exam statistics

The Exam tab has an interactive overview of key metrics on your exam, it includes:

  • Attempt status flow

  • Score distributions

  • Version filtering

  • Highest and lowest average scoring questions

  • Outcome breakdowns (correct, incorrect, partial, skipped)

  • Grade breakdowns

  • Average score by sections

Use the expand (4 arrows) and toggle options (bar or stacked) for different views and more detailed breakdowns of your exams.

Expanding attempt status flow

Exam reliability (KR-20)

The KR-20 statistic measures the internal consistency of the exam — in other words, how reliably the exam assesses the same underlying ability.

  • A higher KR-20 score indicates a more reliable exam

  • A lower score may suggest inconsistent or poorly aligned questions

General guidance:

  • ≥ 0.90 — Excellent

  • 0.80–0.89 — Good

  • 0.70–0.79 — Acceptable

  • 0.60–0.69 — Weak

  • < 0.60 — Poor

Reliability becomes more accurate as more attempts are submitted. This metric is particularly useful when reviewing new exams or validating updates to existing ones.

Tag statistics

The Tags tab groups performance data based on tags and facets applied to questions. This allows you to analyse results by:

  • Subject

  • Topic or subtopic

  • Skill or competency

  • Difficulty level

Use tag statistics to:

  • Identify strengths and weaknesses across cohorts

  • Compare performance across different areas of your content

  • Ensure balanced coverage across your exam

Exam tag statistics

Learn more about Synap's tag and facet system below

Facet best practicechevron-right

Last updated

Was this helpful?