DigiBot: A personalized generative AI tool for assessing digital competences through self-assessment

Two students at Bern University of Applied Sciences created DigiBot, an AI chatbot converting the EU’s standardized digital skills test (DigComp) into personalized questions. Their thesis explored whether AI-tailored tests improved engagement and understanding, revealing key insights into both opportunities and challenges of AI personalization.

The Challenge: Making standardized self-assessment tests more accessible

Standardized tests present a fundamental paradox: they must be generic enough to evaluate everyone consistently, yet specific enough to feel relevant to each test-taker. When questions seem disconnected from real-life experiences, engagement drops and results become less meaningful. The challenge facing educators and assessment designers is maintaining measurement rigor while accommodating diverse backgrounds and contexts.

For their research platform, the students selected the EU DigComp framework – a comprehensive digital skills assessment containing over 80 questions available in multiple languages. This framework assesses five critical digital competencies modern citizens need (Information and data literacy, Communication and collaboration, Digital content creation, Safety and security, Problem-solving).

The research objectives extended beyond merely generating alternative questions. The students structured their investigation around three key research questions:

How can an AI-powered chatbot effectively gather relevant personal context from users?
What selection criteria ensure AI-generated questions maintain assessment validity while increasing relevance?
How do different user groups (varying by age, profession, and digital proficiency) respond to personalized versus standard questions?

This structured approach allowed the researchers to explore not just whether personalization works but also how and for whom it works best.

Innovative Approach: AI-power personalization

DigiBot reimagines standardized testing through a two-phase approach: profile-building followed by question adaptation. Unlike traditional assessments that start with generic questions, DigiBot begins with a brief, structured conversation to build a contextual profile — gathering details about professional role, responsibilities, self-assessed digital skills, daily tech interactions, and frequently used tools or platforms.

Once the profile is built, DigiBot’s LLM-based prompts transform standard DigComp questions into personalized versions while maintaining assessment objectives. For instance, instead of “Can you identify potential phishing attempts in emails?”, a marketing professional might see: “When reviewing customer inquiries after your latest email campaign, how would you identify potentially fraudulent messages?” An IT administrator might be asked: “What indicators would you teach non-technical staff to spot phishing attempts targeting company credentials?”

To evaluate effectiveness, the researchers ran a comparative study with 50 English-speaking participants from the U.S., ensuring language proficiency and cultural consistency. In a single-blind A/B test, users viewed both original and AI-personalized versions of the same question (randomized in order) and rated them on clarity, relevance, and perceived difficulty.

(1) Creating an Effective Digital Profile: The Foundation of Personalization

The quality of personalized questions hinges on how well the system understands each user, making profile generation critical to DigiBot’s success. The research team tested two technical approaches: a no-code solution using Langflow (which integrates LangChain) and a custom chatbot with direct OpenAI integration. Both aimed to gather rich contextual data through carefully designed questions about users’ professions and experiences.

They found that merely collecting information wasn’t enough — its quality and usability were key. To evaluate responses, they developed a framework assessing clarity (specificity), completeness (coverage), and credibility (consistency). Vague answers like “I use computers for work” triggered follow-ups such as “What specific applications do you use, and for what tasks?”

This refinement process required balance. Too many questions risked abandonment; too few led to weak personalization. After iterative testing, the team settled on a 3–5 minute interaction — enough to build meaningful profiles without user fatigue. This foundation proved essential, as high-quality personalization depends entirely on the quality of the input.

(2) From Generic to Personal: The Question Transformation Process

After establishing a comprehensive user profile, DigiBot faced its core challenge: transforming standardized assessment questions into personalized versions without compromising their evaluative intent. The team developed a sophisticated prompt engineering method that combined the original DigComp questions with user profiles, instructing OpenAI’s language model to generate contextually relevant adaptations.

The research team established four critical evaluation metrics for each generated question: clarity (was the question easily understood?), contextual relevance (did it reflect the user’s background?), fidelity to original meaning (did it test the same skill?), and appropriate difficulty (did it preserve the intended challenge?). Since the model typically produced multiple variations per prompt, a scoring algorithm ranked them based on these metrics.

This transformation process required fine tuning. Generic questions lacked personalization benefits, while overly specific one could confuse users. Systematic testing revealed that the most effective adaptations referenced tools from the user’s profile and realistic workplace scenarios – which maintaining clear, accessible language aligned with the originals.

Key Results

The user study provided valuable insights into the effectiveness of AI-generated personalized assessments. The findings highlight key strengths and challenges:

Clarity and Readability. Users found the original DigComp statements easier to understand and read, regardless of their digital proficiency level. AI-generated statements were often more complex, sometimes making them harder to interpret.
Detail and Professional Relevance. The AI-generated statements were rated more detailed and profession-specific, making them feel more relevant – particularly for users whose job roles closely matched the assessment topics.
User Preferences by Skill Level. Beginners preferred the original statements for their simplicity. Advanced users appreciated the contextual depth of the AI-generated versions, even at the cost of increased complexity.
Effectiveness of Personalization. Users valued questions tailored to their personal background, especially when the AI successfully integrated relevant details from their user profile.
Challenges with Personalization. Some AI-generated statements were overly complex or inconsistent, particularly when the user’s job role was not directly related to the assessment topic. Improved AI validation and simplification techniques are needed to ensure generated statements remain clear and accessible.

Outlook and Next Steps

Although no clear preference emerged between original and personalized statements, the study highlights the promise of AI-driven personalization. The main challenge is striking a balance between contextual relevance and simplicity to ensure assessments remain engaging and accessible.

Moving forward, the focus lies on refining user profiling, improving prompt design, and enhancing evaluation criteria. Longer-term efforts include model fine-tuning, feedback integration, and cost-efficient delivery strategies.

This project was a transdisciplinary collaboration between the BFH School of Business and BFH School of Engineering and Computer Science — bridging human-centered design and technical innovation. The work will continue in future research and applied contexts, building on this foundation to push personalized assessment even further.

References

EU DigComp Framework, https://joint-research-centre.ec.europa.eu/projects-and-activities/education-and-training/digital-transformation-education/digital-competence-framework-citizens-digcomp_en
DigiBot – eine GenAI Lösung für Selbsteinschätzungstests, https://bfh.easydocmaker.ch/search/abstract/4240/

Christian Schmidhalter is currently working as research assistant in the Institute for Data Applications and Security (IDA) within the Department of Technology and Computer Science at the Bern University of Applied Sciences.

Roman Schneiter is currently working as research assistant in the Institute for Data Applications and Security (IDA) within the Department of Technology and Computer Science at the Bern University of Applied Sciences.

Roman Rietsche is professor for information systems and AI and co-head of the Human-Centered AI-based Learning Systems lab (HAIS) at the Institute for Digital Technology Management within the Department of Business at the Bern University of Applied Sciences.

Kenneth Ritley is Professor of Computer Science at the Institute for Data Applications and Security (IDAS) at BFH Technik & Informatik. Born in the USA, Ken Ritley has already had an international career in IT. He had Senior Leadership Roles in several Swiss companies such as Swiss Post Solutions and Sulzer and built up offshore teams in India and nearshore teams in Bulgaria among others.

DigiBot: A personalized generative AI tool for assessing digital competences through self-assessment

The Challenge: Making standardized self-assessment tests more accessible

Innovative Approach: AI-power personalization

(1) Creating an Effective Digital Profile: The Foundation of Personalization