Chen, Shu-Yuan

CyCraft Technology / Data Scientist, Data Science

Currently working as a data scientist at CyCraft, specializing in computational mathematics, algorithms, and NLP. Graduated from the Department of Applied Mathematics at NYCU and has been a speaker at CYBERSEC and CraftCon.

SPEECH
4/15 (Tue.) 16:15 - 17:00 7F 703 AI Security & Safety Forum
Unveiling the Bias in Language Models: A Path to Stability in Security Assessments

Large Language Models (LLMs) have shown great potential in cybersecurity applications. However, to fully harness their value, inherent biases and stability issues in LLM-driven security assessments must be effectively addressed. This talk will focus on these challenges and present our latest research on improving evaluation frameworks.

Our study analyzes how LLMs can be influenced by the order of presented options during the assessment process, leading to biases. We propose ranking strategies and probabilistic weighting techniques that significantly improve scoring accuracy and consistency. Key topics covered in this talk include experimental design and observations on LLM biases, probability-based weighting adjustments, and methodologies for integrating results from multiple ranking permutations. Notably, through validation with the G-EVAL dataset, we demonstrate measurable improvements in model evaluation performance.

Whether you are conducting research on language models or working in cybersecurity technology and decision-making, this talk will provide valuable technical insights and practical takeaways.