Publications

Fast Free-text Authentication via Instance-based Keystroke Dynamics

Published in IEEE Transactions on Biometrics, Behavior, and Identity Science, 2020

Abstract: Keystroke dynamics study the way in which users input text via their keyboards. Having the ability to differentiate users, typing behaviors can unobtrusively form a component of a behavioral biometric recognition system to improve existing account security. Keystroke dynamics systems on free-text data have previously required 500 or more characters to achieve reasonable performance. In this paper, we propose a novel instance-based graph comparison algorithm called the instance-based tail area density (ITAD) metric to reduce the number of keystrokes required to authenticate users. Additionally, commonly used features in the keystroke dynamics literature, such as monographs and digraphs, are all found to be useful in informing who is typing. The usefulness of these features for authentication is determined using a random forest classifier and validated across two publicly available datasets. Scores from the individual features are fused to form a single matching score. With the fused matching score and our ITAD metric, we achieve equal error rates (EERs) for 100 and 200 testing digraphs of 9.7% and 7.8% for the Clarkson II dataset, improving upon state-of-the-art of 35.3% and 15.3%.

Download here

Introducing machine learning concepts using hands-on Android-based exercises

Published in 2019 IEEE Frontiers in Education Conference (FIE), 2019

Abstract: In this innovative practice work-in-progress paper, we discuss novel methods to teach machine learning concepts to undergraduate students. Teaching machine learning involves introducing students to complex concepts in statistics, linear algebra, and optimization. In order for students to better grasp concepts in machine learning, we provide them with hands-on exercises. These types of immersive experiences will expose students to the different stages of the practical uses of machine learning. The data collection apparatus is based on applications (apps) developed for the Android platform. Due to the accessible nature of the app and the exercises based on the app, this approach is useful for students across all majors.We provide the students with three different sets of activities, the first of which will introduce the basics of machine learning with specially designed artificial datasets. The second and third activities involve data collection, modeling, training, and testing, as applied to machine learning algorithms. The second activity will involve collecting touch/swipe data on mobile devices from students as they use a touch logger app. The third activity uses the Reflections app to collect cross-correlation data from rooms with different purposes. These hands-on activities guide the students through every step of the machine learning process. Student learning is assessed for each activity by holding workshops for undergraduate students. A workshop with the first activity outlining the basics of machine learning was given in the fall of 2018 and significant student learning was demonstrated. Workshops for the second and third activities are planned for the fall semester of 2019. Results from these workshops will be presented at the conference.

Download here

Fast and Accurate Continuous User Authentication by Fusion of Instance-based, Free-text Keystroke Dynamics

Published in 2019 International Conference of the Biometrics Special Interest Group (BIOSIG), 2019

Abstract: Keystroke dynamics study the way in which users input text via their keyboards, which is unique to each individual, and can form a component of a behavioral biometric system to improve existing account security. Keystroke dynamics systems on free-text data use n-graphs that measure the timing between consecutive keystrokes to distinguish between users. Many algorithms require 500, 1,000, or more keystrokes to achieve EERs of below 10%. In this paper, we propose an instancebased graph comparison algorithm to reduce the number of keystrokes required to authenticate users. Commonly used features such as monographs and digraphs are investigated. Feature importance is determined and used to construct a fused classifier. Detection error tradeoff (DET) curves are produced with different numbers of keystrokes. The fused classifier outperforms the state-of-the-art with EERs of 7.9%, 5.7%, 3.4%, and 2.7% for test samples of 50, 100, 200, and 500 keystrokes.

Download here

Fast Continuous User Authentication Using Distance Metric Fusion of Free-Text Keystroke Data

Published in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR), 2019

Abstract: Keystroke dynamics are a powerful behavioral biometric capable of determining user identity and for continuous authentication. It is an unobtrusive method that can complement an existing security system such as a password scheme and provides continuous user authentication. Existing methods record all keystrokes and use n-graphs that measure the timing between consecutive keystrokes to distinguish between users. Current state-of-the-art algorithms report EER’s of 7.5% or higher with 1000 characters. With 1000 characters it takes a longer time to detect an imposter and significant damage could be done. In this paper, we investigate how quickly a user is authenticated or how many digraphs are required to accurately detect an imposter in an uncontrolled free-text environment. We present and evaluate the effectiveness of three distance metrics individually and fused with each other. We show that with just 100 digraphs, about the length of a single sentence, we achieve an EER of 35.3%. At 200 digraphs the EER drops to 15.3%. With more digraphs, the performance continues to steadily improve. With 1000 digraphs the EER drops to 3.6% which is an improvement over the state-of-the-art.

Download here