CHAPTER 1: INTRODUCTION.In the last few decades, the smartphones have become an essential part of every human daily life, the smartphone equipped with high computational capabilities, easy access to the internet, storage capacity and portability has led to the surge use of the devices for important transactions. Smartphone usage has increased in day-to-day activities. (Jeroen, Elke den, Yuan, 2008).
Some of the important transactions include inputting valuable sensitive information such as, social security numbers, passwords, PINS and credits card information by interacting or touching of the soft keyboard of the smartphone. These smartphones devices are equipped with various hi-tech tools such as, the camera, the various on-board sensors and navigational tools that helps in enhancing the user experience. These hi-tech tools are sensitive to any action done on the phone by the user and have the ability to record data readings in regard to specific actions. For example, interacting with the smart phone soft touch screen keyboard to type or input information, these sensors can detect the action being done and capture data readings variations in respect to the action or the device’s camera ability to capture the location coordinates of a picture when it was taken.
These smartphone capabilities to capture different sensitive readings and information, have given rise to the question of privacy and security gap in regard to the device’s capturing abilities. Problem statement. In this era of smartphone devices, privacy and security of user’s information stored on the phone has become paramount, even though smartphone companies have offered such mechanism to protect user’s valuable information when it comes to some on-board sensors. With today’s smartphone on-board set of cheap implanted sensors number keeps growing, getting smarter and much more sensitive, such as an accelerometer, digital compass, gyroscope, GPS, microphone, and camera, these tools are enabling the emergence of personal, group, and community-scale sensing applications. (Nicholas, et al., 2010). The variation readings from these on-board sensors from these devices are left unsecured, even though there is a restriction of third party application from having access to some sensitive tools variations, like the camera and microphone, there exist no such restrictions when that said third-party application is being used. As such, an attacker can develop an application that would appear innocent but, in the back, the application is transferring these sensor variation readings to the attacker without the knowledge of the user as proven by (Chen ; Cai, 2011) in their research work.
They developed a third-party application TouchLogger, which capture the orientation sensor reading variation of the smartphone to infer the user’s keystrokes. Their application works when the third-party application Touchlogger is in use. The ability of the smartphones’ sensors readings to be able to infer keystrokes or to infer the location of a user has a great privacy and security concern because it can lead to government official to spy on their citizen, or love ones to spy on one another or even the theft of user’s personal information like the PIN.Research Questions.There are many research questions within the smartphone sensors and tools, for these research work, the analysis will be conducted on a full QWERTY keypad. The focus will be on the motion sensors (accelerometer and gyroscope) Below are questions that this research paper seeks to answer: Is there feasibility of carrying a side channel attack to infer user keystrokes based on the sensor (accelerometer and gyroscope) readings on a full QWERTY keypad? Which statistical model will best infer the keystroke? Which sensor among the motion sensor is most effective? How do you use the sensor readings to predict tap event and non-tap event? Is there a different in individual’s keystrokes pattern? Is there a similarity in keystrokes pattern among people from the same nation?Objectives.The objectives of this research work are to expose the weakness of security in a smartphone concerning motion sensors reading variation from different gesture and keystrokes pattern analysis. Inferring keystrokes on a QWERTY keypad on a smartphone.
Tap event inferencing. Find the best model in terms of keystrokes inference. Analysis of keystrokes pattern base on individuality and base on nationality. CHAPTER 2: LITERATURE REVIEWThe smartphone on-board motion sensors are enabling new applications across a wide variety of domains, such as healthcare, social networks, safety, environmental monitoring, and transportation. (Nicholas, et al., 2010). With this technological advancement, however, come new security and privacy challenges. Like their computers and other personal electronics, smartphones are not immune to the data theft ravaging the digital world.
Due to the assumption that these sensors reading is not sensitive, third party applications have access to the data without any security restrictions or privileges on all the major smartphone platforms, Android (specification, 2018) and IOS (Library:, 2018). Attacks on smartphoneSeveral attacks have been proposed to get user’s information on a smartphone device based on the tools and motion sensors. One of the earliest attacks was the ‘Touch Event hijacking’ (Hijacking, 2010).
TapJacking occurs when an attacker creates a fake user interface that seems like it can be interacted with, but actually, passes interaction events such as finger taps to a hidden user interface behind it. Using this technique, an attacker could potentially trick a user into making purchases, clicking on ads, installing an application, granting permissions, or even wiping all of the data from their phone, this type of attack is easy to detect, and with an improve in awareness and security in the operating system, this type of attack is no longer feasible. Another type of attack proposed was by (Aviv, et al., 2010), they explored the feasibility of pinpointing of lock pattern based on smudges left by fingers on the touchscreen. They investigated the feasibility of capturing such smudges and analysis was done by snapping the screen of the smartphone from different angles and variety of lighting, by changing the contrast of the image they were able to that partial pattern password recovery was possible. As smartphone users tends to favour the characters and numbers type of passwords, also there exist the possibility of the smudges being wiped out off the screen as the victim put his smartphone in his pockets and there is the case of the attacker must be in the possession of the victim’s device, these difficulties have rendered this type of attack nonetheless totally infeasible. Video-based attacksMore sophisticated attacks are proposed, one of the those attacks was by (Federico, et al.
, 2011; Krumm, 2007) and (Rahul, et al., 2011), they proposed a more feasible type of attack ‘shoulder surfing attacks’ which an attacker monitor user’s actions by spying on users with a camera (taking pictures or videos) or from a reflection (e.g. from sunglasses), focusing on the key magnifying of touchscreen to infer user’s inputs. Smartphone soft keyboard are created with a feature that magnifies each tap keys, after capturing the pictures, they analysed each frames of the picture/videos and were able to infer user keystrokes. Even their proposed type of attack is still feasible nowadays, there is still some limitations with the proposed type of attack are, which are; (1) smartphones from different companies have different keyboard layout, thus, to make it feasible, the attacker has to buy the same type of smartphone as the victim. (2) The attacker has to follow the victim, this will make the victim suspicious. (3) Due to movement and light intensity, the captured video/pictures may be distorted.
Another video type of attack was proposed by (Yi, et al., 2013) for keystrokes inferencing, where they used an extended computer vision processing to cover the challenges face by low-resolution images in the previous attack. They also took into consideration, the fingertip movement, this counter the challenges of low quality images faced by (Federico, et al., 2011) and (Rahul, et al., 2011). Similar video attack was proposed by (Qinggang, et al.
, 2014), they showed the feasibility of keystrokes inference by observing the shadow of the fingertips on a touchscreen, using machine-learning algorithm to predict the touch pattern. Planar homograph is then applied on the touch pattern to infer the keystrokes. Planar homograph is the process of analysing of an image/video where the video is recreated to make precisely or near precise the first image/video. Another video attack was proposed by (Yimin, et al., 2018), their work was based on the movement of the eyes when the victim is typing on his touchscreen. Their theory was that the human eyes focus on and follow the keys that is intended to tap on a touchscreen, the eyes movement results in a unique pattern of the movement. Using the unique movement pattern also captured with a video recorded, they were able to deduce the keystrokes by analysing each frames of the video and recreating the frames. More of video attack by (Jingchao, et al.
, February 2016) was suggested, they moved away from the eyes movement, reflections and finger shadows and movement, they focused on the tablet backside motion when a user is typing. They noticed that each typed key causes the backside of the tablet to move in unique pattern. Using a video recording from VISIBLE; an application they developed, they used complex steerable pyramid decomposition to detect and quantify the subtle motion patterns. They analyse the frames, using SVM to classify the frames. They then differentiate the motion patterns and then use dictionary and linguistic relationship to increase the inference accuracy. Cons of video-based attacksThere are many feasible keystrokes inference video-based attacks proposed on the smartphone device, even though they are feasible, they are faced with one or more of the below challenges; Smartphones from different companies have different keyboard layout.
Inference is not feasible when the video recording is not available or in low quality. The attacker has to follow the victim; this will make the victim suspicious. Being (the attacker and the victim) in the same place all the time is highly unlikely.
Environment may obstruct the attacker video recording. Attacks based on sensitive tools/sensorsThe privacy and security concern with regards to mobile tools/sensors data have been in the spotlight for sometimes. Research works from (Nicholas, et al.
, 2010) and (Liang, et al., 2009) have shown how mobile tools/sensors data are sensitive. Liang et al., 2009 studies the vulnerability of the mobile tools such as the microphone, GPS and camera, they show that current mobile phone platforms inadequately protect their users from this threat. (Nicholas et al., 2010) highlighted the importance of the sensor-equipped smartphones and how the sensors will revolutionize many sectors in the world. They also explore the challenges of open issues and challenges emerging in the new area of mobile phone sensing research. Both research works highlighted the privacy and security issues relating to the smartphone mobile sensors, ranging from keystrokes inferencing, hijacking of camera, exploiting the GPS sensor for location and tapping of the microphone to eavesdrop.
An earlier proposed attack related to tools was done by (Krumm, 2007). His work was based on the GPS location data of some volunteers, where he was able to infer the subject’s location using the last know GPS coordinates, the amount of time they spent in a particular place (the subject spent more time at home) and coordinates level of clusters of GPS readings. (Nan, et al., 2009) proposed attack was based video camera of the 3G smartphone device. They created a spyware that turn on the camera to stealthy record information without the knowledge of the victim; it is perfect for spying on the victim. This type of attack can also be used to stealthy record environment/building with camera restriction.
An attack on the smartphone microphone has being exploited by (Schlegel, et al., 2011) in their work, they created a trojan called ‘Soundcomber’ that can record victim’s sensitive information such as credit and PIN numbers from both tone and speech-based interaction with phone menu systems. Their theory was that each key tap makes a distinct sounds/tune, by recording the tunes they were able predict the keystrokes. More proposed attack on the microphone was by (Laurent ; Ross, 2013), they exploited the feasibility of attack based on the microphone and the camera’s orientation sensor. They used a malicious application ‘PIN Skimmer’ that has access to the microphone and camera’s orientation sensor to collect the variation reading of the sensor. The microphone is used to detect touch event while the orientation sensor reading is used to deduce the keystrokes. When the application is in used, it stealthily video record the user’s activities and each time the user taps the touchscreen, the microphone detects a tap and the camera takes a picture frame with the orientation sensor variation readings. Another work focusing on the microphone was done by (Sashank, et al.
, 2014), their work was based on the microphone and gyroscope sensor and the combination of the two. The microphone measures the sounds decibels and the gyroscope measures the variation readings. Their work shows that the data from microphone has a higher accuracy than the data from gyroscope but the combination of both has a higher accuracy. Using the Weka tool, they tested different machine algorithms to find the best performing algorithm. Cons of sensitive based sensor attackAll these proposed attacks are feasible, but they depend on sensitive sensors of the smartphone, like the GPS, camera and microphone, accessing these sensitive sensors requires permissions grant from the user, without permission, these attacks are infeasible. Attack based on motion sensorsResearch have shown that motion sensors (e.
g. accelerometer, gyroscope) which are considered insensitive sensors, are highly vulnerable just like their sensitive sensor counterparts (camera, microphone). (Jun Han, 2012) gave a feasibility of location inferencing focusing on the accelerometer sensor readings on a smartphone, since the accelerometer measures the non-gravitation velocity, using the accelerometer trajectory variation readings, they were able to narrow down the possible movement of the user with the smartphone. One of the first research work that focuses on the smartphone motion sensors variation readings was by (Philip, et al., 2011.), they developed an application (sp)iPhone which uses motion sensors in an iPhone 4 placed on a table to infer the keystrokes of a nearby laptop, as the user was typing, the tapping caused vibrations, the motion sensor (accelerometer) is rattled by the vibration and it caused variation.
Another attack was proposed by (Chen & Cai, 2011). They stated since typing on touchscreen causes changes on the orientation sensor position due to the force of the finger on the touchscreen, they developed an application TouchLogger that captures the orientation sensor readings to infer keystrokes. They used a number-only keyboard on a landscape mode on a smartphone.
Using machine learning, they were able to successfully infer more than 70% of the keys typed. Similarly, (Emmanuel, et al., 2012) in their work ACCessory: Password Inference using Accelerometer on smartphone shows that accelerometer sensor readings in a smart phone are a powerful side channel attack feature in inferring user’s password. They divided the smartphone screen into 60 region grids and studied the feasibility of inferring the touch zones. Their work shows that with the accelerometer sensor readings, they were able to predict a 6-character password in 4.5 trials (median) and they were able to successfully break 59 out of 99 passwords using the accelerometer readings. Further work by, (Zhi, et al.
, 2012), they took a different approach to the problem; he proposed another type of attack that utilizes the accelerometer and the orientation sensor readings by creating an application the TapLogger, which does all the other applications does, capture the accelerometer and orientation sensor variation readings. The application has two features; (1) Tap feature is used to collect training data while the user is interacting with the application and the (2) Logging feature is to stealthily send the sensor readings when the user is inputting important information on other third-party applications. The logging feature of the TapLogger, detect when the user is inputting information on a third-party application like the browser, the application stealthy sends the data via the network to the attacker. Lastly, their work also shows the feasibility of inferencing is based on how the user uses his device and the type of device.
(Adam, et al., 2012) research work was based on the accelerometer sensor readings where data was collected in two scenarios; a controlled (while sitting) and uncontrolled (while walking). In a controlled scenario, they had 43% accuracy while in an uncontrolled scenario; a 20% accuracy was achieved.
They showed that keystroke inferencing in an uncontrolled scenario is quite challenging due to movement. (Liang ; Hao, 2012) in their work, answers the question, is the keystrokes pattern universal for everyone, all the type of phone. Their work showed that keystrokes inference depends on the type of device, screen dimension, the keyboard layout and the user. (Ahmed, et al.
, 2013) research work focused on addressing the question, which available sensors can perform best in the context of the inference attack. They considered all the available sensors and the integration of all the sensors data in a single dataset and compared each sensors performance in relation to inference keystrokes attack. They found out that the gyroscope has the highest accuracy and the least performing accuracy is the accelerometer sensor.
the research was conducted on number keypad layout while (Emiliano, et al., 2012) went further to address the question, which sensors has better accuracy. Their work was thus focus on the QWERTY keyboard layout rather than the number keypad layout. Result showed that the combination of the accelerometer and gyroscope reading have the higher accuracy rather than standalone sensor readings.(Xiangyu, et al.
, 2015) took a different approach. They explored the possibility of keystrokes inferencing based on the accelerometer on the smartwatch of the user. Their theory was that the accelerometer of the smartwatch could detect the user’s hands movement when tapping the touchscreen. They focused on the number and QWERTY keyboard pattern; they had a success rate of 65%. Irregular hand movement was one the challenge they faced in their approach. Even though this attack is feasible, it is highly unlikely for an attack because the attacker has to be in possession of victim’s smartwatch. This Experiment setupMost of the research above were based on number keypad instead of the QWERTY keyboard layout and furthermore, as the smartphones are getting light in weight, it is now easy to interact with one hand even while standing.
This research is focused on a controlled scenario where each user is standing and using the smartphone with one hand (right hand). The project is solely relied on motion sensors which is easily assessible through an APIs on the device. Previous works mostly are based on a single user, this research explores the keystrokes inference feasibility with from a number of users.Chapter 3: RESEARCH METHODSThis dissertation begins with introducing a smartphone device and its on-board sensors, the privacy and security concern based on the on-board sensors and an extensive literature review of feasibility of security and privacy different possible attacks, which would help answer this research questions.This research project involves the designing of a malicious application that can capture the motion sensors data readings based on keystrokes events on a QWERTY keyboard layout, the variation sensors readings extracted along with other features from the data and developed a machine learning algorithm to classify the data into different classification for analysis. Tools and modelsThe range of tools to be use for this project include: An htc one A9s Android smartphone with an on-board motion sensor.
For this project, I used the accelerometer and gyroscope sensors. An application: an android application is created with interfaces that allow users to type the inputs. The application captures the sensors variation readings of the accelerometer and gyroscope in the background; it saves the data to be transferred into a workstation.
Figure 3 1 user interaction of the application. Weka machine learning suite tool: is a suite of machine learning software written in Java, developed at the University of Waikato, New Zealand. It is a free software licensed under the GNU General Public License. It contains many machine learning algorithms and visualisation tools. It is used for data mining, classification, regression and feature selections. (Weka, 2018).
Statistical models: the models used for this project include the clustering K-NN model, Naive Bayes, Random Forest and the Multilayer perceptron classification.K means clustering model: is an algorithm that aims to cluster n-data reading into k clusters, each element of n-data is clustered nearest to the mean, serving as the center of the cluster. (Hartigan ; Wong, 1979).Naïve Bayes classification: A Naïve Bayes classifier is a classification based on (naïve) independence features probability assumption; that is, a particular feature in a class is unrelated to the presence of any other feature. It is derived from the Bayes theorem. (Tina ; Sherekar, 2013). Due to the naïve (independency) feature of the classifier, this classifier gives the better accuracy for each keystroke inference without compromising the other characters accuracy. From Naïve Bayes theorem;P(H/D)=P(H)P(D/H)/(P(D))P(H/D) is the probability of H given that the D have already occurred is the P(H) probability of H.
P(D/H) probability of D given the H occurred.P(D) probability of D.P(character/(all_characters ))=P(character)P((all_characters)/character)/p(all_characters ) The Naïve Bayes classification uses the Naïve Bayes theorem in terms of each probability, the probability with the highest value is chosen as the likely classification.Random forest: random forest is a supervised ensemble machine algorithm that build multiple decision trees to get better accuracy (Ho, 1995). The decision trees node is selected at random, in the end, the highest number of outcomes from all the decision trees in the forest is voted as the most likely class.
Figure 3 2 Random Forest algorithms with multiple decision trees.Source: (Niklas, 2018) Randomly select a feature “K” among all the features of 26 characters. From the “K” feature, calculate the nodes “best separation” of the feature. Split the nodes into other nodes using the best separation. Repeat the A to C steps until all the other features nodes has been created. Build the forest by repeating steps A to D for “n” number times to create “n” number of trees.
Multilayer perceptron: is a feedforward type of artificial neural network algorithm. It is a classification model that consist of at least a layer, a hidden layer and nodes. It is a simple algorithm intended to perform binary classification; i.e.
it predicts whether input belongs to a certain category of interest or not. (Rohit ; Suman, 2012) in their research work on Weka machine suite showed that multilayer perceptron works well in most cases. It is activated by an activation function. Backpropagation is used to fine-tune the classifier to get the best accuracy. This model gives importance to instances in terms of weights. Any character instance that is closer to a specific class will be giving more importance than other instances, this will aid in the classification accuracy. The disadvantage of this algorithm is, the computational power is very high. Figure 3 3: multilayer perceptron.
Source: (Vikas, 2017). Type of dataThe type of data used for this dissertation are: Accelerometer sensor: monitors the acceleration of the device when the device is in motion in three axes: left-right(x-axis), forward-backward(y-axis) and the up-down(z-axis). For example, the readings will be positive when the device is accelerating in each of the directions. The gyroscope sensor: monitors the rotation or twisting of a smartphone with respect to gravity. It is calibrated in three axes. It measures angular velocity.
It has three axes: When the device rotates along the Z-axis (perpendicular to the screen plane) azimuth angle changes in 0,360, when the device rotates along the X-axis (pitch angle) changes in ?180,180), when the device rotates along the Y-axis (roll angle) changes in ?90,90). (Liang ; Hao, 2012) Figure 3 4: Accelerometer ; Gyroscope sensors.Source: (Liang ; Hao, 2012)Both sensors reading changes when the smartphone is being interacted with due to the force of the tapping force. The sensor reader application is used to capture the sensors reading and then store the readings for further analysis. Data collectionThe data collection is done in a controlled manner that is standing and holding the smartphone with the right hand.
For this project the focused is placed on the right-handed people. Tap-event data collectionFor tap event data collection, the user holds the smartphone without tapping to get the sensor readings for 20 seconds to get the data for not tapping event, after that, the user is allowed to tap some few alphabets to get the sensor readings for tapping. Data collection for keystrokes inferencingThe sensor reader application allows the user to input all the 26 alphabets 20 times making a total of 520 inputs. As the user is typing, the application is capturing the sensors readings for each tap event.
Data collection from other usersData was collected from six (6) users for further to analysis. Each user is asked to input some random words into the application in the controlled manner. The random words contain 114 words making up of 626 characters. The volunteers are from different nationalities, three are from the China and the rest making up of different nations.
Figure 3 5: Word Counts.Each character frequencies are taking into consideration. Figure 3 6: Character frequencies. Data analysis Tap event: Since the accelerometer measures the acceleration of the mobile, I used the accelerometer three (x,y,z) axes coordinates sumsquare to predict when the user is tapping or not. Hierarchical method: I divided the keyboard alphabets into two sides that is, the right side and left side of a keyboard layout and related each sides’ characters with the sensor readings.
Then I tested the clustering K means model to infer which side of the hierarchy a keystroke belongs. Classification Model Generation: I used the Weka machine-learning suite to try different machine algorithms mentioned above and check the level of accuracy. In the process of testing, three algorithms were involved and Keystrokes inferencing and to answer other objectives of this project. Performance ValuationThe classification model function will analyse accuracy of keystrokes inference. I checked the most the effective motion sensors and the combination of variation readings from both sensors. The higher accuracy of the classification model defines the accuracy of the model for inferencing. The difference in keystrokes pattern of each user. The exact characters that differentiate between people from the same with people from other nations with higher classification accuracy.
CHAPTER 4: FINDINGSAfter data collection, Raw sensor reading data needs to be processed before applying the machine learning algorithms to it. To proceed with the inferencing, specific section relating to keystrokes/tap event has to be extracted from the data set. Taking the sumsquare of the three xyz coordinates, it shows that there is a distinction in the pattern of the different actions.Figure 4 1: This is the square of each coordinates of the accelerometer depicting different actions while holding the phone. (a) Walking (b) Running (c) Standing/no tap (d) Tapping. Pre-processingRaw data readings are not that important in keystrokes inferencing because tap span multiple readings, to get the specific sections concerning tap event, some unique features have to be extracted from the sensor reading data set. The data set consist of the three output (x, y, z) values along the three axes.
In the cause of the research. On average, each tap event consists of three different reading values. That is, at my tapping rate, the difference between a tap event and another are three sensor different readings. Due to the group of sensor readings spanning each keystroke, features are extracted. Some of the previous work focused on different unique features.
For this project, I focused on six statistics features, the minimum, maximum, median, average, standard deviation and the skewness. Minimum m: these are the minimum values for each of the three output (x, y, and z) along the three axes.Maximum mx: these are the maximum values for each of the three output (x, y, and z) along the three axes.
Average ?: the average of each three-sensor output (x, y, and z) values along the three axes.?=?n/n For each tap, the x1+x2+x3/3 for x coordinate. That is, x1 is the value before the tap, x2 the value at the tap reading and x3 is the value after the tap. Median M: is the middle of the three output along the corresponding coordinates (x, y, and z) in an ascending order. M=(N+1)th/2.
Standard deviation SD: is the amount of variation of each output along the coordinates from the corresponding average. SD=(?(?(xi-mean))^2)/nSkewness: is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean.skewness=3(mean-median)/(standard deviation)The raw data consists of three values along the three (x, y, and z) coordinate, each tap event has 18 features. These final features are the final input used for the machine learning analysis. For example, for x coordinate, the six (6) features will be xminA, xmaxA, xmeanA, xmedianA, xSDA, xSkewA. The first ‘x’ is the corresponding coordinate, the second is the unique feature and the ‘A’ is the sensor (accelerometer) from which the sensor data was driven. Each feature is used to distinguish a different keystroke.
EvaluationIn this section l evaluated the different keystrokes inferencing accuracy of the different classification model used, the tap event and the keystroke inference accuracy, the different and similarities among different of the users, The sensor reader application is configured on an htc one A92 Android smartphone; the device can generate sensors reading at the default frequency of 100Hz. I used the ten-fold cross validation approach for the analysis. This type of methods where the dataset is divided into 10 folds, where one-fold is used as a training set and the other folds are used for testing set. The process is repeated until all instances of the dataset serve as the training set and testing set. This method of analysis makes sure each part of the dataset is taken into considerations.
Tap eventTo infer the keystrokes, there is a need to predict between tap event and non-tap event. To do this, I based the analysis on the accelerometer sensor readings. Using the sum square of the three coordinates, the standard deviation and the mean. sumsquare=x^2+y^2+z^2From the figure below, there is a clear distinction between tap event and not tap event. I held the phone without tapping for a while to get the statistical mean and standard deviation for non-tap event to compare it to the tap event.
I got the mean to be 98.88 and the standard deviation to be 7.10. The difference between the mean and standard deviation is 91.77 and the addition of the mean and standard deviation is 105.
98. Using the 68.2% level of a normal distribution curve, I applied the non-tap mean value and up and down limit level to a tap sensor reading to distinguish between tap event and non-tap event. Figure 4 2: (a) Non-tap event, it shows must of the sensor readings fall between the 68.2% of a normal distribution curve. (b) Any value outside the 68.
2% is considered a tap event.Using the sum square of the three coordinates xyz of the accelerometer sensor, I built a model that extract all the sum squared values that are outside the threshold level (68.2%) of a normal distribution and the corresponding xyz coordinates. The xyz values are labelled as tap event. Any sum squared values that are between the threshold levels (68.2), the corresponding xyz values are labelled as non-tap. I then trained a random forest classifier for tap event and non-tap event.
The result is nearly 90%, producing nearly no false positive or false negative. A close look at the figure 8, there is a clear a distinction between the sum squares of a tap event and non-tap event. Keystroke inferenceThe sensor reader application captures the data of both the accelerometer and gyroscope sensors. I analyzed the keystrokes inference based on each sensor and the amalgamation of both sensors.
The inference probability of all the 26 alphabets is 1/26=0.38, after collecting the sensor readings of 520 instances (20 of each alphabet) and applying the different machine learning algorithms mentioned above, I now present the results.No tap Data discarded Classification Result classification result classification result Figure 4 3: The process blue print.
After the testing the correlations of all the features, I found the skewness feature has no correlation to the inference accuracy. The skewness feature was removed and take inot consideration only the remaining features, the max, min, mean, median and standard deviation, making the total of 15 features for the three xyz axes. Keystroke inference based on the accelerometerUsing the accelerometer sensor variation readings, I was able to get an accuracy of 31.
92% for the Random Forest algorithm, 22.11% for Multilayer Perceptron algorithm and 28.46% for the Naïve Bayes algorithms. The confusion matrixes for the three models are as follows: Figure 4 4: The Keystroke inference accuracy levels of the three machine learning algorithms based on the accelerometer sensor data.Table 4 1: confusion matrix for the random forest algorithms for accelerometer sensor data. The first column is the specific key accuracy, 2nd, 3rd, 4th, and 5th columns shows the accuracy of the immediate neighbors. key acc % 1st % conf. 2nd %conf 3rd % conf 4th % conf Total acc %A:80B:20C:20D:35E:30F:30G:15H:5I:45J:30K:35L:20M:70N:5O:20P:60Q:15R:35S:25T:60U:15V:55W:30X:35Y:25Z: 15 D:5N:5V:5F:15F:5G:5H:10G:10K:5K:15L:5M:15N:5M:15I:5O:5Z:10T:20W:10R:30I:10B:0A:10S:5U:15X:0 S:10H:0F:0W:10D:0D:5F:10J:25O:10U:10J:10U:5L:0H:5P:10L:0S:5E:0Z:0Y:0J:10C:0D:5D:10T:5A:5 Z:0G:5V:5S:0W:5S:10B:5U:0U:5H:10O:5P:10J:0J:0J:5M:10D:10D:0A:10G:5K:0F:0Z:15Z:0J:5S:0 Q:0F:0X:0X:5R:0R:0C:5Y:0M:0I:0M:0O:0K:0B:10M:0K:5A:5F:5D:10F:0L:5G:0Q:5C:0H:0D:5 9530306540504540656555507535408045605595405565505025From the table above, the inference accuracy based on the accelerometer sensor data varies from 80% for letter ‘A’ to 5% for letters ‘N’ and ‘H’.
Figure 4 5: keystrokes inference accuracy from 20 instances of each keys on the keyboard based on the accelerometer sensor. Figure 4 6: confusion probabilities of ‘S’, ‘G’ and ‘K’ based on accelerometer sensor. The keys belong to the left, middle and right side on the keyboard. The main keys are in dark color with their accuracies, the keys in light color represent misclassification accuracies for each neighboring character. Keystroke inference based in the gyroscope sensorUsing the gyroscope sensor reading, I was able to get an accuracy of 39.23% for the random forest algorithm, 24.
42% for multilayer perceptron algorithm and 27.69% for the Naïve Bayes algorithms. The confusion matrixes for the three models are as follows: Figure 4 7: The keystroke inference accuracy levels of the three machine learning algorithms based on the gyroscope sensor data.Table 4 2: confusion matrix for the random forest algorithms for accelerometer sensor data. The first column is the specific key accuracy, 2nd, 3rd, 4th, and 5th columns shows the accuracy of the immediate neighbors. key acc % 1st % conf. 2nd %conf 3rd % conf 4th % conf TOTAL ACC %A:60B:25C:50D:30E:15F:50G:40H:15I:60J:20K:30L:35M:45N:25O:45P:60Q:30R:75S:35T:70U:40V:45W:30X:30Y:20Z: 40 Q:5N:10V:10F:20F:0G:0H:10G:5K:5K:10L:10M:25N:0M:0I:0O:10Z:10T:10X:10R:20I:0B:15A:10S:0U:5W:5 S:25V:15F:5W:10D:0D:10F:10J:10O:0U:5J:10U:0L:20Y:5P:5L:0S:0E:0Z:0Y:0J:0C:0D:5D:20T:10A:5 W:10G:5V:5S:10W:10S:5V:0U:0U:0H:15O:5P:10J:0J:5J:0M:5D:5D:0A:25G:0k:0F:0Z:15Z:5G:5S:5 Q:0F:0X:0X:5Z:10T:5T:5Y:0M:0N:10N:5N:10U:15B:5M:0K:0A:15F:0D:15F:0M:20G:0Q:5C:10H:5D:10 10055707535706530656060708040507560858590606075654565From the table above, the inference accuracy based on the gyroscope sensor data varies from 75% for letter ‘E’ to 15% for letters ‘E’ and ‘H’.
Figure 4 8: keystrokes inference accuracy from 20 instances of each keys on the keyboard based on the gyroscope sensor. Figure 4 9: confusion probabilities of ‘S’, ‘G’ and ‘K’ based on gyroscope sensor. The keys belong to the left, middle and right side on the keyboard. The main keys are in dark color with their accuracies, the keys in light color represent misclassification accuracies of neighboring characters. Keystroke inference based on the amalgamation of the two sensorsUsing the combination of both the gyroscope and accelerometer sensors reading, the models gave an accuracy of 42.69% for the random forest algorithm, 31.
73% for multilayer perceptron algorithm and 33.07% for the Naïve Bayes algorithms. The confusion matrixes for the three models are as follows: Figure 4 10. this is the keystroke inference accuracy levels of the three different machine learning algorithms based on the amalgamation of the two motion sensors data, with random forest having a better accuracy than the other two algorithms.Table 4 3: confusion matrix for the random forest algorithms for accelerometer sensor data. The first column is the specific key accuracy, 2nd, 3rd, 4th, and 5th columns shows the accuracy of the immediate neighbors.key acc % 1st % conf.
2nd %Conf 3rd % conf 4th % conf TOTAL ACC %A:75B:25C:30D:35E:20F:50G:40H:20I:65J:35K:50L:25M:80N:15O:35P:60Q:25R:70S:40T:70U:35V:45W:40X:35Y:45Z: 45 E:5N:10V:10F:15F:5G:0H:5G:10K:10K:15L:5M:25N:10M:10I:15O:5Z:5T:20X:5R:20N:5B:10A:0S:5U:0W:5 S:15V:10F:0W:10D:5D:15F:10J:5O:0U:5J:5U:0L:10Y:0P:5L:0S:5E:0Z:0Y:0J:5C:5D:25D:15T:5A:5 W:0G:0E:5S:5W:0S:5V:0U:5U:0H:10O:5P:5J:0J:5J:0M:10D:5D:0A:10G:0Y:10F:0Z:5Z:5G:0S:0 D:5F:0X:5W:10Z:10T:5T:5Y:0L:5N:5N:5N:15U:0B:5M:0K:5A:5F:0W:15F:0M:15G:0Q:5C:5H:5D:0 100455075407560408070707510035558045907090706075655555From the table above, the inference accuracy based on the gyroscope sensor data varies from 80% for letter ‘M’ to 15% for letter ‘N’. Figure 4 11: keystrokes inference accuracy from 20 instances of each keys on the keyboard based on the gyroscope + accelerometer sensors. Figure 4 12: confusion probabilities of ‘S’, ‘G’ and ‘K’ based on gyroscope sensor. The keys belong to the left, middle and right side on the keyboard. The main keys are in dark color with their accuracies, the keys in light color represent misclassification accuracies for neighboring characters.
Character ‘A’ inference accuracy is high in all three scenarios due to its extreme position on the keyboard, tapping on ‘A’ created a more gesture, which caused more variation to the motion sensors. Letters ‘N’ and ‘H’ has the very low accuracy in all the three scenarios due their positions on the keyboard and the position of the hand on the back of the phone. Holding the phone with one-hand places the two keys in the middle of the hand grip which gives more support to phone on both sides, a tap on both keys wouldn’t give a much gestures to cause much variations to the motion sensors. The most effective sensor and algorithmsFrom the analysis above, it shows based on the features extracted, the amalgamated sensor readings from accelerometer and gyroscope have the higher inference accuracy, followed by the gyroscope and the accelerometer has the least inference accuracy.
The accelerometer adds only a 3.46% on accuracy to the gyroscope accuracy result using the best performing algorithm (Random Forest algorithm). It adds a 5.38% and 7.
31% for the other two algorithms Naïve Bayes and Multilayer perceptron respectively. Due to angular calibration of the gyroscope along the three axes, this made the gyroscope to have more inference accuracy than the accelerometer.Figure 4 13: (a). multilayer perceptron inference accuracies (b). Naïve Bayes inference accuracies (c) Random forest inference accuracies. Conclusion: The gyroscope is the most effective sensor between the two and the most effective machine algorithms is the random forest algorithms with the highest accuracy of 42.69%, this is due to the fact that the decision trees node is selected at random.
This helped in giving equal importance to each feature of the dataset. Keystroke inference per-userSix (6), (four (4) male and two (2) female) users across different nationalities were asked to enter some 114 words consisting of 626 characters in a controlled setting. This is done to check if there is a distinction of typing pattern between different users and to check the if there is similarity and difference keystrokes pattern with regards to nationality. The best perfuming algorithms and the most effective sensor is already established. For this exercise, my focused was on the Random Forest algorithms and the acc + gyro sensor readings. Due to the differences in the number of instances for each key, the keys with the highest number of frequencies were taking into consideration. Table 4 4: confusion matrix per user inference accuracies for highest frequencies characters.User Key inference accuracy Two neighboring key accuracy %User 1 A(11/56)E(23/62)O(13/40) E,S(19/56)R,S(9/62)I,P(4/40) 53.
5751.6142.50User 2 A(16/56)E(27/62)O(10/40) E,S(14/56)R,S(13/62)I,P(8/40) 53.5764.5145User 3 A(11/56)E(24/62)O(13/40) E,S(18/56)R,S(9/62)I,P(5/40) 51.7853.
2245User 4 A(8/56)E(20/62)O(19/40) E,S(14/56)R,S(17/62)I,P(11/40) 39.2859.6775User 5 A(15/56)E(24/62)O(6/40) E,S(13/56)R,S(17/62)I,L(9/40) 5066.
1237.5User 6 A(14/56)E(22/62)O(16/40) E,S(22/56)R,S(12/62)I,P(14/40) 64.2854.8375 Figure 4 14: per-user inference accuracy for the letters (‘A’, ‘E’, ‘O’) and their closest neighboring keys on the keyboard. Classification of users based on each userIn this section, I tried to check if there is a difference between how each user type, using the data from the six (6) volunteers’ dataset making up of four (4) male and two (2) female.
I used the Random Forest algorithms and the dataset from the amalgamation of the two sensors (accelerometer and gyroscope). Using the 10-cross validation on the dataset, the confusion matrix is as follows: Figure 4 15: confusion matrix using the random forest algorithm with an accuracy of 96%.From the confusion matrix, figure 21, it shows the nature of keystrokes pattern depends on the user. Each user has a unique way of typing on the smartphone. This is due to each individual way of holding the phones and the force of the keystrokes. Classification of users based on nationalityThere is a misclassification between the last three volunteers, where some of their data overlap each other. The last three volunteers are from the same country. This misclassification among three users from the same nationality suggest that there is a similarity in the way people from the same nationality type on their smartphone.
Giving that I have 50% of my data from people from China, I decided to divide the dataset into groups, China and non-China; using the 66% of the dataset consisting of data from two Chinese and two non-Chinese to create a model and the remaining 44% is used for testing. I then trained the data random forest algorithm; I got a confusion matrix as below; Figure 4 16: confusion matrix for classification model based on Nationality.After testing the model on the remaining 44% dataset, I got a confusion matrix below: Figure 4 17: confusion matrix after testing the dataset on remaining 44% of the dataset with a 71% accuracy.The confusion matrix shows a 100% accuracy for the user from China and 45% for the user from non-China nation. This suggests that, there are some specific characters among the 26 characters that distinguished the nature of how people from the same nation type with other people from different nation, in this case, the people from China and the people from non-China.
I decided to find the unique characters with higher frequencies that are classified the two class to some accuracies accurately from the confusion matrix above. I found the unique characters that successfully classified between the two classes are letters ‘I’, ‘N’, ‘O’, ‘P’, ‘T’, ‘U’ and ‘M’. I created another model using only those characters, using the 66% of the main data. The model has an accuracy of 93.
76%, Figure 4 18. The confusion for the new model using the new unique characters.I extracted the unique characters from the remaining 44% of dataset and used it as a test set. I was able to get 86.61%. This clearly shows that there is a distinction between how people from the countries typed on the smartphone.
Figure 4 19: the confusion matrix for the test set with an 86.62% accuracy.I decided to collect another set of sensors reading from a one non-Chinese person to further test the created model using the only the unique characters. I asked the person to enter the unique characters 20 times each making a total of 140 instances. I added the new data to a Chinese person to make two classes of a China and non-china class.
After testing the model on the new dataset, I was able to get an accuracy of 99.69%. Figure 4 20: the confusion matrix for the new dataset with an accuracy of 99.69% after testing the model.This diversity analysis shows the possibilities of division of how people type on the smartphone screens due to some factors like the length of fingers, how consistent different people use certain characters, like in English, the most common used characters are the vowels (A E I O U). This is open for further research.