Forensic analysis to china’s cloud storage services Long Chen?Qing Zhang Institute of Computer Forensics, Chongqing University of Posts and Telecommunications, Chongqing 400065, China Abstract Nowadays, many users utilize the cloud storage service to store or share their data. At the same time, there are an increasing number of illegal cases about preserving illegal information or stealing the company’s confidential data through cloud storage service. Therefore, a study on digital forensic investigation of cloud storage services is necessary.This paper discusses the types of terrestrial artifacts that are likely to remain on a client’s machine and analyses the law of terrestrial artifacts after accessing to the cloud storage. At last the paper proposes a method to investigate and analyze the artifacts for reconstructing the event of user’s activities. Key words: cloud computing; cloud storage; digital forensic; user’s activities 1.
Introduction In cloud environments the common data and processing power can be shared and distributed across single or multiple datacenters that are spread across a specific geographical area or even the entire globe. The structure and characteristics of the complex to computer forensics work bring huge challenges, in order to adapt to these changes, computer forensic in cloud computing has became an important topic, is of very important theoretical and practical value. The current domestic and foreign scholars on cloud forensics research mainly concentrated in two aspects:1) In the cloud server design scheme to record user information and the customers can obtain network, process, and access logs over a read-only API on the server. The investigator uses the information to analyze the user’s activities. 2) Collected suspicious data from client’s machine, then analyze the user’s activities. Shams Zawoad1 et al.
introduce Secure-Logging-as-a-Service, which stores virtual machines’ logs and provides access to forensic investigators ensuring the confidentiality of the cloud users. Shams Zawoad2 et al. also introduce the idea of building proofs of past data possession in the context of a cloud storage service and discuss how this proof of past data possession can be used effectively in cloud forensics. Li-ping Ding3 et al. has proposed a forensics framework under an infrastructure as a service cloud model.
Experiments show that the framework can obtain evidence data in cloud platform effectively and efficiently. Ting Sang4 et al. propose a approach which using logs model to building a forensic-friendly system.
Using this model we can quickly gather information from cloud computing for some kinds of forensic purpose. Darren Quick5 used Microsoft SkyDrive as a case study, they identified the types of terrestrial artifacts that are likely to remain on a client’s machine. Fabio Marturana6 has discussed technical aspects of digital forensics in cloud computing environments and present results of a case study about user-cloud interaction, aimed at assessing whether existing digital forensics techniques are still applicable to cloud investigations. Jason S.Hale7 discusses the digital artifacts left behind after an Amazon Cloud Drive has been accessed or manipulated from a computer. Kim-Kwang Raymond Choo8 used three popular public cloud storage providers (Dropbox, Google Drive, and Microsoft SkyDrive) as case studies to explore the process of collecting data from a cloud storage account using abrowser and also downloading files using client software.
Darren Quick9 used Dropbox as a case study, research was undertaken to determine the data remnants on a Windows 7 computer and an Apple iPhone when a user undertakes a variety of methods to store, upload, and access data in the cloud. Above studies, the first study records user information and stores the logs in the cloud storage service, which requires cloud storage provider change the current design framework of cloud storage. The second study get information from client’s machine, foreign research only focuses on the specified cloud storage service and just has a simple analysis of the data generated by using cloud storage service. It doesn’t propose the process of collecting and analyzing these data, the universial capability is not very strong. In addition, during cloud forensic investigation, there are always a huge number of suspected data generated after using cloud storage service, the forensic investigators have to spend a lot of time to analyze these these data manually.
In this paper, we used 360 and baidu cloud storage service as case studies to discuss the types of terrestrial artifacts that are likely to remain on a client’s machine and analyses the law of terrestrial artifacts after accessing to the cloud storage. Then we propose a method to reconstruct event of user’s activities by combing logs and history data remnants together. At last, we develope an autopsy tool to help the forensic investigators finish some tasks automatically.
This tool can save the forensic analysis time, greatly improve the efficiency of the forensics. 2. Important factors in an investigation Currently, users usually access cloud storage services through browsers or clients, whether it is the browser or client?it will be a lot of evidence on the user device.
This section outlines and provides a rationale for the choice of elements that are prioritized for investigation, among the data collected from browsers and clients. 2.1 Log files of web browsers Although there is a difference of kernel structure and the method of storing traces of online activities among different browsers, the information can be recorded in different methods, such as history, cookies and cache.
The history record of the browser is an important consideration. There will be a large number of URL records generated by the browser. Analyzing the history record of the browser can indicate that the user has ever used the browser to access the cloud service, but it’s not enough for us to know the detail information of user’s operation. By analyzing the user’s browsing cookie, the investigator can get much useful information related with the case, such as access time, login name, access frequency, operation event and the content of relevant file operated etc. The cache of the browser is the crucial information of the forensic investigation. The browser’s accessing the cloud storage service is essentially calling the network APIS, namely, after the client send the request information to the cloud server, the cloud server will send the corresponding reply information to the client. The cache file is actually used to store these response information, including the pictures, Flash , JS script, CSS files and some html files from the site visited. Analyzing the cache files can obtain the detail operation information of the user’s using browser to access cloud storage service.
On the Windows system, Internet Explore(IE) is the most famous web browser. Therefore, this paper only focuses on log files of Internet Explorer.2.2 Artifacts of client application in PC Most cloud service providers provide the user with client application to access the cloud service. Many files will be generated in the disk of the user’s machine after the client application is installed, such as log files, database files, configuration files etc. These files may have many suspicious data. Analyzing and mining these data can get get a lot of valuable information to reconstruct the event of user’s activities, determine the possible event sequences and reconstruct the activity scene. It can be helpful for the investigator to know what and how the event is taking place, then provide the foundation of auditing the user behavior.
The log files contain much key information that the user has requested to the cloud service provider. When the user upload, rename, download or delete a file, some information will be recorded in the log files. We can reconstruct the timeline of a user’s activities in cloud storage.
The database files usually stores information about folders and files on a PC. The information contain the path of file?the filename?the hash of file?the size of file?the create and modify times etc. The configuration files usually contain the account ID, the username?the email etc. 2.3 Procedure for digital investigation of cloud storage The investigator collects and analyzes data from device that a user has used to access a cloud storage service. There are five steps during the forensic investigation of cloud storage, the detail information is as followings: 1.
Analyze the registry in the user’s device, obtain the information of the user-installed browser, cloud storage client and the corresponding installation directory; 2. Collect the suspicious data(related with the evidence of targets) of each browser and cloud client stored in the user’s device. These data include browser cache, history record, download history, weblog of cloud client, synchronous log, database file and configuration file; 3. Analyze and mine the suspicious data to extract user’s activities, then standardize the user’s activities, the corresponding format is as followings: user’s activities = ; 4. Analyze and process the standard user’s activities data. Firstly, store these data in a dataset, group the similar data, delete the repeated data. Secondly, sort these data by time sequence. Lastly, iterate over this dataset, complete the miss information by reasoning forward and rebuild the event of user’s activities; 5.
Obtain the event of user’s activities according to the requirements, analyze the correlations and rules of user’s activities among different time, different targets and behavior intention. Then determine the possible event sequences, reconstruct the activity scene. This work can be helpful for the investigator to know what and how the event is taking place, then provide the foundation of auditing the user behavior; 3.
Artifacts of cloud storage services In general, the client application of cloud storage services use two methods to record use’s operation: database and log file. The 360 cloud and Baidu cloud are typical representative of the two kinds of storage ways. Here we used two popular public cloud storage services (360, baidu) in China as case studies to describe the artifacts left in the Windows after a customer has used a cloud storage service.3.1 360 cloud In the domestic various cloud storage services?360 cloud storage service is one of the most famous cloud storage services. It not only provides a larger free storage space, and has a fully functional and better user experience. The 360 cloud storage service records user’s actives by log.
3.1.1 Web browser When the user open a files, a file File name on intfn.js is created. The URL attribute of the cache file begin with “http://pXX-X.
php?method=”, and it also contains extra information. This extra information in the form of key-value pairs to record user activities, it is shown in Fig.1. The method field is the user’s action type. The fhash field is the hash of the file on which the user acted.
The fname field is the name of the file on which the user acted. The callback field is the time at which the user performed the action. http://p53-3.yunpan.360.cn/intf.
php?method=Preview.getHtmlInfo&fhash=d6e31e02121a01fdb47107a8c05e86e7d470fc62&fname=%E6%B5%8B%E8%AF%95%E6%96%87%E4%BB%B6.docx&pub=0&ck=609ada7600e39a68d8e612126cdfde0c&ofmt=jsonp&callback=QWJsonp1406769490989 Fig.1: The URL attribute of the cache file after the user open a file When the user upload, rename, download or delete a files, a file File name on webclickn.htm is created. The URL attribute of the cache file begin with http://s.360.cn/yunpan/webclick.
h-tml?u=http%3A%2F%2Fyunpan.360.cn%2Fmy. Fig.2 shows the URL attribute of the cache fileafter the user upload a file to cloud storage services by IE browser. http://s.
1414724501831.406&buttonid=Upload&t=1414724877888 Fig.2: The URL attribute of the cache file after the user upload a file to 360 3.1.2 client software Some folders and files were created when client software is used on a windows system.The observed folder structure is listed in Table1.
Among these folders and files, history.dat?filecache.db and sync.log contain important information. Table1 Important files and paths Path details %profile%Roaming360CloudUIsync.
log %profile%Roaming360CloudUIuser IDfilecache.db %profile%Roaming360CloudW in2user ID history.dat %profile%Roaming360CloudW in2sync.log %profile%Roaming360CloudW in2user ID config.ini %profile%Roaming360CloudW in2user ID filecache.db %profile%Roaming360CloudW in2user ID history.dat the client log the local cache file information the history of upload synchronous log the user information the local cache file information the history of upload Firstly, history.dat and filecache.
db contains the same information. They recorded the history of the users to upload files in a different way. Secondly, config.
ini contains the user name, the account ID, and the user email. Thirdly, sync.log contains some key information that the user has uploaded, edited, opened, downloaded, and deleted most recently. This file contains authentication information, the account ID, IP and the times at which the application started and ended.Some information are recorded in sys.log when users has uploaded a files shown in Fig.4.
The information include the operation type?filename?the hash of file?Operating time?the client’s IP etc. 2015-01-18 16:22:39.103 DLL3.0.
0.1500 DEVUI 18.104.22.1680 os6.1 ie9 206cca6a7afe7048f4666fbda7646a3d 2015-01-18 16:22:39.
103 SetUser 262965246, type 0 2015-01-18 16:22:39.107 SetDiskRoot D:360CloudUICache262965246 2015-01-18 16:22:39.710 resp user detail.
ver 18683, node_count 11930, last_login_ip:113.250.159.
87 2014-11-01 10:16:38.558 status 6(ok) -; 5(monitor) 2014-11-01 10:16:38.558 db Transaction Begin 2014-11-01 10:16:38.558 out_upload Queue:1 new est.
docx 2014-11-01 10:16:38.862 upload192810392 begin: est.docx, size:10258, fhash 6e553062ce4565dc230e6c598288a8036e42658e 2014-11-01 10:16:38.862 req upload filesize=10258, est.docx 2014-11-01 10:16:39.
016 upload192810392 have, new_ver:1, name: est.docx 2014-11-01 10:16:39.016 status 5(monitor) -; 6(ok) Fig.3: sync.
log 3.2 Baidu cloud Baidu cloud is the frequently used storage service. It also provides the browser and the client to access the cloud storage service. Unlike 360 cloud storage, it mainly use database to store information. 3.2.1 Web browser When the user open a file,a file File name on An.html is created.
The URL attribute of the cache file begin with “http://www.baidupcs.com/”, and it also contains some extra information. This extra information also in the form of key-value pairs to record user’s activities. It is shown in Fig.4. The method field is the user’s action type. The md5 field is the md5 of the file on which the user acted.
The time field is the time at which the user performed the action.We can’t get the name of the file on which the user acted. But we can inquery the file information from the cache_file table of client app by the value of md5. http://www.baidupcs.com/doc/d0770031ef57bacc4d312dced0256952?fid=621326181-250528-1069966796137668&time=1417955739&rt=pr&sign=FDTAER-DCb740ccc5511e5e8fedcff06b081203-kZhuu5nQxLYVbCgEBpox8JMZOcE%3d&expires=8h&chkbd=0&chkv=0&method=view&md5=d0770031ef57bacc4d312dced0256952&type=swf&pn=0&rn=1 Fig.4: The URL attribute of the cache file after the user upload a file to baidu 3.
2.2 client software Whenever a user adds a file, edits a file, or deletes a file, some information will be stored in database files. The database file sturcture is showed in Fig5. BaiduYunGuanjia.db sqlite includes six important tables.
The backup_file records backup file information using the client.The bache_file records all file information on the server. The download_file records current download file information.
The download_file records have been downloaded file information. The upload_file records current upload file information. The upload_history_file records have been uploaded file information.
These tables contains some key information that the server_path,the filename, the md5 of file, the create and modify times. We can reconstruct user’s activities through the information.Fig.5: BaiduYunGuanjia.db 4.
Case study of a cloud storage service 4.1 Case overview Suppose one employee of an enterprise disclosed the company’s important design documents. According to the investigation, there is likely that the employee used 360 cloud storage service to copy and steal this design document. Except for this, the employee may also change the original file name and delete some documents in order to hide the traces after the crime. 4.2 Method The investigator firstly found there is the 360 cloud storage client installed in the employee’s PC. Secondly, collect the suspicious data(related with the evidence of a crime) of each browser and cloud client stored in the user’s device.
The record of accessing 360 cloud storage service from the history record of IE browser was also be found. The investigator obtained the user’s activities information and the copied files by analyzing the cache file. Then get user’s activities information form sync.log of the client software. At last, standardize the user’s activities and used the developed automated tool to rebuild the event of user’s activities, analyzed the correlations and rules of user’s activities among different time, different targets and behavior intention, extracted the relationship among user’s activities.
During forensic investigation, we can use our autopsy tool to reconstruct the event of user’s activities. The result will be output to a TXT file. 4.3 Result For this case, the forensic investigator can analyze the user’s data operation behavior according to the event of user’s activities, then determine whether the user disclosed the company’s confidential information. The investigator determined the possible event sequences according to the obtained event of user’s activities, traced every step of processing each file, then reproduced the crime scene.
This work is helpful for the investigator to know what and how the event was taking place, then the investigator can judge whether the user has disclosed this file. 5. conclusion This paper analyzes the left traces in the user’s device and their storage methods and rules after the user using client application and browser to access cloud storage service in the Windows operating system. These left traces and storage rules are helpful for the investigator to extract completed and reliableevidence information quickly.
Then this paper presents a method to reconstruct the use’s activities, it can associate different left traces extracted from the user’s activities, rebuild the user’s accessing cloud storage service, analyze the user’s data operation behavior and provide clues for further investigation and analysis. This paper uses Baidu cloud storage service and 360 cloud storage service as case studies, but the method mentioned in this paper can also be applicable to other cloud storage services. This paper mainly focuses on the PC client of using cloud storage service, the similar applications on the mobile device will be our next research work. Acknowledgements The research work was supported by National Social Science Foundation of China under Grant No. 14BFX156 and Natural Science Foundation of CQ CSTC of P.
R. China(No. cstc2011jjA40031, cstc2011jjA1350). References 1 Shams Zawoad, Amit Kumar Dutta, Ragib Hasan. SecLaaS: secure logging-as-a-service for cloud forensicsC// ASIA CCS’13 Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security table of contents.
New York:ACM, 2013: 219-230. 2 Shams Zawoad, Ragib Hasan. I have the proof: providing proofs of past data possession in cloud forensicsC// Cyber Security. Washington, DC: IEEE, 2012: 75-82.
3 XIE YAlong, DING Liping,LIN Yuqi,et al. ICFF: a cloud forensics framework under the IaaS modelJ. Journal on Communications, 2013,34(5):200-206. 4 Ting Sang. A log based approach to make digital forensics easier on cloud computingC// Intelligent System Design and Engineering Applications (ISDEA), 2013 Third International Conference on. Hong Kong:IEEE, 2013: 91-94. 5 Darren Quick, Kim-Kwang Raymond Choo.
Digital droplets: Microsoft SkyDrive forensic data remnantsJ. Future Generation Computer Systems, 2013, 29(6): 1378-1394. 6 Fabio Marturana, GianluigiMe, Simone Tacconi. A case study on digital forensics in the cloudC// Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2012 International Conference on, Sanya:IEEE, 2012:111-116. 7 Jason S.Hale. Amazon cloud drive forensic analysisJ. Digital Investigation, 2013, 10(3): 259-265.
8 Darren Quick, Kim-Kwang Raymond Choo. Forensic collection of cloud storage data: Does the act of collection result in changes to the data or its metadataJ. Digital Investigation.
2013, 10(3): 266-277. 9 Darren Quick, Kim-Kwang Raymond Choo. Dropbox analysis: data remnants on user machinesJ. Digital Investigation.