Tamper Resistant Data Management

All actions performed in this research direction have in common the use of tamper-resistant devices. From this hypothesis various data protection models have been derived depending on whether data storage, access control, query processing or global computation are delegated to these trusted devices.

Secure combination of public and private Data.
This study focuses on the management of databases mixing public and sensitive (private) data. People talk about privacy, but give it up very easily, especially when faced with complex security procedures that offer only conditional guarantees. This implies that for people’s sensitive data to be protected, the cost to protect it must require little physical effort and must perform well. We proposed a system whereby people carry hidden sensitive data on a tamper-resistant token and they plug it into a personal computer when they need to link their hidden data with visible public data, all with the assurance that no hidden data will ever go out in the open. The principal novelties follow directly from the challenges of implementing this mode of operation: (1)  how to declare which data should be visible and hidden simply and how to query it, (2) how to index the data, and (3) which query processing strategies to use to link public and private data hosted on extremely unequal devices (standard computer and Secure token) without any information leakage of private data. Our philosophy is to make the user’s life as easy as possible while efficiently supporting SQL queries on arbitrarily large databases. Efficiency considerations on the small RAM Secure token lead us to the design of generalized join indexes, Bloom filters for approximate filtering, the postponement of selections until after joins in certain cases, and algorithms that reflect the differences in read/write performance in the Secure token.

Personal Data Servers. An increasing amount of personal data is automatically gathered and stored on servers by administrations, hospitals, insurance companies, etc. Citizen themselves often count on Internet companies to store their data (salary forms, invoices, banking statements, etc) and make them reliable and highly available through the Internet. Unfortunately, there are many examples of privacy violations arising from negligence, abusive use, internal attacks, external attacks, and even the most secured servers are not spared. In this study, we drew a radically different way of considering the management of personal data. We built upon the emergence of new portable and secure devices combining the security of smart cards and the storage capacity of NAND Flash chips. By embedding a full-fledged Personal Data Server in such devices, user's control of how her sensitive data is shared with others (by whom, for how long, according to which rule, for which purpose) can be fully reestablished and convincingly enforced. To give sense to this vision, Personal Data Servers must be able to interoperate with external servers and must provide traditional database services like durability, availability, query facilities, transactions. We proposed an initial design for the Personal Data Server approach, identified the main technical challenges associated with it and sketched preliminary solutions. This initial work paves the way to the definition of a general framework to manage securely personal data.

Privacy Preserving Data Publishing.
  While most PPDP works make the assumption of a trusted central publisher, this study advocates a decentralized way of publishing anonymized datasets. More precisely, our work concerns the proof of feasibility of adapting traditional PPDP schemes, such as k-anonymity, l-diversity or differential privacy to encompass the use of secure portable devices. In the applications we consider, each secure device is a data provider with weak computing capacities and weak connectivity (frequency and duration of connections are unpredictable, e.g., in the e-health context, patients may have their medical folder embedded in a secure device and connect it sporadically when they visit their physician or when they want to consult it at home.). Weak connectivity precludes any P2P solution to the problem. A server allowing asynchronous communications between the devices becomes necessary to implement a distributed PPDP mechanism but this server does not benefit from the same trustworthiness as the participating devices. Our work aims to provide a generic method to adapt an important subclass of PPDP algorithms to this context, using both the limited secure computation capacities of each device (but taking advantage of their number) and the powerful computation abilities of an untrusted server available 24/7. Our proposal is based on a meta algorithm divided in three phases: (1) a collection phase where encrypted data is collected by the untrusted server, (2) a construction phase where the untrusted server performs a sound computation of a given privacy mechanism to generate sanitization rules and (3) a sanitization phase where the encrypted data is decrypted then sanitized by the devices to produce a final clear-text result. The last phase can be distributed using many different devices for better efficiency.We showed how it is possible to transform existing anonymity mechanisms into decentralized ones using secure devices, while maintaining equivalent security guarantees against honest-but-curious and weakly malicious adversaries. We  also studied the (unlikely) event that some secure devices might be compromised, and can collude with the untrusted server. We provided schemes to detect the compromised devices with a probability that can be fixed as close to 1 as desired (the trade-off being the latency of the protocol).

Miscellaneous.
Other isolated works related to tamper-resistant data management have been performed. The first one concerns a tamper-resistant, client-based, XML access right controller supporting dynamic access control policies. Streaming evaluation of XML access control policies and streaming integrity control of an encrypted XML data flow have been designed. A second work has been initiated in cooperation with members of the SECRET INRIA project-team which focuses on the efficient use of cryptography to ensure database confidentiality and integrity. The objective is to find cryptographic techniques minimizing the negative impact of cryptography on the database size (e.g., a 20 bytes MAC is added to each encrypted attribute value in Oracle 11g TDE to ensure data authenticity) and on performance.