Tamper Resistant Data Management
All actions
performed in this research direction have in common the
use of tamper-resistant devices. From this hypothesis various
data protection models have been derived depending on whether data
storage,
access control, query processing or global computation are delegated to
these
trusted devices.
Secure combination of
public and private Data. This study focuses
on the management of databases mixing public and sensitive (private)
data. People
talk about privacy, but give it up very easily, especially when faced
with
complex security procedures that offer only conditional guarantees.
This implies
that for people’s sensitive data to be protected, the cost to protect
it must
require little physical effort and must perform well. We proposed a
system
whereby people carry hidden sensitive data on a tamper-resistant token
and
they plug it into a personal computer when they need to link their
hidden data with visible public data, all with the assurance that no
hidden data will
ever go out in the open. The principal novelties follow directly from
the
challenges of implementing this mode of operation: (1) how
to declare which data should be visible and hidden simply and how to
query it, (2) how to index the data, and (3) which query processing
strategies to use
to link public and private data hosted on extremely unequal devices
(standard
computer and Secure token) without any information leakage of
private data. Our philosophy is to make the user’s life
as easy
as possible while efficiently supporting SQL queries on arbitrarily
large
databases. Efficiency considerations on the small RAM Secure token lead
us to the design of generalized join indexes, Bloom filters for
approximate filtering, the postponement of selections until after joins
in
certain cases, and algorithms that reflect the differences in
read/write
performance in the Secure token.
Personal Data Servers.
An increasing amount of personal data is automatically gathered and
stored on
servers by administrations, hospitals, insurance companies, etc.
Citizen
themselves often count on Internet companies to store their data
(salary forms,
invoices, banking statements, etc) and make them reliable and highly
available
through the Internet. Unfortunately, there are many examples of privacy
violations arising from negligence, abusive use, internal attacks,
external
attacks, and even the most secured servers are not spared. In this
study, we
drew a radically different way of considering the management of
personal data.
We built upon the emergence of new portable and secure devices
combining
the security of smart cards and the storage capacity of NAND Flash
chips. By
embedding a full-fledged Personal Data Server in such devices, user's
control of
how her sensitive data is shared with others (by whom, for how long,
according
to which rule, for which purpose) can be fully reestablished and
convincingly
enforced. To give sense to this vision, Personal Data Servers must be
able to
interoperate with external servers and must provide traditional
database
services like durability, availability, query facilities, transactions.
We
proposed an initial design for the Personal Data Server approach,
identified the
main technical challenges associated with it and sketched preliminary
solutions.
This initial work paves the way to the definition of a general framework to manage
securely
personal data.
Privacy Preserving Data
Publishing. While most PPDP works make
the assumption of a trusted central publisher, this study advocates a
decentralized way of publishing anonymized datasets. More precisely,
our work
concerns the proof of feasibility of adapting traditional PPDP schemes,
such as k-anonymity, l-diversity or differential privacy to encompass
the use of
secure portable devices. In the applications we consider, each secure
device is
a data provider with weak computing capacities and weak connectivity
(frequency
and duration of connections are unpredictable, e.g., in the
e-health
context, patients may have their medical folder embedded in a secure
device and
connect it sporadically when they visit their physician or when they
want to
consult it at home.). Weak connectivity precludes any P2P solution to
the
problem. A server allowing asynchronous communications between the
devices
becomes necessary to implement a distributed PPDP mechanism but this
server
does not benefit from the same trustworthiness as the participating
devices. Our work aims to provide a generic method to adapt an
important subclass of PPDP
algorithms to this context, using both the limited secure computation
capacities
of each device (but taking advantage of their number) and the powerful
computation abilities of an untrusted server available 24/7. Our
proposal is
based on a meta algorithm divided in three phases: (1) a
collection phase where encrypted data is collected by the untrusted
server, (2) a construction phase where the untrusted server performs a
sound
computation of a given privacy mechanism to generate sanitization rules
and (3) a sanitization phase where the encrypted data is decrypted then
sanitized
by the devices to produce a final clear-text result. The last
phase can be distributed using many different devices for better
efficiency.We showed how it is possible to transform
existing
anonymity mechanisms into decentralized ones using secure devices,
while
maintaining equivalent security guarantees against honest-but-curious
and weakly
malicious adversaries. We also studied the (unlikely)
event that
some secure devices might be compromised, and can collude with the
untrusted
server. We provided schemes to detect the compromised devices with a
probability
that can be fixed as close to 1 as desired (the trade-off being the
latency of
the protocol).
Miscellaneous. Other isolated works
related to tamper-resistant data management have been performed. The
first one concerns a
tamper-resistant, client-based, XML access right controller supporting
dynamic
access control policies. Streaming evaluation of XML access control
policies and
streaming integrity control of an encrypted XML data flow have been
designed.
A second work has
been initiated
in cooperation with members of the SECRET INRIA project-team which
focuses on
the efficient use of cryptography to ensure database confidentiality
and
integrity. The objective is to find cryptographic techniques minimizing
the
negative impact of cryptography on the database size (e.g., a 20 bytes
MAC is
added to each encrypted attribute value in Oracle 11g TDE to ensure
data
authenticity) and on performance.