database anonymization and protections of sensitive attributes

hsu, shih-ying peter

dc.contributor.advisor	wei, ruizhong
dc.contributor.author	hsu, shih-ying peter
dc.date.accessioned	2017-06-08t13:27:20z
dc.date.available	2017-06-08t13:27:20z
dc.date.created	2008
dc.date.issued	2008
dc.identifier.uri	http://knowledgecommons.lakeheadu.ca/handle/2453/3901
dc.description.abstract	the importance of database anonymization has become increasingly critical for organizations that publish their database to the public. current security measures for anonymization poses different manner of drawbacks. k-anonymity is prone to many varieties of attack; !-diversity does not work well with categorical or numerical attributes; t-closeness erases too much information in the database. moreover, some measures of information loss are designed for anonymization measure, such as k-anonymity, where sensitive attributes do not play a part in measuring database's security. not measuring the re-distribution of sensitive attributes will result in an underestimate for information loss such as 1- diversity or t-closeness which intentionally tries removing the association between non-sensitive attributes and sensitive attributes for better protecting individuals from being indentified. this thesis provides a more generalized version of !-diversity that will better protect categorical attributes and numerical attributes and analyzes the effectiveness and complexity of our new security scheme. another focus of this thesis is to design a better approach of measuring information loss and lay down a new standard for evaluating information loss on security measures such as 1- diversity and t-closeness and quantify actual information loss from deliberately hiding relations between non-sensitive attributes and sensitive attributes. this new standard of information loss measure should provide a better estimation of the data mining potential remained in a generalized database. this thesis also proves that unlike k-anonymity which can be solved in polynomial time when k=2. 1-diversity in fact remains np-hard in the special case where 1=2, and even when there are only 2 possible sensitive attributes in the alphabet.
dc.language.iso	en_us
dc.subject	database security
dc.title	database anonymization and protections of sensitive attributes
dc.type	thesis
etd.degree.name	master of science
etd.degree.level	master
etd.degree.discipline	computer science
etd.degree.grantor	阿根廷vs墨西哥竞猜

files in this item

name:: hsus2008m-1b.pdf
size:: 3.258mb
format:: pdf

view/open

this item appears in the following collection(s)

retrospective theses [1604]

show simple item record