Professor Jiang Bei from the University of Alberta was invited to give a lecture at the School of Statistics

编辑:时间:2022-01-10 16:05:46 浏览次数:

On the morning of January 5, at the invitation of the School of Statistics, Professor Jiang Bei from the University of Alberta gave an online lecture entitled Comprehensive Data Generation: Balance between Data Utility and Privacy Protection for teachers and students of the School of Statistics. This lecture was chaired by Liu Xiaohui, Deputy Dean of the School of Statistics, and some teachers and graduate students of the School of Statistics attended.

At the beginning of the lecture, Professor Jiang Bei briefly introduced the increasing expectation that the data collected by government-funded research should be publicly available to ensure the reproducibility of research, but at the same time, it also increased people's concerns about data privacy. Rubin (1993) proposed Multiplier Imputation (MI) for synthetic datasets, but on the one hand, this method will lead to partial loss of information in the integrated datasets, and on the other hand, integrated data needs to be generated from a specified model. Once the model is set incorrectly, the statistical nature of the synthetic dataset will be lost. On this basis, Professor Jiang Bei proposed a new DA-MI method, which added a data enhancement step on the basis of Rubin (1993), which significantly improved its data utilization efficiency. Moreover, the DA-MI method introduces adjustment parameters, so that users can adjust the parameters to balance the degree of data interference and the degree of retention of statistical properties.

Subsequently, Professor Jiang Bei applied the method to the data of the Canadian Scleroderma Research Group (CSRG), and compared it with other privacy protection methods, and found that the data after the interference of the DA-MI method was the closest to the statistical conclusions obtained from the original data. And the 95% confidence interval generated by it has an average overlap rate of 98.5% with the confidence interval constructed by the original data. The confidence interval overlap rate of the other methods is only 73.9% to 91.9%.

At the end of the lecture, Professor Jiang Bei believed that the above research findings showed that the DA-MI framework could achieve the goal of effective sharing of research data on the basis of integrating noise addition and preserving the ease of use of Rubin's original MI method, while also protecting users’ privacy.

This wonderful speech by Professor Jiang Bei provides a reference for balancing data utility and privacy protection to generate comprehensive data, and also provides a useful example for teachers and students of the School of Statistics to engage in related research work.

[ Extended reading ]

The University of Alberta is one of the largest research universities in Canada, and its research atmosphere and research conditions are well-known in Canada and even North America. University of Alberta alumni include the 16th Prime Minister of Canada, three Nobel Prize winners, 75 Rhodes Scholars, and 111 Canadian Chief Research Professors. Its artificial intelligence major is in a leading position in the world. Rich Sutton, the father of reinforcement learning, and the lead authors of Alpha Go, David Silver and Aja Huang, are from the University of Alberta.

Jiang Bei is an associate professor and doctoral supervisor in the Department of Mathematics and Statistical Sciences, University of Alberta, Canada. Ph.D from the Department of Biostatistics, University of Michigan. Research areas include privacy data analysis, Bayesian hierarchical modeling, joint modeling of multi-view data integration, etc. The research results are widely used in women's health, mental health, neurology, ecology and other fields. He has published more than 30 papers in JASA, JRSSC, NeurIPS and other journals and conferences.