# Faculty Corner – Dr. Yeil Kwon

Last fall Dr. Yeil Kwon joined the mathematics department as an assistant professor in data science. He received his Ph.D. in statistics from Temple University.

My academic background is Statistics. I earned BS/MS in Statistics from Korea University, MS in Operations Research from Columbia University and PhD in Statistics from Temple University. I have worked for 9 years as a quantitative analyst at credit bureau, insurance company and hedge fund in South Korea and United States. I worked on mostly developing mathematical/statistical modeling for predicting specific events based on the financial data.

(2) Tell us about the courses that you would like to teach at UCA.

I am teaching three courses in Fall 2018. Introduction to Probability Theory covers core theories on probability, random variable and probability distribution functions. The highlight of this course is the limit theorems of random variables, which are used for a number of applications in data analysis. Another course is Statistical Methods, which includes basic topics on probability and statistical inference with R. This course provides students with general understanding of statistics and some applications of data analysis.

My main research interest is developing methods for simultaneous estimation under empirical Bayes framework. In particular, I am working on the variance estimation problem with high dimensional data under arbitrary prior assumption. High dimensional data analysis is one of the popular topics in modern data science area, but it is well known as a quite challenging problem. Empirical Bayesian methods can be one of the alternatives for approaching this topic and I proposed a new method for simultaneous variance estimation based on the empirical distribution function of the marginal distribution based on the data with a number of populations.

(4) Can you give us an example of an application of this research?

One of the representative example of the high dimensional data is microarray gene expression data. Finding the difference of gene expressions on DNA is one of the crucial problem in bioinformatics or genetics. While it includes very large number of parameters to estimate, the sample size of each population is extremely small. It means we do not have sufficient information on the parameter in each population. Using the empirical Bayesian methods, we can obtain much better estimator for each parameter by using the information across the populations.