About Me

My name is Wang, Kaiyuan (王开元). I’m currently a master student in the University of Texas at Austin. My major is software engineering and now I’m working under the supervision of professor Sarfraz Khurshid.

One of my interests lies on data mining. I took data mining and machine learning courses in UT Austin and aced in both courses. I’ve learned how to pre-process the raw data, what models should be applied and how to optimize those models. Our final project in data mining class is related to Sentiment Analysis problem. Basically, we grabbed data from Kaggle and processed the raw data before applying classifiers. The data is composed of movie reviews from Rotten Tomato website. We parsed the data into bag-of-word format where each column represents the frequency of a word and each row denotes a movie review. We then applied a bunch of classifiers including naive Bayes, KNN, random forest, adaptive boost, SVM, logistic regression on the data set. Since the bag-of-word representation does not take advantage of the structural information of movie reviews, we also applied recursive neural tensor network on the data and got the most highest accuracy. In the project, I learned how to pre-process raw data and analyze problem set to figure out the potential models that may work for the problem. I also learned to be passionate and friendly in a team so that all of the team member can work properly and efficiently.

Taken in Santa Barbara

Taken in Santa Barbara

I’m also interested in software testing. I did very well in both software testing and software verification & validation courses and got A for both. I’ve learned basic software testing concept such as node coverage, edge coverage and path coverage, etc. In the software testing class, we are trained to use Java path finder to generate test cases automatically and we also made use of the back-track feature to build a small SAT solver for BNF formulae. In software verification & validation class, we read through a lot papers and became aware of many cutting-edge technologies in software testing area, including newest version of symbolic execution, exhaustive data structure generator, isomorphism breaking technique, semantic synthesis and fault localization technique. The final project we did in the software verification & validation class is related to model driven engineering. Basically, we were trying to help MDE developers to validate models they designed and also help MDE users to check model constraints satisfiability. The tools we used are Violet and Alloy. Violet is an UML tool which gives clear structured xml version of each UML diagram. Alloy is a modeling language which is designed by MIT and is used to model structures. We designed the meta-model of UML diagrams in Alloy and implemented transformations between xml and Alloy language. This allowed us to take advantage of the instance generator feature in Alloy and help validate the meta-model. Also, we used Alloy to check constraints satisfiability for each model. In this project, I learned how to define meta-models and how to quickly get familiar with tools that you never see.

Here is my current resume. I’ll try to keep this website updated so that you can easily know what I’m doing.

Thank you for your interests.