I am a computer programmer and computational linguist living in Las Cruces, New Mexico. This academic year I will be teaching at the University of Mary Washington. My research centers on machine translation. My recent programming projects have been in Flex and PHP.
Ahmed Abdelali, Steve Helmreich and I just submitted a paper to CAASL3: Computational Approaches to Arabic Script-based Languages to be held in Ottawa on August 26th. It reports on work we have done on geographical classification of Arabic text. We presented a paper on this topic at the Chicago Colloquia on Digital Humanities and Computer Science back in November 2008 (Linguistic Dumpster Diving: Geographical Classification of Arabic Text – pdf). At that colloqiua a number of people gave us good suggestions and criticisms. Our work since then has included investigating the suggestions these people made and also addressing the criticisms. For example, one individual suggested we look at non-linear methods of classification. One thing we did was to compare learning algorithms on this task. In our original work we used a support vector machine approach. We compared that approach to C4.5 decision trees, Bagging C4.5, Hyperpipes, nearest neighbor, K-nearest neighbors, Naive Bayes, Neural Network classifiers, SMO with a polynomial kernel and SMO with an RBF kernel. Of these, SMO with a polynomial kernel, neural nets, and Bagging C4.5 appear to perform the best. In addition, we invested the performance improvement from adding data from new sources. We are continuing work in this area. If you have any questions or suggestions please let us know.
As I mentioned in previous posts, I developed (with tremendous help from Adam Zacharski) a cross-language instant messaging system using Adobe Flex. This system provides concurrent real-time translation for instant messaging using multiple machine translation engines. During this last academic year, Bill Ogden, my colleague in New Mexico, and several people in his lab (Sieun An and Yuki Ishikawa) used this system to evaluate the performance of machine translation systems based on how effective they were in helping people accomplish shared tasks. They used paid participants who worked in pairs (one Japanese speaker paired with a native English speaker) to accomplish a photo identification task using this instant messaging system. We just submitted a paper describing the results of this work to the Machine Translation Summit in Ottawa in August.
Okay. This is my first youtube post. What this guy did was take individual performers on youtube–many of them were instructional videos and remixed them into a band. Truly amazing!