Click here or on one of the photos above to see photo albums with more pictures.
Click here to see fun we had with computational photography for infinite depth-of-field.
Click on the thumbnails below to see a video clip of a large pod of dolphins Dayne, Britt, Alex, Greta, Diane and I ran across while sailing to Catalina Island, and also to see clips of jelly fish at the Monterey Bay Aquarium:
Rich Caruana, Assistant Professor, Computer Science
Ph.D. Carnegie Mellon University, 1997
Office Hours
Fall 2007: Tue 4:30-5:00, Wed 11:00-11:30
4157 Upson Hall | Phone: 607-255-1164 |
Computer Science | Fax: 607-255-4428 |
Cornell University | Email: caruana at cs.cornell.edu |
Ithaca, NY 14853 | Admin: Melissa Totman 5-5331 |
I joined Cornell's Department of Computer Science in Fall 2001. Here's a recent CV.
Most of my research is in data mining and machine learning, and the application of these to problems in medicine, ecology, and microprocessor design. I do work on inductive transfer (a.k.a. multitask learning), ensemble learning, probabilistic prediction, model compression, and regression. In general, I like to work on real problems, and develop new learning methods by abstracting what is required to achieve good performance on those problems.
We're doing new work on what we call Model Compression where we take a large, slow, but accurate model and compress it into a much smaller, faster, yet still accurate model. This allows us to separate the models used for learning from the models used to deliver the learned function so that we can train large, complex models such as ensembles, but later make them small enough to fit on a PDA, hearing aid, or satellite. With model compression we can make models 1000 times smaller and faster with little or no loss in accuracy. Here's our first paper on model compression. Google has been funding this work.
One of my students, Alex Niculescu-Mizil, has developed a method for multitask learning of Bayes Net structures. The first paper on this topic was presented at AIStats07. Here's a preprint.
We developed a new ensemble learning method called Ensemble Selection. In ensemble selection we train thousands of different models on the same train set (no sampling or weighting), then carefully select from this library of models a small set that yield best performance when combined in an ensemble. Noteworthy features of ensemble selection are that we train many different kinds of models (e.g. SVMs, neural nets, bagged, boosted, and vanilla decision trees, kNN, boosted stumps), the performance of the ensemble can be optimized to nearly any performance measure, and the method outperforms bagging, boosting, Bayesian model averaging, and all other learning methods we've compared it to. Here's a paper on ensemble selection that was presented at ICML 2004: (caruana.icml04.crc.ps) For an updated draft look at: caruana.icml04.revised.rev2.ps. For a bundle that contains both the revised ICML 2004 as well as a long version of an ICDM 2006 paper that describes how to get even better performance from ensemble selection get: caruana.icml04.icdm06long.pdf.
Along with the ensemble selection work, we have been performing a comprehensive empirical evaluation of machine learning methods. So far we have looked at SVMs, neural nets, logistic regression, naive bayes, many flavors of decision trees, bagged and boosted decision trees, random forests, boosted stumps, and many k-nearest neighbor methods. We are evaluating the performance of these learning methods on a variety of performance metrics: accuracy, ROC area, precision/recall break-even point, Lift, squared error, cross-entropy, probability calibration, ... An ICML 2006 paper with the latest results is at: modelsperf.icml.2006.pdf.
While doing these experiments we discovered that boosted decision trees had excellent performance on metrics such as accuracy, AUC, Lift, and precision/recall, but predicted poorly calibrated probabilities and thus had very bad squared error and cross entropy. By applying calibration to the predictions made by boosting, we are able to get well-calibrated probabilities from boosting, and boosted trees now outperform all other learning methods we have tested on squared error and cross entropy. Here's our AI Stats 2005 paper on this.
We have also begun analyzing how the different performance metrics relate to each other, and presented a paper at KDD2004 that uses multidimensional scaling and correlation analysis to study ten metrics: perfs.kdd04.revised.rev1.pdf. This paper compares Accuracy, F-score, Lift, AUC (Area under the ROC), Average Precision, Precision/Recall Break-Even Point, Squared Error, Cross-Entropy, and Probability Calibration.
Thorsten Joachims and I chaired the KDD-Cup in 2004. If you are interested in our code for evaluating performance metrics (PERF), the best place to get it is from the KDD-Cup 2004 web site: http://kodiak.cs.cornell.edu/kddcup/. PERF calculates more than 20 different performance metrics, and can also generate plots for AUC, precision/recall, Lift, accuracy vs. threshold, weighted cost vs. threshold, ...
We presented a paper titled "Evaluating the C-Section Rate of Different Physician Practices: Using Machine Learning to Model Standard Practice" at the AMIA'2003 (American Medical Informatics Association) Conference. In this work, bagged smoothed decision trees turned out to be the model of choice (because they yielded probabilities with excellent calibration) for modeling the risk of c-section for 22,157 expectant mothers. (This paper was nominated for a best paper award.)
With colleagues at CMU and the University of Pittsburgh, we've been clustering proteins. Based on this work we've developed a new approach to clustering called Meta Clustering. Instead of laboriously defining a clustering distance metric and then tuning the distance metric and clustering algorithm until you get a useful clustering, Meta Clustering automatically generates many qualitatively different, yet good, alternate clusterings of the data for you. These alternate clusterings are then themselves clustered at a meta level (yielding a clustering of clusterings) so that the user can efficiently navigate to the clustering most useful for their purposes. This work is supported by NSF CAREER Award #0347318. Here's the Meta Clustering web page: http://www.cs.cornell.edu/~nhnguyen/metaclustering.htm. Here's our first paper on MetaClustering: ICDM06.metaclust.caruana.pdf
David Cohn, Andrew McCallum and I did some of the first work in semi-supervised clustering back in 1999. See http://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cis/TR2003-1892 for a tech report we published years later. By modern standards it's somewhat passe`, but it was cool stuff that was ahead of its time back when we did it. For a better, more recent paper related to this topic see the MetaClustering paper in the paragraph above.
CS-678 Advanced Topics in Machine Learning Spring 2006 Spring 2007
CS-778 Seminar in Machine Learning: Empirical Evaluation of Learning Methods Spring 2004
CS-678 Advanced Topics in Machine Learning: Theory and Practice of Clustering (co-taught with John Hopcroft) Spring 2003
CS-678 Advanced Topics in Machine Learning (co-taught with Thorsten Joachims) Spring2002
CS-211 Introduction to Data Structures and Algorithms (co-taught with Dexter Kozen) Spring2005
Machine Learning Summer School 2005 at TTI-Chicago and University of Chicago. Topics included: Boosting, Decision trees, Empirical Comparisons and Case Studies, Energy Models, Evidence Integration in Bioinformatics, Generalization Bounds, Information Geometry, Manifold Methods, Object Recognition, Online Learning, Reductions, Regularization, Semisupervised Learning, Structured Learning, SVMs. Speakers included: Yasemin Altun, Misha Belkin, Rich Caruana, Sanjoy Dasgupta, Zoubin Ghahrimani, Mark Johnson, Adam Kalai, John Langford, Yann LeCun, Phil Long, David McAllester, Partha Niyogi, Robert Nowak, Robert Schapire, Yoram Singer, Steve Smale.
Jordan Erenrich and I spent an afternoon taking pictures of a chess board with the camera set to different focus distances to create one combined image with infinite depth-of-focus. See the results at: http://www.cs.cornell.edu/~erenrich/dof/
Cooper, G. F., Abraham, V., Aliferis, C. F., Aronis, J., Buchanan, B. G., Caruana, R., Fine, M. J., Janosky, J. E., Livingston, G., Mitchell, T., Monti, S., Spirtes, P., "Predicting Dire Outcomes of Patients with Community Acquired Pneumonia." The Journal of Biomedical Informatics, Vol 38, #5 (October 2005), pp. 347-366.
Caruana, Rich and de Sa, Virginia R., "Benefitting from the Variables that Variable Selection Discards," Journal of Machine Learning Research (JMLR), Vol. 3, March 2003, pp.1245-1264.
Goldenberg, A., Shmueli, G., Caruana, R., Fienberg, S., "Early Statistical Detection of Anthrax Outbreaks by Tracking Over-the-counter Medication Sales," Proceedings of the National Academy of Sciences, 99, 5237-5240, 2002.
Simms, Cynthia J., Meyn, Leslie, Caruana, Rich, Rao, R. Bharat, Mitchell, Tom, Krohn, Marijane, "Predicting Cesarean Delivery with Decision Tree Models," The American Journal of Obstetrics and Gynecology, Vol. 183, No. 5, November 2000, pp. 1198-1206.
Cooper, G. F., Aliferis, C. F., Ambrosino, R., Aronis, J., Buchanan, B. G., Caruana, R., Fine, M. J., Glymour, C., Gordon, G., Hanusa, B. H., Janosky, J. E., Meek, C., Mitchell, T., Richardson, T., Spirtes, P., "An Evaluation of Machine Learning Methods for Predicting Pneumonia Mortality." Artificial Intelligence in Medicine 9, 1997, pp. 107-138.
Caruana, Rich, "Multitask Learning." Machine Learning, Vol. 28, pp. 41-75, Kluwer Academic Publishers, 1997. (download .ps here)(download .pdf here)
Mitchell, Tom, Caruana, Rich, Freitag, Dayne, McDermott, John, Zabowski, David, "Experience with a Learning Personal Assistant." Communications of the ACM, 1994.
Caruana, Rich, Searle, Roger B., Shupack, Saul I., "Additional Capabilities for a Fast Algorithm for the Resolution of Spectra." Journal of Analytical Chemistry, 1988.
Caruana, Rich, Searle, Roger B., Heller,
Thomas, Shupack, Saul I., "Fast Algorithm for the Resolution of
Spectra." Journal of Analytical
Chemistry, 1986.
Caruana, Rich, "15 Useful Tricks with Extra Outputs." Neural Networks: Tricks of the Trade, G. B. Orr and K.-R. Muller (Eds.), Springer-Verlag, 1998.
Caruana, Rich, Freitag, Dayne, "How
Useful Is Relevance?" Intelligent
Relevance: Papers from the 1994 Fall Symposium, AAAI Report FS-94-02, ISBN
0-929280-76-8, 1994.
Caruana, Rich, "The Automatic Training
of Rule Bases that Use Numerical Uncertainty Representations." Uncertainty in Artificial Intelligence, Vol. 3, North-Holland, 1988.
Rich Caruana, Art Munson, and Alexandru Niculescu-Mizil, "Getting the Most Out of Ensemble Selection," to appear in the Proceedings of the Sixth International Conference on Data Mining (ICDM'06), December 2006.
Rich Caruana, Mohamed Elhawary, Nam Nguyen, and Casey Smith, "Meta Clustering," to appear in the Proceedings of the Sixth International Conference on Data Mining (ICDM'06), December 2006.
Engin Ipek, Sally McKee, Bronis de Supinski, M. Schulz, and Rich Caruana, "Efficiently Exploring Architectural Design Spaces via Predictive Modeling," to appear in The Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October 2006.
Cristian Bucila, Rich Caruana, and Alexandru Niculescu-Mizil, "Model Compression," The Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2006), August 2006, pp. 535-541.
Rich Caruana, Mohamed Elhawary, Daniel Fink, Wesley Hochachka, Steve Kelling, Art Munson, Mirek Riedewald, and Daria Sorokina,, "Mining Citizen Science Data to Predict Prevalence of Wild Bird Species," The Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2006), August 2006, pp. 909-915.
Lars Backstrom and Rich Caruana, "C2FS: An Algorithm for Feature Selection in Cascade Neural Nets," The Proceedings of the IEEE World Congress on Computational Intelligence (IJCNN 2006), July, 2006.
Rich Caruana and Alexandru Niculescu-Mizil, "An Empirical Comparison of Supervised Learning Algorithms," The Proceedings of the 23rd International Conference on Machine Learning (ICML2006), June 2006, pp. 161-168.
Alexandru
Niculescu-Mizi and Rich Caruana, l“Predicting Good Probabilities," The
Proceedings of the 22nd International Conference on Machine Learning (ICML*05),
pp. 625-632. (Received best student paper award at ICML. Also an
oral presentation at the 2005 Snowbird Workshop on Machine Learning.)
Alexandru Niculescu-Mizil and Rich Caruana, “Obtaining Calibrated Probabilities from Boosting," International Conference on Uncertainty in Artificial Intelligence, pp. 413-420 (2005).
Caruana, Rich and Niculescu-Mizil, Alex, "Predicting Good Probabilities with Supervised Learning," Proceedings of the American Meteorology Conference (AMS2005), San Diego, 2005. (http://ams.confex.com/ams/pdfpapers/88928.pdf)
Caruana, Rich, Niculescu, Alex, Crew, Geoff, and Ksikes, Alex, "Ensemble Selection from Libraries of Models," The International Conference on Machine Learning (ICML'04), 2004. (caruana.icml04.crc.ps) For an updated draft, get the following: caruana.icml04.revised.rev2.ps
Caruana, Rich, Niculescu, Stefan, Rao, Bharat, and Simms, Cynthia, "Evaluating the C-section Rate of Different Physician Practices: Using Machine Learning to Model Standard Practice," The American Medical Informatics Conference (AMIA), November 2003, pp. 135-139. (This paper was ominated for a best paper award.) (To minimize confusion, Stefan Niculescu and Alex Niculescu-Mizil are different people.)
Langford, John, and Caruana, Rich, "(Not)Bounding the True Error," Neural and Information Processing Systems, Vol. 14 (Proceedings of NIPS*2001), MIT Press, 2002.
Caruana, Rich, Niculescu, Stefan, Rao, Bharat, and Simms, Cynthia, "Machine Learning for Sub-population Assessment: Evaluating the C-section Rate of Different Physician Practices," The American Medical Informatics Conference (AMIA), November 2002, pp. 126-130. (If you are interested in this paper, you might want to look at our AMIA 2003 paper on the same topic that uses bagged decision trees to predict well-calibrated probabilities.) (To minimize confusion, Stefan Niculescu and Alex Niculescu-Mizil are different people.)
Caruana, Rich, "A Non-Parametric EM-Style Algorithm for Imputing Missing Values," Artificial Intelligence and Statistics, January 2001.
Caruana, Rich, Lawrence, Steve, and Giles, Lee, "Overfitting in Artificial Neural Nets Trained with Backpropagation, Conjugate Gradient, and Early Stopping," Neural and Information Processing Systems, Vol. 13 (Proceedings of NIPS*2000), MIT Press, 2001. (download .ps here)
Berger, Adam, Caruana, Rich, Cohn, David, Freitag, Dayne, Mittal, Vibhu, "Bridging the Lexical Chasm: Automatic FAQ Answer Finding." Special Interest Group on Information Retrieval (SIGIR), Athens, Greece, July 2000.
O'Sullivan, Joseph, Langford, John, Caruana, Rich, Blum, Avrim, "Unabridged Learning." International Conference on Machine Learning (ICML), Stanford, California, July 2000.
Caruana, Rich, Cohn, David, McCallum,
Andrew, "Semi-Supervised Clustering with User Feedback." Machines that
Learn, Snowbird, Utah, April 2000.
Simms, Cynthia, M.D., Caruana, Rich, Krohn, M. J., Meyn, Leslie, Mitchell, Tom, Rao, R. Bharat, and Schmeuking, Ingo, "Predicting Caesarean Section with Decision Trees." Annual Meeting of the Society of Fetal and Maternal Medicine, February 2000.
Caruana, Rich, "Case-Based Explanation of Artificial Neural Nets." Artificial Neural Nets in Medicine and Biology (ANNIMAB), Goteborg, Sweden, May 2000.
Caruana, R., Kangarloo, H., David, J., Dionisio, N., Sinha, U., Johnson, D. "Case-Based Explanation of Non-Case-Based Learning Methods." Proceedings of the 1999 American Medical Informatics Association (AMIA) Symposium, 1999, pp.212-215.
Caruana, Rich, Dionisio, John, Johnson, David, Taira, Ricky, Kangarloo, Hooshang, "Automatic Imaging Protocol Selection." American Radiology Conference (IRAS),
1999.Caruana, Rich, "Applying Case-Based Explanation to Non-Case-Based Methods such as Artificial Neural Nets or Decision Trees." American Radiology Conference (IRAS), 1999.
Caruana, Rich, O'Sullivan, Joseph, "Multitask Pattern Recognition for Autonomous Robots." IEEE International Conference on Intelligent Robotic Systems (IROS), Victoria, B.C., Canada, October 1998.
Caruana, Rich, de Sa, Virginia, "Using Feature Selection to Find Inputs that Work Better as Extra Outputs." The International Conference on Neural Nets (ICANN), Skövde, Sweden, September 1998.
Caruana, Rich, O'Sullivan, Joseph, "Multitask Pattern Recognition for Vision-Based Autonomous Robots." The International Conference on Neural Nets (ICANN), Skövde, Sweden, September 1998.
Caruana, Rich, de Sa, Virginia, "Promoting Poor Features to Supervisor: Some Inputs Works Better as Outputs." Neural and Information Processing Systems, Vol. 9 (Proceedings of NIPS*96), MIT Press, 1997, pp. 389-395. (download .ps here)
Caruana, Rich, "Algorithms and Applications for Multitask Learning." Machine Learning, Proceedings of the 13th International Conference on Machine Learning (ICML 1996, Bari, Italy), Morgan Kauffmann, 1996, pp. 87-95. (download .ps here)
Caruana, Rich, Baluja, Shumeet, Mitchell, Tom, "Using the Future to 'Sort Out' the Present: Rankprop and Multitask Learning for Medical Risk Evaluation." Advances in Neural Information Processing Systems, Vol. 8 (Proceedings of NIPS*95), MIT Press, 1996, pp. 959-965. (download .ps here)
Baluja, Shumeet, Caruana, Rich, "Removing the Genetics from the Standard Genetic Algorithm." Proceedings of the 12th Annual Conference on Machine Learning, 1995, pp. 38-46. (download .ps here)
Caruana, Rich, "Learning Many Related Tasks at the Same Time with Backpropagation." Advances in Neural Information Processing Systems 7 (Proceedings of NIPS*94), MIT Press, 1995 pp. 657-664. (download .ps here)
Caruana, Rich, Freitag, Dayne, "Greedy Attribute Selection." Machine Learning, Proceedings of the Eleventh International Conference on Machine Learning, (ICML 1994, New Brunswick, New Jersey) Morgan Kauffmann, 1994, pp. 28-36. (download .ps here)
Caruana, Rich, "Multitask Connectionist Learning." Proceedings of the 1993 Connectionist Models Summer School, 1993, pp. 372-379. (download .ps here)
Caruana, Rich, "Multitask Learning: A Knowledge-Based Source of Inductive Bias." Proceedings of the 10th International Conference on Machine Learning, 1993, pp. 41-48. (download .ps here)
Caruana, Rich, Schaffer, J. David, "Using Multiple Representations to Control Inductive Bias: Gray and Binary Codes for Genetic Algorithms." Proceedings of the Sixth International Workshop on Machine Learning (ML 1989), Morgan Kaufmann, 1989, pp. 375-378.
Schaffer, J. David, Caruana, Rich, Eshelman, Larry J., "Designing Neural Nets that Generalize Optimally with Genetic Algorithms." Los Alamos Conference on Emergent Computation, May 1989.
Eshelman, Larry J., Caruana, Rich, Schaffer, J. David, "Biases in the Crossover Landscape." The 1989 International Conference on Genetic Algorithms, June 1989.
Schaffer, J. David, Caruana, Rich, Eshelman, Larry J., Das, Raj, "A Study of Control Parameters Affecting Online Performance of Genetic Algorithms for Function Optimization." The 1989 International Conference on Genetic Algorithms, June 1989.
Caruana, Rich, Eshelman, Larry J., Schaffer, J. David, "Representation and Hidden Bias II: Eliminating Defining Length Bias in Genetic Search with Shuffle Crossover." International Joint Conference on Artificial Intelligence (IJCAI), August 1989.
Caruana, Rich, Schaffer, J. David,
"Representation and Hidden Bias: Gray vs. Binary Coding for Genetic
Algorithms." Fifth International Conference on Machine Learning, June 1988.
Caruana, Rich, Hodor, Paul, "A High-Precision Workbench for Extracting Information from the Protein Data Bank (PDB)." Knowledge and Data Discovery (KDD) Workshop on Text and Information Extraction, Boston, Massachusetts, 2000.
Caruana, Rich, Mullin, Matt, "Estimating the Number of Local Minima in Complex Search Spaces." International Joint Conference on Artificial Intelligence Workshop on Optimization, (IJCAI), Stockholm, Sweden, 1999.
Caruana, Rich, "15 Useful Tricks with Extra Outputs." Neural and Information Processing Systems (NIPS) Workshop on Tricks of the Trade, 1996.
Caruana, Rich, Freitag, Dayne, "How Useful Is Relevance?" AAAI Fall Symposium on Relevance, New Orleans, Louisiana, 1994. (download .ps here)
Caruana, Rich, "Generalization vs. Network Size." Neural and Information Processing Systems (NIPS) Workshop on Generalization, 1993.
Chrisman, Lonnie, Caruana, Rich, Carriker, Wayne, "Intelligent Agent Design Issues: Internal Agent State and Incomplete Perception." AAAI Fall Symposium on Sensory Aspects of Robotic Intelligence, Asilomar, California, 1991. (download .ps here)
Caruana, Rich, "The Automatic Training
of Rule Bases that Use Numerical Uncertainty Representations." AAAI-87
Workshop on Uncertainty in Artificial Intelligence, Seattle, Washington, 1987.
David, Cohn, Rich Caruana, and McCallum Andrew, "Semi-Supervised Clustering with User Feedback." Cornell University Technical Report, TR2003-1892.
Caruana, Rich, Artigas, Pedro, Goldenberg, Anna, and Likhodedov, Anton, "Meta Clustering." Cornell University Technical Report, TR2002-1884.
Caruana, Rich, "Multitask Learning." Ph.D. Dissertation, School of Computer Science, Carnegie Mellon University, CMU-CS-97-203, 1997.
Buntine, Wray, Caruana, Rich, "Introduction to IND and Recursive Partitioning." NASA Ames Research Center, TR# FIA-91-28, 1991.
Caruana, Rich, "BANDIT: A Fast Algorithm for the Resolution of Spectra." Master's Thesis, Departments of Computer Science and Chemistry, Villanova University, 1989.
Caruana, Rich, "Estimating the Number
of Minima in Complex Search Spaces." Philips Labs, TR-88-159, 1988.
Caruana, Rich, Schaffer, J. David,
"Optimizing Digital Filters with Simulated Annealing and Genetic
Algorithms." Philips Labs, TR-88-123, 1988.
Caruana, Rich, Schaffer, J. David, "An Investigation of Parameter Sets for Genetic Algorithms." Philips Labs, TR-88-045, 1988.
Caruana, Rich, Coffey, Brian J., "Searching for Optimal FIR Multiplierless Digital Filters with Simulated Annealing." Philips Labs, TR-88-031, 1988.
Pelavin, Rich N., Coffey, Brian J., Caruana, Rich, "Research in Diagnosis Using Design Knowledge." Philips Labs, TR-88-001, 1988.
Caruana, Rich, Schaffer, J. David,
"Gray vs. Binary Coding for Genetic Algorithm Function Optimizers."
Philips Labs, TR-87-080, 1987.
Benjamin, D. Paul, Caruana, Rich, "Partial-Matching as Search." Philips Labs, TR-87-019, 1987.
Caruana, Rich, "Experiments in
Rule-Based Learning in Systems Using Numerical Uncertainty
Representations." GTE Western Division Technical Report, 1986.
Sukthankar, Rahul, Caruana, Rich, Hasegawa, Keiko, Mullin, Matt, "Using Active Monitor Illumination for 3-D Active Imaging." United States Patent 6,704,447. Assignee: Justsystem Pittsburgh Research Center, Pittsburgh. Filed April 2000, granted March 9, 2004.
Caruana, Rich, "Iterated K-Nearest
Neighbor Method and Article of Manufacture for Filling in Missing Values."
United States Patent 6,047,287. Assignee: Justsystem Pittsburgh Research Center,
Pittsburgh, Pennsylvania. Filed May 5, 1998, granted April 4, 2000.