AI is A-OK at discovering patterns in data
Jan. 17, 2023 - Susan T. BurtchPaul Brooks is the School of Business Information Systems department chair. He is also a two-time presenter and finalist in the Data Mining Society Best Paper Award competition at the INFORMS (Institute for Operations Research and Management Science) annual conference on business analytics and operations research. But most importantly, Brooks is an innovator.
Brooks and his collaborators at VCU and Ghent University developed a new method of conjecturing to help people understand why artificial intelligence (AI) systems make the recommendations they do. “It fills an important gap in data analysis methods,” says Brooks. “It recovers patterns others do not see. It’s the first method we know of to apply conjecturing to learning from data.”
Brooks was inspired in part by a classroom experience where he conducted a predictive analytics competition for his students using real estate data. The students were able to apply sophisticated machine learning and AI methods to obtain accurate predictions but were unable to recover a simple rule that existed in the data.
“Our method is designed to discover exactly these kinds of rules,” says Brooks. “First, it finds bounds of variables in terms of other variables, and then it finds sufficient conditions for a logical target. Our method applied to this real estate data produces the successful rule.”
Meanwhile, Brooks – along with his co-authors David Edwards (Statistical Sciences and Operations Research, VCU), Craig Larson (Mathematics and Applied Mathematics, VCU) and Nico Van Cleemput (Applied Mathematics, Computer Science and Statistics, Ghent) -- are working to apply the conjecturing method in other scientific domains and commercial applications. For instance, they are using it to discover risk factors for poor COVID outcomes and predict injuries in military basic training programs. Yet Brooks is especially focused on alerting his fellow academicians to the possibility of “using our method for generating data-mined-yet-interpretable features with new methods of automatically specifying instrumental variables for establishing causal relationships.”