Columbia’s Institute for Data Sciences and Engineering is about to open, with research beginning in January, a data seminar series planned for the spring, and several academic programs starting next fall.
The institute, part of the School of Engineering and Applied Science, will occupy space within already-existing Columbia buildings on the Morningside Heights and Medical Center campuses. It is being funded by a $15 million grant the University received from the city last July, part of which is going toward hiring 75 new faculty members over the next 15 years.
“It’s been an opportunity to really grow the engineering school and many areas at Columbia,” said Kathleen McKeown, who was named chair of the institute in July. “It’s very interdisciplinary, and I think right now ... many people have this large amount of data and have the need to be able to use it, to draw inferences from it, and that’s an exciting area to work in.”
The institute represents a significant downsizing from the initial plan put forth last October as an entry in the city’s Applied Science NYC competition. That proposal covered 1.1 million square feet instead of 44,000, and was slated to occupy space in the Manhattanville campus. It was downsized after the lion’s share of city funds in the competition went to Cornell’s proposed campus.
In its new iteration, the institute will include a theory-based foundation for data sciences and five major research centers—new media, smart cities, health analytics, cybersecurity, and financial analytics. Each will focus on projects that synthesize vast amounts of data into meaningful statistics, an emerging field known as data mining.
“If we had GPS on taxis all over the city, we’d start to understand traffic patterns and where people are going, then we can direct traffic in real time,” said Patricia Culligan, associate director of the institute, citing an example of a potential Smart Cities project. “If we had sensors on bridges all over the city, we could start to understand which bridges needed maintenance and when they needed maintenance.”
Culligan said that sensing technology and data would allow civil engineers to target problems in “very smart ways,” noting that many researchers lack the funding to monitor and maintain current infrastructure. She said that data mining is the way of the future.
“There’s a lot of outreach going on to companies and to the high-tech community in New York City,” said David Madigan, chair of the statistics department and member of the institute’s executive committee.
The executive committee has also started planning the academic programs for next year.
“The first will probably be a certification, and the second will be a master’s program,” McKeown said, noting that organizing these programs has been challenging because of the interdisciplinary
nature of the project, which includes faculty from eight different departments.
According to Madigan, in light of the logistical challenges that the institute faces, expanding the network of data scientists on Columbia’s campus remains at the top of the institute’s priority list.
“At the end of the day, this is all about the people. It’s all about bringing world-class researchers here to Columbia, which can then make all sorts of great things happen,” he said. “There’s no reason that we can’t build a truly world-class group in this area.”
Culligan said that data mining is a rapidly growing field because of its ability “to solve problems that were previously unsolvable.”
Madigan said that the data institute is “exactly what Columbia needs right now” because data now affects everyone.
“Harnessing the power of the data deluge is something that researchers, educators, industry, commerce around the world is going to see as an exciting challenge and opportunity for the next decade and beyond,” she said. “I think for us to be in a position to be leading in this area is just great.”
“It’s like data is the new black,” she said.