Skip to Content, Navigation, or Footer.

Professor of Applied Mathematics Jan Hesthaven is spearheading an educational initiative to increase student fluency of massive data sets — huge sets of data being generated in such disparate fields as biology, psychology and linguistics.

Set to rollout next fall, this massive data initiative will focus on increasing awareness, potentially including a first-year seminar on the topic.

"We're currently focused on getting a sense of what's already there on the research side and on the education side," said Hesthaven, who also holds the position of deputy director of the Institute for Computational and Experimental Research in Mathematics. "But after some time, we will start seeing more focused activities, like graduate research and maybe an undergraduate-certification program."

The possibilities for massive data are limitless, Hesthaven said. It has the potential to integrate the humanities and the sciences, often seen as a relatively incompatible subjects. "The way we deal with data is becoming ubiquitous in all disciplines," Hesthaven said. "For the first time, disciplines that have no common language are able to talk to each other in the medium of computation."

Computers, for instance, can detect commonalities between languages that humans may pass over. "A computer is incredibly patient," Hesthaven said.

Biology is already at the forefront of this process. This summer, Casey Dunn, assistant professor of biology, published a very large set of data that examined gene function of siphonophores, a group of deep-sea organisms. Dunn, who conducted much of his research aboard ships and submarines, believes massive data can finally take biological inquiry out of the lab. "Before, we only had data sets for about 20 model species that we were able to study in the lab, when there are about 10 million species in the world," he said. "With massive data, we can take functional genomics out of the lab and into the field."

But researchers handling massive data must also maintain the standards of good science despite changing methods, said Alexandre Fournier-Level, postdoctoral research associate in the ecology and evolutionary biology department, who has experience handling massive data. "With massive data, we can capture everything, run thousands of automated tests and get a probabilistic answer, but we need to know the difference between a direct and indirect link," he said. "We used to gather data after we asked the question. Now we gather data, and then ask the questions."

"Massive data poses very interesting challenges and opportunities," Hesthaven said. Researchers, he said, can "use this data to answer questions that we care about, big questions."


ADVERTISEMENT


Powered by SNworks Solutions by The State News
All Content © 2024 The Brown Daily Herald, Inc.