University News

Nascent club hosts first DataFest

Students tasked with analyzing data, creating presentation to win inaugural competition

By
Staff Writer
Monday, April 25, 2016

After being presented with four gigabytes worth of data ­— the equivalent of roughly 1,000 songs or 1,740 pictures, according to the Huffington Post — competitors at Brown’s first DataFest were afforded just a day and a half to extrapolate something meaningful from them.

DataFest was organized by Brown Data Science, a newly formed club that offers workshops, supports projects and promotes the learning of data science, said Tanay Padhi ’17, the club’s cofounder and co-president. The event was also organized with support from the Department of Computer Science and the Data Science Initiative — a “cross-departmental program that seeks to integrate data science into the current curriculum at Brown,” according to the club’s website.

Five teams of up to three students each worked primarily in the Center for Information Technology to analyze the data and create a final presentation of their findings, said Sachin Pendse ’17, vice president of events and education for Brown Data Science.

The data set used in the competition was provided by the American Statistical Association, which helps organize DataFests at other universities, Padhi said. Because the data set will be used in upcoming DataFests, organizers are not allowed to disclose the specific information it contains, he said.

Presentations were judged by Director of the Center for Biomedical Informatics Neil Sarkar, Assistant Professor of Computer Science Tim Kraska and Dmitri Lemmerman ’05, a representative from sponsor Two Sigma, a hedge fund that uses digital technology to manage investments.

After some deliberation, judges awarded the status of “best visualization” to Ian Pan ’16, who created an interface that allows users to comb through data.

“We really looked for something that was beyond the standard bar graph,” Sarkar said. “The winner really was able to allow for hypothesis generating and exploring the data in a way we thought was interactive and visually stunning.”

“Best insight” went to the team of Nikolas Baya ’18, Anthony Cruz ’18 and Jason Wang ’18, who used the programming language Python to sort through the data set, Baya said. They then created graphs and maps examining when advertisements are most effective in a sports season and how advertisements are targeted in the industry, Cruz said.

“For ‘best insight,’ we were looking for completeness in terms of considering (all variables),” Sarkar said. “(We wanted to know) how they went through the data-cleaning process, … if thought was given to the types of statistics they should use and why they chose to use them. The winner went through all those things really methodically.”

The winners received prizes that included an Amazon Kindle Fire, a Garmin Vivofit and a Tile — a Bluetooth tracking device.

Throughout the weekend, upperclassmen hosted workshops on data visualization and other data and computer science topics, such as “Exploring Genomic Data Science” and “Deep Learning: The Future of Machine Intelligence.” Participants of DataFest who chose not to compete were strongly encouraged to attend these workshops, Pendse said.

Edwin Hidalgo ’17, who attended the workshop “Sports Analytics,” said he admired the speaker’s passion and enjoyed learning how data can be applied beyond an academic setting.

Hidalgo also co-hosted the workshop “Discussion on Health and Wellness in CS” with Martin Zhu ’17, vice president of operations of BDS. The group hopes to partner with Women in Computer Science and Brown Active Minds to host a larger discussion on the same topic during reading period, Pendse said.

In his keynote address during the closing ceremony, Chief Information Officer and Vice President for Computing and Information Services Ravi Pendse P’17 highlighted the proliferation of technology over the past five years and how data and data science have impacted the daily lives of millions of people. Pendse challenged and encouraged students to face problems — some created by data itself — using data science.

“With your human ingenuity and the power of data science, I have no doubt that the world that I’ll see in the next 10 years or so will be a much more enhanced world and a better world for all of us,” Pendse said.

The BDS executive board hopes to grow the DataFest next year to reach more participants and competitors, Sachin Pendse said. But the goal is “not so much quantitative,” Padhi said. “It is to ensure that students who are interested in (data science) but lack confidence do have the confidence to do this type of work.”

BDS was founded because, despite great interest in data science on campus, Brown lacked spaces for students to share and develop that interest besides “a few high-level classes,” Padhi said.

The group believes that data science can be applied to every field, whether that application means conducting topical analyses of texts in literature or predicting outcomes in international relations, Sachin Pendse said. “Not only do we have people who want to become data scientists, we also have people who could use data science as part of their work,” Padhi said.