Workshops & Tutorials
The workshops and tutorial day will be on Wednesday June 29th, 2016. The schedule will be as follows:
09:00 - 10:30 Session 1
10:30 - 11:00 coffee break
11:00 - 12:30 Session 2
12:30 - 14:00 Lunch
14:00 - 15:·30 Session 3
15:30 - 16:00 coffee break
16:00 - 17:30 Session 4
Jonathan Rowe and Erica Snow (eds.) Proceedings of the EDM 2016 Workshops and Tutorials co-located with the 9th International Conference on Educational Data Mining.
WS-1: Computer-Supported Peer Review in Education (CSPRED-2016) - Full day
Edward Gehringer, North Carolina State University
Ferry Pramudianto, North Carolina State University
Yang Song, North Carolina State University
Computer-supported peer review is drawing increasing attention from educators and researchers. It produces more copious feedback than the instructor or course staff could provide, and delivers it more quickly. It provides authors with multiple perspectives on their work, rather than the singular voice of a teacher. For the instructor, it generates multiple performance measures that can be used to judge the class’s progress.
As an inherently interdisciplinary topic, peer review stands to benefit from the perspectives of learning scientists, technologists, and instructors, as well as psychologists, anthropologists, statisticians, designers, and other interested parties. The workshop calls for presentation of both early and mature research. Technology demonstrations are also welcome.
WS-2: Writing Analytics, Data Mining, and Writing Studies - Full day
Val Ross, University of Pennsylvania
Alex Rudniy, Fairleigh Dickinson University
Joe Moxley, University of South Florida
David Eubanks, Furman University
The primary goal of this workshop is to facilitate a research community around the topic of large-scale data analysis with a particular focus on writing studies, data mining, and analytics. The workshop hopes to generate cross-disciplinary research among writing program directors and faculty, computational linguists, computer scientists, and educational measurement specialists.
Presenters will address
- How can data mining and analytics be leveraged to better meet the needs of students and educational institutions?
- What are the best practices for adapting the state-of-the-art data mining approaches to the educational domain, with specific attention to teaching and assessing writing?
- How can researchers detect and assess students’ affective and emotional states while engaging the writing construct?
- For assessing writing, automated grading, automated commenting, natural language or textual data processing:
- What are applications of massive parallel computations?
- What are current advances and future directions in the artificial intelligence field?
- What methods, tools or big data platforms are more efficient?
- What are effective pre-processing techniques, e.g. for the Extract/Transform/Load phase?
- What are successful evaluation and validation methods?
Digital tools such as My Reviewers
WS-3: Educational Data Analysis using LearnSphere - Full day
John Stamper, Carnegie Mellon University
Kenneth Koedinger, Carnegie Mellon University
Philip Pavlik, University of Memphis
Carolyn Rose, Carnegie Mellon University
Ran Liu, Carnegie Mellon University
Michael Eagle, Carnegie Mellon University
Michael Yudelson, Carnegie Mellon University
Kalyan Veeramachaneni, Massachusetts Institute of Technology
LearnSphere’s goal is to support any custom analysis workflow that can be applied to educational datasets (such as those in DataShop, DiscourseDB, MOOCdb) and to produce standardized workflow outputs that facilitate quantitative and qualitative model comparisons. We invite researchers to submit 2-4 page analysis descriptions. Strong submissions will have high level descriptions of the analysis workflow and detailed information on the format of input data and resulting outputs. Accepted participants will be eligible for a travel scholarship and have the opportunity to publish outcomes in the EDM workshop proceedings.
This workshop will explore the application and refinement of novel educational data mining workflows using LearnSphere, a new $5 million NSF funded data sharing and analysis portal that extends the existing DataShop infrastructure and includes teams from Carnegie Mellon, Stanford, Memphis, and MIT. Increased flexibility to accommodate custom educational analysis workflows is one of the core ways in which LearnSphere expands upon DataShop.
Tutorial 1: SAS Tools for Educational Data Mining - Full day
Jennifer Sabourin, SAS Institute
Scott McQuiggan, SAS Institute
Andre De Waal, SAS Institute
Researchers in the EDM community have always relied on sophisticated tools to analyze data and build models. As the amount of data that can be collected and stored grows, the need for tools capable of handling "big data" becomes ever more prevalent. SAS® Analytics U is a new initiative for making SAS data analysis and mining tools available for free to educational researchers and instructors. These tools are designed for handling very large data sets and can be run in the cloud, saving researchers valuable time and resources. Furthermore, SAS Analytics U provides a community of SAS educators and learners to share resources and information about SAS tools and techniques. This tutorial aims to introduce researchers to the tools available through SAS Analytics U and how they can be applied to the field of Educational Data Mining. We will provide an overview of the SAS architecture and provide instruction on the key features of each tool in the suite.
If you intend to participate in the hands-on activities, please bring a laptop with SAS University Edition already installed. The process can take up to an hour so there will not be time for it on the day of the tutorial. The free download is available at http://www.sas.com/en_us/software/university-edition.html
Tutorial 2: Massively Scalable EDM with Spark - Full day
Tristan Nixon, Institute for Intelligent Systems, University of Memphis
The creation and availability of ever-larger datasets is motivating the development of new distributed technologies to store and process data across clusters of servers. Apache Spark has emerged as the new standard platform for developing highly scalable cluster computing applications. It offers a wide range of connectors to numerous databases and enterprise data management systems, an ever-growing library of machine-learning algorithms and the ability to process streaming data in near-realtime. Developers can write their applications in Java, Scala, Python and R. Applications can be run locally (for easy development and testing), and deployed to dedicated clusters or on clusters leased from cloud-computing providers. This will be a full-day tutorial at EDM 2016 on developing massively scalable machine learning and data mining applications with Spark. Participants will be expected to follow along with all examples on their own laptops throughout the tutorial. All code used in the tutorial will either be taken from publicly available examples, or be available for download from the IEDMS github repository, and made available under a very liberal open source license. All examples will be designed to process a modestly sized sample of the KDD cup dataset available from the DataShop.