University of Minnesota
Dispersed Data-driven Computing
index.php

Course Work

The course work consists of a combination of class presentations (30% of the grade), paper reviews (10%), class participation (15%), scribing (10%), and course project (35%).

Class Presentations

Each student will present two full papers from the reading list, with each presentation worth 10% of the course grade. In addition, for 10% of the grade, each student will pick one of the following options for additional class presentation:

  • Software demo
  • 2 short paper presentations (5% of the grade each)
  • An additional full paper presentation

Paper Presentations

You will be required to do at least two (and optionally a third) regular paper presentations. We will typically cover 3 regular papers each week. Some of the papers are listed as Background - these papers are essential background papers. Prof. Chandra will present the initial papers, however, everyone is required to read them to get an understanding of the subsequent material in the class.

Please send me a list of your top three or four paper choices by email. Although I'll try to assign the preferred papers to everyone, conflicts would be resolved in a first-come-first-serve manner. It's possible you may not get any of your preferences if there is a great demand for some of the papers, but I'll try to assign you one closest to your area of interest in that case. I'll post a schedule of class presentations as soon as everyone has signed up.

Suggestions for presentations

For a regular paper, plan to spend a total of about 30-35 minutes on the presentation. As part of this, plan to spend about 20 minutes on the talk (about 5 minutes of motivation and problem statement, 10-12 minutes explaining the main idea and results, and 3-5 minutes summarizing the paper and providing your perspective), followed by about 10-15 minutes of discussion. The emphasis should NOT be on presenting each detail of the paper, but to convey the main ideas. Most importantly, you should try to identify some interesting questions and issues that can spark discussion in the class. Some tips about your presentation slides:

  • A picture is worth a thousand words! Use as many figures as possible and avoid too much text on each slide.

  • Mainly use short bullets and explain the matter in detail verbally. The bullets are for reference, not for reading out.

  • Avoid using font size smaller than 22 pt (20 pt should be the absolute minimum).

  • Avoid using sub-sub-bullets.

  • Add 1-2 slides at the end (after the conclusion slide) with your comments, questions, and discussion points about the paper. These slides could be used as a basis for class discussion. Alternately, you can incorporate discussion points throughout your slides.

You owe it to your colleagues to do a good job on the presentation. If you need feedback from me, you must turn in a rough draft of your presentation materials (slides, questions that you would like to raise during the discussions, etc.) at least 2-3 days before the class. You can also come by and talk to me about your presentation. In addition, students would be grouped into Red Teams (groups of 2-3 students). This Red Team could be your project team or the other presenters in the same week. You should practice your talk with the members of your Red Team prior to the class presentation. You must submit a copy of your final presentation after the class.

The presentations will be evaluated based on a combination of content, format, delivery, and the amount of discussion generated in the class. Here's a Presentation Feedback Form that you can look at to get a sense of the main points to consider in preparing your presentation.

Short Paper Presentations

You can do two short paper presentations (from the reading list) instead of a software demo or an additional full paper. The guidelines for a short paper presentation are similar to those for regular papers above, except each presentation should be about 20 minutes (12-15 minutes for the talk followed by 5-8 minutes of discussion). Each short paper presentation will be worth 5% of the grade.

Software Demos

A software demo would involve demonstrating a popular software project related to a topic of interest in the class. This would involve describing the high-level features, components, and programming interfaces of the software, and potentially showing a simple "Hello World" like usage of the software. For instance, a demo of Hadoop would involve a short description of Hadoop MapReduce and HDFS, and showing how to write and execute a simple MapReduce program (such as WordCount). Plan to spend about 15-20 minutes on your demo in the class. A software demo would be worth 10% of the grade.

Scribing

You will be asked to scribe for two papers on a related topic (likely to be covered in the same class). You need to submit a single scribe report for both papers. This report will be worth 10% of your grade. Submit your final scribe report via the Canvas site.

As part of your scribing duties, you will be required to take notes of the main points of the presentation, as well as the discussion and Q&A during the class. It would be desirable to organize these notes after the class, so that the points and discussion are not necessarily placed in the order in which these things were presented and discussed in the class, but based on a logical order and restructured (e.g., group together similar issues, put suitable headings, itemize points, reword/expand some points for clarity, etc.). Note that the scribe notes you submit are not meant to be minutes of the class meeting, rather they should be a coherent collection of the main ideas and issues that were raised during the class. At the same time, they are not meant to be a detailed reiteration of the paper contents, so avoid expanding too much on details already there in the paper, and put them briefly as required in relation to the presentation/discussion. In addition, add a brief paper summary (similar to that in a review) at the top of the scribe notes. The summary should be a short paragraph briefly describing the main ideas of the paper. Overall, one should be able to read the scribe notes and get a quick idea of what the paper is about, as well as some of the interesting issues that were raised/left unanswered by the paper.

In addition to organizing the class notes, you should collect additional information related to the topic discussed in the papers from other sources, such as external blogs, technical articles, or websites. This material could include discussion of interesting technologies that have appeared in the marketplace or as software projects (e.g., open-source projects), as well as new ideas that might have appeared in non-academic blogs or articles. The goal is to find related ideas that may have not been published in an academic publication, but would still be relevant to the topic in hand. As an example, if we discuss the MapReduce paper in class, a discussion of the open-source Hadoop project or other similar projects, or interesting use cases of MapReduce in the industry would be relevant. I do not expect you to carry out an extensive survey, but you should try to find about 2-3 interesting sources if possible. Add a section to your Scribe notes summarizing this additional information at a high level. Make sure to add references and include comparison to the research papers if applicable.

Template scribe report.

Paper reviews

You are required to review papers before they are presented in the class. You must submit a text review (plain ASCII text; no PDF, Word, etc.) of the paper via the Canvas site prior to the class. The goal is to ensure that everyone has read the paper in advance, so that we can have an informed discussion in the class. The reviews will constitute 10% of the total course grade - each paper review carries 1% of the grade, so you need to submit enough reviews to add up to 10% of the grade. You must not submit a review for a paper that you are presenting (If you do, it will not count towards your grade). While the reviews will not be graded, Prof. Chandra will pick interesting points from the submitted reviews to further the discussion.

The review should NOT be the paper abstract. We would follow a conference-style review of the papers. The review should:

  1. summarize the main ideas and conclusions of the paper (2-3 sentences in your own words), and

  2. list the strengths and weaknesses of the paper.

The review should be no more than 2-3 paragraphs long, and the focus should be on the key ideas of the paper. Avoid quoting or copying from the paper text. The review should be in your own words, and should reflect your thoughts about the paper.

Template review form.

Each class presentation would be followed by a discussion on the topic. The entire class will be involved in the discussion. At the end of the discussion, we should have answered some of the following questions:

  1. Why is this a significant problem?

  2. Are there alternate approaches to solve this problem?

  3. Can we improve the techniques proposed in the paper?

  4. What are the main strengths and weaknesses of the paper?

Class participation

Class participation is 15% of the grade, so you are encouraged to participate in a meaningul way in the discussion. This would involve making insightful comments, posing pertinent questions, sharing your views and relevant knowledge, and engaging with other students during class. The class participation grade will be based on active, consistent, and meaningful participation throughout the semester, so please maintain regular attendance and come prepared to the class. You might be asked to individually comment or participate in small group discussions.

Course Project

The goal of the course project is to do some implementation and/or research that advances the state of the art in the field. Projects must be done in teams of 2-3 students. The project should be moderately sized so that it can be completed in about 2.5 months time.  There would be deadlines for project milestones (project proposal, mid-term review, final submission) to help keep the projects on track. There would be final project presentations and a written final report at the end of the semester. I encourage students to define their own projects, and I will help you in defining your projects as well.

Ideally, each project should have a combination of research, implementation, and evaluation. However, each of these components can vary based on the specific project chosen. Each team must discuss their project ideas and plan with me before the proposal date. Examples of some relevant project topics done in a previous offerings of 8980 classes include:

  • MapReduce straggler mitigation in virtualized clusters
  • Weakling detection and mitigation in the Storm stream computing system
  • Load balancing in a graph processing system
  • Implementation of a hypergraph processing layer on top of GraphX
  • Implementation of a new job scheduler in UMN Nebula geo-distributed system

Each of these projects involved a combination of research, implementation, and experimentation, and your project is also likely to have a similar flavor, though the topics are likely to be different. I will also provide some suggestions on possible project topics. A good place to start thinking about project ideas is by reading some of the papers on the reading list along with their related work and key citations. This can provide ideas or at least help you narrow down the topics of interest. You must look for opportunities to extend the state-of-the-art in some way.

Here are some of the key project milestones:

Proposal: The project proposal should be about 2-3 pages long, and it should contain the following components:

  1. Problem statement and Motivation for the problem.

  2. Existing related work.

  3. Proposed solution/design.

  4. Plan of work: How would you implement your solution, timeline/milestones, availability of resources (machines, equipment, etc.).

  5. Evaluation methodology and expected results (if any).

  6. (Important) Division of labor: Which team member will be responsible for what component of the project?

Besides submitting the proposal document, each team must make a short (10-15 minute) proposal presentation in class (including 2-3 mins for Q&A) emphasizing the above points.

Mid-Term Review: There would be a mid-term project review, where each team must submit a short (1-2 page) project progress report specifying what milestones have been achieved so far, and what remains to be done, if there are any major roadblocks being faced, and how to address these roadblocks (if any). Each team would also give a 5-10 minute update on their progress in the class.

Final Report/Presentation: The final project reports would be due at the end of the semester, and each team would also have to give a 20-25 minute final project presentation. The report should be ~8-12 pgs document with many of the same elements as in the project proposal, except with greater details of the problem, related work, solution and implementation, actual results and analysis, as well as the division of labor done by the team members.

Project Cluster: A shared virtual cluster will be set up for use in the class projects. More details will be available soon.

Academic Conduct and Dishonesty

Collaboration and discussion is highly encouraged in this class. However, all submitted paper reviews must be original and written individually. Presentation slides must be prepared on your own, though you can borrow some material available on the web or from the paper (e.g., images, pictures, graphs, etc.). In such cases, you must clearly acknowledge and attribute the sources. Topic survey reports and presentations can cite and attribute external sources, but the report should be written on your own. Class projects are to be done within your teams, and your reports and presentations must be original. You may use external code (such as open-source project code) as part of or the basis of your project, but you must clearly attribute the source(s). You may discuss ideas and ask for clarifications freely with others on or off the class forum, and with Professor Chandra. However, the work must be down on your own. Plagiarism or copying material (including code) from the Web or other sources without attribution is considered cheating.

All instances of academic dishonesty will be dealt with in accordance with University policies. Note that in all instances of cheating, the student(s) providing as well as receiving unauthorized help will be considered to be equally culpable. If you are ever doubtful about what may or may not be considered academic dishonesty, please do not hesitate to ask the instructor.