University of Minnesota
CSci5980: Data Storage Systems to Support Big Data
index.php

Announcements for CSci5980: Special Topic on Data Storage Systems to Support Big Data

  • 01/14: Welcome to CSci5980: Special Topic on Data Storage Systems to Support Big Data

  • 01/18: Syllabus has been uploaded. Please check the following link for the detailed information.

  • 01/19: Slides for syllabus and overview have been uploaded.

  • 01/27: Three more slides for overview have been uploaded.

  • 01/31: Papers for various topics have been uploaded.

  • 02/15: Please check out the reading assignment for each lecture in reference.

  • 03/02: The first project report will be due on March 18th. In this report, please describe the project that you intend to do, provide some motivations and background of the subject. The total number of pages of this report is around 3 excluding the references to be listed at the end of the report.

  • 03/14: We will test Zoom for online instruction. Please login for 5 minutes on March 16th at 1:00 p.m.

  • 03/14: New slides have been upladed for the presentations.

  • 03/23: The followings are the schedule for the next few classes:
    • 03/30: we will discuss "99 Data Deduplication Problems"
    • 04/01: we will discuss BigTable paper
    • 04/06: we will discuss LevelDB vs. RocksDB, since we do not have any paper on either LevelDB or RocksDB specifically, you can refer to the following page for RocksDB basics: RocksDB basics
    • 04/08: we will discuss the paper "Characterizing, Modeling and Benchmarking RocksDB"


  • 03/24: The slides for the discussion tomorrow about deduplication has been uploaded.

  • 03/29: The slides for the discussion tomorrow has been uploaded.

  • 03/30: The slides for the discussion on big table has been uploaded.

  • 04/03: The slides for the discussion on KVS have been uploaded.

  • 04/06: The followings are the schedule for the next few classes. The first part of schedule includes the subjects to be discussed in the next two weeks and the second part of the schedule is for individual presentations (15 minutes each):
    • 04/13: DNA-Storage: Scaling Up DNA Data Storage and Random Access Retrieval

    • 04/15: Big Graph Processing: One Trillion Edges: Graph Processing at Facebook

    • 04/20: VM and Container: K8sES: Kubernetes with Enhanced Storage Service Level Objectives

    • 04/22: SDN + SDS: TurboKV: Scaling Up the Performance of Key-Values Store with In-Switch Coordination

    • 04/27:
      Wenlong Wang: Potential Better Data Structures for NVRAM
      Haoyu Gong: Integrating Non-Volatile Memory into Database Systems
      Suresh Siddharth: Indexing Schemes for Key-Value Store
      Wei-Yu Chen and Chai-Wen Hsieh: A Study of Data Deduplication

    • 04/29:
      William Batu and Yuanli Wang: Auto-Tuning RocksDB
      Troy Dey: Data Placement Algorithms for Heterogeneous Storage Systems
      Shannav Nath and Trevor Sloan: DNA-Storage
      Yixun Wei: Deduplication to Enable Efficient Read and Write in DNA-Storage

    • 05/04:
      Huibing Dong: Big Graph Processing
      Sami Frank: Container Orchestration Using Kubernetes and Docker
      Karl Witthuhn: Software Defined Network+ Software Defined Storage
      Nic Hamlin:



  • 04/10: The slides for the discussion on DNA storage has been uploaded.

  • 04/13: The slides for the discussion on graph processing has been uploaded.

  • 04/17: The slides for the discussion on VM and Container and SDN + SDS have been uploaded.

  • 04/22: The final project report will be due on May 13th.

  • 04/27: The slides for the final presentations have been uploaded.