University of Minnesota
CSci 8363 - Linear Algebra in Data Exploration
index.php

THE FOLLOWING SOURCES WERE FOUND VIA A GOOGLE SEARCH "data mining sample data" on Jul 12 2019.

THE FOLLOWING SOURCES WERE FOUND IN 2017.


SNAP(Stanford Network Analysis Project)
Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining
library. 
http://snap.stanford.edu/

Bureau of Transportation Statistics
https://www.bts.gov/

The SuiteSparse Matrix Collection (formerly known as the University of Florida Sparse Matrix Collection) 
https://www.cise.ufl.edu/research/sparse/matrices/


movielens +  GroupLens
https://grouplens.org/datasets/

kaggle - Data Science & Machine Learning competitions and open data sets
https://www.kaggle.com
https://www.kaggle.com/datasets

https://www.kaggle.com/c/word2vec-nlp-tutorial/data
Use Google's Word2Vec for movie reviews


BSDB (The Berkeley Segmentation Dataset and Benchmark)
https://www.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/.
Testbed images for image segmentation and edge detection in natural images.

Network Data
http://openflights.org/data.html#route 
airport-airport  539 airlines 2939 airport pairs

Word count data
http://www.ngrams.info/intro.asp
to get 3-grams of english text

https://books.google.com/ngrams
sample data

MSR GPS Privacy Dataset 2009, March 2017
http://research.microsoft.com/en-us/um/people/jckrumm/wallflower/testimages.htm.
See the "Downloads" tab:
Seattle region GPS tracking data.
https://www.microsoft.com/en-us/download/details.aspx?id=54965

http://www-personal.umich.edu/~mejn/netdata/
links to some network data, compiled over the years. 

http://socialcomputing.asu.edu/datasets/YouTube
co-occurence data among user's access to youtube videos.

http://socialcomputing.asu.edu/datasets/BlogCatalog
friend information among bloggers.


10,000 Facebook status updates of 250 users + personality + Facebook social network properties, including network size, betweenness centrality, density and transitivity. 
http://mypersonality.org/wiki/lib/exe/fetch.php?media=wiki:mypersonality_final.zip
Cite: Celli F., Pianesi F., Stillwell D., Kosinski M. (2013) Workshop on Computational Personality Recognition (Shared Task). In Proceedings of WCPR13, in conjunction with ICWSM-13.

Image - Video Data
Detection of Moving Objects
http://limu.ait.kyushu-u.ac.jp/dataset/en/

http://wordpress-jodoin.dmi.usherb.ca/dataset2012/
identification of changing or moving areas in the field of view of a camera
This dataset contains 6 video categories with 4 to 6 videos sequences in each category

Test Images for Wallflower Paper (background subtraction) February 2017
http://research.microsoft.com/en-us/um/people/jckrumm/wallflower/testimages.htm.
See the "Downloads" tab:

Back to Class Web Page