Matlab Code for Generating Random Datasets
 An example `.m' file that creates a 2D dataset with 3 clusters. It can also be modified to generate other artificial
data (with different numbers of clusters, dimensions, and underlying distributions).
 The following matlab package contains a file called
"generate_samples.m" for generating hybrid linear models. It is part of the larger GPCA package.
In order to avoid intersection of subspaces (so that standard clustering could be applied) one needs to set the parameter avoidIntersection = TRUE
(and also have affine subspaces instead of linear).
Other Data and Data repositories
 Clustering datasets at UCI Repository
 Complete UCI Machine Learning Repository
 Yale Face Database B
 Some processed face datasets saved as Matlab data can be found here. Two matrices, X and Y, are included. If you plot Y(1:3,:) you will see three clearly separated clusters.
The first 64 points are in one cluster, the next 64 points in another cluster, etc.. The original files are on the Yale Face Database B webpage (above). The folder names
are yaleB5_P00, yaleB8_P00, yaleB10_P00. They have been processed following the steps described in Section 4.2.2 of the
following paper. The matlab code used for processing them is here.
 Here is an example of spectral clustering data. It contains points from 2 noisy circles: after loading the `.mat' file type
"plot(X(:,1),X(:,2),'LineStyle','.');" to see them. You can embed them into 2D space for clustering with EmbedCircles.m. Note that changing sigma
in this file will lead to different problems.
