Agenda
Problem Statement
Pre-process
Algorithm Selection
Experiment
Conclusions
QING Pei edwardtoday@gmail.com
March 6, 2013 @ PolyU BRC
Problem Statement
Pre-process
Algorithm Selection
Experiment
Conclusions
slow regression / learning
huge storage space
noisy due to correlation
slow regression / learning
huge storage space
noisy due to correlation
Calculate Correlation Coefficient
Calculate Correlation Coefficient
Remove Most Correlated Column
max column sum -> most correlated
snapshots for 10 removals
93 sets for experiment
Regression
Machine Learning
Neural Network
Probabilistic Graphical Model
Boosting
Meta Algorithms
Regression
Logistic Regression
Classification and Regression Tree
ADTree
SimpleCART
Machine Learning
Naive Bayes
Support Vector Machine
SVM
SMO
Kernel Estimation
Neural Network
Multilayer Perceptron
Radial Basis Function Network
Probabilistic Graphical Model
Bayesian Network
Hidden Markov Model
Boosting
AdaBoostM1
little gain from 96.07% to 96.42%
considerably slower
Meta Algorithms
Ten labeled train/test set pairs are generated.
Train a classifier with an algorithm from a train set containing 90% labeled data from the entire input. Test the classifier performance using the rest 10% data which is the test set.
Repeat step 2 with the 2nd to 10th train/test set pairs.
Take the average performance as output.
This is repeated 10 times to make sure our results are statistically significant.
Accuracy
Training Time
Testing Time
Model Size
Bayesian algorithms are vulnerable to noise. They lose a bit accuracy after the feature vector length is larger than 400.
Some machine learning algorithms, such as SVM, tend to ignore the negative impacts of noise. The accuracy do not drop when lots of noises are added. It does not goes any higher either.
The highest accuracy (97%) is achieved by BayesNet with a very short feature vector length of around 70.
Most algorithms peak at [250,300]. Several tree-based algorithms step up a lot from 260 to 270.
BayesNet provides highest accuracy.
Tree-based algorithms are fast. They are optimal with 270-dimensional feature.
AI algorithms (abandoned during algorithm selection phase) are a order of magnitude slower. They do not, however, provide lower error.
Wrong cases will be inspected to find outliers or to improve current models.
Algorithm | Feature Vector Length | Accuracy (%) | Training Time (s) | Testing Time (s) | Model Size (B) |
---|---|---|---|---|---|
BayesNet | 70 | 0.97 | 0.03 | 0 | 5.2e5 |
Bagging | 270 | 96.07 | 1.02 | 0 | 4.7e4 |
SMO w/Polykernel | 440 | 96.9 | 0.67 | 0 | 2.8e6 |
Slideshow created using remark.
Keyboard shortcuts
↑, ←, pg up, k | Go to previous slide |
↓, →, pg dn, space, j | Go to next slide |
home / end | Go to first / last slide |
? | Show help |
esc | Back to slideshow |