机器学习-分类问题D
数学公式在渲染器中会出现错误,目前还没有解决
数据集:[face.zip]
https://github.com/hhgw/hhgw.github.io/tree/main/zip
In this week you will train a classifier to detect whether there is a face in a small image patch. This type of face detector is used in your phone and camera whenever you take a picture!
First we need to initialize Python. Run the below cell.
1 | %matplotlib inline |
Loading Data and Pre-processing
Next we need to load the images. Download faces.zip
, and put it in the same direcotry as this ipynb file. Do not unzip the file. Then run the following cell to load the images.
1 | imgdata = {'train':[], 'test':[]} |
1745
944
Each image is a 19x19 array of pixel values. Run the below code to show an example:
1 | print(img.shape) |
(19, 19)
Run the below code to show more images!
1 | # function to make an image montage |
Each image is a 2d array, but the classifier algorithms work on 1d vectors. Run the following code to convert all the images into 1d vectors by flattening. The result should be a matrix where each row is a flattened image.
1 | trainX = empty((len(imgdata['train']), prod(imgsize))) |
(1745, 361)
(1745,)
(944, 361)
(944,)
Detection using pixel values
Train an AdaBoost and GradientBoosting classifiers to classify an image patch as face or non-face. Also train a kernel SVM classifier using either RBF or polynomial kernel, and a Random Forest Classifier. Evaluate all your classifiers on the test set.
First we will normalize the features.
1 | from sklearn import preprocessing |
1 | # -------------------- |
Fitting 5 folds for each of 88 candidates, totalling 440 fits
[CV 5/5; 1/88] START learning_rate=1e-05, n_estimators=1........................
[CV 2/5; 1/88] START learning_rate=1e-05, n_estimators=1........................
[CV 3/5; 1/88] START learning_rate=1e-05, n_estimators=1........................
[CV 4/5; 1/88] START learning_rate=1e-05, n_estimators=1........................
[CV 1/5; 1/88] START learning_rate=1e-05, n_estimators=1........................
[CV 1/5; 2/88] START learning_rate=1e-05, n_estimators=2........................
[CV 4/5; 1/88] END learning_rate=1e-05, n_estimators=1;, score=0.862 total time= 0.0s
[CV 2/5; 1/88] END learning_rate=1e-05, n_estimators=1;, score=0.774 total time= 0.0s
[CV 5/5; 1/88] END learning_rate=1e-05, n_estimators=1;, score=0.883 total time= 0.0s
[CV 3/5; 1/88] END learning_rate=1e-05, n_estimators=1;, score=0.874 total time= 0.0s
[CV 1/5; 1/88] END learning_rate=1e-05, n_estimators=1;, score=0.679 total time= 0.0s
[CV 2/5; 2/88] START learning_rate=1e-05, n_estimators=2........................
[CV 3/5; 2/88] START learning_rate=1e-05, n_estimators=2........................
[CV 4/5; 2/88] START learning_rate=1e-05, n_estimators=2........................
[CV 5/5; 2/88] START learning_rate=1e-05, n_estimators=2........................
[CV 1/5; 3/88] START learning_rate=1e-05, n_estimators=5........................
[CV 1/5; 2/88] END learning_rate=1e-05, n_estimators=2;, score=0.679 total time= 0.1s
[CV 4/5; 88/88] START learning_rate=1.0, n_estimators=2000......................
[CV 4/5; 87/88] END learning_rate=1.0, n_estimators=1000;, score=0.989 total time= 23.8s
[CV 5/5; 88/88] START learning_rate=1.0, n_estimators=2000......................
[CV 5/5; 87/88] END learning_rate=1.0, n_estimators=1000;, score=0.986 total time= 23.9s
[CV 1/5; 88/88] END learning_rate=1.0, n_estimators=2000;, score=0.877 total time= 47.0s
[CV 2/5; 88/88] END learning_rate=1.0, n_estimators=2000;, score=0.931 total time= 46.5s
[CV 3/5; 88/88] END learning_rate=1.0, n_estimators=2000;, score=0.991 total time= 46.3s
[CV 4/5; 88/88] END learning_rate=1.0, n_estimators=2000;, score=0.986 total time= 46.4s
[CV 5/5; 88/88] END learning_rate=1.0, n_estimators=2000;, score=0.983 total time= 46.6s
best params: {'learning_rate': 1.0, 'n_estimators': 1000}
best score: 0.954727793696275
test accuracy = 0.6260593220338984
1 | # -------------------- |
Fitting 3 folds for each of 96 candidates, totalling 288 fits
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/xgboost/sklearn.py:1395: UserWarning: `use_label_encoder` is deprecated in 1.7.0.
warnings.warn("`use_label_encoder` is deprecated in 1.7.0.")
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/xgboost/sklearn.py:1395: UserWarning: `use_label_encoder` is deprecated in 1.7.0.
warnings.warn("`use_label_encoder` is deprecated in 1.7.0.")
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/xgboost/sklearn.py:1395: UserWarning: `use_label_encoder` is deprecated in 1.7.0.
warnings.warn("`use_label_encoder` is deprecated in 1.7.0.")
[CV 3/3; 2/96] START learning_rate=0.0001, n_estimators=2.......................
[CV 1/3; 2/96] START learning_rate=0.0001, n_estimators=2.......................
[CV 1/3; 1/96] START learning_rate=0.0001, n_estimators=1.......................
[CV 3/3; 1/96] START learning_rate=0.0001, n_estimators=1.......................
[CV 2/3; 1/96] START learning_rate=0.0001, n_estimators=1.......................
[CV 2/3; 2/96] START learning_rate=0.0001, n_estimators=2.......................
[CV 2/3; 1/96] END learning_rate=0.0001, n_estimators=1;, score=0.895 total time= 0.1s
[CV 1/3; 3/96] START learning_rate=0.0001, n_estimators=5.......................
[CV 3/3; 1/96] END learning_rate=0.0001, n_estimators=1;, score=0.936 total time= 0.1s
[CV 2/3; 3/96] START learning_rate=0.0001, n_estimators=5.......................
[CV 1/3; 1/96] END learning_rate=0.0001, n_estimators=1;, score=0.770 total time= 0.1s
[CV 3/3; 3/96] START learning_rate=0.0001, n_estimators=5.......................
[CV 1/3; 2/96] END learning_rate=0.0001, n_estimators=2;, score=0.789 total time= 0.1s
[CV 1/3; 4/96] START learning_rate=0.0001, n_estimators=10......................
[CV 2/3; 2/96] END learning_rate=0.0001, n_estimators=2;, score=0.900 total time= 0.1s
[CV 2/3; 4/96] START learning_rate=0.0001, n_estimators=10......................
[CV 3/3; 2/96] END learning_rate=0.0001, n_estimators=2;, score=0.935 total time= 0.1s
[CV 3/3; 4/96] START learning_rate=0.0001, n_estimators=10......................
[CV 1/3; 3/96] END learning_rate=0.0001, n_estimators=5;, score=0.790 total time= 0.2s
[CV 2/3; 96/96] START learning_rate=1.0, n_estimators=5000......................
[CV 2/3; 84/96] END learning_rate=0.2682695795279725, n_estimators=5000;, score=0.985 total time= 27.7s
[CV 3/3; 96/96] START learning_rate=1.0, n_estimators=5000......................
[CV 2/3; 95/96] END learning_rate=1.0, n_estimators=2000;, score=0.976 total time= 10.6s
[CV 3/3; 84/96] END learning_rate=0.2682695795279725, n_estimators=5000;, score=0.990 total time= 27.8s
[CV 3/3; 95/96] END learning_rate=1.0, n_estimators=2000;, score=0.974 total time= 10.6s
[CV 1/3; 96/96] END learning_rate=1.0, n_estimators=5000;, score=0.830 total time= 18.6s
[CV 2/3; 96/96] END learning_rate=1.0, n_estimators=5000;, score=0.976 total time= 16.0s
[CV 3/3; 96/96] END learning_rate=1.0, n_estimators=5000;, score=0.974 total time= 15.7s
best params: {'learning_rate': 0.2682695795279725, 'n_estimators': 100}
best score: 0.9392858621525869
test accuracy = 0.6610169491525424
1 | # -------------------- |
Fitting 5 folds for each of 400 candidates, totalling 2000 fits
[CV 5/5; 1/400] START C=0.1, gamma=0.001........................................
[CV 2/5; 1/400] START C=0.1, gamma=0.001........................................
[CV 3/5; 1/400] START C=0.1, gamma=0.001........................................
[CV 4/5; 1/400] START C=0.1, gamma=0.001........................................
[CV 1/5; 1/400] START C=0.1, gamma=0.001........................................
[CV 1/5; 2/400] START C=0.1, gamma=0.00206913808111479..........................
[CV 1/5; 2/400] END C=0.1, gamma=0.00206913808111479;, score=0.825 total time= 0.8s
[CV 2/5; 2/400] START C=0.1, gamma=0.00206913808111479..........................
[CV 1/5; 1/400] END .........C=0.1, gamma=0.001;, score=0.831 total time= 0.9s
[CV 3/5; 2/400] START C=0.1, gamma=0.00206913808111479..........................
[CV 2/5; 1/400] END .........C=0.1, gamma=0.001;, score=0.857 total time= 1.0s
[CV 4/5; 2/400] START C=0.1, gamma=0.00206913808111479..........................
[CV 4/5; 1/400] END .........C=0.1, gamma=0.001;, score=0.986 total time= 1.0s
[CV 5/5; 2/400] START C=0.1, gamma=0.00206913808111479..........................
[CV 3/5; 1/400] END .........C=0.1, gamma=0.001;, score=0.986 total time= 1.0s
[CV 1/5; 3/400] START C=0.1, gamma=0.004281332398719396.........................
[CV 5/5; 1/400] END .........C=0.1, gamma=0.001;, score=0.991 total time= 1.1s
[CV 2/5; 3/400] START C=0.1, gamma=0.004281332398719396.........................
[CV 2/5; 2/400] END C=0.1, gamma=0.00206913808111479;, score=0.874 total time= 0.7s
[CV 2/5; 399/400] START C=1000.0, gamma=483.2930238571752.......................
[CV 2/5; 398/400] END C=1000.0, gamma=233.57214690901213;, score=0.650 total time= 2.0s
[CV 3/5; 399/400] START C=1000.0, gamma=483.2930238571752.......................
[CV 3/5; 398/400] END C=1000.0, gamma=233.57214690901213;, score=0.650 total time= 2.0s
[CV 4/5; 399/400] START C=1000.0, gamma=483.2930238571752.......................
[CV 4/5; 398/400] END C=1000.0, gamma=233.57214690901213;, score=0.653 total time= 2.0s
[CV 5/5; 399/400] START C=1000.0, gamma=483.2930238571752.......................
[CV 5/5; 398/400] END C=1000.0, gamma=233.57214690901213;, score=0.653 total time= 2.0s
[CV 1/5; 400/400] START C=1000.0, gamma=1000.0..................................
[CV 1/5; 399/400] END C=1000.0, gamma=483.2930238571752;, score=0.650 total time= 2.0s
[CV 2/5; 400/400] START C=1000.0, gamma=1000.0..................................
[CV 2/5; 399/400] END C=1000.0, gamma=483.2930238571752;, score=0.650 total time= 1.9s
[CV 3/5; 400/400] START C=1000.0, gamma=1000.0..................................
[CV 3/5; 399/400] END C=1000.0, gamma=483.2930238571752;, score=0.650 total time= 2.0s
[CV 4/5; 400/400] START C=1000.0, gamma=1000.0..................................
[CV 4/5; 399/400] END C=1000.0, gamma=483.2930238571752;, score=0.653 total time= 2.0s
[CV 5/5; 400/400] START C=1000.0, gamma=1000.0..................................
[CV 5/5; 399/400] END C=1000.0, gamma=483.2930238571752;, score=0.653 total time= 2.0s
[CV 1/5; 400/400] END ...C=1000.0, gamma=1000.0;, score=0.650 total time= 2.0s
[CV 2/5; 400/400] END ...C=1000.0, gamma=1000.0;, score=0.650 total time= 2.0s
[CV 3/5; 400/400] END ...C=1000.0, gamma=1000.0;, score=0.650 total time= 1.8s
[CV 4/5; 400/400] END ...C=1000.0, gamma=1000.0;, score=0.653 total time= 0.7s
[CV 5/5; 400/400] END ...C=1000.0, gamma=1000.0;, score=0.653 total time= 0.7s
best params: {'C': 1.1288378916846888, 'gamma': 0.018329807108324356}
best score: 0.9679083094555875
test accuracy = 0.6610169491525424
1 | # -------------------- |
Fitting 5 folds for each of 1000 candidates, totalling 5000 fits
[CV 2/5; 1/1000] START max_depth=2, min_samples_leaf=0.4985924054694343, min_samples_split=0.4662786796693294
[CV 1/5; 1/1000] START max_depth=2, min_samples_leaf=0.4985924054694343, min_samples_split=0.4662786796693294
[CV 3/5; 1/1000] START max_depth=2, min_samples_leaf=0.4985924054694343, min_samples_split=0.4662786796693294
[CV 4/5; 1/1000] START max_depth=2, min_samples_leaf=0.4985924054694343, min_samples_split=0.4662786796693294
[CV 5/5; 1/1000] START max_depth=2, min_samples_leaf=0.4985924054694343, min_samples_split=0.4662786796693294
[CV 1/5; 2/1000] START max_depth=2, min_samples_leaf=0.15116628631591988,
[CV 1/5; 1000/1000] END max_depth=1, min_samples_leaf=0.17171004655433614, min_samples_split=0.1753085946534768;, score=0.745 total time= 0.5s
[CV 2/5; 1000/1000] END max_depth=1, min_samples_leaf=0.17171004655433614, min_samples_split=0.1753085946534768;, score=0.791 total time= 0.4s
[CV 3/5; 1000/1000] END max_depth=1, min_samples_leaf=0.17171004655433614, min_samples_split=0.1753085946534768;, score=0.883 total time= 0.4s
[CV 4/5; 1000/1000] END max_depth=1, min_samples_leaf=0.17171004655433614, min_samples_split=0.1753085946534768;, score=0.894 total time= 0.4s
[CV 5/5; 999/1000] END max_depth=2, min_samples_leaf=0.0752203373318121, min_samples_split=0.1452287642960985;, score=0.980 total time= 0.5s
[CV 5/5; 1000/1000] END max_depth=1, min_samples_leaf=0.17171004655433614, min_samples_split=0.1753085946534768;, score=0.894 total time= 0.4s
best params: {'max_depth': 4, 'min_samples_leaf': 0.0020517537799524255, 'min_samples_split': 0.0172701749254065}
best score: 0.9300859598853869
test accuracy = 0.673728813559322
Which classifier was best?
According to the test accuracy and F1 value (below), the random forest classifier works best。
Some of the advantages of Random Forest Classifier include:
Low bias and low variance, making it less prone to overfitting than some other models.
Ability to handle high-dimensional data with many features.
Ability to handle both categorical and continuous data.
Capable of handling missing values and outliers without the need for data pre-processing.
However, in this case, the advantages of the random forest classifier are not obvious, and the overall accuracy is only slightly higher than other classifiers‘. These classifiers all have the problem of insufficient generalization.
Error analysis
The accuracy only tells part of the classifier’s performance. We can also look at the different types of errors that the classifier makes:
- True Positive (TP): classifier correctly said face
- True Negative (TN): classifier correctly said non-face
- False Positive (FP): classifier said face, but not a face
- False Negative (FN): classifier said non-face, but was a face
This is summarized in the following table:
Actual | |||
---|---|---|---|
Face | Non-face | ||
Prediction | Face | True Positive (TP) | False Positive (FP) |
Non-face | False Negative (FN) | True Negative (TN) |
We can then look at the true positive rate and the false positive rate.
- true positive rate (TPR): proportion of true faces that were correctly detected
- false positive rate (FPR): proportion of non-faces that were mis-classified as faces.
Use the below code to calculate the TPR and FPR of your classifiers.
1 |
|
------------------------------------
model : AdaBoost
------------------------------------
Accuracy: 0.6260593220338984
F1 score: 0.4087102177554438
Confusion matrix:
TP= 122 | FP= 3
TN= 469 | FN= 350
TPR= 0.2584745762711864
FPR= 0.006355932203389831
------------------------------------
model : XGBoost
------------------------------------
Accuracy: 0.6610169491525424
F1 score: 0.4952681388012618
Confusion matrix:
TP= 157 | FP= 5
TN= 467 | FN= 315
TPR= 0.3326271186440678
FPR= 0.01059322033898305
------------------------------------
model : SVM
------------------------------------
Accuracy: 0.6610169491525424
F1 score: 0.48881789137380194
Confusion matrix:
TP= 153 | FP= 1
TN= 471 | FN= 319
TPR= 0.3241525423728814
FPR= 0.00211864406779661
------------------------------------
model : RandomForest
------------------------------------
Accuracy: 0.673728813559322
F1 score: 0.5333333333333334
Confusion matrix:
TP= 176 | FP= 12
TN= 460 | FN= 296
TPR= 0.3728813559322034
FPR= 0.025423728813559324
How does the classifier make errors?
The classifier recognizes face image as non-face, which is the main reason for classifier errors, however, the classifier almost never recognizes non-face image as face. In short, the classifier is not good at recognizing face image.
A model with a low TP value and a high TN value suggests that the model is not performing well in correctly identifying positive cases. There are several potential reasons for this:
Imbalanced data: The dataset used to train the model may be imbalanced, meaning that there are significantly more negative cases than positive cases. This can lead the model to predict negative more often and result in a high TN value, but a low TP value.
Inadequate features: The features used to train the model may not be informative enough to distinguish between positive and negative cases, leading to poor performance in identifying positive cases.
Overfitting: The model may be overfitting to the training data, meaning that it is performing well on the training data but poorly on new, unseen data. This can lead to a high TN value, but a low TP value.
Model complexity: The model may be too simple or too complex for the problem at hand, leading to poor performance in identifying positive cases.
To improve the performance of the model, I may need to adjust the dataset, feature selection or model complexity.
Classifier analysis
For the AdaBoost classifier, we can interpret what it is doing by looking at which features it uses most in the weak learners. Use the below code to visualize the pixel features used.
Note: if you used GridSearchCV to train the classifier, then you need to use the best_estimator_
field to access the classifier.
1 | # adaboost classifier |
<matplotlib.colorbar.Colorbar at 0x7ff1fc72e9d0>
Similarly, we can also look at the important features for xgboost.
1 | # xgboost classifier |
<matplotlib.colorbar.Colorbar at 0x7ff1fe059520>
Similarly for Random Forests, we can look at the important features.
1 | # random forest classifier |
<matplotlib.colorbar.Colorbar at 0x7ff1fcdcb100>
Comment on which features (pixels) that AdaBoost and Random Forests are using
- AdaBoost uses the pixel points around the corners of the image and part of the face contour for classification, while Random Forests uses the nose, eyes and cheeks for classification.
- The reason why Random Forests uses different features than AdaBoost for face detection is that the two algorithms have different ways of selecting features. AdaBoost selects the most discriminative features for classification, Focusing on the full picture outline, while Random Forests focuses on selecting the most informative features for generalization.
For kernel SVM, we can look at the support vectors to see what the classifier finds difficult.
1 | # svm classifier |
num support vectors: 664
<matplotlib.image.AxesImage at 0x7ff1fce6cb50>
Comment on anything you notice about what the SVM finds difficult (i.e., on the decision boundary or within the margin)
High Dimensionality: When the number of dimensions in the data is very high, SVM may find it challenging to find the optimal hyperplane that separates the classes. This is because as the number of dimensions increases, the data becomes more sparse, and the search space becomes more complex, making it more difficult to find a good separation.
Overfitting: In some cases, SVM may overfit the training data, resulting in a model that performs well on the training set but poorly on the testing set. This can happen if the SVM tries to fit too closely to the training data, resulting in a decision boundary that does not generalize well to new data.
Noise: If the data contains a significant amount of noise, SVM may struggle to find the optimal hyperplane that separates the classes accurately. This is because the noise can lead to misclassifications and make it difficult to find a clear separation.
In addition to the above challenges, there are some specific cases where SVM may find it difficult to classify certain images accurately. For example, SVM may struggle to recognize faces with glasses, as the glasses can obscure important facial features. Similarly, images of faces with deep eye sockets may be challenging to classify accurately, as these features can alter the appearance of the face and make it difficult to find a clear separation between the classes.
Custom kernel SVM
Now we will try to use a custom kernel with the SVM. We will consider the following RBF-like kernel based on L1 distance (i.e., cityblock or Manhattan distance),
$$ k(\mathbf{x},\mathbf{y}) = \exp \left(-\alpha \sum_{i=1}^d |x_i-y_i|\right)$$
where $x_i,y_i$ are the elements of the vectors $\mathbf{x},\mathbf{y}$, and $\alpha$ is the hyperparameter. The difference with the RBF kernel is that the new kernel uses the absolute difference rather than the squared difference. Thus, the new kernel does not “drop off” as fast as the RBF kernel using squared distance.
- Implement the new kernel as a custom kernel function. The
scipy.spatial.distance.cdist
function will be helpful. - Train the SVM with the new kernel. To select the hyperparameter $\alpha$, you need to run cross-validation “manually” by: 1) trying different values of $\alpha$, and running cross-validation to select $C$; 2) selecting the $\alpha$ with the highest cross-validation score
best_score_
inGridSearchCV
.
1 | from scipy import spatial |
---------------------------
alpha : 0.001
best params: {'C': 26.366508987303583}
best score: 0.952457843154651
---------------------------
alpha : 0.0021544346900318843
best params: {'C': 7.847599703514607}
best score: 0.9558942692714895
---------------------------
alpha : 0.004641588833612777
best params: {'C': 2.3357214690901213}
best score: 0.9587589434813383
---------------------------
alpha : 0.01
best params: {'C': 2.3357214690901213}
best score: 0.9581871915743484
---------------------------
alpha : 0.021544346900318832
best params: {'C': 1.2742749857031335}
best score: 0.9352766983496085
---------------------------
alpha : 0.046415888336127774
best params: {'C': 1.2742749857031335}
best score: 0.6991549901126352
---------------------------
alpha : 0.1
best params: {'C': 0.01}
best score: 0.6515763594387368
---------------------------
alpha : 0.21544346900318823
best params: {'C': 0.01}
best score: 0.6515763594387368
---------------------------
alpha : 0.46415888336127775
best params: {'C': 0.01}
best score: 0.6515763594387368
---------------------------
alpha : 1.0
best params: {'C': 0.01}
best score: 0.6515763594387368
===============================
best alpha: 0.004641588833612777
test accuracy = 0.6641949152542372
F1 score: 0.49602543720190784
Confusion matrix:
TP= 156 | FP= 1
TN= 471 | FN= 316
TPR= 0.3305084745762712
FPR= 0.00211864406779661
Does using the new kernel improve the results?
Yes, the new kernel improved the results, but only to a very limited extent. However, the program does run faster.
When using a kernel SVM classifier for face detection, the choice of kernel can have a significant impact on the performance of the classifier. A custom kernel based on cityblock distance can have advantages over an RBF-like kernel based on squared difference, depending on the specific characteristics of the data.
Cityblock distance is a metric that measures the distance between two points by summing the absolute differences of their coordinates. This type of distance metric can be useful in face detection because it is robust to differences in lighting and contrast, which can cause pixel values to vary significantly. In contrast, an RBF-like kernel based on squared difference is sensitive to these differences, which can lead to overfitting and poor generalization performance.
The advantage of using a custom kernel based on cityblock distance is that it can better capture the intrinsic structure of the face data, which can lead to improved classification performance. This is particularly true when the face data has significant variations in lighting, contrast, or other factors that can affect pixel values.
Image Feature Extraction
The detection performance is not that good. The problem is that we are using the raw pixel values as features, so it is difficult for the classifier to interpret larger structures of the face that might be important. To fix the problem, we will extract features from the image using a set of filters.
Run the below code to look at the filter output. The filters are a sets of black and white boxes that respond to similar structures in the image. After applying the filters to the image, the filter response map is aggregated over a 4x4 window. Hence each filter produces a 5x5 feature response. Since there are 4 filters, then the feature vector is 100 dimensions.
1 | def extract_features(imgs, doplot=False): |
1 | # new features |
Now lets extract image features on the training and test sets. It may take a few seconds.
1 | trainXf = extract_features(imgdata['train']) |
(1745, 100)
(944, 100)
Detection using Image Features
Now train AdaBoost and SVM classifiers on the image feature data. Evaluate on the test set.
1 | # first scale the features |
1 | # AdaBoost |
Fitting 5 folds for each of 88 candidates, totalling 440 fits
[CV 2/5; 1/88] START learning_rate=1e-05, n_estimators=1........................
[CV 1/5; 1/88] START learning_rate=1e-05, n_estimators=1........................
[CV 3/5; 1/88] START learning_rate=1e-05, n_estimators=1........................
[CV 2/5; 1/88] END learning_rate=1e-05, n_estimators=1;, score=0.725 total time= 0.0s
[CV 1/5; 1/88] END learning_rate=1e-05, n_estimators=1;, score=0.587 total time= 0.0s
[CV 4/5; 1/88] START learning_rate=1e-05, n_estimators=1........................
[CV 3/5; 1/88] END learning_rate=1e-05, n_estimators=1;, score=0.679 total time= 0.0s
[CV 5/5; 1/88] START learning_rate=1e-05, n_estimators=1........................
[CV 1/5; 2/88] START learning_rate=1e-05, n_estimators=2........................
[CV 2/5; 2/88] START learning_rate=1e-05, n_estimators=2........................
[CV 3/5; 2/88] START learning_rate=1e-05, n_estimators=2........................
[CV 4/5; 1/88] END learning_rate=1e-05, n_estimators=1;, score=0.693 total time= 0.0s
[CV 5/5; 1/88] END learning_rate=1e-05, n_estimators=1;, score=0.762 total time= 0.0s
[CV 4/5; 2/88] START learning_rate=1e-05, n_estimators=2........................
[CV 5/5; 2/88] START learning_rate=1e-05, n_estimators=2........................
[CV 1/5; 3/88] START learning_rate=1e-05, n_estimators=5........................
[CV 1/5; 2/88] END learning_rate=1e-05, n_estimators=2;, score=0.587 total time= 0.0s
[CV 2/5; 2/88] END learning_rate=1e-05, n_estimators=2;, score=0.725 total time= 0.0s
[CV 2/5; 3/88] START learning_rate=1e-05, n_estimators=5........................
[CV 3/5; 3/88] START learning_rate=1e-05, n_estimators=5........................
[CV 3/5; 2/88] END learning_rate=1e-05, n_estimators=2;, score=0.679 total time= 0.0s
[CV 5/5; 3/88] START learning_rate=1e-05, n_estimators=5........................
[CV 4/5; 2/88] END learning_rate=1e-05, n_estimators=2;, score=0.693 total time= 0.0s
[CV 5/5; 2/88] END learning_rate=1e-05, n_estimators=2;, score=0.762 total time= 0.0s
[CV 2/5; 399/400] START C=1000.0, gamma=483.2930238571752.......................
[CV 4/5; 398/400] END C=1000.0, gamma=233.57214690901213;, score=0.653 total time= 0.3s
[CV 5/5; 398/400] START C=1000.0, gamma=233.57214690901213......................
[CV 4/5; 397/400] END C=1000.0, gamma=112.88378916846884;, score=0.653 total time= 0.3s
[CV 5/5; 399/400] START C=1000.0, gamma=483.2930238571752.......................
[CV 1/5; 398/400] END C=1000.0, gamma=233.57214690901213;, score=0.650 total time= 0.3s
[CV 2/5; 400/400] START C=1000.0, gamma=1000.0..................................
[CV 3/5; 399/400] END C=1000.0, gamma=483.2930238571752;, score=0.650 total time= 0.3s
[CV 4/5; 399/400] START C=1000.0, gamma=483.2930238571752.......................
[CV 3/5; 398/400] END C=1000.0, gamma=233.57214690901213;, score=0.650 total time= 0.3s
[CV 4/5; 400/400] START C=1000.0, gamma=1000.0..................................
[CV 2/5; 399/400] END C=1000.0, gamma=483.2930238571752;, score=0.650 total time= 0.3s
[CV 5/5; 400/400] START C=1000.0, gamma=1000.0..................................
[CV 5/5; 398/400] END C=1000.0, gamma=233.57214690901213;, score=0.653 total time= 0.3s
[CV 5/5; 399/400] END C=1000.0, gamma=483.2930238571752;, score=0.653 total time= 0.3s
[CV 1/5; 400/400] START C=1000.0, gamma=1000.0..................................
[CV 2/5; 400/400] END ...C=1000.0, gamma=1000.0;, score=0.650 total time= 0.3s
[CV 3/5; 400/400] START C=1000.0, gamma=1000.0..................................
[CV 4/5; 399/400] END C=1000.0, gamma=483.2930238571752;, score=0.653 total time= 0.3s
[CV 4/5; 400/400] END ...C=1000.0, gamma=1000.0;, score=0.653 total time= 0.2s
[CV 5/5; 400/400] END ...C=1000.0, gamma=1000.0;, score=0.653 total time= 0.2s
[CV 1/5; 400/400] END ...C=1000.0, gamma=1000.0;, score=0.650 total time= 0.2s
[CV 3/5; 400/400] END ...C=1000.0, gamma=1000.0;, score=0.650 total time= 0.2s
best params: {'C': 1000.0, 'gamma': 0.008858667904100823}
best score: 0.9570200573065903
test accuracy = 0.7372881355932204
Error Analysis
Repeat the error analysis for the new classifiers.
1 | predY_list = [] |
------------------------------------
model : AdaBoost
------------------------------------
Accuracy: 0.715042372881356
F1 score: 0.6184397163120567
Confusion matrix:
TP= 218 | FP= 15
TN= 457 | FN= 254
TPR= 0.461864406779661
FPR= 0.03177966101694915
------------------------------------
model : SVM
------------------------------------
Accuracy: 0.7372881355932204
F1 score: 0.6545961002785515
Confusion matrix:
TP= 235 | FP= 11
TN= 461 | FN= 237
TPR= 0.4978813559322034
FPR= 0.023305084745762712
How has the classifier using image features improved?
The classifier increases the TP value and reduces the TN value to a considerable extent, which means that the classifier can recognize faces more easily, resulting in an increase in the accuracy and F1 value.
Training machine learning models on image feature data rather than the original image data can improve the performance of the models for a few reasons:
Dimensionality reduction: Image feature data typically has fewer dimensions than the original image data. This reduces the number of features that the models have to learn from, making them more efficient and less prone to overfitting.
Noise reduction: Image feature data is often pre-processed to remove noise and enhance relevant features. This can make the models more robust to noisy input data.
Increased generalization: Image feature data can capture higher-level information about the images, such as edges, textures, and shapes, which can be more useful for classification than pixel-level information. This can improve the generalization of the models to new, unseen data.
Test image
Now lets try your face detector on a real image. Download the “nasa-small.png” image and put it in the same directory as your ipynb file. The below code will load the image, crop out image patches and then extract features. (this may take a few minutes)
1 | fname = "nasa-small.png" |
1 | # load image |
(210, 480)
<matplotlib.image.AxesImage at 0x7ff202448580>
1 | # step size for the sliding window |
(5568, 19, 19)
Now predict using your classifier. The extracted features are in newXf
, and scaled features are newXfn
.
1 | newXfn = scalerf.transform(newXf) # apply scaling to test data |
1 | # use the SVM model to predict |
Now we we will view the results on the image. Use the below code. prednewY
is the vector of predictions.
1 | # reshape prediction to an image |
(-0.5, 479.5, 209.5, -0.5)
How did your face detector do?
- Among the 23 faces, 15 faces were successfully recognized. Clothes badges were easily mistaken for human faces, and places with complex background textures were easily mistaken for human faces. Faces with less prominent eyebrows are ignored.
You can try it on your own images. The faces should all be around 19x19 pixels though. We only used 1/4 of the training data. Try using more data to train it!