In my last course of computer vision and learning, I was working on a project to recognize between two styles of paintings. I decided to use the Adaboost algorithm [1]. I am going to describe the steps and code to make the algorithm run.

Step 0. The binary classification

This is not a step, but you have to be clear that this algorithm is just for classifying two classes. For example, ones from zeros, faces from non-faces or, in my case, baroque from renaissance paintings.

Step 1. Prepare the files.

There are several ways of introducing the samples to the algorithm. I found that the easiest way was using simple csv files. Also, you DO NOT have to worry about dividing the samples in training or testing. Just put all in the same files, OpenCV is going to divide the set picking the training/testings samples automatically. Then it is a good idea to put all the samples of the first class at the beginning and the second class at the end.

The format is very simple. The first column is going to be the category (however you can specify the exact column if your file does not follow this format). The rest of the columns are going to be the features of your problem. For example, I could have used three features. Each of them represent the average of red, blue and green per pixel in the image. So my csv file should look like this. Note that in the first column I am using a character. I recommend to do that so OpenCV is going to recognize that is a category (again you could specify that this a category an not a number).

```B,124.34,45.4,12.4
B,64.14,45.23,3.23
B,42.32,125.41,23.8
R,224.4,35.34,163.87
R,14.55,12.423,89.67
...
```

NOTE: For a very strange reason the OpenCV implementation does not work with less than 11 samples. So this file should have at leas 11 rows.  Just put some more to be sure and because you will need to specify a testing set as well.

Step 2. Opening the file

Let's suppose that the file is called "samples.csv" This would be the code:

``` //1. Declare a structure to keep the data
CvMLData cvml;
//3. Indicate which column is the response
cvml.set_response_idx(0);
```

Step 3. Splitting the samples

Let's suppose that our file has 100 rows. This code would select 40 for the training.

``` //1. Select 40 for the training
CvTrainTestSplit cvtts(40, true);
//2. Assign the division to the data
cvml.set_train_test_split(&cvtts);
```

Step 4. The training process

Let's suppose that I got 1000 features (columns in the csv after the response) and that I want to train the algorithm with just 100 (the second parameter in the next code)

``` //1. Declare the classifier
CvBoost boost;
//2. Train it with 100 features
boost.train(&cvml, CvBoostParams(CvBoost::REAL, 100, 0, 1, false, 0), false);
```

The description of each of the arguments can be find here.

Step 5. Calculating the testing and training error

The error corresponds to the misclassified samples. Then, there could be two possible errors: the training and the testing.

``` // 1. Declare a couple of vectors to save the predictions of each sample
std::vector train_responses, test_responses;
// 2. Calculate the training error
float fl1 = boost.calc_error(&cvml,CV_TRAIN_ERROR,&train_responses);
// 3. Calculate the test error
float fl2 = boost.calc_error(&cvml,CV_TEST_ERROR,&test_responses);
```

Note that the responses for each samples are saved in the train_responses and test_responses vectors. This is very useful to calculate confusion matrix (false positives, false negatives, true positives and false negatives and roc curves. I ll be posting how to build them with R.

You probably wouldn't mind at the beginning when it takes a few seconds to train something but you definitely don't want to lost it after a couple of hours or days that you waited for the results:

``` // Save the trained classifier
boost.save("./trained_boost.xml", "boost");```

Step 7. Compiling the whole code

The whole code is pasted at the end. To compile it, use this

`g++ -ggdb `pkg-config --cflags opencv` -o `basename main` main.cpp `pkg-config --libs opencv`;`

Here is the file with my code.

```#main.cpp
#include <cstdlib>
#include "opencv/cv.h"
#include "opencv/ml.h"
#include <vector>

using namespace std;
using namespace cv;
int main(int argc, char** argv) {

/* STEP 2. Opening the file */
//1. Declare a structure to keep the data
CvMLData cvml;
//3. Indicate which column is the response
cvml.set_response_idx(0);

/* STEP 3. Splitting the samples */
//1. Select 40 for the training
CvTrainTestSplit cvtts(40, true);
//2. Assign the division to the data
cvml.set_train_test_split(&cvtts);
printf("Training ... ");

/* STEP 4. The training */
//1. Declare the classifier
CvBoost boost;
//2. Train it with 100 features
boost.train(&cvml, CvBoostParams(CvBoost::REAL, 100, 0, 1, false, 0), false);

/* STEP 5. Calculating the testing and training error */
// 1. Declare a couple of vectors to save the predictions of each sample
std::vector train_responses, test_responses;
// 2. Calculate the training error
float fl1 = boost.calc_error(&cvml,CV_TRAIN_ERROR,&train_responses);
// 3. Calculate the test error
float fl2 = boost.calc_error(&cvml,CV_TEST_ERROR,&test_responses);
printf("Error train %f n", fl1);
printf("Error test %f n", fl2);

/* STEP 6. Save your classifier */
// Save the trained classifier
boost.save("./trained_boost.xml", "boost");

return EXIT_SUCCESS;
}
```

## 33 thoughts on “Adaboost on OpenCV 2.3”

1. Hi,

I came across your blog and saw your post on adaboost and opencv. I cannot seem to find any documentation on how adaboost works on opencv. For example, how are the classifiers trained?

Does cvboost take each feature and train a classifier? So, for example, if you have 10 features, does cvboost train 10 classifiers based on each single feature? And from there, it adds the classifier with the lowest error into a “strong classifier”?

Or does it say take all the 10 features, train 100 classifiers (using all of the 10 features) and boost from the pool of 100 classifiers? Any idea?

Thanks!

• Sorry for the super late reply. I struggled for a year with spam in my blog and finally got the time to fix it.

As far as I know, the algorithm is based in the implementation of Viola and Jones Rapid Object Detection using a Boosted Cascade of Simple Features”

You should have as many features as you can. Let’s say a lot of features (thousands or more). A classifier is as simple as feature23>rand_value().

If you just have 10 features, adaboost is not your option. If you are working with images, for example, you can generate all sorts of features: histograms of colors, edges, etc… Also split the image in nxn subregions and generate histograms on the subregions.

2. Hi Roberto,

first of all thanks for your great guide!
Secondly I would ask you: if you want more robust feature to extract (for example you want extract SIFT keypoint) how could you use Adaboost with this new feature?

The problem is that SIFT keypoint (or almost any other keypoints) need to be matched with some sort of distance ( keypoint descriptors aren’t equal from training and the image of testing ). How would you solve this problem?

• Sorry for the very late reply. I had problems with my span and it is not until today I realized ppl were asking about this. I would have loved to help at the right moment.

As far as they features could be represented on an array, then you should be able to use adaboost. I barely remember SIFT but it is related that detects important local features.

The answer depends pretty much on which problem you are trying to solve and there are many ways of organize the keypoints. Give you a few examples:
1. Assuming that you are detecting objects that are more or less in the same orientation and same scale, grab the top 10 keypoints
2. Count the number of keypoints detected
3. There is usually a value asociated to the keypoint. This value is compared against threshold to decide whether a keypoint is or not. You can use the value of the n most important keypoints.
4. You can introduce the distances in between the n most important keypoints

So, there are 2 fundamental rules. First, keep the vector of the same size, use dummy values in case you don’t have all of them. Second, each position of the vector has to have the same information.

But, much more important:

Theoretically you don’t need robust features with Adaboost. That’s why we like it!, and that’s why it is so fast and able to work on real time (for example in face detection video cameras)

So, unless you are solving a very hard detection problem such as my case (styles on paintings) you don’t really need any robust feature that are expensive to calculate.

I hope I helped a bit. I has been quite a while since I stop working with this kind of things.

Sorry again for the late reply,
Roberto

3. Hey!

I’m in the process of creating an application to detect humans in images of urban setting. I am implementing an algorithm that computes various channels of an image to extract features from them. ( http://www.loni.ucla.edu/~ztu/publication/dollarBMVC09ChnFtrs_0.pdf )

I have yet to train a classifier, but your tutorial made me curios. I’m computing 10 channels and then use a sliding window where I calculate the local sum of pixels over that region. Since I’m computing 10 channels, I get 10 features per detection window. Is that not enough? Should I be extracting more features? If so, what other features?

Other question: you explain how to train an adaboost classifier, but how do you then use it to actually classify something?

Regards!

• PS: I tried with a dataset and I’m getting

Error train 19.799999
Error test 19.873817

What is the meaning of this values? Are they percentages?

• It is definitively not enough. Adaboost is made to work with many many simple features. I did a summary of all the post related of the features I used:

Use as many as your computer is able to process. That is the wonderful thing about Adaboost. At some point I was using 10000+ with 4500+ samples in my humble laptop.

I ll try to help u later on how to actually use the trained data because I have no idea right now. That was last year :S..

Cheers and good luck

4. I actually tried a stupid basic example so that I could make sence of those errors.

My csv file looks like this

P,1
N,0
N,0
P,1
N,0
N,0
P,1
N,0
P,1
P,1
N,0
P,0

And I train this for 11 lines for only 1 feature (obviously), leaving just the last (12) for testing. Note P always has 1 and N is always 0 in the first eleven lines, but in the 12th P has 0 associated, so I am expecting to get 100% error.

Here is the output:

Error train 9.090909
Error test 0.000000

Can you explain this to me?

• The “error train” is the error on the training process and it is the number of training samples that are classified incorrectly (called false positives + false negatives – http://en.wikipedia.org/wiki/Confusion_matrix).

I am pretty sure that the train method randomize the selection of the training set and the testing set. This means that the last row that you are expecting being in the test is actually in the train.

As far as I remember the random seed (http://en.wikipedia.org/wiki/Random_seed) is always the same unless you specify it differently (which I don’t remember how). Since the random seed is always the same, the result of training/testing should be always the same.

I ll continue replying your questions later and my apologies because I had the blog a bit abandoned.

• Thanks for the help! I was suspecting that the algorithm was randomizing the data, so the result actually makes sence.

Even with only 10 features I get an error off 19% on the test data, which means that certainly in the right direction. I’ll have to get a large pool of features and do some more tests

• The paper you share actually gives you ideas of the numbers.

“Training a boosted classifier with 1000 weak classifiers given 20,000 training windows and 5,000 features”

You should read the original article that started the Adaboost fever (if u haven’t done so). A major breakthrough in computer vision.
Rapid Object Detection using a Boosted Cascade of Simple Features

Very fun to read and gives you an idea of the weak classifiers Adaboost is supposed to work with.

• I’ll read it , thank you for the reference.

In the paper they generate features randomly and thats what I just implemented, now I need to see if I get better results.

One of the things I don’t understand is what is the difference between a feature and a weak classifier.

• Features are simply the values that you extract from the image. Given a feature f and a generated threshold t, a classifier would be something very similar to (assuming face recognition):

if f < t {
continue_to_next_week_classifier()
} else {
return "this is not a face"
}

5. Sorry for bothering you again.

I managed to train the classifier and learn how to load it and use it, but the results I am getting just doesn’t make any sence.

When I train the classifier with 4000 samples and 100 features, the test error is around 5%, which tells me I am in a good direction.

To use it on new data I do the following:

CvBoost boost;

Mat Test;

for(uint n=0;n<WindowFtrs.size();n++)
{

Test=Mat::zeros(1,NRFEATURE,CV_32FC1);

for(uint i=0; i<NRFEATURE; i++)
{

Test.at(0,i)=WindowFtrs[n][i]; //Putting data in cv::Mat

}
float x = boost.predict(Test,Mat(),Range::all(),false,false);
cout<<x;

}

x always outputs the number 2. No matter what image I use, it outputs 2 100% of the times which is extremely weird.

One thing that is bothering me is that Test has to have same number of columns as the sample I used to train. The thing is, the data used to train has 100 columns + 1 response, and if I try to run the classifier with only 100 features it throws an exception saying that sizes doesn't match. If I run the classifier with 101 features (which is absolutely arbitrary) it works, but the results doesnt make any sence.

Can you help me with this? Thanks in advance!

Regards

• I was trying to see what was the problem but I couldn’t figure it out. I am curious. What was the problem?

• I just changed the class label from letters to numbers (1 and 2) and it started working.

Do you, by any chance, know how to build a decision cascade given a large labeled data set ? I can train a boosted classifier using your tutorial, but what I really wanted was a set of boosted classifiers with growing complexity in order to make efficient classification.

I cant use openCV’s implementation because I have to test my own algorithm, and not the Viola and Jones implementation

• Thanks for the info.

That really sounds complicated. I mean, if you have to program the whole thing. I guess it is part of a project. Sorry, I cannot help you there. I don’t have that deep understanding.

I can tell you though, that the 4th parameter of CvBoostParams(…) is called max_depth which control the depth of the decision tree… acourding to opencv documentation:

max_depth – The maximum possible depth of the tree. That is the training algorithms attempts to split a node while its depth is less than max_depth. The actual depth may be smaller if the other termination criteria are met (see the outline of the training procedure in the beginning of the section), and/or if the tree is pruned.

That might mean that there is already something done. When I tried this parameter I had an unstable behaviour in the sense that the quality of the training depended on the number of iterations.

I hope I helped a bit.

• Hi Median and Roberto,
I encounter the same problem as Median ‘s.
But when I change the class label from letters to numbers (1 and 2), then train the model again. It did not work and throws an exception.
Could you have any suggestions? Thanks a lot!!

• Hi,I also encounter the same problem as Median ‘s.
And I followed what Median did, but I received “OpenCV Error:The function/feature is not implemented (Boosted trees can only be used for 2-class classification.)”
I run my classifier with 11 features and 100 samples.
Can you help me with this? Thanks!

6. Pingback: Adaboost on OpenCV 2.3

7. Incredibly useful post!! This description should be adopted in OpenCV documentation.
Once I’ve tried to implement SVM classifiert using OpenCV but failed. The reason were many tiny uncertainties and unclear formulations in the OpenCV specs. You example instantly made everything clear.
In particular loading, training and storing classifier seems to be much more easier then I’ve assumed.
Many thanks!

• You are welcome. I agree that it was confusing when I went to the documentation. I remember starting to use an XOR training example to understand the parameters. I am very glad this post helped, but it is a shame that the documentation hasn’t been improved by then. On the other hand, OpenCV is a huge project.

8. Dear Roberto
I have csv file which represents image (csv contain numbers which represent image – data from Mat object). I must read that file in my program and make Mat object from them. For that, I use CvMLData and read_csv function. Whan I have small csv file ( so that is small image) everything works fine. But, if I try to load big csv (for example 100-200MB ) I got : OpenCV Error: One of arguments’ values is out of range (Storage block size is too small to fit the sequence elements) in cvSetSeqBlockSize. Do you have any idea how solve this problem?

Thanks!
Fabiano.

9. It seems you are having a memory problem but first discard that there is a mistake in your spreadsheet by inputing a very small sample of your csv.

I also suggetst to try to find the max file size you can input in the read_csv function. First test if that is the case by cutting down the file to, say, 10%. If that works, gradually increase the file size until you cannot increase it anymore.

I have never deal with this issue before, but I found that this:

What kind of data are you working with? That spreadsheet seems huge. What features are you using?

First I try to reduce the file cvs for 57M , however the problem is the same. The message is the same. I Tried change the block size memory like the post ” Made changes to allow ml module to work with big data. #395 ” in the site ttps://github.com/Itseez/opencv/pull/395 ,but I had not sucess. I have worked in object detection with a window size of 160×120 pixels. Do you know other adaboost open source C++ ?

Thanks.

Fabiano.

10. What is the smalles file size that it supports. I am starting to wonder if the columns (features) are the ones that are too many. At the end, the algorithm just requiere one row on memory at a time.

Which features are you working with? Usually it is possible to select features that provide more information, and therefor reduce the number of features.

I have a few post explaining how to use histograms to extract significant features.

Using edges is usually the best
http://robertour.com/2012/01/26/lots-of-features-from-edge-orientation-histogram-on-a-directory-of-images/

How many rows do you have in the file of 57M? Depending on the problem you might not need so many examples for the training phase.

I don’t know any other implementation but I worked with this 2 years ago. I know you can use opencv from python (and I think java), since they allocate memory dynamically it is possible that you could have no troubles with big files.

• Hi Khder, I am not sure if Adaboost is the right algorithm for this task. As I said in Step 0., Adaboost performs well as a binary classifier; it can distinguish among 2 different categories. It seems to me that signs numbers are a multi-category problem, unless you are trying to distinguish between sign and not sign, maybe Adaboost is not your best option.

To be honest, I don’t remember if the OpenCV implementation supports multiple categories, but if it does apparently its performance is not very good, although there has been some improvements. For example,
http://article.sapub.org/10.5923.j.ajis.20130302.02.html

If you are still willing to try, the most time-consuming part for you is deciding on which feature you are going to use. These features are in any case mostly-independent of the classification algorithm you want to try (e.g. svm instead of adaboost). You can follow my entire tutorial which starts here: http://robertour.com/2012/01/27/from-adaboost-to-features-a-kind-of-tutorial.

11. Hi,

Great post, thanks for the effort.
I’m using opencv 3 and I’ve been getting errors while compiling your code, below are the errors:

main.cpp:14:1: error: unknown type name ‘CvMLData’
CvMLData cvml;

main.cpp:35:30: error: use of undeclared identifier ‘test_responses’
std::vector train_responses, test_responses;
^
main.cpp:37:36: error: use of undeclared identifier ‘CV_TRAIN_ERROR’
float fl1 = boost.calc_error(&cvml,CV_TRAIN_ERROR,&train_responses);
^
main.cpp:37:52: error: use of undeclared identifier ‘train_responses’
float fl1 = boost.calc_error(&cvml,CV_TRAIN_ERROR,&train_responses);
^
main.cpp:37:13: error: use of undeclared identifier ‘boost’
float fl1 = boost.calc_error(&cvml,CV_TRAIN_ERROR,&train_responses);
^
main.cpp:39:36: error: use of undeclared identifier ‘CV_TEST_ERROR’
float fl2 = boost.calc_error(&cvml,CV_TEST_ERROR,&test_responses);
^
main.cpp:39:51: error: use of undeclared identifier ‘test_responses’
float fl2 = boost.calc_error(&cvml,CV_TEST_ERROR,&test_responses);
^
main.cpp:39:13: error: use of undeclared identifier ‘boost’
float fl2 = boost.calc_error(&cvml,CV_TEST_ERROR,&test_responses);
^
main.cpp:45:1: error: use of undeclared identifier ‘boost’
boost.save(“./trained_boost.xml”, “boost”);

Do you think it’s a version difference issue?