I trained an Adaboost classifier to distinguish between two artistic styles. A tecnichal report of my results can be found on my ResearchGate.net account. This sort of tutorial - or more precisely collection of blog posts - explains the steps and provides the code to create an image classifier from histograms of oriented edges, colors and intensities. Therefore you can replicate my methodology to any other problems.

There are two main steps on this: (1) produce the features of the images, and (2) train and use the classifier. I started the blog sequence from the classifier that I used (Adaboost), and then continue explaining how to produce features for big collections. Probably this is a weird way of viewing the problem because I am starting from the last step,however I found that most of the decisions I took in the process were justified by the input I wanted to reach. I also recommend to check the comments where I have answered multiple questions during the time of existance of this posts.


 

This post follow the same idea as Lots of features from color histograms on a directory of images but using Edge orientation histograms in global and local features.

Basically I wanted to construct a collection of different edge orientation histograms for a collection of images that were saved in a directory. The histograms were calculated in different regions so I could get a lot of features. The images were numerated so the name of the file coincides with a number. The first thing I noticed was that I shouldn't use the code of Edge orientation histograms in global and local features directly because it was very inefficient. There is no need of calculating the gradients each time. It is better to do it just once and then extract the histogram from regions of this initial process. For this reason I divided that function in two functions: extract_edges and hist_edges. Here are both codes:

# parameters
# - the image
function [data] = extract_edges(im)

% define the filters for the 5 types of edges
f2 = zeros(3,3,5);
f2(:,:,1) = [1 2 1;0 0 0;-1 -2 -1];
f2(:,:,2) = [-1 0 1;-2 0 2;-1 0 1];
f2(:,:,3) = [2 2 -1;2 -1 -1; -1 -1 -1];
f2(:,:,4) = [-1 2 2; -1 -1 2; -1 -1 -1];
f2(:,:,5) = [-1 0 1;0 0 0;1 0 -1];

% the size of the image
ys = size(im,1);
xs = size(im,2);

# The image has to be in gray scale (intensities)
if (isrgb(im))
    im = rgb2gray(im);
endif

# Build a new matrix of the same size of the image
# and 5 dimensions to save the gradients
im2 = zeros(ys,xs,5);

# iterate over the posible directions
for i = 1:5
    # apply the sobel mask
    im2(:,:,i) = filter2(f2(:,:,i), im);
end

# calculate the max sobel gradient
[mmax, maxp] = max(im2,[],3);
# save just the index (type) of the orientation
# and ignore the value of the gradient
im2 = maxp;

# detect the edges using the default Octave parameters
ime = edge(im, 'canny');

# multiply against the types of orientations detected
# by the Sobel masks
data = im2.*ime;
function [data] = histo_edges(im, edges, r)

% size of the image
ys = size(im,1);
xs = size(im,2);
size = round(ys/r) * round(xs/r);

# produce a structur to save all the bins of the
# histogram of each region
eoh = zeros(r,r,6);
# for each region
for j = 1:r
    for i = 1:r
        # extract the subimage
        clip = edges(round((j-1)*ys/r+1):round(j*ys/r),round((i-1)*xs/r+1):round(i*xs/r));
        # calculate the histogram for the region
        eoh(j,i,:) = (hist(makelinear(clip), 0:5)*100)/numel(clip);
    end
end

# take out the zeros
eoh = eoh(:,:,2:6);

# represent all the histograms on one vector
data = zeros(1,numel(eoh));
data(:) = eoh(:);

Now, with this two functions and following the idea of Lots of features from color histograms on a directory of images, it is possible to generate the features in just one vector for each of the images:

function [t_set] = extract_eohs(dir, samples, filename)

fibs = [1,2,3,5,8,13];
total = 0;
ranges = [6, 2];
for fib = 1:size(fibs)(2)
    ranges(fib,1) = total + 1;
    total += 5*fibs(fib)*fibs(fib);
    ranges(fib,2) = total;
endfor

histo = zeros(samples, total);

for ind = 1:samples
    im = imread(strcat(dir, int2str(ind)));
    edges = extract_edges(im);
    for fib = 1:size(fibs)(2)
        histo(ind,ranges(fib,1):ranges(fib,2)) = histo_edges(im, edges, fibs(fib));
    endfor
endfor
save("-text", filename, "histo");
save("-text", "ranges.dat", "ranges");

t_set = ranges;

 

This is the last type of histograms I used in my project of training an Adaboost classifier to distinguish two artistic styles.

The basic idea in this step is to build a histogram with the directions of the gradients of the edges (borders or contours). It is possible to detect edges in an image but it in this we are interest in the detection of the angles. This is possible trough Sobel operators. The next five operators could give an idea of the strength of the gradient in five particular directions (Fig 1.).

Fig. 1 The sobel masks for 5 orientations: vertical, horizontal, diagonals and non-directional

The convolution against each of this mask produce a matrix of the same size of the original image indicating the gradient (strength) of the edge in any particular direction. It is possible to count the max gradient in the final 5 matrix and use that to complete a histogram (Fig 2.)

Fig 2. Edge Orientation Histogram

In terms of avoiding the amount of non important gradients that could potentially be introduced by this methodology, an option is to just take into account the edges detected by a very robust method as the canny edge detector. This detector returns a matrix of the same size of the image with a 1 if there is an edge and 0 if there is not and edge. Basically it returns the contours of the objects inside the image. If you just consider the 1's we are just counting the most pronounced gradients.

I am also interested in calculate global and local histograms (I have already talk about this in previous posts).  For example Fig 1, Fig 2 and Fig 3 presents the regions for three different type of region divisions: 1, 3, 8 respectively.

Fig 1. 1x1 region divisions
Fig 2. 3x3 region divisions
8x8 region divisions

 

 

 

 

 

 

 

I found this code but I had to do several modifications because of my particular requirements. The most importants are:

  • I need to work just with gray scale images
  • I took out an initial filter that seems to be unnecessary
  • I need to extract histograms of different regions
  • I need a linear response. Just a vector with the responses together

I am posting the code with all the modifications:

# parameters
# - the image
# - the number of vertical and horizontal divisions
function [data] = edgeOrientationHistogram(im, r)

% define the filters for the 5 types of edges
f2 = zeros(3,3,5);
f2(:,:,1) = [1 2 1;0 0 0;-1 -2 -1];
f2(:,:,2) = [-1 0 1;-2 0 2;-1 0 1];
f2(:,:,3) = [2 2 -1;2 -1 -1; -1 -1 -1];
f2(:,:,4) = [-1 2 2; -1 -1 2; -1 -1 -1];
f2(:,:,5) = [-1 0 1;0 0 0;1 0 -1];

% the size of the image
ys = size(im,1);
xs = size(im,2);

# The image has to be in gray scale (intensities)
if (isrgb(im))
    im = rgb2gray(im);
endif

# Build a new matrix of the same size of the image
# and 5 dimensions to save the gradients
im2 = zeros(ys,xs,5);

# iterate over the posible directions
for i = 1:5
    # apply the sobel mask
    im2(:,:,i) = filter2(f2(:,:,i), im);
end

# calculate the max sobel gradient
[mmax, maxp] = max(im2,[],3);
# save just the index (type) of the orientation
# and ignore the value of the gradient
im2 = maxp;

# detect the edges using the default Octave parameters
ime = edge(im, 'canny');

# multiply against the types of orientations detected
# by the Sobel masks
im2 = im2.*ime;

# produce a structur to save all the bins of the
# histogram of each region
eoh = zeros(r,r,6);
# for each region
for j = 1:r
    for i = 1:r
        # extract the subimage
        clip = im2(round((j-1)*ys/r+1):round(j*ys/r),round((i-1)*xs/r+1):round(i*xs/r));
        # calculate the histogram for the region
        eoh(j,i,:) = (hist(makelinear(clip), 0:5)*100)/numel(clip);
    end
end

# take out the zeros
eoh = eoh(:,:,2:6);

# represent all the histograms on one vector
data = zeros(1,numel(eoh));
data(:) = eoh(:);

 
The makelinear function doesn't exist in Octave. All it does is converting a matrix into a vector. If the function doesn't exist you can use the following function.

# makelinear.m
# converts any input matrix into a 1D vector (output)

function data = makelinear(im)
data = zeros(numel(im),1);
data(:) = im(:);

In a previous post I explained how to produce color or intensities histogram of different regions of an image. For example Fig 1, Fig 2 and Fig 3 presents the regions for three different type of region divisions: 1, 3, 8 respectively.

Fig 1. 1x1 region divisions
Fig 2. 3x3 region divisions
Fig 3. 8x8 region divisions

 

 

 

 

 

 

 

The basic goal was to produce small subimages of aproximately the same size and then calculate the histogram over the subimage. In this post I will use the existent function regionImHistogram  to produce features for a set of images that are in the same directory. The images are numerated in the directory to simplify the process and to track it back to the data I have in other tables

For each image I am going to produce a lot of features. The basic idea is to produce histograms of many regions. Concerned of the size of the descriptor I am going to stop using 256 bins for the histograms. Instead of that I am going to use different quantities of bins depending on the regions I am dividing the image. If I have more regions, I will use less bins. More regions also means less pixels, so maybe this will give a little bit more of generalization or statistical power to the feature. Here is the code for 6 different region divisions.

% Parameters:
% - directory with the images
% - number of images on the directory
% - name of the file with the features
function extract_cohs(dir, samples, filename)

% The different amount of regions the image
% is going to be divided
fibs = [1,2,3,5,8,13];
% The bins per region
bins = [128, 64, 32, 16, 8, 4];

% A counter of the number of features added
total = 0;
% The ranges that indicate were the set of features
% per region division are going to be saved
ranges = [6, 2];

% This cycle calculates the ranges in the vector
for fib = 1:size(fibs)(2)
    ranges(fib,1) = total + 1;
    total += 3*fibs(fib)*fibs(fib)*bins(fib);
    ranges(fib,2) = total;
endfor

% create the vector that is going to keep all the samples
histo = zeros(samples, total);

% open each image and process it
for ind = 1:samples
    im = imread(strcat(dir, int2str(ind)));
    for fib = 1:size(fibs)(2)
        histo(ind,ranges(fib,1):ranges(fib,2)) = regionImHistogram(im, fibs(fib), bins(fib));
    endfor
endfor

% save the features
save("-text", filename, "histo");

% save the values of the ranges
save("-text", "ranges.dat", "ranges");

The previous code will generate 6780 features per image and depending on the quantity of images it could take a while. It's quite straight forward to calculate intensity histograms from this code. Two changes are necessary:

1. Instead of

    total += 3*fibs(fib)*fibs(fib)*bins(fib);

You have to take out the 3*

    total += fibs(fib)*fibs(fib)*bins(fib);

2. After

    im = imread(strcat(dir, int2str(ind)));

You have to transform the image to grayscale

    im = imread(strcat(dir, int2str(ind)));
    if isrgb(im)
        im = rgb2gray(im)
    else

I ll be posting some code to produce edge orientation histogram very soon.

In a previous post I explained how to produce color or intensities histogram of an image. In this post I will post some codes to produce them in different regions. The idea remains the same, however we are going to divide the image in different regions to obtain global and local histograms of the same image.

For example Fig 1, Fig 2 and Fig 3 presents the regions for three different type of region divisions: 1, 3, 8 respectively.

Fig 1. 1x1 region divisions
Fig 2. 3x3 region divisions
Fig 3. 8x8 region divisions

 

 

 

 

 

 

 

The basic goal is to produce small subimages of aproximately the same size and then calculate the histogram over the subimage. Here is the code.

function [data] = regionImHistogram(im, r, bins)

% 1. extract the x and y size of the image
ys = size(im,1);
xs = size(im,2);

% 2. calculate the number of pixels of each region
size = round(ys/r)*round(xs/r)

% 3. create a structure to keep all the histograms
coh = zeros(bins*3, r*r);

% 4. iterate over all the regions
for j = 1:r
    for i = 1:r
        %5. extract the subimage
        % 14/12/2016: this doesn't work - for some reason transform the crop two grayscale
        % clip = im(round((j-1)*ys/r+1):round(j*ys/r),round((i-1)*xs/r+1):round(i*xs/r));

        % 14/12/2016: Use this instead
        clip = imcrop(im,[round((i-1)*xs/r+1) round((j-1)*ys/r+1) round(xs/r)-1  round(ys/r)-1]);

        %6. calculate the histogram and normalize it
        coh(:,(j-1)*r+i) = linearHistogram(clip, bins)/size;
    end
end

% 7. put it all in just one vector
data = zeros(1,numel(coh));
data(:) = coh(:);

Notice that instruction 3 creates a matrix bins*3 x r*r. Remember that the color histograms has a histogram per color so it's three times the number of bins. And we are going to need one per block (3*3). Instruction 5. extract the subimage. Instruction 6. use the function explained in my previous post to build the color/intensity histogram. Instruction 7. is particularly important because I am generating descriptors for training my Adaboost classifier.

Then, this would be the code to produce the histograms of a 3x3 region division

% open the image
im = imread(path/to/image);
% call the function
linear = regionImHistogram(im, 3, 256);

Note that it is almost the same code to produce an intensity histograms of a 3x3 region division

% open the image
im = imread(path/to/image);
% transform to gray scale
im = im = rgb2gray(im);
% call the function
linear = regionImHistogram(im, 3, 256);

There is going to be a final post of how to use the regionImHistogram to generate multiple histograms of different regions and different amount of bins.