Lots of features from an edge orientation histogram on a directory of images (0ctave/Matlab)

This post follow the same idea as Lots of features from color histograms on a directory of images but using Edge orientation histograms in global and local features.

Basically I wanted to construct a collection of different edge orientation histograms for a collection of images that were saved in a directory. The histograms were calculated in different regions so I could get a lot of features. The images were numerated so the name of the file coincides with a number. The first thing I noticed was that I shouldn't use the code of Edge orientation histograms in global and local features directly because it was very inefficient. There is no need of calculating the gradients each time. It is better to do it just once and then extract the histogram from regions of this initial process. For this reason I divided that function in two functions: extract_edges and hist_edges. Here are both codes:

# parameters
# - the image
function [data] = extract_edges(im)

% define the filters for the 5 types of edges
f2 = zeros(3,3,5);
f2(:,:,1) = [1 2 1;0 0 0;-1 -2 -1];
f2(:,:,2) = [-1 0 1;-2 0 2;-1 0 1];
f2(:,:,3) = [2 2 -1;2 -1 -1; -1 -1 -1];
f2(:,:,4) = [-1 2 2; -1 -1 2; -1 -1 -1];
f2(:,:,5) = [-1 0 1;0 0 0;1 0 -1];

% the size of the image
ys = size(im,1);
xs = size(im,2);

# The image has to be in gray scale (intensities)
if (isrgb(im))
    im = rgb2gray(im);
endif

# Build a new matrix of the same size of the image
# and 5 dimensions to save the gradients
im2 = zeros(ys,xs,5);

# iterate over the posible directions
for i = 1:5
    # apply the sobel mask
    im2(:,:,i) = filter2(f2(:,:,i), im);
end

# calculate the max sobel gradient
[mmax, maxp] = max(im2,[],3);
# save just the index (type) of the orientation
# and ignore the value of the gradient
im2 = maxp;

# detect the edges using the default Octave parameters
ime = edge(im, 'canny');

# multiply against the types of orientations detected
# by the Sobel masks
data = im2.*ime;

function [data] = histo_edges(im, edges, r)

% size of the image
ys = size(im,1);
xs = size(im,2);
size = round(ys/r) * round(xs/r);

# produce a structur to save all the bins of the
# histogram of each region
eoh = zeros(r,r,6);
# for each region
for j = 1:r
    for i = 1:r
        # extract the subimage
        clip = edges(round((j-1)*ys/r+1):round(j*ys/r),round((i-1)*xs/r+1):round(i*xs/r));
        # calculate the histogram for the region
        eoh(j,i,:) = (hist(makelinear(clip), 0:5)*100)/numel(clip);
    end
end

# take out the zeros
eoh = eoh(:,:,2:6);

# represent all the histograms on one vector
data = zeros(1,numel(eoh));
data(:) = eoh(:);

Now, with this two functions and following the idea of Lots of features from color histograms on a directory of images, it is possible to generate the features in just one vector for each of the images:

function [t_set] = extract_eohs(dir, samples, filename)

fibs = [1,2,3,5,8,13];
total = 0;
ranges = [6, 2];
for fib = 1:size(fibs)(2)
    ranges(fib,1) = total + 1;
    total += 5*fibs(fib)*fibs(fib);
    ranges(fib,2) = total;
endfor

histo = zeros(samples, total);

for ind = 1:samples
    im = imread(strcat(dir, int2str(ind)));
    edges = extract_edges(im);
    for fib = 1:size(fibs)(2)
        histo(ind,ranges(fib,1):ranges(fib,2)) = histo_edges(im, edges, fibs(fib));
    endfor
endfor
save("-text", filename, "histo");
save("-text", "ranges.dat", "ranges");

t_set = ranges;

6 thoughts on “Lots of features from an edge orientation histogram on a directory of images (0ctave/Matlab)”

Hello Tico,
Since you have a huge amount of features, wouldn’t be good to use some feature selection algorithm? Would you recommend some?
One more question, How did label the features?
And why do you need all the features to be in one vector? is it because of the input parameter of the OpenCV function.
I am asking because I need to do something similar using Histogram of oriented edges with SVM classifier, still not sure how to apply it in OpnenCV.
Thanks

Hi!

This post is related to Adaboost (not SVM). You can follow all my related post to it here: http://robertour.com/2012/01/27/from-adaboost-to-features-a-kind-of-tutorial/. A sort of tutorial for Adaboost. It more describes what I did.

I would say that Adaboost is some sort of selection algorithm. The original paper is available here: Rapid Object Detection using a Boosted Cascade of Simple Features.

Regarding the labelling of the features, you can use the first column. However, if I remember well, you can modify that behaviour. You can refer to this post.

The features usually goes in a vector per sample for any classifier (at the end you get a matrix with as many rows as samples, and as many columns as features). It is just a generalization that would work with anything you want to classified. However, you have to maintain the semantics of the columns for each feature. For example, if a feature is missing for a particular sample, which value are you going to use instead?

SVM is a very different classifier from Adaboost. Adaboost needs a lot of features and you don’t care if they are good or not, it is going to select the useful ones. It makes it perfect for computer vision tasks because usually it is difficult to select what it is important. The original paper is remarkable in this sense.

That said, histogram of oriented edges are quite informative features so it is OK to use SVM. But you probably don’t want to have as many roi (regions of interest or partitions of the picture) as I have.

I hope I was clear. This has been a couple of years ago so it is difficult to recall. Feel free to continue asking. I would help as much as I can.

Can u please elaborate what does ‘clip’ contain at each iteration..?
for j = 1:r
for i = 1:r
# extract the subimage
clip = edges(round((j-1)*ys/r+1):round(j*ys/r),round((i-1)*xs/r+1):round(i*xs/r));
# calculate the histogram for the region
eoh(j,i,:) = (hist(makelinear(clip), 0:5)*100)/numel(clip);
end
end

thanks in Advance

Sorry for the late replay. Maybe you can take a look to the post of the features of colours here .

Basically, you can imagine a grid of r x r size over the painting. So, a clip is basically a cell of the grid which contains a piece of the painting, i.e. any of the pieces of the Mona Lisa in the previous link.

Well, not exactly just that piece of the painting because the method edges is called on that piece of painting. After that, the clip would have directions (represented by numbers from 1 to 5 – not sure here but basically meaning horizontal, vertical, diagonal, inverse diagonal and non-directional) instead of colours.

The following instructions actually build the histogram from that matrix of directions. You can see the sort of histograms I am talking about in the Fig 2. of this other post .

I hope I was clearer. If not, feel free to ask again.

what is the value of r in the hist code?
I am getting error while running this?

tototico said on March 19, 2015 at 9:14 am:

The r parameter is the number of regions I am dividing the image in each dimension. If it is 4, I am creating 16 regions and calculating histograms in those 16 regions. You can take a look at this:
http://robertour.com/2012/01/26/lots-of-features-from-color-histograms-on-a-directory-of-images/

Which error are you getting?
Log in to Reply

Yara said on October 9, 2013 at 1:03 pm:

Hello Tico,
Since you have a huge amount of features, wouldn’t be good to use some feature selection algorithm? Would you recommend some?
One more question, How did label the features?
And why do you need all the features to be in one vector? is it because of the input parameter of the OpenCV function.
I am asking because I need to do something similar using Histogram of oriented edges with SVM classifier, still not sure how to apply it in OpnenCV.
Thanks
Log in to Reply
tototico said on October 9, 2013 at 4:35 pm:

Hi!

This post is related to Adaboost (not SVM). You can follow all my related post to it here: http://robertour.com/2012/01/27/from-adaboost-to-features-a-kind-of-tutorial/. A sort of tutorial for Adaboost. It more describes what I did.

I would say that Adaboost is some sort of selection algorithm. The original paper is available here: Rapid Object Detection using a Boosted Cascade of Simple Features.

Regarding the labelling of the features, you can use the first column. However, if I remember well, you can modify that behaviour. You can refer to this post.

The features usually goes in a vector per sample for any classifier (at the end you get a matrix with as many rows as samples, and as many columns as features). It is just a generalization that would work with anything you want to classified. However, you have to maintain the semantics of the columns for each feature. For example, if a feature is missing for a particular sample, which value are you going to use instead?

SVM is a very different classifier from Adaboost. Adaboost needs a lot of features and you don’t care if they are good or not, it is going to select the useful ones. It makes it perfect for computer vision tasks because usually it is difficult to select what it is important. The original paper is remarkable in this sense.

That said, histogram of oriented edges are quite informative features so it is OK to use SVM. But you probably don’t want to have as many roi (regions of interest or partitions of the picture) as I have.

I hope I was clear. This has been a couple of years ago so it is difficult to recall. Feel free to continue asking. I would help as much as I can.
Log in to Reply
image processing aspirant said on October 20, 2013 at 7:10 am:

Can u please elaborate what does ‘clip’ contain at each iteration..?
for j = 1:r
for i = 1:r
# extract the subimage
clip = edges(round((j-1)*ys/r+1):round(j*ys/r),round((i-1)*xs/r+1):round(i*xs/r));
# calculate the histogram for the region
eoh(j,i,:) = (hist(makelinear(clip), 0:5)*100)/numel(clip);
end
end

thanks in Advance
Log in to Reply
tototico said on October 30, 2013 at 3:17 am:

Sorry for the late replay. Maybe you can take a look to the post of the features of colours here .

Basically, you can imagine a grid of r x r size over the painting. So, a clip is basically a cell of the grid which contains a piece of the painting, i.e. any of the pieces of the Mona Lisa in the previous link.

Well, not exactly just that piece of the painting because the method edges is called on that piece of painting. After that, the clip would have directions (represented by numbers from 1 to 5 – not sure here but basically meaning horizontal, vertical, diagonal, inverse diagonal and non-directional) instead of colours.

The following instructions actually build the histogram from that matrix of directions. You can see the sort of histograms I am talking about in the Fig 2. of this other post .

I hope I was clearer. If not, feel free to ask again.
Log in to Reply
sheenu said on March 18, 2015 at 5:38 am:

what is the value of r in the hist code?
I am getting error while running this?
Log in to Reply
- tototico said on March 19, 2015 at 9:14 am:
  
  The r parameter is the number of regions I am dividing the image in each dimension. If it is 4, I am creating 16 regions and calculating histograms in those 16 regions. You can take a look at this:
  http://robertour.com/2012/01/26/lots-of-features-from-color-histograms-on-a-directory-of-images/
  
  Which error are you getting?
  Log in to Reply

that doesn't make any sense…

… can you repeat it again?

Lots of features from an edge orientation histogram on a directory of images (0ctave/Matlab)

6 thoughts on “Lots of features from an edge orientation histogram on a directory of images (0ctave/Matlab)”

Leave a reply Cancel reply