There are two ways to approach this, depending on your preference.
Method # 1 - Using bwareaopen
A cheap way to do this would be to transform the image so that the pixels of the object are white, not black, and then morphologically close the image and remove those areas that fall under a certain amount. Closing brings together the united regions and takes advantage of the fact that joining the "structure" will lead to the creation of a region with a large area, you can span the region of each region and exclude those regions that fall below a certain amount.
Then you can return to the original image by simply doing a logical AND with an inverted image and a closed result, and then reinstalling this intermediate result. The effect of this is that we only save pixels belonging to the original image due to the close operation, artificially creating the pixels of the object. In particular, combining neighboring areas of the structure will create new pixels for the object, and therefore, AND will ensure that these pixels do not match the original. Since this is done on the back of the original result, re-conversion returns you to the original pixel domain of the object, which is black, not white.
Something like that:
%// Read in image from StackOverflow im = imread('http://i.stack.imgur.com/A7iT7.png'); %// Invert image im = ~im; %// Define 50 x 50 structuring element and close the image se = strel('square', 50); out = imclose(im, se); %// Remove regions whose areas fall below 10000 pixels out = bwareaopen(out, 10000); %// Remove out extraneous closing areas by ANDing with inverted image %// then reinvert to bring back to original label scheme out = ~(im & out); %// Show the image imshow(out);
We get this image:

Notes
- The
imclose function imclose do a morphological closure for you using the structuring element defined by strel . I used a 50 x 50 square to make sure that we have a large enough window to combine the neighboring pixels of the object. - The
bwareaopen function takes a binary image and removes areas whose pixel areas are below a certain value. After closing, you will have two connected areas - the upper part of the image with the structure and the lower part with the text. In experiments, 10,000 pixels removed the area below.
Method # 2 - Using regionprops
In connection with method No. 1, an alternative method for this and to be an agent, which is a threshold, is to transition with your original idea. Perform a close operation, but then evaluate the areas of each of the connected areas and select the one that has the largest area. In this case, I recommend using regionprops , which is a function specifically designed to analyze the characteristics of individual areas of the image. The result will be a structure of N elements, where N is the total number of unique and related objects found in the image, and each structure contains property fields that you want to measure on the image. In your case, specify the 'Area' and 'PixelIdxList' , which contain areas and main pixel pixel locations in each region.
You will find the maximum area as a whole and use the corresponding pixel locations and install an output map with which you would logically AND .
Something like that:
%// Read in image from StackOverflow im = imread('http://i.stack.imgur.com/A7iT7.png'); %// Invert image im = ~im; %// Define 50 x 50 structuring element and close the image se = strel('square', 50); out = imclose(im, se); s = regionprops(out, 'Area', 'PixelIdxList'); %// Apply regionprops %// Find the region with the max area [~,id] = max([s.Area]); %// Create an output mask with the largest area %// Make logical out = false(size(im)); %// Set pixels from largest area out(s(id).PixelIdxList) = true; %// Rest of the logic from before %// Remove out extraneous closing areas by ANDing with inverted image %// then reinvert to bring back to original label scheme out = ~(im & out); %// Show the image imshow(out);
You should get exactly the same results as the first method.