Writing unittests for a function that returns a hierarchy of objects

I have a function that performs hierarchical clustering in a list of input vectors. The return value is the root element of the object hierarchy, where each object represents a cluster. I want to check the following things:

  • Does each cluster contain the correct elements (and possibly other properties)?
  • Does each cluster point to the correct child elements?
  • Does each cluster point to the correct parent?

I have two problems here. First, how to indicate the expected result in a readable format. Secondly, how to write a test statement, allows isomorphic versions of the expected data that I provide? Suppose that one cluster in the expected hierarchy has two children, A and B Suppose now that the cluster is represented by an object with the properties child1 and child2 . I don't care if child1 corresponds to cluster A or B , which corresponds to one of them, and child2 corresponds to the other. The solution should be somewhat general, because I will write several tests with different input data.

In fact, my main problem here is to find a way to determine the expected result in a readable and understandable way. Any suggestions?

+4
source share
3 answers

If there are isomorphic results, you should have a predicate that can verify logical equivalence. This will probably be useful for your code block, and will also help implement unit test.

This is the core of Manoj Govindan's answer without intermediate intermediate elements, and since you are not interested in string intermediate elements (presumably), adding them to the test mode would be an unnecessary source of errors.

Regarding the problem of readability, you will need to show what you consider unreadable for the correct answer. Perhaps the equivalence predicate eliminates this.

+2
source

This is an offside offer. This is a little cool. Caution emptor!

First write a function to create a string representation of the cluster. You will need to write unit tests to make sure that this function works in all cases. The format can be either plain or XML (not very human-friendly, but usually easy to work with hierarchical data). You can call this function by passing to the cluster: string_representation(cluster) .

Secondly, write a variant of this to generate the same output without transmission in a real cluster. Something like util.test.generate_string_representation('child1', 'child2') .

Third, modify your unit test statements to compare the output of string_representation(cluster) with generate_string_representation('child1', 'child2') , as appropriate.

 actual = string_representation(f(*args, **kwargs)) expected = generate_string_representation('child1', 'child2') self.assertEqual(actual, expected) 

Make sure that both string functions use the same mechanism to format the output. You do not want to end up chasing the smallest line differences.

Told you it's pretty hacks. I hope others have better answers.

0
source

There seems to be room for decomposing your method into smaller pieces. They focused on working with syntactic input and output formatting, and can be separated from real clustering logic. Thus, the tests around your clustering methods will be smaller and deal with easily understood and verifiable data structures such as dicts and lists.

0
source

All Articles