Scala Function Chain Documentation

Scala (and, in general, functional programming) protects the programming style, where you create functional "chains" of the form

collection.operation1 (...). Works 2 (...) ...

where operations are various combinations of map , filter , etc.

If equivalent Java code can require 50 lines, Scala code can be executed in 1 or 2 lines. A function chain can change an input collection to something completely different.

The downside of Scala code is that after 10 minutes (not to mention 6 months later) I can’t understand what I was thinking because the notation is so compact and has no type information (due to the implied types).

How do you document this? Do you put a large block of comments in front of the chain, changing an elegant single-line solution to a voluminous 40-line solution consisting of 39 comment lines? Are you viewing your comments as follows?

 collection. // Select the items that meet condition X filter(predicate_function). // Change these items from A to B's map(transformation_function). // etc. 

Something else? No documentation? (Leave them guessing, they will never "reduce" you, because no one can support the code. :-))

+4
source share
6 answers

I do not write this code to begin with (unless it is a script for one-time use or playback in REPL).

If I can explain what the code does in one comment and reads fine, I save it as one liner:

 // Find all real-valued square roots and group them in integer bins ds.filter(_ >= 0).map(math.sqrt).groupBy(_.toInt).map(_._2) 

If I do not understand this, having carefully read the command chain, I must break it down into functionally different units. For example, if I expected someone to not understand that the square root of a negative number is not real, I would say:

 // Only non-negative numbers have a real-valued square root val nonneg = ds.filter(_ >= 0) // Find square roots and group them in integer bins nonneg.map(math.sqrt).groupBy(_.toInt).map(_._2) 

In particular, if someone does not know the Scala collection library well and does not have the patience to spend from five to ten minutes understanding one line of code, then they should not work on my code (nor on anything else that does something it’s nontrivial that they don’t understand and don’t have the patience to understand), or I should know in advance what I provide, for example, to the language and mathematics in addition to writing working code, either by writing a paragraph explaining how the next line works, or by breaking team by command, or including comment A stage at the beginning of each anonymous function, explaining what happens (as needed).

In any case, if you cannot understand what it is doing, you will probably need some intermediate values. They are very useful for a mental reset ("I don’t see how to get from A to C! ... but ... okay, I can understand from A to B. And I can understand from B to C.")

+8
source

If you wrote comments at this level of detail, you simply repeat what the code says.

For long function chains, define new functions for replacing parts of the circuit. Give these meaningful names. Then you can avoid comments. The names of these functions themselves must explain what they are doing.

The best comments are those that explain why the code does something. Well-written code should make the "how" obvious from the code itself.

+10
source

If your chain operations are all monadic transformations: map , flatMap , filter , then it is often much, much clearer to rewrite logic as an understanding.

 coll.filter(predicate).map(transform) 

can be

 for(elem <- coll if predicate) yield transform(elem) 

it’s even easier to demonstrate the power of technology if you have a longer sequence of operations, for example, with the example of Cassin:

 def eligibleCustomers(products: Seq[Product]) = for { product <- products customer <- product.customers paying <- customer if customer.isPremium eligible <- paying if paying.age < 20 } yield eligible 
+6
source

If you do not want to split it into several methods, since the proposed hammar, you can split the string and give the names of intermediate values ​​(and, if necessary, types).

 def eligibleCustomers: List[Customer] = { val customers = products.flatMap(_.customers) val paying = customers.filter(_.isPremium) val eligible = paying.filter(_.age < 20) eligible } 
+2
source

Linear length is somehow a natural indicator when your chain gets too long. :)

Of course, this will depend on how trivial the chain is:

 customerdata.filter (_.age < 40).filter (_.city == "Rio"). filter (_.income > 3000).filter (_.joined < 2005) filter (_.sex == 'f'). ... 

Recently, I got the impression that an application of 3 files, one of which is somewhat long, consists of 4 classes, one of them is not trivial and from 10 to 20 methods. Each method ranged from 5 to 10 lines, and each of them could be easily combined with one, but I had to convince myself that although the measurement of elegance in spare lines of codes is not completely wrong, the sparing lines are not the goal itself.

But splitting a method into two often makes complexity one line lower, but not overall complexity, to understand the whole program.

If the problem area is complex - filter the data at different levels, exchange them in columns, compare them, group, create average values, build graphs, paginate them ... - complex work must be done somewhere.

The program is not easier to understand, you just need to hit page down less often. This is a reconfiguration that you should read a line of code more slowly.

+2
source

It doesn't bother me, now I'm used to Scala. If you want to be more explicit with types, you can always, for example, replace things like map(_.foo) with map { a:A => a.foo } to make the code more readable during long / complex operations. Not that I usually find the need to do this.

0
source

All Articles