Why should I avoid using local mutable variables in Scala?

Question

Why should I avoid using local mutable variables in Scala?

I am new to Scala and most of the time before I used Java. Right now, I have warnings all over my code, saying that I have to “Avoid mutable local variables”, and I have a simple question - why?

Suppose I have a small problem - determine max int from four. My first approach:

def max4(a: Int, b: Int,c: Int, d: Int): Int = { var subMax1 = a if (b > a) subMax1 = b var subMax2 = c if (d > c) subMax2 = d if (subMax1 > subMax2) subMax1 else subMax2 }

After accepting this warning, I found another solution:

 def max4(a: Int, b: Int,c: Int, d: Int): Int = { max(max(a, b), max(c, d)) } def max(a: Int, b: Int): Int = { if (a > b) a else b }

It looks more beautiful, but what kind of ideology?

Whenever I approach a problem, I think about it: "Well, we start with this, and then gradually change things and get an answer." I understand that the problem is that I'm trying to change some initial state to get an answer, and I don’t understand why changing things is at least locally bad? How to iterate over a collection, then into functional languages like Scala?

As an example: suppose we have a list of ints, how to write a function that returns a list of ints that are divisible by 6? It is impossible to come up with a solution without a local mutable variable.

+6

immutability scala functional-programming

Oleksii duzhyi Aug 2 '15 at 19:49

source share

4 answers

This is not so much connected with Scala as with the methodology of functional programming in general. The idea is this: if you have constant variables (final in Java), you can use them without any fear that they will change. Similarly, you can parallelize your code without worrying about race conditions or unsafe code.

This is not so important in your example, but imagine the following example:

 val variable = ... new Future { function1(variable) } new Future { function2(variable) }

Using the final variables, you can be sure that there will be no problems. Otherwise, you will need to check the main thread and functions function1 and function2.

Of course, you can get the same result with mutable variables if you never change them. But using inmutable, you can be sure that it will be so.

Edit the response to the edit :

Local variables are not so bad, so you can use them. However, if you try to think without them, you can come to decisions like the one you published, which is cleaner and can be easily parallelized.

How to iterate over a collection and then into functional languages like Scala?

You can always iterate over a nationwide collection, while you do not change anything. For instance:

 val list = Seq(1,2,3) for (n <- list) println n

Regarding the second thing you said: you have to stop thinking in the traditional way. In functional programming, the use of Map, Filter, Reduce, etc. Is normal; as well as pattern matching and other concepts that are not specific to OOP. For the example you give:

As an example: suppose we have a list of ints, how to write a function that returns a list of ints that are divisible by 6?

 val list = Seq(1,6,10,12,18,20) val result = list.filter(_ % 6 == 0)

+5

Álvaro Reneses Aug 2 '15 at 20:04

source share

Your two main questions:

Why warn about local state changes?
How can you iterate over collections without an altered state?

I will answer both.

Warnings

The compiler warns about the use of mutable local variables, since they are often the cause of the error. This does not mean that it is always so. However, your sample code is pretty much a classic example of where a mutable local state is used completely unnecessarily, so as to not only make it more error prone and less clear, but also less efficient.

Your first code sample is more inefficient than your second functional solution. Why is it possible to make two submax1 assignments when you only need to assign them? You ask which of the two inputs is larger, so why not ask about it first, and then do one task? Why was your first approach to temporarily storing a partial state only half the process of requesting such a simple question?

Your first code sample is also inefficient due to unnecessary code duplication. You repeatedly ask: "what is the largest of the two meanings?" Why write the code for this 3 times on your own? Incorrectly repeated code is a known bad habit in OOP , every bit like FP , and for the same reasons. Each time you repeat the code unnecessarily, you open up a potential source of errors. Adding a mutable local state (especially when it's not necessary) only adds fragility and the potential for error detection even in short code. You just need to enter submax1 instead of submax2 in one place, and you may not notice the error for a while.

The second FP solution eliminates code duplication, significantly reducing the likelihood of error and shows that there is simply no need for a mutable local state. It is also, as you yourself say, cleaner and more understandable - and better than an alternative solution in the om-nom-nom answer.

(By the way, Scala's idiomatic way of writing such a simple function is

 def max(a: Int, b: Int) = if (a > b) a else b

which terser style emphasizes its simplicity and makes the code less verbose)

Your first decision was ineffective and fragile, but it was your first instinct. A warning made you find a better solution. The warning proved its worth. Scala was designed to be accessible to Java developers and is perceived by many with many years of imperative style experience and little knowledge of FP . Their first instinct is almost always the same as yours. You have demonstrated how this warning can help improve your code.

There are times when the use of a volatile local state can be faster, but the advice of Scala experts in general (and not just true FP believers) refers to prefer immutability and achieve variability only where there is a clear argument in favor of its use. This is so against the instincts of many developers that a warning is useful even if it annoys experienced Scala developers.

It's funny how often in the "news for FP / Scala" there is some kind of function max . The questioner very often dwells on errors caused by their use of the local state ... this link both demonstrates often dumb addictions to mutable state among some developers, as well as me to another question.

Functional iteration over collections

There are three functional ways to iterate over sets in Scala.

For understanding
Explicit Recursion
Folds and other higher order functions

For understanding

Your question:

Suppose we have a list of ints, how to write a function that returns a list of ints that are divisible by 6? It is impossible to come up with a solution without a local mutable variable

Answer: it is assumed that xs is a list (or some other sequence) of integers, then

 for (x <- xs; if x % 6 == 0) yield x

will provide you with a sequence (of the same type as xs ) containing only those elements that are divisible by 6, if any. No volatile state is required. Scala just repeats the sequence for you and returns everything that matches your criteria.

If you have not yet learned about the power to understand (also known as the sequence of understanding ) you really should. This is a very expressive and powerful piece of Scala syntax. You can even use them with side effects and a volatile state if you want (look at the last example of the tutorial I just contacted). However, there may be unexpected performance penalties , and some developers abuse them.

Explicit Recursion

In the question I linked to at the end of the first section, I give in my answer a very simple, explicitly recursive solution for returning the largest Int from the list.

 def max(xs: List[Int]): Option[Int] = xs match { case Nil => None case List(x: Int) => Some(x) case x :: y :: rest => max( (if (x > y) x else y) :: rest ) }

I will not explain how pattern matching and explicit recursion work (read my other answer or this one ). I will just show you the technique. Most Scala collections can be iterated over recursively, without the need to change state. If you need to keep track of what was on the way, you pass by the battery. (In my code example, I attach a drive at the front of the list to reduce the code, but look at the other answers to these questions for a more common use of batteries).

But here is a (naive) explicit recursive way of finding these integers divisible by 6

 def divisibleByN(n: Int, xs: List[Int]): List[Int] = xs match { case Nil => Nil case x :: rest if x % n == 0 => x :: divisibleByN(n, rest) case _ :: rest => divisibleByN(n, rest) }

I call it naive because it is not tail recursive and therefore can hit your stack. A safer version can be recorded using the battery list and internal helper function, but I leave this exercise for you. The result will be less nice than the naive version, no matter how you try, but the effort is educational.

Recursion is a very important technique to learn. However, once you have learned how to do this, the next important thing to learn is that you usually can not use it explicitly yourself ...

Folds and other higher order functions

Have you noticed how similar my two explicit examples of recursion are? This is because most list recursions have the same basic structure. If you write many such functions, you will repeat this structure many times. This makes it a template; a waste of time and a potential source of errors.

There are now many complex ways to explain folds , but one simple concept is that they exit recursion. They take care of recursion and managing battery values for you. All they ask is to provide the initial value for the battery and the function used at each iteration.

For example, here is one way to use fold to extract the highest Int from an xs list

 xs.tail.foldRight(xs.head) {(a, b) => if (a > b) a else b}

I know that you are not familiar with folds, so it may seem like a gibberish to you, but, of course, you will recognize the lambda (anonymous function). I turn right. What I am doing is taking the first item in the list ( xs.head ) and using it as the initial value for the battery. Then I tell the rest of the list ( xs.tail ) iterations on my own, comparing each item in turn with the battery value.

This type of thing is commonplace, so api collection designers have provided an abbreviated version:

 xs.reduce {(a, b) => if (a > b) a else b}

(If you look at the source code, you will see that they implemented it using a fold).

Anything you might want to do iteratively can be made into the Scala collection using a fold. Often api designers have provided a simpler, higher order function that is implemented under the hood using a fold. Want to find again those who are divided by six inches?

 xs.foldRight(Nil: List[Int]) {(x, acc) => if (x % 6 == 0) x :: acc else acc}

It starts with an empty list as a battery, iterating over each item, adding only those that are divided by 6 into the drive. Again, a stronger HoF was suggested for you:

 xs filter { _ % 6 == 0 }

Folds and related functions of a higher order are more difficult to understand than for understanding or explicit recursion, but very powerful and expressive (to anyone who understands them). They eliminate the pattern by removing the potential source of errors. Since they are implemented by the developers of the main language, they can be more effective (and this implementation may change as the language moves, without breaking your code). Experienced Scala developers use them in preference for understanding or explicit recursion.

TL; DR

Learn to understand
Learn explicit recursion
Do not use them if a higher order function completes the task.

+1

itsbruce Aug 6 '15 at 11:57

source share

First, you can rewrite your example as follows:

 def max(first: Int, others: Int*): Int = { val curMax = Math.max(first, others(0)) if (others.size == 1) curMax else max(curMax, others.tail : _*) }

To find the largest number, varargs and tail recursion are used. Of course, there are many other ways to do the same.

To answer your question is a good question and one that I thought of myself when I first started using scala. Personally, I think that the whole immutable / functional approach to programming is somewhat bloated. But for what stands here, there are major arguments in favor of this:

Optional code easier to read (subjective)

Optional code is more reliable - it is certainly true that changing a volatile state can lead to errors. Take this for example:

 for (int i=0; i<100; i++) { for (int j=0; j<100; i++) { System.out.println("i is " + i = " and j is " + j); } }

This is a more simplified example, but it's still easy to skip the error, and the compiler won't help you.

Compatible code is generally not thread safe . Even trivial and seemingly atomic operations are unsafe. Take, for example, i++ , it looks like an atomic operation, but is actually equivalent:

 int i = 0; int tempI = i + 0; i = tempI;

Continuous data structures will not let you do something like this, so you will need to explicitly think about how to deal with it. Of course, you point out that local variables are usually thread safe, but there is no guarantee. It is possible to pass a ListBuffer instance variable as a parameter to a method, for example

However, there are flaws in immutable and functional programming styles:

Performance . As a rule, it is slower both in compilation and at runtime. The compiler must ensure immutability, and the JVM must allocate more objects than is required in mutable data structures. This is especially true for collections.

Most scala examples show something like val numbers = List(1,2,3) , but hardcoded values are rare in the real world. We usually collect collections dynamically (from a database query, etc.). While scala can reassign values in cooperation, it should still create a new collection object each time you modify it. If you want to add 1000 elements to the scala list (immutable), the JVM will need to allocate (and then GC) 1000 objects

It’s hard to maintain . Functional code can be very difficult to read, you can often see this code:

 val data = numbers.foreach(_.map(a => doStuff(a).flatMap(somethingElse)).foldleft("", (a : Int,b: Int) => a + b))

I do not know about you, but I find this code very difficult for you!

It’s hard to debug . Functional code can also be difficult to debug. Try setting a breakpoint halfway in my (terrible) example above

My advice would be to use a functional / unchanging style in which it really makes sense, and you and your colleagues should feel comfortable. Do not use immutable structures because they are cool or smart. Comprehensive and complex solutions will get you bonus points at Uni, but in the commercial world we want simple solutions to complex problems! :)

0

Toby hobson Aug 2 '15 at 20:58

source share

om-nom-nom · Accepted Answer · 2015-08-02T20:00:15+0000

In your specific case, there is another solution:

 def max4(a: Int, b: Int,c: Int, d: Int): Int = { val submax1 = if (a > b) a else b val submax2 = if (c > d) c else d if (submax1 > submax2) submax1 else submax2 }

Isn't it easier to follow? Of course, I am a little biased, but I tend to think that it is, BUT it is not blind to follow this rule. If you see that some code can be written in a more readable and concise manner in a mutable style, do it this way - the great strength of scala is that you do not need to fix either immutable or mutable approaches, you can switch between them (by the way , this also applies to the use of the return keyword).

As an example: suppose we have a list of ints, how to write a function that returns a sublist of ints that are divisible by 6? It is impossible to come up with a solution without a local mutable variable.

You can, of course, write such a function using recursion, but, again, if the mutated solution looks and works well, why not?

Why should I avoid using local mutable variables in Scala?

Warnings

Functional iteration over collections

For understanding

Explicit Recursion

Folds and other higher order functions

TL; DR

More articles: