Sounds like C ++ link access?

I am building a clustering algorithm in C ++, but I am not very well versed in OOP and the state of variables (member data) that are changing. For an algorithm of some complexity, I find this an obstacle to my development.

So, I was thinking about changing the programming language to one of the functional languages: Ocaml or F #. Besides the need to change my mind about how to approach programming, there is something I need to clarify. In C ++, I use a double end queue to copy a time window through data. After a while, the oldest data is deleted and new data is added. Data that is not too old remains in the double-ended queue.

Another, more difficult task is to compare the properties of one of the objects. Each object is data from a certain period of time. And if I have a thousand data objects in a specific time window, I need to compare each of them ranging from one to twenty or thirty, depending. And some properties of the compared object may change as a result of this comparison. In C ++, I do all this with references, which means that I access objects in memory that they are never copied, so the algorithm runs at full speed (for my knowledge of C ++).

I read about functional programming, and the idea that I get is that each function performs some operation and the original data (input) does not change. This means that the language copies the data structure and performs the required conversion. If so, the use of functional programming will significantly slow down the execution of the algorithm. It's right? If not, that is, if there is a quick way to do the conversion in the data, can you show me how to do this? A very small example would be great.

I hope to have some kind of object. I read that both Ocaml and F # are used in research and scientific projects.

+7
f # ocaml
source share
4 answers

At a high level, your question is whether using immutable data is slower than using mutable data. The answer to this question is yes, in some cases it is slower. What is surprising (for me) is how small the penalty is. In most cases (in my experience) the additional time, which is often a log factor, deserves additional modularity and clarity of using immutable data. And in many other cases there is no penalty at all.

The main reason this is not as slow as you expected is that you can freely reuse any parts of the old data. No need to worry that some other part of the calculation will change the data later: it is immutable!

For the same reason, all calls to immutable data are similar to references in C ++. There is no need to make copies of the data, as other parts of the calculation cannot change it.

If you want to work this way, you need to structure your data in order to get reuse. If you cannot easily do this, you may want to use some (controlled) mutation.

Both OCaml and F # are mixed paradigm languages. They allow you to use mutable data if you want.

The most enlightened report on immutable data operations (IMHO) is Chris Okasaki's book Purely Functional Data Structures . (This link to Amazon is for information only, not necessarily an offer to buy a book :-) You can also find most of this information in Okasaki's doctoral dissertation .

+8
source share

In OCaml and F #, you can certainly implement a machine pointer . So you can keep direct links and update them. For example.

type 'a cell = { data : 'a; mutable lhs : 'a cell; mutable rhs : 'a cell; } 

In OCaml, this will display as a pointer to a data structure containing three words: a pointer to data and two pointers to siblings:

  +--------+ +-------+ +-------+ | cell |-------->| data |----->| | +--------+ |-------| +-------+ +---| lhs | | |-------| | | rhs |--+ | +-------+ | | +-------+ | +-------+ +-->| data | --->| data | |-------| |-------| | lhs | | lhs | |-------| |-------| | rhs | | rhs | +-------+ +-------+ 

So, there is nothing special here. This is the same thing that you can choose between a constant and an imperative implementation in C ++. But in C ++, you usually pay a greater cost of persistence due to the lack of support for the language itself. There is a generative garbage collector in OCaml with very cheap placement costs and other optimizations.

So, yes, you can implement your data structure in the usual (imperative) way. But before you do this, you must be sure that you are willing to pay for it. It is much easier to talk about functional code, rather than imperative. This is a topical reason why people choose and use the FP paradigm.

+5
source share

This means that the language copies the data structure and performs the required conversion.

Not necessary. If the objects are immutable (since they are by default for F # record types, in C ++, if all data members are const without using mutable ), then the link is in order.

If so, the use of functional programming will significantly slow down the execution of the algorithm. Is it correct?

Even with the foregoing, functional languages ​​tend to support lazy operations. In F #, with the right data structures / methods, this will be so. But he can also be impatient.

Example (not scary idiomatic, but trying to be clear):

 let Square (is : seq<'t>) = is |> Seq.map(fun n -> n*n) 

and then in

 let res = [1; 2; 3; 4] |> Square 

will not calculate any of the squares until you read the values ​​from re .

+2
source share

It is important to understand this in terms of two factors: mutation and sharing. You seem to be focusing on the aspect of mutation and seemingly ignoring sharing.

Take the standard list-append '@'; he copies left arg and shares right

So, yes, it’s true that you are losing efficiency by copying, but you will receive sharing accordingly. And therefore, if you streamline your data structures to maximize shared access, you will benefit from what you have lost due to the immutability caused by copying.

For the most part, this is “just happening.” However, sometimes you need to configure it.

A general example involving laziness in haskell:

 ones = 1 : ones 

this denotes an endless list of 1s [1,1,1,...] And the implementation can be expected to optimize it before the loop (pie chart)

  +-----------+ | | V | +---------+ | | | | | 1 |-->---+ | | +---------+ 

However, when we generalize it to an infinite list of x-es

 repeat x = x : repeat x 

the implementation has more difficult time detecting a loop because the ones variable has now become a (recursive) call function repeat x

Change it to

 repeat x = let repeat_x = x : repeat_x in repeat_x 

and the cycle (i.e. sharing) is restored.

+1
source share

All Articles