What holds back genetic programming?

I was pretty successful at working with genetic algorithms and still ignored genetic programming. As far as I know, most programs remain written by programmers, and I'm curious to know what genetic programming brings back?

Some possible explanations that I thought of are as follows:

  • Search space too large to find useful programs in the midst of noise
  • Most real-world applications cannot provide sufficient data to assess the suitability of such a space.
  • It is difficult to reduce the effectiveness of many real-world applications down to a single fitness indicator. In other words, writing a suitable fitness function is likely to entail the same work as writing the actual program.

Any ideas?

+55
algorithm evolutionary-algorithm genetic-programming
Dec 07 '10 at 19:11
source share
14 answers

This is what I considered in my research, and I would say that there are many reasons:

  • The vast majority of research in the field of GP is focused on the production of formulas, and not on the software that is created by most programmers. There are many computer scientists in this area, but very few are focused on creating the programs you expect, so progress in this area has been slow.

  • There is considerable emphasis on using LISP because it can create beautiful tree structures that are easy to manipulate, and unfortunately, persistent programs are ignored because they are associated with solving some complex problems. In order for GP to be taken seriously by programmers, it must create imperative programs.

  • Real programs contain loop constructs, but loops are hard to implement in the GP without any ugly restrictions to prevent an infinite loop.

  • Genetic programming does not scale well. This is fine for small issues with a small language available, but as you say in your first point, the search space gets too large very quickly.

  • Compared to a human programmer, the GP can be very slow. Nevertheless, it is very parallelized, so it will probably be useful, since more processor cores will become the norm.

Another correct answer would be that it is hard to trust that the code was automatically generated. That's true, but in practice, I doubt it has a big impact, because the GP is not able to produce the right programs in the first place.

+37
Dec 07 '10 at 20:00
source share

The hard part about genetic programming is writing a good scoring function:

  • The scoring function should correctly determine if the algorithm has the required properties. It's harder than it sounds - much harder than writing a test suite. The algorithm can adapt to any quirk in your scoring function and optimize it. Trying to develop strcmp ? Instead, your algorithm can detect a math pattern for the duration of your tests with pass / fail checks.

  • The scoring function should be able to assess whether the algorithm partially works. Genetic programming is based on "climbing a hill." A tiny beneficial mutation should cause a tiny gradual improvement in score. If your scoring function can only display true / false, then you search randomly, not genetically.

If you try your hand at this, you will find that genetic programming is the ultimate in lateral thinking: your program will strive to solve the problem in all respects that you did not think about, most of them are unexpected and (for serious applications) probably useless . In one known case, an attempt was made to evolution of the oscillator using the main electronic components. They judged by the simplicity of the circuit and whether the output fluctuated. He produced something so simple that the researchers were sure that he could not work, but he did it: he collected and amplified radio waves from the environment.

Edit to quote:

Graeme Rowe, Duncan. "Radio comes out of electronic soup." New Scientist, vol. 175, no. 2358, p. 19 (August 31, 2002). Available online at http://www.newscientist.com/news/news.jsp?id=ns99992732

However, now the link is now broken.

New link http://www.newscientist.com/article/dn2732-radio-emerges-from-the-electronic-soup.html

+25
Dec 07 '10 at 19:32
source share

I know I'm late for this party, but I just wanted to make a couple of points. I was incredibly happy to work with John Goat in his book Genetic Programming 4.

The type of programming that most of us do every day is connecting graphical interfaces, clicking pixels, creating databases, etc. etc., of course, does not apply to the types of programs that the GP seeks to build.

What John Coza does with his cluster of about a hundred machines (if I remember correctly!) Is a search for solutions to interesting problems (I think NP-hard). This is sad, but most of the problems that we programmers work with every day are not very interesting or complex problems, mostly just annoying and time-consuming.

What Ben Jackson said about the gene engineering electronic oscillator is the closest thing that Dr. Coza and the team actually do on this topic. There were many examples in the GP series of books.

What Tom Castle said here about imperative programs is not entirely true. An example of this from John and company is the run of antenna design. This is pretty much a 3D drawing program with commands such as the LOGO drawing language that designs the antenna. It has moveto, lineto commands for pushing and popping matrices on the stack. The GP package I just looked at last week, jgap, has direct support: a nonterminal container type that can contain void return statements and then has a return statement at the end. I believe that it even has something like local variables, although I did not look too closely.

The original LISP design, which the early GP runs around, is from time to time, and it certainly is. This is a bit of a rite of passage that seems annoying to me to translate GP results into more useful code.

TL; DR: GP is not an automatic programming system. This is an automated system of inventions.

+8
Jan 29 '13 at 8:51
source share

I would say 1 and 3.

In particular, with respect to point 3, I would say that in most cases it is even easier to write the actual program than to create a suitable objective function and make sure that this leads to the correct or acceptable solution (how do you know that the algorithm obtained from genetic programming, works as expected?)

As for point 1, I would say that the search space has an infinite number of dimensions.

+5
Dec 07 '10 at 19:33
source share

Three things:

  • As Andre says , it’s very difficult to write a sufficient fitness function. This is the final version of Test Driven Development. You will need to write unit tests that provide 100% coverage of an as yet unwritten program. Even then, 100% coverage alone is unlikely to be sufficient. When we solved the problem of formally indicating the exact behavior of all aspects of a complex system, and then verifying that the program meets this specification, then we may have a chance.
  • Genetic programming is non-deterministic and better suited to creating approximate solutions, rather than exact solutions.
  • Genetic programming, applicable to any problem of reasonable complexity, is phenomenally computationally expensive. As far back as 1999, Genetic Programming Inc used the 1000 <node cluster to work in this area.
+4
Dec 07 '10 at 19:47
source share

GP cannot quickly solve new problems that are beyond the knowledge of the person creating the fitness function. Thus, it can only be used at present to solve problems that we already know how to solve, but they are not ideal for a complete solution because of the search space. Such as Traveling Salesman. What can be solved faster with GA.

The reason is actually quite simple. To solve new problems, the tools that you give GP must be simple enough or complete enough for the GP search space to become a true analogue of real genetics.

Genetic programming, like real genetics, is subject to the same growth pattern as all platform growth systems. This means that the GP will reach the point where the main utilities are included in the platform, it is leveled, and then takes LONG time before it stumbles upon a new paradigm to create a new platform.

This evolutionary video illustrates these platform growth patterns. http://www.youtube.com/watch?v=mcAq9bmCeR0 It takes a long time to develop a new hand, and as soon as this happens, the extra hand will become a new thing and will quickly advance to the optimal example of more hands. (I have to mention that this video most likely uses GA, not GP, but genetics is genetics)

It's all about the same logic that goes into Singularity's theory, BTW.

But for GP, this means that it is very good for research, and not for practical application within the program. With a few simple exceptions, when the requirements are a little deeper than GA can solve. For example, some scheduler programs. Where the programming search space is quite small and where the tools necessary to solve it are well understood and where there may be a well-defined fitness function.

+4
Dec 08 '10 at
source share

This is simply because it has been proven that it is theoretically impossible (at least for the right programs).

Suppose you have infinite computational power, discarding the size and slowness of the search space and other “speeds”. Now you are faced with only two problems: - Will the generated program stop? - How to be sure that the generated program behaves the way I want?

The first problem is the central question of computability theory and is called the stop problem . Turing proved in 1936 that this problem is unsolvable for all program entry pairs. This means that this is possible in some cases, but not for everyone. There is no automated process that can determine if a program is stopped or not. Therefore, for this you can not do much;)

The second problem is related to the correctness of the program. In genetic programming, validation is usually done using approximate values ​​that do not constitute any evidence of correctness. This is comparable to unit testing, gives approval for a number of examples, but is not general proof. For example, if I write a program for checking prime numbers, test it only with 3 5 7 and 11, then a program that returns true for each odd number will test.

The next step is to use automatic evidence. The automatic proof of the correctness of the algorithms is in fact deeply connected with the mathematical automatic proof of the theorem. You describe your program using an axiomatized system and try to automatically confirm the correctness of your statement. Here again, you come across strong theoretical barriers, which are Gödel's incompleteness theorems . These theorems state, among other things, that for even very simple axiomatized systems capable of performing arithmetic on natural numbers, there is no algorithm (called an effective procedure) capable of proving all theorems regarding these natural numbers. Specifically, this means that even for simple programs you cannot prove your case.

Even without proven correctness, using test cases to test a genetic program is very prone to over-specialization, a phenomenon known as retraining in machine learning. That is, the learned program will perfectly cope with the provided test examples, but it can become absolutely ballistic for some other inputs.

+3
Feb 17 '13 at 11:50
source share

Perhaps most programmers are programmers, not computer scientists?

Genetic programming requires serious skills. You must be able to solve the problem correctly, and first you need the corresponding problem. And you need to know enough to know that genetic algorithms are an option. And the problem should not have a well-known solution.

Not everyone needs a fantastic algorithm. Of all the code written in the world, “standard” webapps, OS, device programming really need genetic algorithms?

And when it comes to this, most programming efforts are dedicated to solving simple problems when a complex approach is not needed.

+2
Dec 07 '10 at 7:12
source share

I think most of the reasons are risk management. Bear with me: when a programmer sits down to write a program, they have at least some idea of ​​how long it will take and what they can do.

Instead, the programmer sits down to write a program that will generate the program (using genetic programming), uncertainty breaks through the roof: it is unclear how long the process will take, and it is unclear how good the final program is.

There is also uncertainty elsewhere: how easy will it be to correct the program later or to correct errors? The generated code may be almost impossible to debug.

+2
Dec 07 '10 at 19:29
source share

The initial soup is suspicious and unappetizing. For my production code, I prefer smart design.

+1
Dec 08 '10 at 19:47
source share

My personal opinion after a couple of years in the study of GP at the university and after is useful after that: GP does not scale.

The simple tasks represented by the simple GP fitness functions can solve everything correctly. But, as mentioned earlier, the formulation of the fitness function, which describes the task of developing strcmp or a calculator or even a simple text editor, is almost impossible using classical approaches.

I like that the concept of the GP fitness function is TDD perfectly, though :-) Maybe some bright mind will come up with a better way to control the simulated evolution, but this has not happened yet.

I have some hope for the GP in the field of implicit fitness functions, where I am currently doing some private research. Let's see how far this comes out!

Cheers, Jay

+1
Dec 11 '11 at 12:23
source share

There are several projects aimed at solving the above problems in the GP. An example is opencog MOSES.

+1
Oct 12 '13 at 7:47
source share

Programming currently defines a subtle specification in a machine-readable way. You tell the program exactly what you want and what the result should look like. It is not much more. This means that if you want to generate a result, for example. genetic programming, you still need to make this machine readable exact specification in the form of a fitness function. This will lead to at least the same, but likely to increase the amount of work. Thus, this is simply the wrong field for genetic programming, which was created for problems with ease of determination, but difficult to achieve specification.

edit: now you can say that in most projects this subtle specification is fulfilled anyway in the form of tests. I would say that genetic programming tests are inaccurate to indicate what you need. They are based on examples, and not as programming based on a formal specification language. Genetic programming is likely to produce results that satisfy test cases and behave incorrectly in real-life situations with new inputs. Thus, in order to get a specification level with tests that exactly matches the specification with a programming language, for each case you will need a huge number of test cases. So, you will finally finish the job much more than programming.

0
Mar 11 2018-12-12T00:
source share

GP and GA, or in general Evolutionary algorithms are more like hobbies. They are not used for actual use, with the exception of exceptions. The reason is that these algorithms are based on randomness. There is no certainty of receiving a good answer.

In addition, this area is flooded with sub-paired research, because it is easy to understand and implement compared to other mathematically intensive methods. Therefore, students simply find the goal to optimize in any engineering or scientific application, and you have a new job - to publish it.

Simply compare sponsors for your conferences with those of other AI / ML / Optimization conferences. It will be clear about their "current" value in the industry.

But his area of ​​research, and its up to us to make it work. This is currently just a hobby!

0
Nov 17 '16 at 11:33
source share



All Articles