Why do compilers only look back for type and function declarations?

Question

Why do compilers only look back for type and function declarations?

This is purely to satisfy my curiosity, but why are functions and types allowed only for those that were previously defined in the code, and not anywhere within the same area?

Sometimes it appears when functions must call each other:

void foo(int depth) { bar(depth + 2); } void bar(int depth) { if (depth > 0) foo(depth - 3); }

If you need to either move the panel to foo or declare the panel in advance:

 void bar(int); void foo(int depth) { bar(depth + 2); } void bar(int depth) { if (depth > 0) foo(depth - 3); }

In more complex examples, you may need to look at the #include tree and #ifndef security elements to find out why your code does not recognize the function or type defined in any other file, and if you already know that you should check this out.

And then, of course, there is this classic:

 typedef struct { Node *next; } Node;

Where you need to know that this is possible:

 typedef struct Node { struct Node *next; } Node;

... but apparently many people do not, so they simply use "void *" or "struct Node".

So, is there a reason why these compilers also do not allow forwarding? I can understand that the preprocessor only checks backwards, since things can be # undefined, but once a type or function is declared there forever and cannot be overloaded with a later definition.

Is this for historical reasons, with limited technology that was available when they were first developed? Or is there some kind of logical ambiguity that I'm missing?

+6

c ++ c parsing

Bonzaithepenguin Apr 21 '14 at 22:28

source share

3 answers

The reason is that all the necessary information about the entity, present if necessary, allows the compiler to translate the source file in one pass. Wed http://blogs.msdn.com/b/ericlippert/archive/2010/02/04/how-many-passes.aspx , for comparison with C # (which does not require previous declarations, but then everything is still in the class )

C shares this function with Pascal and several other languages. Writing such a compiler is easier and, as others have pointed out, tends to use less memory (but potentially increases compilation time, paradoxically, because the declarations in the headers are parsed / compiled over and over for each translation unit).

+3

Peter A. Schneider Apr 22 '14 at 0:23

source share

Since any logical conclusion requires a kind of "logical order of things."

You deliberately ignore the order of things in order to express your thoughts here. Good, good, but it is very ignorant.

In C and C ++, you have an order that tries to solve some common problems: declaration and definition. In C / C ++, you can refer to things that are at least declared, even if they are not already defined.

But this separation or “declared” and “definite” is simply a relaxation of the logical order, which, after all, is simply a simplification of things.

Imagine that you use a simple sequence of expressions in a simple language to describe your program (unlike any programming language that tries to express the same thing, but even tries to “compile” it into some practical computer program): A is B, but B depends on C, but B is similar to C only if A is B, that D may be A, that C may be A, that D is C.

WTF ???

If you can bring out a real world solution dedicated to the problem of the real world in a form independent of any logical order, then you will know your answer, and you can become very rich just knowing it.

0

Frunsi Apr 21 '14 at 23:31

source share

Mats petersson · Accepted Answer · 2014-04-21T22:45:08+0000

The answer to your question simply lies in the fact that “it would be much more difficult to write a compiler if it weren’t” - the language specification says that it should be, but the reason for this wording in the language of the specification is that “it simplifies writing the compiler this way. " To some extent, it is also likely that in the old days compilers generated code “as they moved” and at first did not read all the code in the translation block (source file), and then processed it.

Remember that C and C ++ compilers are still used on machines that do not have huge amounts of memory and very fast processors. Thus, if you try to compile LARGE amounts of code on a machine with low bandwidth, then “we don’t want to read ALL the source first and then go to it”, this makes more sense than on a 16-gigabyte quad-core desktop machine. I expect that you immediately load all the source code for a fairly large project into memory (for example, all files in LLVM + Clang are about 450 MB, so they can easily fit into memory on a modern desktop / laptop).

Edit: It should be noted that “interpreted languages” such as PHP, JavaScript, and Basic generally do not have this requirement, but other compiled languages are usually executed - for example, Pascal has a special forward keyword to tell the compiler that this function exists, but I will tell you what it contains later.

Both Pascal and C (and C ++, because they are based on C in this aspect) allow you to point to structures that are not yet complete. Just this simple “you don’t have all the information” means that the compiler must create the type information and then “come back and fix it” [obviously, only when necessary]. But this allows us to:

 struct X { struct X* next; ... };

or in C ++:

 struct X { X* next; ... };

Edit2: This blog post by Jan Gubicki, a GCC developer, explains some issues with "accessing all code simultaneously." Of course, most of us do not compile Firefox projects of a similar size, but large projects, when you have to deal with EVERYTHING from the code right away, really cause problems with "insufficient memory" even with modern machines, if developers don’t put "from time to time diet compiler. "

http://hubicka.blogspot.co.uk/2014/04/linktime-optimization-in-gcc-1-brief.html

Why do compilers only look back for type and function declarations?

More articles: