Why do lines not containing verbal characters contain newline characters?

I am starting to learn C # and I don’t understand why regular string literals (ie " " ) cannot contain newline alphabetic characters. (I'm not talking about the escape sequence \n ). I know that for multi-line strings you should use literal string literals (i.e. @" " ), but why?

regular string produces "Newline in constant" errorverbatim string produces no error

I have not seen this explicitly indicate that you cannot use them in regular strings. Moreover, except when he mentioned in passing that I can use literal strings for this, everything I read seems to suggest that literals with a newline character will be allowed in ordinary string literals.

Starting in Visual C # 2010 and Code: Generating Multiline String Literals (Visual C #) show examples of verbal multiline strings without any further explanation.

Learning C # 3.0 says the following:

In C #, spaces, tabs, and newlines are considered spaces ... Instances are usually ignored in C # statements .... An exception to this rule is that a space inside a line is considered literal; he is not ignored.

So is this literal? This is what I also expected, but it is not. It even includes this tip:

Tip
Visual Basic programmers note: in C #, the end of a line does not really matter. Statements end with a semicolon, not newlines. There is no line continuation character because there is no need.

(I understand that this is about outside the lines, but why does the end of the line have special parsing inside the line if it is not outside the line?)

I finally found the path to string (C # Reference) , I still do not understand:

String literals can contain any character literal. Evacuation sequences are included. The following example uses the escape sequence \\ for the backslash, \u0066 for the letter f, and \n for the new line.

It says that escape sequences can be used, but he does not say that they should be used. Are lined newline characters not included in "any character literal"? If I have a line that contains an alphabetic tab character instead of its escape sequence \t , there is no error. But if I have a literal newline, I get an error. I even changed the line ending of the file from \r\n to \n or \r without effect.


Obviously, I can conclude from examples and from Visual Studio errors that a verbatim string is required if it contains an alphanumeric newline character, but everything I read suggests that this should not be. Why is the difference?

+6
source share
3 answers

Ok, shoot. As soon as I introduced this, I found the answer.

Are lined newline characters not included in "any character literal"?

Apparently not, it is not.

2.4.4.4 Character literals :

character literal:

'character'

character:

single character

single character:

Any character except '(U + 0027), \ (U + 005C) and a newline character

+5
source

Probable trick Why are C / C ++ string literal declarations single-line?

In a nutshell, because the C language does not support it.

A typo that leaves a string literal that is not closed will overlap the rest of the file as a single token, leaving the programmer with a compiler error message in the strings "waiting for a half in line xxx, yyy column", where the specified location is the end of the source file.

Basically you do not use multiline literals. Better to make them clear in terms of UX.

Also, in a restricted environment, C was developed in (8K PDP-11?), I suspect that such an overflow could cause the compiler to crash.

The C language supports literal splicing, although this is useful:

 char *txt = "this is line 1\n" "this is line 2\n" "this is line 3\n" ; 

It also supports string splicing:

 char *txt = "this is my\n\ multi-line string literal\n\ isn't it nice?\n" ; 

Features I would like to have C #.

+1
source

C # (along with C ++, C, Java, which influenced its syntax) have a very simple rule for spaces:

You can do what you want.

This allows you to use the format, whatever you want, for readability. A Python fan can now say that the advantage is overrated, but that is the advantage we are using.

New lines in lines can ruin this. If you are not sure that a new line in the source should mean that we insert "\u000D" , "\u000A" , "\u000A\u000D" , "\u0085" , "\u000B" , "\u000C" , "\u2028" or "\u2029" to a string, all of which have new line semantics, and the first four of them have a different system" only a reasonable way to make a new line, all the rest are wrong. "

You can still claim that the flaw to resolve it is overrated. C # - after all, the form of strings that are not the same as people might expect from C ++, etc., allows this.

+1
source

Source: https://habr.com/ru/post/924215/


All Articles