How to prevent the use of some api in the source code?

In order to create an online judge for ACM competitions, we must prevent some api from being called in the source code submitted by users. For example, it is not allowed to call shutdown() or socket() in the source code. If the source code calls api, we must stop compiling it or throw errors during compilation or throw errors during operation.

I do not know how to do this on Linux or Windows; Can you guys give me some advice?

+4
source share
7 answers

First: I recommend not reinventing the wheel again. There is already a system of judges, perhaps you should first look at them (for example, here we used DomJudge as an ACM judge).

Second: you could, as already suggested, use LD_PRELOAD to communicate with a limited library. Another option that also works with some other forbidden things like protection is a sandbox. Set up a chroot environment where you simply install these limited libraries so that you don't have access to illegal features.

+2
source

You need to use an isolated sandbox, for example, "linux user mode" or "features".

The reason is that system calls do not require a library link, LD_PRELOAD not valid with code containing syscall instructions. And trying to stop someone from putting machine code in an array and then jumping onto it is incredibly difficult, there are so many ways to do it in C (pointers to functions, attacks with stack breaks, etc.). An unwritable code segment and an unenforceable data segment will help, but the only safe way is to use an unprivileged user account so that the kernel does not make a call using EPERM .

+2
source
 #define verboten_api(a1, a2, a3) you may not use this verboten API 

Make sure they must use the header containing the verboten APIs.

GNU provides an obsolete attribute. From the GCC 4.6.1 manual:

deprecated
deprecated (msg)
An obsolete attribute results in a warning if the function is used anywhere in the source file. This is useful in determining the expected functions to be removed in a future version of the program. The warning also includes the location of the declaration of the deprecated function, so that users can easily find additional information about why the function is deprecated or what they should do instead. Please note that warnings are for use only:

 int old_fn () __attribute__ ((deprecated)); int old_fn (); int (*fn_ptr)() = old_fn; 

a warning is displayed in line 3, but not line 2. The optional msg argument, which must be a string, will be printed in the warning, if present. An obsolete attribute can also be used for variables and types (see Section 6.36 [Variable Attributes], p. 341, see Section 6.37 [Type Attributes] p. 350.)

Please note that GCC provides options for not compiling code using legacy functions.

These are compile-time checks β€” unlike runtime checks. They are probably also intrusive if you do not want to crack the system headers used. In addition, if competitors do not use the system header, they can leave with their help.

Consider creating a static library that is related to their code that defines functions that are prohibited, but the implementation of each function is a statement that will always fail:

 int verboten_api(int x, int y, char *z) { assert("function verboten_api() called" == 0); return -1; } 

Link test programs to this library.

0
source

Answer on Linux:

 nm -D _the_compiled_binary_ | grep ' U ' 

will display all dynamic characters used (called) by binary code.

0
source

Keep in mind that you do not need a socket library to access the network, you can do with open () read () and write (). Therefore, you probably need some kind of sandbox, and not just restrictions on what is allowed in the code.

0
source

Filtering the source code is not enough. Even if the source code does not call the API, it can use tricks to call it. For example, a simple regular expression filter can be broken by inserting labels. And this is only at the source code level; when you start thinking about machine codes, there are many other possibilities: from simple built-in assembly to reverse oriented programming , and it can be done in a way that is hard to see when viewing the source code, as shown by Underhanded C Contest .

All APIs ultimately boil down to kernel APIs, as the programmer can simply copy the API implementation otherwise. There are AFAIK only two safe ways to prevent kernel API calls: either filter it in the kernel or statically prove that the code cannot directly call the kernel. Other methods, such as LD_PRELOAD , can be circumvented. Bypassing LD_PRELOAD is simple; just make a system call directly.

To filter the API in the kernel, the most recent way is to use seccomp filters , which allows you to limit system calls and their parameters. With it, you can easily prohibit a process, for example, ever allowing shutdown and socket system calls to be called. Other mechanisms (namespaces, groups, chroot, etc.) can be used to add other kinds of constraints on top of the filter.

An alternative approach to statistically verify the code is safe, using Google’s Native Client . It restricts the generated assembly code in ways that allow simple evidence that the thread of execution cannot exit the sandbox, with the exception of a few well-defined methods. As an example of such rules, no instructions can cross a 32-byte boundary, all jump targets are aligned with a 32-byte boundary, and indirect jumps are allowed only through a couple of commands that mask the lower bits of the target address before the jump, so there is no way to go to the middle instructions.

0
source

You must first determine which API you do not want to receive. I think it is probably easier to do static code analysis and raise an error if unwanted #includes happens.

-1
source

All Articles