How to distinguish signed and unsigned integer in LLVM

The LLVM project does not distinguish between signed and unsigned integers, as described here . There are situations when you need to know whether a particular variable should be interpreted as signed or unsigned, for example, when it is expanded in size or when it is used in a unit. My solution is to keep separate type information for each variable, which describes whether it is integer or cardinal.

However, I wonder if there is no way to "attribute" the LLVM type this way? I was looking for some “user data” that could be added to the type, but it seems nothing. This was supposed to happen somehow when a type is created, since the same types are generated only once in LLVM.

So my question is:

Is there a way to keep track of whether the integer variable should be interpreted as signed or unsigned in the LLVM framework, or is it the only way to keep separate information like me?

thanks

+7
llvm llvm-ir llvm-c ++ - api
source share
1 answer

First of all, you must be sure that you need to insert metadata of an additional type, since Clang already processes operations with the integer sign, for example, using sdiv and srem , rather than udev and urem .

In addition, you can use this to implement some lightweight output type based on how variables are accessed in IR. Note that an operation of type add does not need sign information, since it is based on a two-pad representation.

Otherwise, I think the best way to do this is to change the interface (Clang) to add custom DWARF debugging information. Here is a link that can run you.

UPDATE: If your goal is to implement static analysis directly on LLVM IR. This document can provide an exhaustive discussion.

Navas, JA, Schachte, P., Søndergaard, H., Stuckey, PJ: Agnostic Signature Analysis: Integer Boundary Accuracy for Low-Level Code. In: Jhala, R., Igarashi, A. (ed.) APLAS 2012. LNCS, Volume 7705, pp. 115-130. Springer, Heidelberg (2012)

+4
source share

All Articles