GCC 4.7.2 Optimization Issues

Summary

I am porting the ST USB OTG library to the STM32F4 user board using the latest version of the Sourcery CodeBench Lite toolkit (GCC arm-none-eabi 4.7.2).

When I compile the code with -O0, the program works fine. When I compile with -O1 or -O2, it fails. When I talk about refusal, it just stops. There is no hard failure, nothing (well, obviously, he is doing something, but I donโ€™t have an emulator for debugging and finding out, sorry, my hard error handler is not called).

More details

I am trying to make a call to the following function ...

void USBD_Init(USB_OTG_CORE_HANDLE *pdev, USB_OTG_CORE_ID_TypeDef coreID, USBD_DEVICE *pDevice, USBD_Class_cb_TypeDef *class_cb, USBD_Usr_cb_TypeDef *usr_cb); 

... but it does not seem to fall into the body of the function. (Is this a symptom of a โ€œstack breaking"?)

The structures passed to this function have the following definitions:

 typedef struct USB_OTG_handle { USB_OTG_CORE_CFGS cfg; USB_OTG_CORE_REGS regs; DCD_DEV dev; } USB_OTG_CORE_HANDLE , *PUSB_OTG_CORE_HANDLE; typedef enum { USB_OTG_HS_CORE_ID = 0, USB_OTG_FS_CORE_ID = 1 }USB_OTG_CORE_ID_TypeDef; typedef struct _Device_TypeDef { uint8_t *(*GetDeviceDescriptor)( uint8_t speed , uint16_t *length); uint8_t *(*GetLangIDStrDescriptor)( uint8_t speed , uint16_t *length); uint8_t *(*GetManufacturerStrDescriptor)( uint8_t speed , uint16_t *length); uint8_t *(*GetProductStrDescriptor)( uint8_t speed , uint16_t *length); uint8_t *(*GetSerialStrDescriptor)( uint8_t speed , uint16_t *length); uint8_t *(*GetConfigurationStrDescriptor)( uint8_t speed , uint16_t *length); uint8_t *(*GetInterfaceStrDescriptor)( uint8_t speed , uint16_t *length); } USBD_DEVICE, *pUSBD_DEVICE; typedef struct _Device_cb { uint8_t (*Init) (void *pdev , uint8_t cfgidx); uint8_t (*DeInit) (void *pdev , uint8_t cfgidx); /* Control Endpoints*/ uint8_t (*Setup) (void *pdev , USB_SETUP_REQ *req); uint8_t (*EP0_TxSent) (void *pdev ); uint8_t (*EP0_RxReady) (void *pdev ); /* Class Specific Endpoints*/ uint8_t (*DataIn) (void *pdev , uint8_t epnum); uint8_t (*DataOut) (void *pdev , uint8_t epnum); uint8_t (*SOF) (void *pdev); uint8_t (*IsoINIncomplete) (void *pdev); uint8_t (*IsoOUTIncomplete) (void *pdev); uint8_t *(*GetConfigDescriptor)( uint8_t speed , uint16_t *length); uint8_t *(*GetUsrStrDescriptor)( uint8_t speed ,uint8_t index, uint16_t *length); } USBD_Class_cb_TypeDef; typedef struct _USBD_USR_PROP { void (*Init)(void); void (*DeviceReset)(uint8_t speed); void (*DeviceConfigured)(void); void (*DeviceSuspended)(void); void (*DeviceResumed)(void); void (*DeviceConnected)(void); void (*DeviceDisconnected)(void); } USBD_Usr_cb_TypeDef; 

I tried to include all the source code related to this problem. If you want to see all the source code, you can download it here: http://www.st.com/st-web-ui/static/active/en/st_prod_software_internet/resource/technical/software/firmware/stm32_f105-07_f2_f4_usb- host-device_lib.zip

Resolved Attempts

I tried playing with #pragma GCC optimize ("O0") , __attribute__((optimize("O0"))) and declaring certain definitions as volatile , but nothing worked. I would rather just change the code so that it plays well with the optimizer.

Question

How can I change this code to work well with the GCC optimizer?

+7
source share
1 answer

There is nothing wrong with the code you showed, so this answer will be more general.

What are common errors with "close to hardware" code that works correctly unoptimized and fails with higher levels of optimization?

Think about the differences between -O0 and -O1/-O2 : optimization strategies - among other things - unrolling the loop (this doesn't seem dangerous), trying to keep the values โ€‹โ€‹in the registers as long as possible, eliminating dead code and reordering the commands.

improved use of registers usually leads to problems with higher levels of optimization if hardware registers, which can change at any time, are not correctly declared volatile (see PokyBrain comment above). The optimized code will try to keep the values โ€‹โ€‹in the registers for as long as possible, which will cause your program to not notice the changes on the hardware side. Be sure to declare hardware volatile registers correctly

deleting dead code is likely to lead to problems if you need to read the hardware register in order to have any effect on hardware unknown to the compiler and not do anything with the value you just read. These hardware calls can be optimized if you did not specify the correct variable used to access void reading (the compiler should warn about this). Make sure layout sheets are readable on (void)

reordering commands: if you need to access various hardware registers in a certain sequence to get the desired results, and if you do this using pointers that are not connected in any way, the compiler can freely reorder the received commands, as it considers necessary (even if the hardware registers are correctly declared volatile ). To ensure the required access sequence, you will need to get lost in the memory barriers ( __asm__ __volatile__(::: "memory"); ). Be sure to add the necessary memory barriers.

Although unlikely, a compiler-related error may still occur. Optimization is not an easy task, especially when it gets closer to hardware. It might be worth a glimpse into the gcc error database.

If all this does not help, you sometimes simply cannot avoid digging into the assembler code to make sure that it does what it should do.

+1
source

All Articles