I used the same phone and coincidentally the same averaging interval of 6 seconds for the application a few years ago, and I do not remember to see the behavior on the graph.
I am wondering if the problem is how the 6 second averages accumulate. One of the problems was that the sampling interval was not constant, but it depends on how busy the processor is. The sample was received at the specified time, but the call to the event handler depends on the scheduler. When the processor is unloaded, sampling occurs at a constant frequency, but as the processor runs more intensively, the sampling frequency becomes slower and more unstable. You can write your application to maintain low CPU utilization during fetching. We made a selection for 6 seconds, doing nothing, then stopped the selection and processed the last set of samples, but this was only partially successful, since you cannot manage other applications running at the same time, and the scheduler uses all the processor resources for them. In Xperia Active, I discovered that it can sometimes go out for a few seconds between samples that I attributed to garbage collection in one of the JVMs. The solution for us was to time stamp each sample, then perform some quality checks on the set of samples and discard those that did not pass the quality check. This is a bad decision, because determining what is good enough is inaccurate, and when a user launches another application that uses a lot of resources, most sample sets can be discarded, so additional logic is required to process this application.
The current Android API, unavailable to Xperia Active, should have eliminated this, as samples can be downloaded as described at https://source.android.com/devices/sensors/hal-interface.html#batch_sensor_flags_sampling_period_maximum_report_latency .
If the algorithm took a certain number of samples, rather than counting them, and the processor worked harder, because the bike was faster, although I'm not sure why it would do this, it would produce something like the first graph, because when the bike the decreasing amount of descent decreases, and rising to the hill, it rises. There are many assumptions, but the average value of 6 seconds, giving a value of less than 3 m / s ^ 2, seems implausible from my experience with this sensor.