Using a stream in C ++ to report on the progress of calculations

I am writing a generalized abstract class to be able to report the status of as many instance variables as possible. For example, consider the following useless loop:

int a, b; for (int i=0; i < 10000; ++i) { for (int j=0; j < 1000; ++j) { for (int k =0; k < 1000; ++k) { a = i; b = j; } } } 

It would be nice to see the values ​​of a and b without the need to change the loop. I used to write if the following statements:

 int a, b; for (int i=0; i < 10000; ++i) { for (int j=0; j < 1000; ++j) { for (int k =0; k < 1000; ++k) { a = i; b = j; if (a % 100 == 0) { printf("a = %d\n", a); } } } } 

This would allow me to see the value of a every 100 iterations. However, depending on the calculations performed, it is sometimes simply impossible to verify the progress of this method. The idea is to be able to leave the computer, return after a certain time and check what values ​​you want to see.

For this purpose we can use pthreads . The following code works, and the only reason I am posting it is that I am not sure if I am using this thread correctly, mainly how to disable it.

First, consider the file "reporter.h":

 #include <cstdio> #include <cstdlib> #include <pthread.h> void* run_reporter(void*); class reporter { public: pthread_t thread; bool stdstream; FILE* fp; struct timespec sleepTime; struct timespec remainingSleepTime; const char* filename; const int sleepT; double totalTime; reporter(int st, FILE* fp_): fp(fp_), filename(NULL), stdstream(true), sleepT(st) { begin_report(); } reporter(int st, const char* fn): fp(NULL), filename(fn), stdstream(false), sleepT(st) { begin_report(); } void begin_report() { totalTime = 0; if (!stdstream) fp = fopen(filename, "w"); fprintf(fp, "reporting every %d seconds ...\n", sleepT); if (!stdstream) fclose(fp); pthread_create(&thread, NULL, run_reporter, this); } void sleep() { sleepTime.tv_sec=sleepT; sleepTime.tv_nsec=0; nanosleep(&sleepTime, &remainingSleepTime); totalTime += sleepT; } virtual void report() = 0; void end_report() { pthread_cancel(thread); // Wrong addition of remaining time, needs to be fixed // but non-important at the moment. //totalTime += sleepT - remainingSleepTime.tv_sec; long sec = remainingSleepTime.tv_sec; if (!stdstream) fp = fopen(filename, "a"); fprintf(fp, "reported for %g seconds.\n", totalTime); if (!stdstream) fclose(fp); } }; void* run_reporter(void* rep_){ reporter* rep = (reporter*)rep_; while(1) { if (!rep->stdstream) rep->fp = fopen(rep->filename, "a"); rep->report(); if (!rep->stdstream) fclose(rep->fp); rep->sleep(); } } 

This file declares an abstract class reporter , notices a pure virtual function report . This is the function that will print messages. Each reporter has its own thread , and a thread is created when the reporter constructor is called. To use the reporter object in our useless loop, we can now do:

 #include "reporter.h" int main() { // Declaration of objects we want to track int a = 0; int b = 0; // Declaration of reporter class prog_reporter: public reporter { public: int& a; int& b; prog_reporter(int& a_, int& b_): a(a_), b(b_), reporter(3, stdout) {} void report() { fprintf(fp, "(a, b) = (%d, %d)\n", this->a, this->b); } }; // Start tracking a and b every 3 seconds prog_reporter rep(a, b); // Do some useless computation for (int i=0; i < 10000; ++i) { for (int j=0; j < 1000; ++j) { for (int k =0; k < 1000; ++k) { a = i; b = j; } } } // Stop reporting rep.end_report(); } 

After compiling this code (without the optimization flag) and running it, I get:

 macbook-pro:Desktop jmlopez$ g++ testing.cpp macbook-pro:Desktop jmlopez$ ./a.out reporting every 3 seconds ... (a, b) = (0, 60) (a, b) = (1497, 713) (a, b) = (2996, 309) (a, b) = (4497, 478) (a, b) = (5996, 703) (a, b) = (7420, 978) (a, b) = (8915, 78) reported for 18 seconds. 

This does exactly what I wanted to do, with the optimization flags that I get:

 macbook-pro:Desktop jmlopez$ g++ testing.cpp -O3 macbook-pro:Desktop jmlopez$ ./a.out reporting every 3 seconds ... (a, b) = (0, 0) reported for 0 seconds. 

Which is not surprising, because the compiler probably rewrote my code to give me the same answer in a shorter amount of time. My initial question was why the reporter would not give me the values ​​of the variables if I would make the loops longer, for example:

 for (int i=0; i < 1000000; ++i) { for (int j=0; j < 100000; ++j) { for (int k =0; k < 100000; ++k) { a = i; b = j; } } } 

After restarting the code with the optimization flag:

 macbook-pro:Desktop jmlopez$ g++ testing.cpp -O3 macbook-pro:Desktop jmlopez$ ./a.out reporting every 3 seconds ... (a, b) = (0, 0) (a, b) = (0, 0) (a, b) = (0, 0) (a, b) = (0, 0) (a, b) = (0, 0) (a, b) = (0, 0) (a, b) = (0, 0) (a, b) = (0, 0) (a, b) = (0, 0) (a, b) = (0, 0) (a, b) = (0, 0) (a, b) = (0, 0) (a, b) = (0, 0) (a, b) = (0, 0) reported for 39 seconds. 

Question: Is this the result of an optimization flag that changes the code, and it just decides not to update the variables to the very end?

The main question:

In the reporter end_report method, I call the pthread_cancel function. After reading the following answer , it made me doubt the use of the function and how I complete the report flow. For those familiar with pthreads , are there any obvious holes or potential problems using thread , as I did?

+6
source share
2 answers

About the main question: you are close. Add a call to pthread_join() ( http://linux.die.net/man/3/pthread_join ) after pthread_cancel() , and everything should be fine.

A call to the connection ensures that you clear the resources of the streams and, if you forget, can lead to the exhaustion of streaming resources in certain cases.

And just to add, the important point when using pthread_cancel() (other than remembering to join the thread) is to make sure that the thread you cancel has a so-called cancel point , which your thread executes by calling nanosleep() (and maybe also fopen , fprintf and fclose , which can be undo points). If there is no cancel point, your thread will continue to work.

+2
source

C ++ does not know about threads, and your code uses two local variables a and b and does not call a function call with unknown code.

It happens that a and b end in the register during the cycle for calculation, and they are updated only at the end of the cycle.

Although it is true that a and b must receive a real memory address (since they are passed as a reference to an external function), the compiler does not know that some external code that knows the address a and b will execute during the cycle, and therefore prefers to store all intermediate values ​​for the registers until the loop ends.

If your code in the loop, however, calls an unknown function (that is, a function for which the implementation is unknown), then the compiler will be forced to update a and b before calling the function, because it must be paranoid and consider that the resulting progress function passed the address a and b , may pass this information to an unknown function.

+3
source

All Articles