(An initial poster and, rather, a new one in programming, so be patient!)
I'm interested in how an efficient general algorithm for printing formatted binary trees (in the CLI environment) and C. Here is some code that I wrote for fun (this is a very simplified version of the original and part of a larger program that supports many BST operations, but it should compile just fine):
#include <stdbool.h> // C99, boolean type support #include <stdio.h> #include <stdlib.h> #include <math.h> #define DATATYPE_IS_DOUBLE #define NDEBUG // disable assertions #include <assert.h> #define WCHARBUF_LINES 20 // def: 20 #define WCHARBUF_COLMS 800 // def: 80 (using a huge number, like 500, is a good idea, // in order to prevent a buffer overflow :) #define RECOMMENDED_CONS_WIDTH 150 #define RECOMMENDED_CONS_WIDTHQ "150" // use the same value, quoted /* Preprocessor directives depending on DATATYPE_IS_* : */ #if defined DATATYPE_IS_INT || defined DATATYPE_IS_LONG #define DTYPE long int #define DTYPE_STRING "INTEGER" #define DTYPE_PRINTF "%*.*ld" #undef DATATYPE_IS_CHAR #elif defined DATATYPE_IS_FLOAT #define DTYPE float #define DTYPE_STRING "FLOAT" #define DTYPE_PRINTF "%*.*f" #undef DATATYPE_IS_CHAR #elif defined DATATYPE_IS_DOUBLE #define DTYPE double #define DTYPE_STRING "DOUBLE" #define DTYPE_PRINTF "%*.*lf" #undef DATATYPE_IS_CHAR #elif defined DATATYPE_IS_CHAR #define DTYPE char #define DTYPE_STRING "CHARACTER" #define DTYPE_PRINTF "%*.*c" /* using the "precision" sub-specifier ( .* ) with a */ /* character will produce a harmless compiler warning */ #else #error "DATATYPE_IS_* preprocessor directive undefined!" #endif typedef struct node_struct { DTYPE data; struct node_struct *left; struct node_struct *right; /* int height; // useful for AVL trees */ } node; typedef struct { node *root; bool IsAVL; // useful for AVL trees long size; } tree; static inline DTYPE get_largest(node *n){ if (n == NULL) return (DTYPE)0; for(; n->right != NULL; n=n->right); return n->data; } static int subtreeheight(node *ST){ if (ST == NULL) return -1; int height_left = subtreeheight(ST->left); int height_right = subtreeheight(ST->right); return (height_left > height_right) ? (height_left + 1) : (height_right + 1); } void prettyprint_tree(tree *T){ if (T == NULL) // if T empty, abort return; #ifndef DATATYPE_IS_CHAR /* then DTYPE is a numeric type */ /* compute spaces, find width: */ int width, i, j; DTYPE max = get_largest(T->root); width = (max < 10) ? 1 : (max < 100) ? 2 : (max < 1000) ? 3 : (max < 10000) ? 4 : (max < 100000) ? 5 : (max < 1000000) ? 6 : (max < 10000000) ? 7 : (max < 100000000) ? 8 : (max < 1000000000) ? 9 : 10; assert (max < 10000000000); width += 2; // needed for prettier results #if defined DATATYPE_IS_FLOAT || defined DATATYPE_IS_DOUBLE width += 2; // because of the decimals! (1 decimal is printed by default...) #endif // float or double int spacesafter = width / 2; int spacesbefore = spacesafter + 1; //int spacesbefore = ceil(width / 2.0); #else /* character input */ int i, j, width = 3, spacesbefore = 2, spacesafter = 1; #endif // #ifndef DATATYPE_IS_CHAR /* start wchar_t printing, using a 2D character array with swprintf() : */ struct columninfo{ // auxiliary structure bool visited; int col; }; wchar_t wcharbuf[WCHARBUF_LINES][WCHARBUF_COLMS]; int line=0; struct columninfo eachline[WCHARBUF_LINES]; for (i=0; i<WCHARBUF_LINES; ++i){ // initialization for (j=0; j<WCHARBUF_COLMS; ++j) wcharbuf[i][j] = (wchar_t)' '; eachline[i].visited = false; eachline[i].col = 0; } int height = subtreeheight(T->root); void recur_swprintf(node *ST, int cur_line, const wchar_t *nullstr){ // nested function, // GCC extension! float offset = width * pow(2, height - cur_line); ++cur_line; if (eachline[cur_line].visited == false) { eachline[cur_line].col = (int) (offset / 2); eachline[cur_line].visited = true; } else{ eachline[cur_line].col += (int) offset; if (eachline[cur_line].col + width > WCHARBUF_COLMS) swprintf(wcharbuf[cur_line], L" BUFFER OVERFLOW DETECTED! "); } if (ST == NULL){ swprintf(wcharbuf[cur_line] + eachline[cur_line].col, L"%*.*s", 0, width, nullstr); if (cur_line <= height){ /* use spaces instead of the nullstr for all the "children" of a NULL node */ recur_swprintf(NULL, cur_line, L" "); recur_swprintf(NULL, cur_line, L" "); } else return; } else{ recur_swprintf(ST->left, cur_line, nullstr); recur_swprintf(ST->right, cur_line, nullstr); swprintf(wcharbuf[cur_line] + eachline[cur_line].col - 1, L"("DTYPE_PRINTF"", spacesbefore, 1, ST->data); //swprintf(wcharbuf[cur_line] + eachline[cur_line].col + spacesafter + 1, L")"); swprintf(wcharbuf[cur_line] + eachline[cur_line].col + spacesafter + 2, L")"); } } void call_recur(tree *tr){ // nested function, GCC extension! (wraps recur_swprintf()) recur_swprintf(tr->root, -1, L"NULL"); } call_recur(T); /* Omit empty columns: */ int omit_cols(void){ // nested function, GCC extension! int col; for (col=0; col<RECOMMENDED_CONS_WIDTH; ++col) for (line=0; line <= height+1; ++line) if (wcharbuf[line][col] != ' ' && wcharbuf[line][col] != '\0') return col; return 0; } /* Use fputwc to transfer the character array to the screen: */ j = omit_cols() - 2; j = (j < 0) ? 0 : j; for (line=0; line <= height+1; ++line){ // assumes RECOMMENDED_CONS_WIDTH console window! fputwc('\n', stdout); // optional blanc line for (i=j; i<j+RECOMMENDED_CONS_WIDTH && i<WCHARBUF_COLMS; ++i) fputwc(wcharbuf[line][i], stdout); fputwc('\n', stdout); } }
(also loaded into the pastebin service to preserve syntax highlighting)
It works pretty well, although automatic width adjustment might be better. The magic of the preprocessor is a little stupid (or even ugly) and is not related to the algorithm, but allows you to use different types of data in the nodes of the tree (I saw that this is a chance to experiment a little with the preprocessor - remember, I'm new!).
The main program should call
system ("mode con: cols =" RECOMMENDED_CONS_WIDTHQ "lines = 2000");
before calling prettyprint_tree () when running inside cmd.exe.
Output Example:
(106.0)
(102.0) (109.0)
(101.5) NULL (107.0) (115.0)
NULL NULL (106.1) NULL (113.0) NULL
NULL NULL NULL NULL
Ideally, the output would be like this (the reason I use the wprintf () family of functions can print Unicode characters anyway):
(107.0)
┌─────┴─────┐
(106.1) NULL
┌───┴───┐
Null null
So my questions are:
- What do you think of this code? (Coding style suggestions are also very welcome!)
- Can it be expanded in an elegant way to include line drawing characters? (Unfortunately, I do not think so.)
- Any other algorithms in C or other imperative languages (or imperative pseudocode)?
- A bit unrelated: what's your opinion on nested functions (GNU non-portable extension)? I believe that this is an elegant way to write recursive parts of a function without having to provide all local variables as arguments (and also useful as a method to hide the implementation), but this may be my Pascal by :-) I'm interested in the opinion of more experienced coders.
Thank you in advance for your answers!
PS. This question is not a duplicate of this .
change Jonathan Leffler wrote a great excellent answer , which is likely to become an “accepted answer” in a few days (if only someone says something equally amazing!). I decided to answer here instead of comments due to space limitations.
- The above code is actually part of a larger homework project (implementing BST operations in a shared library + CLI application using this library). However, the "prettyprint" function was not part of the requirements; I just decided to add myself.
- I also added the function "convert to AVL without rotation", which used "arraystr" as an intermediate view ;-) I forgot that it was not used here. I edited the code to remove it. In addition, a member of the
bool IsAVL structure is nothing but unused; just not used in this particular function. I had to copy / paste the code from different files and make a lot of changes to present the code above. This is a problem that I don’t know how to solve. I would love to publish the whole program, but it is too large and commented on in my native language (not in English!). - The entire project was about 1,600 LOCs (including comments) with several build goals (debug / release / static-linking), and it compiled purely with
-Wall and -Wextra . Assertions and debug messages were automatically enabled / disabled depending on the purpose of the build. I also thought that function prototypes are not needed for nested functions, because all nested functions do not implement any external interface by definition - GCC, of course, did not complain here. I do not know why there are so many warnings in OSX :( - I am using GCC 4.4.1 on Windows 7.
- Although I write and test this program on Windows, I am actually a Linux user ... However, I cannot stand
vim , and instead I use nano (inside the GNU screen) or gedit (shoot me) ! Anyway, I prefer the K & R brace style :) - Portability doesn't really matter, for Linux GCC users, it’s pretty much de facto ... The fact that it also works well under Windows is a great bonus.
- I do not use VCS, perhaps I should. I want to try, but they all seem too complicated for my needs, and I don’t know how to choose them :-)
- You are definitely right in checking for depth overflow, fortunately it is very easy to add.
- Thanks for the
L' ' tip! - I find your suggestion (encapsulating "the entire drawing code so that the image on the screen and related information are in the same structure") is extremely interesting ... but I really do not understand what you mean by "encapsulation". Could you provide 3 or 4 lines of (pseudo) code showing a possible function declaration and / or a possible function call?
This is my first big-ish (and non-trivial) program, and I am very grateful for your advice.
edit # 2: Here is the implementation of the “quick and dirty” method mentioned here .
( edit # 3: I decided to split it into a separate answer , as it is a valid answer to the OP.)
Many answers are mentioned by Graphviz . I already knew about this (many Linux applications are associated with it), but I thought it would be redundant for a 10K CLK executable. However, I will keep this in mind in the future. It seems healthy.