What data structure in C allows me to easily store strings and add strings?

Question

What data structure in C allows me to easily store strings and add strings?

I got a list of string data. 10,20,30 - line numbers

10. string 1 20. string 2 30. string 3

and if the user enters "23 string data". 23 - line number that the user wants to insert. Data should become such that

 10. string 1 20. string 2 23. string data 30. string 3

and if the user enters "40 string data". Data should become such that

 10. string 1 20. string 2 23. string data 30. string 3 40. string data

I am relatively new to data structure C. What data structure should be used to efficiently store this kind of data ? My current direction is to implement a dynamic array or linked list. However, below is a list of the problems I experienced.

The problem with dynamic array:

Using the line number as an index and create enough space to make sure the length of the array is always greater than or equal to the highest line number. Get plenty of memory for an unused index.
Printing data will be a problem. For example. index 0-9 has no allocated memory. Access to it will result in an error. Need to find ways to find out which index is being used?

Problem with linked list:

I can not go to the index (not sure). How to determine which line appears after another and easily insert a line between them?

+7

c linked-list arrays data-structures

ysj Nov 06 '16 at 6:47

source share

6 answers

Suppose the following about your requirements:

No strong real time. (for example, this is not for high frequency trading or machine control.)
It works on a relatively modern PC (RAM is measured in GB, processor frequency in GHz). In particular, it does not start in the embedded system.
Data no more than several tens of thousands of rows.

Then you can use almost any data structure that you like; it does not matter in terms of memory or runtime.

For example, to find the insertion point in a linked list, simply repeat the list. PCs are fast enough to repeat tens of thousands of times before you finish flashing.

Or just select an array of 100,000 lines of 80 characters each. No problems. Or a million lines. Still not a problem. Or 10 million lines, still not a problem. Do you see my thought? (In the array, you need a marker to indicate unused lines. I would use struct line { bool used; char text[80]; } or the like. You can also use arbitrary long lines and save memory with only the char *text member and dynamically distributing or defining the text as a linked list of pieces.)

The choice therefore comes down to what is easiest to use. May be an array.

+4

Peter A. Schneider Nov 06 '16 at 8:21

source share

I know that you are looking for a specialized data structure, but what about using a simple data structure, but sorting it is lazy? You can add new lines to a dynamic array and then sort the array (using qsort ) when you need to print them.

I think this would be better because printing all the lines is probably much less common than adding / inserting the lines. Therefore, you should make adding lines cheaper (in this case, O (1) is amortized), and printing can be more expensive (in this case, O (n log n)). It also simplifies your data structures and allows the C standard library to handle complex parts.

You can do this a little better by keeping a flag that keeps track of whether all data is already known for sorting; this method prints multiple times (or, suppose you try to write a BASIC interpreter many times) will also be cheap. This flag can also be useful if you expect strings to be usually entered in order; then as you add each line:

 alreadySorted = alreadySorted && (new_line_number > last_line_number)

I note that you did not indicate what would happen if a line was added that reuses the existing line number. If you want to replace the old line, you can customize this approach using a stable view and then iterating along the lines to delete lines with duplicate numbers, keeping only the last.

(If you want to make qsort stable for this case, instead of saving only a row for each row, you can save some additional metadata with it (any monotonously increasing counter will do, for example, the current time or just the total number of rows when adding a row). Then, for the comparison function you give qsort , you just need to use this extra data to resolve the links from duplicate line numbers.)

One of the drawbacks of this approach is that deleting rows will not be quick or will not immediately restore memory. However, you did not indicate whether deleting the line is a requirement; even if it is, it is likely to be a rare operation (so being a little more time inefficient or a little more spatially inefficient may be acceptable).

+2

jamesdlin Nov 06 '16 at 9:11

source share

The best solution to this problem is to use a dictionary data type. Of course, depending on the nature of the keys (number of rows), you can perform optimization through the corresponding hash table.

Of course, there is no dictionary implementation in library c. But you can create your own, based on red ebony. Kormen explained this data structure easily https://www.amazon.com/Introduction-Algorithms-3rd-MIT-Press/dp/0262033844

Note. If your collection is small or you rarely change the structure, you can simply use a linked list.

+1

Lmtinytoon Nov 06 '16 at 8:47

source share

My suggestion is to use a linked list and insertion sort, when necessary,

Here is the code modified on originally taken from geeksforgeeks.org,

I have not tested the code, it is just a modified code taken from the site.

  /* C program for insertion sort on a linked list */ #include<stdio.h> #include<stdlib.h> /* Link list node */ struct node { int lineNumber; char *str; struct node* next; }node; // Function to insert a given node in a sorted linked list void sortedInsert(struct node**, struct node*); // function to sort a singly linked list using insertion sort void insertionSort(struct node **head_ref) { // Initialize sorted linked list struct node *sorted = NULL; // Traverse the given linked list and insert every // node to sorted struct node *current = *head_ref; while (current != NULL) { // Store next for next iteration struct node *next = current->next; // insert current in sorted linked list sortedInsert(&sorted, current); // Update current current = next; } // Update head_ref to point to sorted linked list *head_ref = sorted; } /* function to insert a new_node in a list. Note that this function expects a pointer to head_ref as this can modify the head of the input linked list (similar to push())*/ void sortedInsert(struct node** head_ref, struct node* new_node) { struct node* current; /* Special case for the head end */ if (*head_ref == NULL || (*head_ref)->lineNumber >= new_node->lineNumber) { new_node->next = *head_ref; *head_ref = new_node; } else { /* Locate the node before the point of insertion */ current = *head_ref; while (current->next!=NULL && current->next->lineNumber < new_node->lineNumber) { current = current->next; } new_node->next = current->next; current->next = new_node; } } /* BELOW FUNCTIONS ARE JUST UTILITY TO TEST sortedInsert */ /* Function to print linked list */ void printList(struct node *head) { struct node *temp = head; while(temp != NULL) { printf("%d %s \n", temp->lineNumber,temp->str); temp = temp->next; } } /* A utility function to insert a node at the beginning of linked list */ void push(struct node** head_ref, int new_data, char *line) { /* allocate node */ struct node* new_node = (struct node *)malloc(sizeof(struct node)); int len = strlen(line)+1; /* put in the data */ new_node->lineNumber = new_data; new_node->str = malloc(len); strcpy(new_node->str,line); new_node->str[len] = '\0'; /* link the old list off the new node */ new_node->next = (*head_ref); /* move the head to point to the new node */ (*head_ref) = new_node; } // Driver program to test above functions int main(int argc,char *argv[]) { struct node *a = NULL; push(&a, 5 , "TestLine"); push(&a, 1 , "SecondTest"); push(&a, 1 , "SecondTest"); push(&a, 3 , "SecondTest"); insertionSort(&a); printf("\nLinked List after sorting \n"); printList(a); return 0; }

code>

+1

cslrnr Nov 06 '16 at 10:24

source share

I would advise you to use a linked list.

 // Define your list like this typedef struct node { int line; // To hold the line number char * data; struct node * next; } node_t; // To insert node_t* insert(node_t *head, const char * data, int line) // n is line from beginning { // Node to be inserted in given line node_t *newNode; // Allocating Memory newNode = malloc(sizeof(node_t)); // Filling the Data to New Node newNode->data = malloc(strlen(data)+1); // Allocate memory to store data strcpy(newNode->data, data); newNode->line = line; newNode->next = NULL; // It might be our First Node in Linked List if(head == NULL) { //Address of New Node Becomes our head return (head = newNode); } // Node Might be inserted At Head else if(line == 0) { // Joining previous Linked List After new Node newNode->next = head; // Address of New Node Becomes our head return (head = newNode); } // Inserting At the line next to line else { // Pointer to store intermediate address of node // To be used in Traversing node_t * current = head; // Go through to insert at Nth line while(current != NULL) { node_t * next = current->next; //The next Node if((line >= current->line && line < next->line) || (line >= current->line && NULL == next->line)) { // Test if we are at some point between current line and next line or if there is no next // If we are, point newNode to the next node of current newNode->next = current->next; // Now point current towards our New Node current->next = newNode; // Return Head as soon as we have inserted our new node return head; } current = next; // Point current to the next node to continue } } }

If you guarantee that line numbers will always be larger, you can also keep a pointer to the node with the highest line number in each node. This will increase the space, but will result in a result for n (0).

0

KBJ Nov 06 '16 at 8:00

source share

sudo · Accepted Answer · 2016-11-06T07:07:11+0000

I will give two solutions that I could come up with, but this question is probably open.

Use a hash table. The keys are line numbers. Values (string, pointer to next line value) . This provides quick and linear access. Edit: The insert is still O(n) with this. This will only help with the access time, which will be O(1) . The second solution has an O(1) insert.
Assuming you don't have wildly spaced line numbers: use a single L list to store the lines. Also create a separate P array containing a pointer to each k -th node in the list. To access line i , check P[floor(i/k)] , go to the node that points to L , and go forward i mod k times to reach your line. Thus, the access time is O(k) . Insertion time O(1) . Using space for strings n - O(n + max{i}/k) .

The only thing that makes this relevant for C ... is that there is no built-in hash table, of course! So # 2 may be easier to implement.

What data structure in C allows me to easily store strings and add strings?

More articles: