How to Read an Entire Line From a File in C
Solarian Programmer
My programming ramblings
C Programming - read a file line past line with fgets and getline, implement a portable getline version
Posted on April three, 2019 by Paul
In this commodity, I volition show you how to read a text file line past line in C using the standard C function fgets and the POSIX getline function. At the end of the article, I will write a portable implementation of the getline role that can exist used with any standard C compiler.
Reading a file line past line is a trivial problem in many programming languages, but not in C. The standard mode of reading a line of text in C is to utilize the fgets function, which is fine if you know in advance how long a line of text could be.
You lot can find all the code examples and the input file at the GitHub repo for this article.
Let's starting time with a simple example of using fgets to read chunks from a text file. :
1 #include <stdio.h> 2 #include <stdlib.h> 3 4 int main ( void ) { 5 FILE * fp = fopen ( "lorem.txt" , "r" ); six if ( fp == Nil ) { 7 perror ( "Unable to open file!" ); 8 exit ( one ); ix } 10 11 char clamper [ 128 ]; 12 13 while ( fgets ( chunk , sizeof ( chunk ), fp ) != Cipher ) { 14 fputs ( chunk , stdout ); 15 fputs ( "|* \n " , stdout ); // marker cord used to prove where the content of the chunk array has ended 16 } 17 18 fclose ( fp ); 19 }
For testing the code I've used a simple dummy file, lorem.txt. This is a piece from the output of the higher up plan on my automobile:
ane ~ $ clang -std=c17 -Wall -Wextra -pedantic t0.c -o t0 two ~ $ ./t0 3 Lorem ipsum dolor sit amet, consectetur adipiscing elit. 4 |* 5 Fusce dignissim facilisis ligula consectetur hendrerit. Vestibulum porttitor aliquam luctus. Nam pharetra lorem vel ornare cond|* half-dozen imentum. 7 |* 8 Praesent et nunc at libero vulputate convallis. Cras egestas nunc vitae eros vehicula hendrerit. Pellentesque in est et sapien |* 9 dignissim molestie. 10 |*
The code prints the content of the chunk array, equally filled subsequently every call to fgets, and a mark cord.
If you picket carefully, by scrolling the above text snippet to the correct, you lot tin can run into that the output was truncated to 127 characters per line of text. This was expected considering our code can store an entire line from the original text file only if the line tin can fit inside our clamper assortment.
What if you need to have the entire line of text available for further processing and not a piece of line ? A possible solution is to copy or concatenate chunks of text in a split line buffer until we find the end of line grapheme.
Permit'south beginning by creating a line buffer that volition store the chunks of text, initially this volition have the same length equally the clamper array:
1 #include <stdio.h> 2 #include <stdlib.h> 3 #include <string.h> iv v int principal ( void ) { vi FILE * fp = fopen ( "lorem.txt" , "r" ); 7 // ... 8 nine char clamper [ 128 ]; 10 11 // Shop the chunks of text into a line buffer 12 size_t len = sizeof ( chunk ); 13 char * line = malloc ( len ); 14 if ( line == Zip ) { fifteen perror ( "Unable to classify memory for the line buffer." ); xvi leave ( 1 ); 17 } xviii 19 // "Empty" the string twenty line [ 0 ] = '\0' ; 21 22 // ... 23 24 }
Next, nosotros are going to suspend the content of the chunk array to the end of the line string, until we observe the terminate of line character. If necessary, we'll resize the line buffer:
1 #include <stdio.h> 2 #include <stdlib.h> 3 #include <string.h> 4 5 int main ( void ) { 6 // ... 7 eight // "Empty" the string 9 line [ 0 ] = '\0' ; 10 11 while ( fgets ( clamper , sizeof ( clamper ), fp ) != Naught ) { 12 // Resize the line buffer if necessary 13 size_t len_used = strlen ( line ); 14 size_t chunk_used = strlen ( clamper ); fifteen 16 if ( len - len_used < chunk_used ) { 17 len *= 2 ; 18 if (( line = realloc ( line , len )) == NULL ) { 19 perror ( "Unable to reallocate retentivity for the line buffer." ); xx free ( line ); 21 leave ( ane ); 22 } 23 } 24 25 // Copy the chunk to the end of the line buffer 26 strncpy ( line + len_used , clamper , len - len_used ); 27 len_used += chunk_used ; 28 29 // Check if line contains '\n', if yes procedure the line of text xxx if ( line [ len_used - 1 ] == '\n' ) { 31 fputs ( line , stdout ); 32 fputs ( "|* \n " , stdout ); 33 // "Empty" the line buffer 34 line [ 0 ] = '\0' ; 35 } 36 } 37 38 fclose ( fp ); 39 gratis ( line ); forty 41 printf ( " \north\n Max line size: %zd \n " , len ); 42 }
Please annotation, that in the in a higher place code, every fourth dimension the line buffer needs to be resized its chapters is doubled.
This is the result of running the in a higher place code on my auto. For brevity, I kept only the beginning lines of output:
1 ~ $ clang -std=c17 -Wall -Wextra -pedantic t1.c -o t1 two ~ $ ./t1 iii Lorem ipsum dolor sit amet, consectetur adipiscing elit. iv |* 5 Fusce dignissim facilisis ligula consectetur hendrerit. Vestibulum porttitor aliquam luctus. Nam pharetra lorem vel ornare condimentum. half dozen |* 7 Praesent et nunc at libero vulputate convallis. Cras egestas nunc vitae eros vehicula hendrerit. Pellentesque in est et sapien dignissim molestie. viii |* 9 Aliquam erat volutpat. Mauris dignissim augue air conditioning purus placerat scelerisque. Donec eleifend ut nibh eu elementum. 10 |*
Y'all can see that, this time, we tin can print full lines of text and not stock-still length chunks like in the initial arroyo.
Let's modify the to a higher place code in order to impress the line length instead of the bodily text:
ane // ... 2 three int primary ( void ) { 4 // ... 5 half dozen while ( fgets ( chunk , sizeof ( clamper ), fp ) != NULL ) { 7 viii // ... 9 10 // Check if line contains '\northward', if yes process the line of text eleven if ( line [ len_used - 1 ] == '\north' ) { 12 printf ( "line length: %zd \n " , len_used ); xiii // "Empty" the line buffer 14 line [ 0 ] = '\0' ; 15 } 16 } 17 18 fclose ( fp ); 19 free ( line ); twenty 21 printf ( " \due north\due north Max line size: %zd \n " , len ); 22 }
This is the issue of running the modified code on my car:
1 ~ $ clang -std=c17 -Wall -Wextra -pedantic t1.c -o t1 two ~ $ ./t1 iii line length: 57 four line length: 136 5 line length: 147 half dozen line length: 114 7 line length: 112 8 line length: 95 9 line length: 62 10 line length: 1 eleven line length: 428 12 line length: 1 xiii line length: 460 14 line length: 1 15 line length: 834 xvi line length: ane 17 line length: 821 eighteen nineteen 20 Max line size: 1024
In the adjacent example, I will show y'all how to use the getline office available on POSIX systems like Linux, Unix and macOS. Microsoft Visual Studio doesn't have an equivalent role, so you won't exist able to easily test this example on a Windows system. Nevertheless, you should exist able to test information technology if you are using Cygwin or Windows Subsystem for Linux.
ane #include <stdio.h> 2 #include <stdlib.h> 3 #include <string.h> four five int main ( void ) { vi FILE * fp = fopen ( "lorem.txt" , "r" ); 7 if ( fp == Aught ) { 8 perror ( "Unable to open file!" ); 9 exit ( ane ); 10 } 11 12 // Read lines using POSIX function getline 13 // This code won't work on Windows 14 char * line = Naught ; 15 size_t len = 0 ; 16 17 while ( getline ( & line , & len , fp ) != - 1 ) { 18 printf ( "line length: %zd \north " , strlen ( line )); 19 } 20 21 printf ( " \north\north Max line size: %zd \n " , len ); 22 23 fclose ( fp ); 24 free ( line ); // getline will resize the input buffer as necessary 25 // the user needs to free the retentivity when not needed! 26 }
Please notation, how uncomplicated is to use POSIX'southward getline versus manually buffering chunks of line like in my previous instance. Information technology is unfortunate that the standard C library doesn't include an equivalent function.
When you apply getline, don't forget to costless the line buffer when you don't need it anymore. Also, calling getline more than than one time will overwrite the line buffer, brand a copy of the line content if you lot need to proceed it for farther processing.
This is the event of running the higher up getline example on a Linux automobile:
i ~ $ clang -std=gnu17 -Wall -Wextra -pedantic t2.c -o t2 two ~ $ ./t2 3 line length: 57 4 line length: 136 5 line length: 147 6 line length: 114 7 line length: 112 eight line length: 95 9 line length: 62 x line length: 1 11 line length: 428 12 line length: 1 13 line length: 460 14 line length: 1 xv line length: 834 16 line length: i 17 line length: 821 eighteen xix 20 Max line size: 960
Information technology is interesting to note, that for this particular case the getline role on Linux resizes the line buffer to a max of 960 bytes. If yous run the same code on macOS the line buffer is resized to 1024 bytes. This is due to the unlike ways in which getline is implemented on different Unix like systems.
As mentioned before, getline is not present in the C standard library. It could be an interesting exercise to implement a portable version of this role. The thought hither is not to implement the well-nigh performant version of getline, merely rather to implement a simple replacement for non POSIX systems.
We are going to take the above example and supersede the POSIX'due south getline version with our own implementation, say my_getline. Obviously, if you are on a POSIX system, you should use the version provided by the operating system, which was tested by countless users and tuned for optimal performance.
The POSIX getline function has this signature:
1 ssize_t getline ( char ** restrict lineptr , size_t * restrict due north , FILE * restrict stream );
Since ssize_t is also a POSIX defined type, usually a 64 bits signed integer, this is how we are going to declare our version:
ane int64_t my_getline ( char ** restrict line , size_t * restrict len , FILE * restrict fp );
In principle nosotros are going to implement the function using the aforementioned approach as in one of the higher up examples, where I've defined a line buffer and kept copying chunks of text in the buffer until we plant the end of line character:
i // This will only have effect on Windows with MSVC ii #ifdef _MSC_VER three #define _CRT_SECURE_NO_WARNINGS 1 four #define restrict __restrict v
0 Response to "How to Read an Entire Line From a File in C"
Post a Comment