Std :: getline alternative when input line ends are mixed

I am trying to read lines from std::istream , but the input may contain '\r' and / or '\n' , so std::getline useless.

Sorry to scream, but this seems to need an accent ...

The input can contain either a newline type, or both.

Is there a standard way to do this? I'm trying at the moment

 char c; while (in >> c && '\n' != c && '\r' != c) out .push_back (c); 

... but it misses the gaps. D'o! std::noskipws - you need to mess around more, and now it works misehaving.

Of course, there must be a better way?!?

+7
source share
2 answers

OK, here is one way to do this. I basically implemented the implementation of std::getline , which takes a predicate instead of a character. This gives you 2/3 of the way:

 template <class Ch, class Tr, class A, class Pred> std::basic_istream<Ch, Tr> &getline(std::basic_istream<Ch, Tr> &is, std::basic_string<Ch, Tr, A>& str, Pred p) { typename std::string::size_type nread = 0; if(typename std::istream::sentry(is, true)) { std::streambuf *sbuf = is.rdbuf(); str.clear(); while (nread < str.max_size()) { int c1 = sbuf->sbumpc(); if (Tr::eq_int_type(c1, Tr::eof())) { is.setstate(std::istream::eofbit); break; } else { ++nread; const Ch ch = Tr::to_char_type(c1); if (!p(ch)) { str.push_back(ch); } else { break; } } } } if (nread == 0 || nread >= str.max_size()) { is.setstate(std::istream::failbit); } return is; } 

with a functor like this:

 struct is_newline { bool operator()(char ch) const { return ch == '\n' || ch == '\r'; } }; 

Now it remains only to determine whether you ended up with '\r' or not ... if you did, then if the next character is '\n' , just use it and ignore it.

EDIT . So, so that all this goes into a functional solution, here is an example:

 #include <string> #include <sstream> #include <iostream> namespace util { struct is_newline { bool operator()(char ch) { ch_ = ch; return ch_ == '\n' || ch_ == '\r'; } char ch_; }; template <class Ch, class Tr, class A, class Pred> std::basic_istream<Ch, Tr> &getline(std::basic_istream<Ch, Tr> &is, std::basic_string<Ch, Tr, A>& str, Pred &p) { typename std::string::size_type nread = 0; if(typename std::istream::sentry(is, true)) { std::streambuf *const sbuf = is.rdbuf(); str.clear(); while (nread < str.max_size()) { int c1 = sbuf->sbumpc(); if (Tr::eq_int_type(c1, Tr::eof())) { is.setstate(std::istream::eofbit); break; } else { ++nread; const Ch ch = Tr::to_char_type(c1); if (!p(ch)) { str.push_back(ch); } else { break; } } } } if (nread == 0 || nread >= str.max_size()) { is.setstate(std::istream::failbit); } return is; } } int main() { std::stringstream ss("this\ris a\ntest\r\nyay"); std::string item; util::is_newline is_newline; while(util::getline(ss, item, is_newline)) { if(is_newline.ch_ == '\r' && ss.peek() == '\n') { ss.ignore(1); } std::cout << '[' << item << ']' << std::endl; } } 

I made a few minor changes in my original example. The Pred p parameter is now a reference, so the predicate can store some data (in particular, the last char ). In the same way, I made the predicate operator() non-const so that it could preserve this character.

Basically I have a line in std::stringstream that has all 3 versions of line breaks. I use my util::getline , and if the predicate says the last char was '\r' , then I peek() in front and ignore character 1 if it is '\n' .

+4
source

The usual way to read a string is std::getline .

Edit: if your implementation of std::getline broken, you can write something similar, something like this:

 std::istream &getline(std::istream &is, std::string &s) { char ch; s.clear(); while (is.get(ch) && ch != '\n' && ch != '\r') s += ch; return is; } 

I should add that technically this is probably not a violation of std::getline , since due to the destruction of the existing stream implementation it is before the stream to translate from any characters that mean the end of the line for the platform to a newline character. However, regardless of which particular parts are violated, if your implementation is violated, it may be able to compensate for this (then again, if your implementation is violated badly enough, it's hard to be sure that this will work too).

+3
source

All Articles