I am new to Rust and I read "Rust Programming Language", and in the "Error Handling" section there is a "case study" describing a program for reading data from a CSV file using the csv and rustc-serialize libraries (using getopts to parse the arguments).
The author writes a search function that goes through the lines of the csv file using the csv::Reader object and collects those records whose field "city" matches the specified value in the vector and returns it. I used a slightly different approach than the author, but this should not affect my question. My (working) function looks like this:
extern crate csv; extern crate rustc_serialize; use std::path::Path; use std::fs::File; fn search<P>(data_path: P, city: &str) -> Vec<DataRow> where P: AsRef<Path> { let file = File::open(data_path).expect("Opening file failed!"); let mut reader = csv::Reader::from_reader(file).has_headers(true); reader.decode() .map(|row| row.expect("Failed decoding row")) .filter(|row: &DataRow| row.city == city) .collect() }
where the type of DataRow is just a record,
#[derive(Debug, RustcDecodable)] struct DataRow { country: String, city: String, accent_city: String, region: String, population: Option<u64>, latitude: Option<f64>, longitude: Option<f64> }
Now the author poses, as a terrible "exercise for the reader," the problem of changing this function to return an iterator instead of a vector (excluding the call to collect ). My question is: how can this be done at all, and what are the most concise and idiomatic ways to do this?
A simple attempt, which seems to me to get the correct type signature, is
fn search_iter<'a,P>(data_path: P, city: &'a str) -> Box<Iterator<Item=DataRow> + 'a> where P: AsRef<Path> { let file = File::open(data_path).expect("Opening file failed!"); let mut reader = csv::Reader::from_reader(file).has_headers(true); Box::new(reader.decode() .map(|row| row.expect("Failed decoding row")) .filter(|row: &DataRow| row.city == city)) }
I am returning a tag object of type Box<Iterator<Item=DataRow> + 'a> so as not to expose the Filter type internally and where the lifetime of 'a is entered to avoid the need to create a local city clone. But this cannot be compiled because the reader does not live long enough; it is allocated on the stack and therefore freed when the function returns.
I suppose this means that the reader must be allocated on the heap (i.e. in the box) from the very beginning or somehow moved from the stack before the function completes. If I returned the closure, this is exactly the problem that would be solved by closing move . But I do not know how to do something like this when I do not return a function. I tried to determine the type of user iterator containing the necessary data, but I could not get it to work, and it was more ugly and more inventive (do not do too much of this code, I only include it in showing the general direction of my attempts):
fn search_iter<'a,P>(data_path: P, city: &'a str) -> Box<Iterator<Item=DataRow> + 'a> where P: AsRef<Path> { struct ResultIter<'a> { reader: csv::Reader<File>, wrapped_iterator: Option<Box<Iterator<Item=DataRow> + 'a>> } impl<'a> Iterator for ResultIter<'a> { type Item = DataRow; fn next(&mut self) -> Option<DataRow> { self.wrapped_iterator.unwrap().next() } } let file = File::open(data_path).expect("Opening file failed!"); // Incrementally initialise let mut result_iter = ResultIter { reader: csv::Reader::from_reader(file).has_headers(true), wrapped_iterator: None // Uninitialised }; result_iter.wrapped_iterator = Some(Box::new(result_iter.reader .decode() .map(|row| row.expect("Failed decoding row")) .filter(|&row: &DataRow| row.city == city))); Box::new(result_iter) }
This question , apparently, concerns the same problem, but the author of the answer solves it by making the corresponding data static , which I do not think is an alternative for this question.
I am using Rust 1.10.0, the current stable version from the Arch Linux rust package.