I am trying to learn more about the FSharp.Data project, using it to read a CSV file. The CSV file is a simplified version of the data from the Kaggle digit recognition contest.
When I read a CSV file containing 785 columns and 113 lines (including the header line), the following two lines of code run very slowly:
type trainingSet = CsvProvider<"Data/trainSmall.csv", ",", CacheRows=false> let data = trainingSet.Load("Data/trainSmall.csv")
When I sent the first line to F # interactive, it will return in about 10 seconds, while when sending the second line of code to interactive F #, it takes more than 5 minutes before the interactive invitation responds.
I have been running code on my MacBook Pro since 2013 with an I5 2.6 GHz processor and 16 GB of RAM using F # 3.0 and Xamarin Studio. I tried the same experiment with Windows7 / VS2013 running under VM on the same hardware. The results are comparable. When I use the same machine and try to do the same with R, it is so fast that I canβt do this with a regular watch.
Please advise me on the proper use of the CSV type propvider from Fsharp.Data!
carstenj
source share