I am trying to use rmongodb to extract information from the MongoDB database for further processing in R. However, I have some difficulties to really get started. It works:
cursor <- mongo.find(mongo, "people", query=list(last.name="Smith", first.name="John"), fields=list(address=1L, age=1L)) while (mongo.cursor.next(cursor)){ print(mongo.cursor.value(cursor))}
Now, what if I want to find people whose name is "John" or "Bob" or "Katherine"? I tried query=list(last.name="Smith", first.name=c(John, Bob, Catherine)) , but that didn't work. Replacing = by % did not work either.
Another problem is that the contents of the database are nested, which means that I have subtrees, subsubtrees, etc. For example, for the first.name="John", last.name="Smith" record, I may have such approaches as address, age, occupation , and for classes I may again have categories like subtrees (for example, years since 2005 for 2012 and for each year I would have a record like "unemployed", "clerk", "entrepreneur"). So, what if I want to find all the people with the name "John" who are 40 years old and were unemployed in 2010? What does the request look like?
EDIT as Stennie's answer: Here is an example of my database structure and the query I'm trying to make. Imagine that university graduates are divided into groups (for example, “very good students”, “good students”, etc.). Each group then contains a list of people who have been assigned to this group along with their details.
(0){..} _id : (Object ID) class id groupname: (string) unique name for this group (eg "beststudents") members[11] (0){..} persid : (integer) 1 firstname: (string) surname: (string) age: (integer) occupation: (string) (1){..} persid : (integer) 2 firstname: (string) surname: (string) age: (integer) occupation: (string)
Now suppose I am interested in groups with the names “best students” and “good students” and would like to get a “last name” and “lesson” for each member of each of these groups as an object R to do some stories, statistics, or something else . And maybe I also want to clarify this request in order to get only those under the age of 40. Now, after reading Stanny's answer, I tried this as follows:
cursor <- mongo.find(mongo, "test.people", list(groupname=list('$in'=c("beststudents", "goodstudents")), members.age=list('$lt'=40) # I haven't tried this with my DB, so I hope this line is right ), fields=list(members.surname=1L, members.occupation=1L) ) count <- mongo.count(mongo, "test.people", list(groupname=list('$in'=c("beststudents", "goodstudents")), members.age=list('$lt'=40) ) ) surnames <- vector("character", count) occupations <- vector("character", count) i <- 1 while (mongo.cursor.next(cursor)) { b <- mongo.cursor.value(cursor) surnames[i] <- mongo.bson.value(b, "members.surname") occupations[i] <- mongo.bson.value(b, "members.occupation") i <- i + 1 } df <- as.data.frame(list(surnames=surnames, occupations=occupations))
After running this message, there is no error message, but I get an empty data frame. What is wrong with this code?