Rare behavior with mongodb cursor (DBCursor) in java, although last element has already been reached

I have a collection called MyCollection that contains 200 items in bd MyDB in mongodb

> use MyDB switched to db MyDB > db.MyCollection.count() 200 

I get very rare behavior, even in the different ways that I used to load the cursor and iterate over it, this is my code:

 DBCollection collection = getCollection("MyBD", "MyCollection"); DBCursor cursor = collection.find(); //DBCursor cursor = collection.find().limit(200); System.out.println("Cursor length: "+cursor.length()); Iterator<DBObject> itrc = cursor.iterator(); //while(cursor.hasNext()){ while (itrc.hasNext()) { //DBObject obj = (DBObject)cursor.next(); DBObject obj = (DBObject)itrc.next(); //BSONObject obj2 = (BSONObject)obj.get("scores"); Integer intg = (Integer) obj.get("_id"); System.out.println("_id:"+intg.toString()); // operations remove and insert on the collection // that affect the cursor behavior BasicDBList bl = (BasicDBList) obj.get("fieldArray"); BasicDBObject bdo = new BasicDBObject(); bdo.put("fieldArray", bl); Integer intid = (Integer) obj.get("_id"); bdo.put("_id", intid); String str = (String) obj.get("fieldString"); bdo.put("fieldString", str); collection.remove(obj); obj=null; collection.insert(bdo); if(intg.intValue()==199){ System.out.println("Reached: "+intg.intValue()); } } 

This is the conclusion:

 Cursor length: 200 _id:0 _id:1 _id:2 _id:3 _id:4 _id:5 _id:6 _id:7 _id:8 _id:9 _id:10 _id:11 _id:12 _id:13 _id:14 _id:15 _id:16 _id:17 _id:18 _id:19 _id:20 _id:21 _id:22 _id:23 _id:24 _id:25 _id:26 _id:27 _id:28 _id:29 _id:30 _id:31 _id:32 _id:33 _id:34 _id:35 _id:36 _id:37 _id:38 _id:39 _id:40 _id:41 _id:42 _id:43 _id:44 _id:45 _id:46 _id:47 _id:48 _id:49 _id:50 _id:51 _id:52 _id:53 _id:54 _id:55 _id:56 _id:57 _id:58 _id:59 _id:60 _id:61 _id:62 _id:63 _id:64 _id:65 _id:66 _id:67 _id:68 _id:69 _id:113 _id:101 _id:102 _id:103 _id:104 _id:105 _id:106 _id:107 _id:108 _id:109 _id:110 _id:111 _id:112 _id:114 _id:115 _id:116 _id:117 _id:118 _id:119 _id:120 _id:121 _id:122 _id:123 _id:124 _id:125 _id:126 _id:127 _id:128 _id:129 _id:130 _id:131 _id:132 _id:133 _id:134 _id:135 _id:136 _id:137 _id:138 _id:139 _id:140 _id:141 _id:142 _id:143 _id:144 _id:145 _id:146 _id:147 _id:148 _id:149 _id:150 _id:151 _id:152 _id:153 _id:154 _id:155 _id:156 _id:157 _id:158 _id:159 _id:160 _id:161 _id:162 _id:163 _id:164 _id:165 _id:166 _id:167 _id:168 _id:169 _id:170 _id:171 _id:172 _id:173 _id:174 _id:175 _id:176 _id:177 _id:178 _id:179 _id:180 _id:181 _id:182 _id:183 _id:184 _id:185 _id:186 _id:187 _id:188 _id:189 _id:190 _id:191 _id:192 _id:193 _id:194 _id:195 _id:196 _id:197 _id:198 _id:199 *************************** Reached: 199 *************************** _id:70 _id:71 _id:72 _id:73 _id:74 _id:75 _id:76 _id:77 _id:78 _id:79 _id:80 _id:81 _id:82 _id:83 _id:84 _id:85 _id:86 _id:87 _id:88 _id:89 _id:90 _id:91 _id:92 _id:93 _id:94 _id:95 _id:96 _id:97 _id:98 _id:99 _id:100_id:96 _id:97 _id:98 _id:99 _id:100 

As you can see, as soon as the limit of 200 elements is reached (element _id: 199), it goes to the element using _id: 70, and then repeats 31 iterations until the element _id: 100 is reached, and not finished at the right time that would be at iteration 200.

Alternatives: one is commented out in code (using the cursor method: hasNext ()), and the other that works (using Iterator) has the same output.

If I remove part of the operations from the collection (delete / insert in my case), then rare behavior does not occur.

This is the expected result:

 Cursor length: 200 _id:0 _id:1 _id:2 _id:3 _id:4 _id:5 _id:6 _id:7 _id:8 _id:9 _id:10 _id:11 _id:12 _id:13 _id:14 _id:15 _id:16 _id:17 _id:18 _id:19 _id:20 _id:21 _id:22 _id:23 _id:24 _id:25 _id:26 _id:27 _id:28 _id:29 _id:30 _id:31 _id:32 _id:33 _id:34 _id:35 _id:36 _id:37 _id:38 _id:39 _id:40 _id:41 _id:42 _id:43 _id:44 _id:45 _id:46 _id:47 _id:48 _id:49 _id:50 _id:51 _id:52 _id:53 _id:54 _id:55 _id:56 _id:57 _id:58 _id:59 _id:60 _id:61 _id:62 _id:63 _id:64 _id:65 _id:66 _id:67 _id:68 _id:69 _id:113 _id:101 _id:102 _id:103 _id:104 _id:105 _id:106 _id:107 _id:108 _id:109 _id:110 _id:111 _id:112 _id:114 _id:115 _id:116 _id:117 _id:118 _id:119 _id:120 _id:121 _id:122 _id:123 _id:124 _id:125 _id:126 _id:127 _id:128 _id:129 _id:130 _id:131 _id:132 _id:133 _id:134 _id:135 _id:136 _id:137 _id:138 _id:139 _id:140 _id:141 _id:142 _id:143 _id:144 _id:145 _id:146 _id:147 _id:148 _id:149 _id:150 _id:151 _id:152 _id:153 _id:154 _id:155 _id:156 _id:157 _id:158 _id:159 _id:160 _id:161 _id:162 _id:163 _id:164 _id:165 _id:166 _id:167 _id:168 _id:169 _id:170 _id:171 _id:172 _id:173 _id:174 _id:175 _id:176 _id:177 _id:178 _id:179 _id:180 _id:181 _id:182 _id:183 _id:184 _id:185 _id:186 _id:187 _id:188 _id:189 _id:190 _id:191 _id:192 _id:193 _id:194 _id:195 _id:196 _id:197 _id:198 _id:199 *************************** Reached: 199 *************************** 

I found a similar SO question , but this is not clear to me:

  • How do delete / insert operations affect cursor behavior in the way I was before?
  • How to use snapshot option?
  • thinking ahead, but what about if I need to work with an ordered collection?

By the way, if I use a parameter without an iterator, like this:

 while(cursor.hasNext()){ DBObject obj = (DBObject)cursor.next(); 
  • Why do I need to delete the next line?

    System.out.println ("Cursor length:" + cursor.length ());

To avoid the following exception:

 Exception in thread "main" java.lang.IllegalArgumentException: can't switch cursor access methods at com.mongodb.DBCursor._checkType(DBCursor.java:412) at com.mongodb.DBCursor.hasNext(DBCursor.java:483) at tasks.UpdateRemoveHW.main(Test.java:56) 
+4
source share
2 answers

Just throw an IllegalArgumentException, then you get that using DBCursor.length () will already convert the cursor to an array. So after using hasnext () the next isillegeal. If you want to use hasnext () or next (), it is better to remove length () before iterating.

+3
source

I'm not sure what happens to your rare behavior in the first part of your question, however, it is generally unsafe to modify any data structure during iteration over it, with the exception of the Iterator.remove () method.


The last Warning at the top of the API Documentation for DBCursor indirectly answers the last part of your question:

Warning Calling toArray or length on a DBCursor will irrevocably turn it into an array. This means that if the cursor repeated more than ten million results (which it lazily retrieved from the database), suddenly there would be a ten million element array in memory. Before converting to an array, make sure that there are a reasonable number of results using skip () and limit ().

If you read the source code for DBCursor (line 483) , where it throws an IllegalArgumentException , you can see that any call to DBCursor.length () turns the cursor into an array, after which all calls to DBCursor.next () or DBCursor.hasNext () become illegal .

I think this behavior definitely violates the Least Surprise Principle . Arrays still have Iterators, so it would be better if the internal data structure was hidden and the Iterator methods continued to work. In addition, when calling DBCursor.length (), you do not need to retrieve records from the database, I think that it should behave similarly to DBCursor.count (), but somehow take into account limit () and skip (), and then cache the result.

+2
source

All Articles