Symfony: Doctrine data fixture: how to handle a large csv file?

I am trying to insert (into mySQL database) data from a "large" CSV file (3Mo / 37000 rows / 7 columns) using doctrinal data.

The process is very slow, and at this time I could not succeed (maybe I had to wait a little longer).

I believe that doctrine devices are not designed to control so much data? Maybe the solution should be to import my csv directly into the database?

Any idea on how to proceed?

Here is the code:

<?php namespace FBN\GuideBundle\DataFixtures\ORM; use Doctrine\Common\DataFixtures\AbstractFixture; use Doctrine\Common\DataFixtures\OrderedFixtureInterface; use Doctrine\Common\Persistence\ObjectManager; use FBN\GuideBundle\Entity\CoordinatesFRCity as CoordFRCity; class CoordinatesFRCity extends AbstractFixture implements OrderedFixtureInterface { public function load(ObjectManager $manager) { $csv = fopen(dirname(__FILE__).'/Resources/Coordinates/CoordinatesFRCity.csv', 'r'); $i = 0; while (!feof($csv)) { $line = fgetcsv($csv); $coordinatesfrcity[$i] = new CoordFRCity(); $coordinatesfrcity[$i]->setAreaPre2016($line[0]); $coordinatesfrcity[$i]->setAreaPost2016($line[1]); $coordinatesfrcity[$i]->setDeptNum($line[2]); $coordinatesfrcity[$i]->setDeptName($line[3]); $coordinatesfrcity[$i]->setdistrict($line[4]); $coordinatesfrcity[$i]->setpostCode($line[5]); $coordinatesfrcity[$i]->setCity($line[6]); $manager->persist($coordinatesfrcity[$i]); $this->addReference('coordinatesfrcity-'.$i, $coordinatesfrcity[$i]); $i = $i + 1; } fclose($csv); $manager->flush(); } public function getOrder() { return 1; } } 
+6
source share
3 answers

Two rules to follow when creating large import packages are as follows:

  • Disable SQL logging: ( $manager->getConnection()->getConfiguration()->setSQLLogger(null); ) to avoid huge memory loss.

  • Repeat and clean often, not just once at the end. I suggest you add if ($i % 25 == 0) { $manager->flush(); $manager->clear() } if ($i % 25 == 0) { $manager->flush(); $manager->clear() } inside your loop to clear every 25 INSERT.

EDIT: The last thing I forgot: don't keep your entities inside variables when you no longer need them. Here, in your loop, you only need the current object that is being processed, so do not save the previous entity in the $coordinatesfrcity array. This can lead to memory overflow if you keep doing this.

+6
source

There is a great example in the docs: http://doctrine-orm.readthedocs.org/projects/doctrine-orm/en/latest/reference/batch-processing.html

Use the expression modulo (x% y) to implement batch processing, this example will insert 20 at a time. You can optimize this depending on your server.

 $batchSize = 20; for ($i = 1; $i <= 10000; ++$i) { $user = new CmsUser; $user->setStatus('user'); $user->setUsername('user' . $i); $user->setName('Mr.Smith-' . $i); $em->persist($user); if (($i % $batchSize) === 0) { $em->flush(); $em->clear(); // Detaches all objects from Doctrine! } } $em->flush(); //Persist objects that did not make up an entire batch $em->clear(); 
0
source

For appliances that require a lot of memory but are not dependent on each other, I will get around this problem by using the append flag to insert one object (or a smaller group of objects) at a time:

 bin/console doctrine:fixtures:load --fixtures="memory_hungry_fixture.file" --append 

Then I write a Bash script that runs this command as many times as I need.

In your case, you can expand the Fixtures command and have a flag that executes lots of objects - the first 1000 lines, then 2,000, etc.

0
source

All Articles