Using Faker to Populate a Database: With an ORM, It's a Piece of Cake

Generating test data for your database requires imagination. Too much imagination in fact, if you want to populate tables with thousands of records with relatively good relevance.

Fortunately, Faker can do the job for you. For instance, to populate an author table with 100 test authors, here is how to make Faker and PDO cooperate for the best:

If you use an ORM like Propel, you can get rid of the PDO machinery to deal with Entity objects:

You may notice that the names of the Faker formatters are very similar to that of the author properties. In fact, it's quite easy to choose a Faker formatter based on a column name - or type. And that's exactly what The Faker\ORM\Propel\Populator does: based on the metadata gathered from model introspection, it infers the best formatters to use for each column. The code to populate 100 authors is then much shorter:

This is so easy that you can populate an entire database in no time. For instance, if an Author has many Books, and a Book has many Reviews, here is how to bootstrap an application for stress test:

This snippet will add tons of data to the three tables. And since the Populator collects the inserted primary keys along the process, it can reinject them as random foreign keys on related objects. In this example, the populated Books get related to one of the populated Authors, and the populated Reviews get related to one of the populated Books. That's why the calling order of the addEntity() method is important - the entities generated first become available for relationship to the entities generated next. The result is that the generated data looks like real data, including the relationships between entities.

The Faker Populator allows the developer to override the formatter used for a particular property. For instance, the Book Entity has an ISBN property declared as a string of 13 characters. Faker's guesser would generate a random string looking like 'gudgtsncrdkfu' for this property - which clearly doesn't look like a real ISBN. It's very easy to pass a closure in the third argument to the addEntity() method call, so that the populator use a custom formatter for this property:

When populating the ISBN property of each new Book entity, the populator now calls the anonymous function instead of the default text formatter. It passes the already inserted PKs and the current entity as arguments (even if the anonymous function shown above ignores them) to ease the crafting of fake data related to other entities or to the current one.

The Propel Populator for Faker is a simple and powerful tool that will let you work in development with close-to-real data volumes. When the time to put applications to production comes, there should be no bad surprise due to longer queries or bad indexes - these can now be detected earlier in development.

Published on 24 Oct 2011 with tags development php

comments powered by Disqus