import 27 Oct 2006
The import method can be used to insert multiple sets of values at once. This is more efficient and much faster then creating and iterating over lots of ActiveRecord models and calling the save method. This is because the ActiveRecord::Base#save method sends a single SQL statement to the server. If you have thousands of records to create this will take a long long while because you will be executing a thousands and thousands of statements.

ARE also includes a built-in benchmarking library which outputs to CSV and to HTML.

import Usage

Syntax: Model.import( columns, values, options )

  • columns – an array of column identifiers, The identifiers can be Strings or Symbols. (Required)

  • values – an array of value sets. The order of the elements inside each value set must match up to the order of the column identifiers. (Required)

  • options – a Hash of options. (Optional)

    Basic Example

    class Person < ActiveRecord::Base ; end

    columns = [ :first_name, :last_name ]
    values = [ [ ‘Zach’, ‘Dennis’ ], [ ‘Mark’, ‘Van Holstyn’ ], [ ‘John’, ‘Doe’ ] ]
    Person.import column, values

    The Options Hash

The options Hash can take the following key/value pairs:

  • :validate => true|false – this tells import whether or not to use the validations that you defined on your models before importing the records. By default this is true

  • :on_duplicate_key_update => Array|Hash – this tells import to use the associated Array or Hash to update rows where a duplicate key already exists. This is specific MySQL.

  • :ignore => true|false – this tells import to issue a “INSERT IGNORE” when running each SQL statement. This is useful when you are importing mass amounts of records and you don’t care if MySQL finds duplicate entries.

    :on_duplicate_key Notes

    The :on_duplicate_key_update option will support an Array of column identifiers or it will support a Hash of column to column mappings. For example if you were inserting information for school enrollment and you had 6,000 students that were enrolled you would want to batch process them with the import method. But what if some of the students attended last semester? You don’t want to overwrite their information. Instead you may want to just update their address and contact information. That would look like:

    values = values_for_students_csv_file( ‘students.csv’ ) Students.import( [ :first_name, :last_name, :student_id, :address, :city, :state ], values, :on_duplicate_key_update => [ :address, :city, :state ] )

    Currently the Hash options works the same way except for you explicitly state { :address => :address, :city => :city, :state => :state } . Why is the Hash even there? It’s there because it is anticipated to allow for special different column mapping support and table/data merging support you will be able to move data from one table to the next, where column names may not be identical and you want to map columnA from 1 table to columnB on another.

    Where is this useful?

  • Users want to upload CSV or tab delimited files with lots of data

  • Log files that need to be batch processed

  • Anything that needs to be batched processed

Supported Database Adapters

  • MySQL

    Additional Comments

    I prefer this over maintaining alot of little ruby or bash shell scripts to create a file of SQL statements, and then to run those statements from mysql’s command line client in batch mode. I much prefer to keep the code intact with the rest of my application. It is easier to read, maintain and track.

    Benchmarks

Please see the benchmarks for MySQL.


blog comments powered by Disqus