Simple dataset import/export for Rails
I’ve been doing some Ruby on Rails development recently and one thing that’s bugged me a bit is how to deal with test data while you’re developing the application. Not data that is used just for testing, but data that lives in the development database that you use and fiddle around with while developing the site itself. I’ve found that while developing features of the site I need a fairly large and varied set of data to use, and this data takes a while to enter into the site each time I modify the database.
So instead of repeatedly entering this data I wrote some simple Rake tasks that can automate exporting and importing the data from the database. I can’t take all the credit as the idea was inspired/based on/stolen from the Advanced Rails Recipes book.
Exporting works by writing each table to its own CSV file in the db/dataset directory (or subdirectory) in the rails project. The first line of the file is the name of the columns in the database.
desc "Exports the database contents into CSV files."
task :export => :environment do
require "FasterCSV"
require "FileUtils"
path = File.join(DATASET_PATH, (ENV["DATASET"] || "."))
FileUtils.mkdir_p(path)
ActiveRecord::Base.establish_connection
database = ActiveRecord::Base.connection
(database.tables - SKIP_TABLES).each do |table_name|
FasterCSV.open(File.join(path, "#{table_name}.csv"), "w") do |csv|
# Write column names
csv << database.columns(table_name).map(&:name)
# Write rows
database.select_rows("SELECT * FROM %s" % table_name).each { |row| csv << row }
end
end
end
Importing works much the same way by reading all of the CSV files in the db/dataset directory (or subdirectory) and attempting to load the data into the database. However it does check that the table of that name exists first and skips it otherwise.
desc "Imports data from CSV files into the database."
task :import => :environment do
require "FasterCSV"
path = File.join(DATASET_PATH, (ENV["DATASET"] || "."))
ActiveRecord::Base.establish_connection
database = ActiveRecord::Base.connection
tables = (database.tables - SKIP_TABLES)
database.disable_referential_integrity do
Dir.chdir(path) do
Dir.glob("*.csv") do |file_name|
table_name = file_name[/(.+)\.csv/, 1]
next unless tables.include?(table_name)
rows = FasterCSV.read(file_name)
# Construct a list of column headings from the first row
heading_row = rows.shift.map { |heading| database.quote_column_name(heading) }.join(",")
rows.each do |row|
# Need to escape the data correctly so that inserting is successful
row_data = row.map { |item| database.quote(item) }.join(",")
database.execute("INSERT INTO %s(%s) VALUES (%s);" % [table_name, heading_row, row_data])
end
end
end
end
end
These are faily simple tasks and therefore I’m releasing the code to the public domain, grab the file and use it freely.
To make it work just add it to your lib/tasks directory in your Rails project. The tasks can then be invoked with:
rake dataset:export rake dataset:import
You can also specify a specific subdirectory by passing the DATASET option to rake, like:
rake dataset:export DATASET="dev"
Of course I know there are many things missing from this script, such as being able to specify the rails environment to use, but I’ve kept it simple so adding these features is left as an exercise to the reader.