paperclip with more than 32.000 attachments

those of you who run their rails app with paperclip on a standard linux server and ever tried to save more than 32.000 attachments may know the problem.

the saving of new files is shamefully refused with a meaningless error message.

the problem is as easy to explain as it is to solve.
but sadly most of the time it’s already to late when you run into the problem and you already saved 32.000 attachments.

but to start from the beginning: the linux filesystem ext3 can normally only handle 32.000 subfolders in a folder.
since paperclip creates one subfolder for every attachment you’re headed for disaster rather quick.

luckily the paperclip makers in their infinite wisdom have already built in a solution for this problem. which works great…if you happen to know about it…
you only have to adjust the path and the url accordingly:

has_attached_file :photo, :styles => {:original => "640x480>"},
:path => ":rails_root/public/system/attachment/:id_partition/:style/:filename",
:url => "/system/:attachment/:id_partition/:style/:filename"

as you can see the parameter :id_partition changes the saving scheme.
this means that paperclip now will convert all ids to 9 digits and split them into subfolders every 3 digits.
so a folder structure like this will emerge:

000/
      001/
      002/
      003/
            001/
            002/
      004/
001/

this way you could save up to a billion files. should be enough for now…

but what about the files we already saved the old way?
for that purpose i wrote a little rake task:

task "repair" do
  require "config/environment"

  Dir.foreach("#{RAILS_ROOT}/public/system/photos/") do |entry|
    if entry =~ /..../
      puts entry.to_i
      new = ("%09d" % entry.to_i).scan(/d{3}/).join("/")
      puts new
      `mkdir -p #{RAILS_ROOT}/public/system/photos/#{new}`
      `mv #{RAILS_ROOT}/public/system/photos/#{entry}/*  #{RAILS_ROOT}/public/system/photos/#{new}/`
    end
  end

please be advised that this is a rather hacky solution, which means that in this example i only convert my photos with a 4 digit ids.

so if you have time and know a nice regex to solve this in a more general way, please leave a comment.

enjoy.

4 responses to “paperclip with more than 32.000 attachments”

  1. jofr

    I think the method is named “ID Partitioning”. 37Signals used ID partitioning already in 2006. Attachment_fu has a partition option as well.

  2. Tom Dunning

    here we go:

    source_path = “#{RAILS_ROOT}/private_downloads/resources/files”

    Dir.foreach(source_path) do |file|
    next if file == ‘.’ || file == ‘..’
    a, b, c = file.rjust(9, ‘0’).scan(/\d{3}/) # pad the file to 9 digits # then split the file into 3 parts
    puts “moving: #{file} => #{a}/#{b}/#{c}”
    target_path = “#{RAILS_ROOT}/private_downloads/resources/files_p/#{a}/#{b}/#{c}”
    `mkdir -p #{target_path}` # create the space in files_p (partitioned version)
    `mv #{source_path}/#{file}/* #{target_path}/`
    end

    `rm -rf #{RAILS_ROOT}/private_downloads/resources/files`
    `mv #{RAILS_ROOT}/private_downloads/resources/files_p #{RAILS_ROOT}/private_downloads/resources/files` # rename to old once finished.

    This will take any length of file name and split up into the new format for you.

  3. Aaron

    I know this is a reasonably old post now, but what actually happens when you reach the 1 billion attachment mark?

Leave a Reply to Aaron Click here to cancel reply.