For a recent project we needed (and wanted) a simple solution to generate PDF files. Ideally, the solution would use HTML for the general layout and design of the generated PDF, working just like a normal view in Rails.

After testing a number of potential PDF solutions we came across a neat little library called HTMLDOC. What it does is take basic HTML and converts it to PDF, among many other output formats.

For the situation and solution we wanted it does a great job, especially for the price. To make it even easier to use, there is a also a Ruby HTMLDOC Gem to use along with it. Score!

To generate the PDF files we used plain old HTML, something we were familiar with. Exactly like creating a normal rhtml view.

Below follows our experience installing and using HTMLDOC to get PDF file generation out of our Rails application. This has been tested and used on Linux and MacOS X 10.4.9.

Note: You will need the proper tools installed to compile HTMLDOC. On MacOS X, that usually consists of installing the developer tools.

1. Installing HTMLDOC

The first thing we need to do is get HTMLDOC downloaded, compiled, and installed. Copy and past the following in your console:

curl -O http://ftp.easysw.com/pub/htmldoc/snapshots/htmldoc-1.9.x-r1521.tar.gz

tar zxvf htmldoc-1.9.x-r1521.tar.gz

cd htmldoc-1.9.x-r1521

./configure --prefix=/usr/local

make

sudo make install

2. Install the HTMLDOC Gem

Now that we have the HTMLDOC application ready to go, we want to install the HTMLDOC Ruby Gem to interface HTMLDOC with our Rails application:

sudo gem install htmldoc

3. Configuring Your Application

Next we need to configure our application. Open up your config/environment.rb file and add the following (to the end):

Mime::Type.register 'application/pdf', :pdf
require 'htmldoc'

Note: There is a way to make Rails handle the ‘.pdf’ extension format, but when we tried it, it kept asking us to download a file no matter what format we requested on every page. After many attempts at trying to rectify the issue, we eventually decided on the following solution:

4. PDF Renderer

We also added a method to our app/controllers/application.rb file to help DRY up the PDF generation, sort of like the render methods already included in Rails:

def render_to_pdf(options = nil)
  data = render_to_string(options)
  pdf = PDF::HTMLDoc.new
  pdf.set_option :bodycolor, :white
  pdf.set_option :toc, false
  pdf.set_option :portrait, true
  pdf.set_option :links, false
  pdf.set_option :webpage, true
  pdf.set_option :left, '2cm'
  pdf.set_option :right, '2cm'
  pdf << data
  pdf.generate
end

Just pass it the same options you would pass render. Check the HTMLDOC Gem rdoc page for more options and configurations.

Example

Here is an example controller method:

  def index
    @items = Item.find(:all)
 
    respond_to do |format|
      format.html # index.html
      format.xml { head :ok }
      format.pdf { send_data render_to_pdf({ :action => 'index.rpdf', :layout => 'pdf_report' }) }
    end
  end

Pretty typical Rails, no? We tell it to explicitly use that action/view and to use a different layout file.

Now an example of a view:

<h3>Showing: <%= pluralize(@items.size, 'item') %></h3>
<table><tbody>
<tr>
<th>Field1</th>
<th>Field1</th>
<th>Field1</th>
</tr>
<% @items.each do |item| %>
<tr>
<td><%= item.field1 %></td>
<td><%= item.field1 %></td>
<td><%= item.field1 %></td>
</tr>
<% end %>
</tbody></table>

Maybe it’s just me, but that sure beats using the examples using the PDF Writer plugin. At least from what I have seen.

Finally, to generate a link to the PDF, assuming you are using restful routes:

 
<%= link_to 'PDF', formatted_items_path(:pdf) %>

HTMLDOC may not be perfect, but I found it’s ease of use to generate nicely formatted PDF files far outweighed it’s limitations. I hope you found this useful.

Update: 12-26-2007

I made a little helper method for images also, put this in your application_helper.rb:

def pdf_image_tag(image, options = {})
  options[:src] = File.expand_path(RAILS_ROOT) + '/public/images/' + image 
  tag(:img, options)
end

You can leave a response, or trackback from your own site.

45 Comments

  1. Lovely approach, used the same in my projects. Only really problem Have had is with embedded, generated images which are sometime lost as if htmldoc expects file paths and not urls. Have on todo to look at fixing this with some pre caching.

    Tried a number of other approaches before this and definitely the best and really easy to implement

    Robert.

  2. Chris Kaukis says:

    Robert,

    I had the same problem with images, but have not looked into it yet as it was not of primary concern.
    Chris

  3. Jamie Hill says:

    Looks a nice alternative to PDF Writer. Does is take into account CSS?

  4. Chris Kaukis says:

    Jamie,

    I am not entirely sure. I believe the website states the nightly builds have at least some support for CSS.

    Chris

  5. Dan Kubb says:

    Would you mind posting an example of what the example view renders to when rendered as PDF?

    I’ve got some code using PDF::Writer, and managing the templates is a pain. If I could design the templates using normal HTML views, and then render them as PDF with some reasonable control over the style, I’d be a happy man.

  6. Chris Kaukis says:

    Dan,

    I will try and post an example PDF tomorrow morning. However, I can tell you it looks pretty close to what a normal HTML page would look like.

    @Robert Shell: I tested images and you give it the full path. For example /Users/chris/Projects/railsapp/public/images/logo.gif

    Chris

  7. Chris W says:

    If you have to have css support and don’t mind spending some money, check this solution out.

    HTML / CSS to PDF using Ruby on Rails
    http://sublog.subimage.com/articles/2007/05/29/html-css-to-pdf-using-ruby-on-rails

  8. [...] Easy PDF Generation with Ruby, Rails, and HTMLDOC [...]

  9. [...] PDF generation with Ruby, Rails and HTMLDoc [...]

  10. [...] Ruby on Rails Website Development Blog from Atlantic Dominion Solutions For a recent project we needed (and wanted) a simple solution to generate PDF files. Ideally, the solution would use HTML for the general layout and design of the generated PDF, working just like a normal view in Rails. (tags: pdf rails ruby htmldoc rubyonrails development webdev) [...]

  11. [...] Ruby on Rails Website Development Blog from Atlantic Dominion Solutions (tags: development pdf rails ruby) [...]

  12. Dave says:

    WTF is formatted_items_path? I can’t find documentation on this anywhere.

  13. Chris Kaukis says:

    @Dave:

    It comes from restful routes. Example:

    map.resources :posts

    gives you:

    posts_path as well as formatted_posts_path(:format)

    Try rake routes within in your project directory to see all your routes.

    Chris

  14. Dave says:

    also, how do you name the pdf

  15. Dave says:

    Chris, thanks for the clarification on the routing for me. Much obliged.

  16. Chris Kaukis says:

    @Dave:

    send_data has an option for filename.

    send_data render_to_pdf({ :action => ‘index.rpdf’, :layout => ‘pdf_report’ }), :filename => “foobar.pdf”

    Chris

  17. Nidhika says:

    Whem I am puting
    ” Mime::Type.register ‘application/pdf’, :pdf
    require ‘htmldoc’”

    In config/enviroment.rb file in the last webrick server not restart. How am I solve this issue.
    Is anybody help me

    Thanks
    Nidhika

  18. @Nidhika: ensure that you are using the ‘ mark and not the full quotation mark if you copy/pasted from the post.

  19. Matt says:

    Hi all, am I the only one who can’t ever get images to be included in the document using PDF::HTMLDoc?

    If I run HTMLDOC from the command line with supposedly the same arguments (basically just –webpage) then it works a treat, even if I point it at the URL for my dynamically generated rails page. However if I call it using the Rails GEM then it renders everything except the images.

    It’s driving me bonkers and pretty urgent. I’ve tried all sorts of things including the pdf_image_tag helper above with numerous modifications (eg. adding file:/// at the start).

    Cheers,
    Matt

  20. Chris Kaukis says:

    Matt,

    I found that I had to use absolute path to the image. Thus, for example:

    /Users/chris/src/my_rails_app/public/images/some_image.png

    I hope that helps.

  21. Matt says:

    Hi Chris,

    Thanks for that. The trouble is (and I should have mentioned) that it’s on Windoze. If I set it to ‘c:\rails\project\public\images\some_image.jpg’, load the page directly and save it to disk and then load the page in explorer it looks fine (that is, it loads the image/s from disk). But the Ruby htmldoc plugin still doesn’t get them.

    I’ve also tried prepending file:/// and a few other things.

    //matt

  22. Kelly says:

    Hi Chris. Say quick question.. I have everything wired up but I’m actually calling the render_to_pdf from another controller in my app. It work fine but give me this error.
    undefined method `size’ for false:FalseClass
    I walk thru it with the debugger and it seems that data is nil so it can’t do a size. I assume that is because the model for my control is not an option to post to the render_as_pdf but not sure.

  23. Kelly says:

    As an addition when I debug it I get here in the streaming and I see data is null..

    ********************
    def send_data(data, options = {}) #:doc:
    logger.info “Sending data #{options[:filename]}” unless logger.nil?
    send_file_headers! options.merge(:length => data.size)
    @performed_render = false
    ********************
    is there a way to pass the record I have when I call this?
    right now we do send_data render_to_pdf but is there a way to say @object so I can pass the object created by the finder? send_data @object.render_to_pdf does not work either as I get a missing method render_to_string that way

  24. Christoph says:

    same error here, false class when using an image… any solution?

  25. Chris Kaukis says:

    It sounds like you are missing something. I would have to see your code to help more I think. Sorry.

  26. Christoph says:

    I tracked it down to some weird htmldoc output, which contains whitespaces sometimes.

    Here is a patch that solves my problem:
    http://textmode.at/2008/5/14/ruby-htmldoc-gem-falseclass-error

    Not sure if it’s the same problem as the user above has.
    I contacted the author htmldoc people.

    greetings

  27. kajinski says:

    Christoph, YOU ROCK.

    Thanks for the patch!

    I was getting the exact same error when using the pdf_image_tag helper. Applied the patch and viola! Image renders perfect.

  28. Chris says:

    sorry to ask what may be a simple answer, but how do you apply that patch?

    thx…

  29. srishti says:

    How can we use html doc in rails 1.2.3

  30. [...] HTMLDoc This looks really GREAT. I’m used to html so the new level of coding to learn is much less. awesomeness. There is also this video tutorial. [...]

  31. [...] (tutorial) I had to modify it though, when certain PDFs was generated, the gem reported false as return [...]

  32. [...] Easy PDF Generation with Ruby, Rails, and HTMLDOC | Atlantic Dominion Solutions (tags: pdf rails htmldoc rubyonrails ruby) [...]

  33. Webagentur says:

    Hey, that has me very helped. Thanks!

  34. dafinn says:

    First of all, Great work! Due to the Patch of Christoph printing Images with a complete path is working now!
    BUT… I am storing Articles in my Database. I use FCKEditor for the User-interface – the text is stored HTML formatted and the path for Images is stored in the HTML Code in form of “/public/uploads/Images” and so on. I have explored that HTMLDOC needs the full path(as some People above). For Example http://0.0.0.0:3000/public/uploads/Images
    Has anyone an idea how to fix this? I have seen the pdf_image_tag but i am not sure how it works…

  35. Eric Wagoner says:

    dafinn:

    I’ve got an app that does the same. I wrote a very simple helper that expands the image paths.

    def write_full_image_path(text)
    newtext = text.gsub(’img src=”‘, ‘img src=”‘ + File.expand_path(RAILS_ROOT) + ‘/public/’)
    newtext
    end

    Then, in my .rpdf files, I call for write_full_image_path(model.formatted_text)

  36. Francesco says:

    Just a note for Ubuntu users: HTMLDOC must be installed by hand (configure, make, make install as stated here by author): if you install it with apt-get then HTMLDOC is not working in rails :)
    With Debian is fine to install htmldoc with apt-get install.

    The patch for images is necessary in both Debian and Ubuntu.

    Thank you *very much* to all the people who shared their knowledge here :)

  37. Robert says:

    def self.generate_pdf(url, links=false)
    doc = Document.new
    doc.mime = ‘application/pdf’
    pdf = PDF::HTMLDoc.new
    pdf.set_option :bodycolor, :white
    pdf.set_option :links, links
    pdf.set_option :webpage, true
    pdf.set_option :path, “#{RAILS_ROOT}/public/”
    pdf << url
    pdf.footer “.1.”
    if pdf.generate
    puts “Successfully generated a PDF file”
    doc.body = pdf.generate
    else
    puts “ERROR!——————————————–”
    for error in pdf.errors
    puts error
    end
    end
    doc
    end

    The important line is the pdf.set_option :path, #{RAILS_ROOT}/public/”

    It tells htmldoc to look in your public folder for images.

    If you manually create the img tag this will work and the images will be the same on the web as in the pdf. Using image_tag however will not work, because rails adds on the uid to each image.

    If you were adventurous, you could parse out the uid from the tags with gsub. Now if I could just figure out how to keep the errors without having to generate twice.

  38. Robert says:

    saving for later

  39. bharati says:

    getting file does not begin with ‘%pdf-’ error.. Can anyone help me..

  40. kams says:

    Hello,

    getting error as for rails application.
    “‘htmldoc’ is not recognized as an internal or external command,\noperable program or batch file.\n”

    Thank you

  41. Fernando says:

    Hi, I want to know how can I set the page as landscape.

  42. Yves-Eric says:

    Nice article, I had things up and running in a few minutes…

    But show stopper for me: HTMLDOC does not support CJK (Chinese, Japanese, Korean) languages.

  43. Robert Hall says:

    Having some issues with HTMLDoc, Rails 2.2.2 and Phusion Passenger – Apache. With the GEM installed, passenger refuses to start up with an “Unknown error” and doesn’t log anything so debugging is tough. Have you had any successful implementations on Rails 2.2.2 with Passenger? Suggestions on how to proceed?

  44. Hi Robert. We haven’t tried HTML doc with passenger. We have successfully used Prawn with that setup though. I’d give that a try.

Leave a Reply

Collaborate.
Enable.
Succeed.

Contact

(888) 331-8520
4210 Beau James Court
Winter Park, Florida 32792 RSS Feed

Search

Popular Articles

Recent Articles