558 words, 3 min read

When you need to convert PDF files to images on a Linux server, pdftoppm (from the Poppler utilities) is a fast and reliable tool. In this post, we’ll look at how to invoke pdftoppm from Elixir and how to run multiple conversions in parallel to improve throughput.

Installing pdftoppm

On most Linux distributions, pdftoppm is part of the poppler-utils package, on macOS, it's simply poppler.

# Debian / Ubuntu
sudo apt install poppler-utils

# Alpine
apk add poppler-utils

# macOS
brew install poppler

You can verify the installation with:

pdftoppm -h

Basic pdftoppm usage

To convert a PDF to JPEG images at 150 DPI:

pdftoppm -jpeg -r 150 input.pdf output/page

This produces files like:

output/page-1.jpg
output/page-2.jpg

Each page becomes a separate image.

Writing image data to stdout with pdftoppm

In some setups it is useful to avoid temporary files and let pdftoppm write the rendered image directly to stdout. From Elixir, you can then capture that output and persist it yourself. This post shows how to do this cleanly, while keeping stdout and stderr separated so errors are easy to handle.

pdftoppm writes images to files by default, but if you don't pass the PPM-file-prefix it will write the image data to stdout.

To render a single page as JPEG to stdout:

pdftoppm -jpeg -r 150 -f 1 -l 1 -jpegopt quality=85 -aa yes -aaVector yes input.pdf

On success:

stdout contains the binary JPEG data
stderr is empty

On failure:

stdout is empty
stderr contains the error message

This makes it a good fit for piping and programmatic use.

Why `System.cmd/3` is not enough

System.cmd/3 can redirect stderr to stdout, but it cannot capture them separately. Since we explicitly want:

image data from stdout
error messages from stderr

we need to use a Port.

Converting a single page from Elixir

The function below renders a single page to JPEG, saves the image to disk, and returns structured errors when something goes wrong.

defmodule PdfToImage do
  def convert_page(pdf_path, page, output_file, opts \\ []) do
    dpi = Keyword.get(opts, :dpi, 150)

    args = [
      "pdftoppm",
      "-jpeg",
      "-jpegopt", "quality=85",
      "-aa", "yes",
      "-aaVector", "yes",
      "-r", to_string(dpi),
      "-f", to_string(page),
      "-l", to_string(page),
      pdf_path
    ]

    port =
      Port.open(
        {:spawn_executable, System.find_executable("pdftoppm")},
        [:binary, :exit_status, args: tl(args)]
      )

    collect_output(port, output_file, <<>>, <<>>)
  end

  defp collect_output(port, output_file, stdout, stderr) do
    receive do
      {^port, {:data, data}} ->
        collect_output(port, output_file, stdout <> data, stderr)

      {^port, {:exit_status, 0}} ->
        File.write!(output_file, stdout)
        :ok

      {^port, {:exit_status, status}} ->
        {:error, {status, stderr}}
    after
      30_000 ->
        Port.close(port)
        {:error, :timeout}
    end
  end
end

Usage:

PdfToImage.convert_page(
  "input.pdf",
  1,
  "output/page-1.jpg",
  dpi: 200
)

Parallelizing page conversion

Because each page conversion is independent, this approach works well with Task.async_stream/3.

pages = 1..10

Task.async_stream(
  pages,
  fn page ->
    PdfToImage.convert_page(
      "input.pdf",
      page,
      "output/page-#{page}.jpg"
    )
  end,
  max_concurrency: System.schedulers_online(),
  timeout: :infinity
)
|> Enum.to_list()

Each task spawns its own pdftoppm process, captures binary image data from stdout, and only writes a file once rendering succeeds.

Error handling characteristics

On success, only stdout is used and written to disk
On failure, no file is created
The returned error contains the full stderr output from pdftoppm
This makes it suitable for background jobs and structured logging

Conclusion

By letting pdftoppm write image data to stdout and capturing it via a Port, you gain full control over I/O, error handling, and parallel execution. This avoids temporary files, keeps failure cases clean, and integrates well with Elixir’s concurrency primitives.

If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, you can email my personal email. To get new posts, subscribe use the RSS feed.

🐥 Using pdftoppm from Elixir to convert PDF files to images

February 1, 2026

Installing pdftoppm

Basic pdftoppm usage

Writing image data to stdout with pdftoppm

Why `System.cmd/3` is not enough

Converting a single page from Elixir

Parallelizing page conversion

Error handling characteristics

Conclusion

Installing pdftoppm

Basic pdftoppm usage

Writing image data to stdout with pdftoppm

Why System.cmd/3 is not enough

Converting a single page from Elixir

Parallelizing page conversion

Error handling characteristics

Conclusion

Why `System.cmd/3` is not enough