2. Installation

Dependencies

The IDML format is complex, and extracting information from it is difficult. Ideally, this repository would consist of a single Haskell file, like Pandoc’s other readers. However, that would take an enormous amount of time to develop.

Others have explored reading IDML files, notably the project idml2xml-frontend, distributed under the FreeBSD license. We build our converter by continuing that work, offering a binding between idml2xml-frontend and Pandoc—i.e., converting the Hub XML output of idml2xml-frontend into DocBook 5.1 format.

The principal dependencies are:

  • Python 3.x;

  • Java 1.7+;

  • the dependencies of the Python package idml2docbook;

  • idml2xml-frontend.

To understand the purpose of each dependency, see the conversion graph.

Installation with install.sh

First, start by cloning the idml-pandoc-reader repository:

git clone https://gitlab.com/deborderbollore/idml-pandoc-reader

Next, an install.sh script for Mac and Linux has been developed to make getting started with this software easier. Installation on Windows is also possible by adapting the steps described below, although it has not yet been tested. This script mainly serves to:

  • check that Java (>= 7.0.0) is installed;

  • check that Git is installed;

  • install idml2xml-frontend;

  • check that Python 3 and pip (>= 21.0) are installed;

  • install Python dependencies from requirements.txt;

  • generate a basic .env environment file;

  • optionally install the idml2docbook module via pip install .;

  • run a test command to verify the installation is valid.

To run this script, start by making it executable:

chmod +x ./install.sh

You can then start the installation:

./install.sh

Note: For large IDML files, it may be necessary to increase the Java heap size, for example to 2048m or 4096m.

Environment configuration (.env)

The .env.sample file shows an example configuration file.

At a minimum, for the converter to work, idml2xml-frontend must be executed. The IDML2HUBXML_SCRIPT_FOLDER line in the .env file should therefore point to the absolute path to the idml2xml-frontend directory on your machine. This is likely the most important line in your .env file. It is usually filled in automatically by the installation script.

The key/value pairs in the .env file allow you to override the default values of the idml2docbook package. For more information on these variables, see the list of options.

Pandoc

Pandoc is not technically a dependency, since the idml2docbook converter does not require it to function.

However, a slightly modified version of Pandoc was developed to support reading paragraph and character styles. To use it, you must compile it from source.

It is also possible to use the main version of Pandoc, but without Style Mapping. A pull request is close to be merged to integrate these new features into the main Pandoc branch.

Configuration test with the modified version of Pandoc

To verify that the dependencies are installed and the .env file is correctly set up, you can test the converter in your terminal with the following command:

pandoc hello_world.idml -f idml.lua -t markdown

The result should then be:

::: {wrapper="1" role="NormalParagraphStyle"}
Hello world!
:::