Converting latex/pdf to html/markdown for Jekyll-based blogs

Shuvomoy Das Gupta

January 6, 2023

In this blog, we will discuss how to convert a .tex or .pdf file to .html and then use that .html in a markdown blog post.

.tex to .md

Suppose, name of the .tex file is test.tex. First, change the directory to the one where this file resides. Type the following in the terminal:

cd "path_of_the_folder_containing_test.tex"
make4ht test.tex "mathml,mathjax"

Here, make4ht is a build system that comes with texlive, so please install the latter if you do not have it already. The last command will create a file named test.html in the same folder. Now open the test.html file in a text editor, and copy the entire content of the file. Now open the markdown file, which would be used to create the blog post, and then paste the copied content between the two ~~~s as follows.

~~~
paste the content of test.html
~~~

That's it, the markdown file can be used as a blog post in Jekyll based website.

.pdf to .html

For converting pdf to html in a linux based OS (or Linux on Windows with WSL), do the following steps:

  • install ttfautohint by inputting the following in terminal sudo apt install ttfautohint

  • install pdf2htmlEX from the link https://shuvomoy.github.io/blogs/assets/pdf2htmlEX/pdf2htmlEX-0.18.8.rc1-master-20200630-Ubuntu-focal-x86_64.deb

  • go to the folder containing the pdf file by typing cd DIR_NAME in terminal

  • convert the file into html format by typing pdf2htmlEX --zoom 1.75 --external-hint-tool=ttfautohint --process-outline=0 "filename.pdf",

  • which will create the filename.html file

  • copy the file in your posts folder, which will have url such as: http://localhost:8000/posts/filename/