MMD to LaTeX: Apostrophes in Section Labels

Ma
Marko
Posts: 32
Joined: Thu May 07, 2015 7:05 am
Platform: Mac + iOS

Mon Jul 03, 2017 1:46 pm Post

Hello community,

I write my academic papers in MultiMarkdown and convert them to LaTeX documents, but I have a problem. In the compiled .tex-file, there are automatically generated labels for every (sub-)section and I use the names of folders as titles (as per the Formatting tab of the compile screen). I believe this is the default behavior for MultiMarkdown. But I use some special characters in the names of folders (especially apostrophes), but this always renders an error when I compile the .tex-file, because there are no special characters allowed in the label. For example, I have a folder called “Aristotle’s Thesis”. This creates the label \label{aristotle’sthesis}.

Is there a way to tell Scrivener to ignore special characters when creating labels in LaTeX? Or is there another solution?

Thanks a lot!

li
liz
Posts: 75
Joined: Sat Mar 25, 2017 6:40 pm
Platform: Win + iOS

Mon Jul 03, 2017 3:15 pm Post

Barring some esoteric knowledge about MMD that I don't have, I think you might be out of luck in this regard if you want to use MMD to automatically generate sections. I like to use periods in labels, as \label{section.label} and MMD strips these out too. To get around this I "Compile As-Is" and make a replacement that HTML comments out the label syntax. But then you need to do the same thing with the section syntax, because "Compile As-Is" loses that too.

Ma
Marko
Posts: 32
Joined: Thu May 07, 2015 7:05 am
Platform: Mac + iOS

Mon Jul 03, 2017 4:45 pm Post

Thanks liz, you gave me an idea. Here’s my solution, and maybe it even fits your workflow, too.

In the Formatting tab, I defined a prefix for folders:

Code: Select all

<!--\section{<§title>}-->

I used \subsection and \subsubsection in folders of levels 2 and 3. I then unchecked the box that inserts the title of the folders automatically. The result is as before, but there will be no label automatically created. You could then add the label (in HTML-comment tags) into the text or notes of the folder to create a label manually. This way, I don’t have to manually check to compile the folders as-is.

I actually put

Code: Select all

<!--\pagebreak

\section{<$title>}-->

in the prefix, in order to automatically put a page break before headings of level 1, without the need to manually check the Pg Break Before box. Pretty neat.

User avatar
AmberV
Posts: 24654
Joined: Sun Jun 18, 2006 4:30 am
Platform: Mac + Linux
Location: Ourense, Galiza
Contact:

Mon Jul 03, 2017 9:45 pm Post

My guess (at least based on how you typed them in) is that the root problem here is that you are using typographic quotes and apostrophes in your titles—MMD generally assumes you are not using typographic punctuation at any point—it has its own smart quotes feature built-in for one thing. By and large they will work fine these days, especially if you are using UTF–8 packages or XeLaTeX, but it looks like MMD itself is blind to them as characters that need to be stripped from labels.

If you title was “Aristotle's Thesis”, then you would find the resulting label to be “aristotlesthesis”, and any cross-references you used like [Aristotle's Thesis] would generate the same target for autoref.

That little quirk aside, if you want to generate labels out of Scrivener, do note you can use the <$title> placeholder more than once, but you might want to use the <$title_no_spaces> placeholder instead. This works for a title prefix for instance:

Code: Select all

<!--\section{<$title>}
\label{<$title_no_spaces>}-->


Likewise, you can use that latter placeholder in a cross-reference if you create a Scrivener Link pointing from the placeholder text to the title in question (linking a placeholder causes it to draw its data from the linked item).
.:.
Ioa Petra'ka
“Whole sight, or all the rest is desolation.” —John Fowles

Ma
Marko
Posts: 32
Joined: Thu May 07, 2015 7:05 am
Platform: Mac + iOS

Tue Jul 04, 2017 6:02 am Post

As far as I can tell, the problem is that I need to use \usepackage[UTF8]{inputenc}. This package destroys the possibility to use characters other than a–z and 0–9 (and maybe some punctuation) in labels (rest works fine). Other than that, I have never had problems with typographic symbols and MMD. Since I regularly switch between German and English, I usually deactivate any smart quote replacements and just type the correct ones myself.

Thanks for your further tips! I’m just starting to learn how to use the placeholder tags effectively. Thanks!

User avatar
nontroppo
Posts: 1288
Joined: Mon Mar 05, 2007 5:22 pm
Platform: Mac
Location: Airstrip One

Tue Jul 04, 2017 7:53 am Post

Pandoc correctly handle apostrophes and quotes when generating auto-identifiers (http://pandoc.org/MANUAL.html#extension ... dentifiers).

Code: Select all

\hypertarget{annas-talk}{%
\subsection{\texorpdfstring{``Anna's
Talk''}{Anna's Talk}}\label{annas-talk}}

li
liz
Posts: 75
Joined: Sat Mar 25, 2017 6:40 pm
Platform: Win + iOS

Tue Jul 04, 2017 3:20 pm Post

I like your idea about using prefixes in Compile.
It didn't even occur to me that you might be having an encoding problem because...I use XeTeX. As Ioa pointed out, you can use XeTeX (or LuaTeX) instead of inputenc. It's a better solution and a personal crusade of mine. It is absurd that in 2017 the LaTeX default is ASCII. This will change only when people stop using pdftex.

Ma
Marko
Posts: 32
Joined: Thu May 07, 2015 7:05 am
Platform: Mac + iOS

Fri Jul 07, 2017 5:34 am Post

That’s what I tried first, but that came with other problems with my Latex header I’m not really inclined to solve right now. Not the least of which was that I couldn't even change the font to Times New Roman. Unfortunately, it's not just switching to XeTeX, I need different code.

Maybe when I have a little time and when I'm between writing projects I'll take another shot at it, but for the time being, I'm satisfied with what I have.

li
liz
Posts: 75
Joined: Sat Mar 25, 2017 6:40 pm
Platform: Win + iOS

Fri Jul 07, 2017 4:27 pm Post

I totally hear you about that. The entire TeX ecosystem is a more than little frustrating when it comes to getting stuff set up so that it works the way you want it to. XeTeX does two things very well: handling UTF-8 and handling fonts (the fontspec package is a huge improvement over loading fonts for pdftex). But unfortunately some older packages break and like everything else with TeX it requires some doing to get it set up.
Thanks again for the tips and happy writing.

Mr
MrGruff
Posts: 208
Joined: Tue Jun 05, 2007 4:22 pm
Location: UK
Contact:

Mon Jul 10, 2017 11:19 am Post

The simplest answer to this is to add a label in the title of the document, enclosed in square brackets.

So your title in Scrivener would look like: Aristotle's thesis [aristotlesthesis]

On compile to MMD that gives the correct title Aristotle's thesis, and the label you have chosen.

I use this feature for creating short labels for very long titles (much easier for creating cross-references without typos) and also for creating unique labels for documents with identical titles (in my distance learning course handbooks each module has a 'module overview', which would generate lots of duplicate labels). It has the benefit over inserting raw latex that it doesn't clutter up the text with code.

User avatar
AmberV
Posts: 24654
Joined: Sun Jun 18, 2006 4:30 am
Platform: Mac + Linux
Location: Ourense, Galiza
Contact:

Mon Jul 10, 2017 6:34 pm Post

True, but in that specific case it would probably be easier to let MMD handle it. If you type in “Aristotle’s Thesis” into the binder title and then refer to it in the text as [Aristotle's Thesis] then the result will be the same as manually giving it “aristotlesthesis” as a handle:

Input:

Code: Select all

# Aristotle's Thesis

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua: [Aristotle’s Thesis].

# Aristotle’s Thesis

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua: [Aristotle's Thesis].

# Aristotle's Thesis [aristotlesthesis]

This should produce the same HTML as the first heading, which will confusing things in a normal document since this would create a duplicate label---but here it is useful to see that the automatic result is the same as the manual result.


Output (HTML):

Code: Select all

<h1 id="aristotlesthesis">Aristotle&#8217;s Thesis</h1>

<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua: <a href="#aristotle’sthesis">Aristotle’s Thesis</a>.</p>

<h1 id="aristotle’sthesis">Aristotle’s Thesis</h1>

<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua: <a href="#aristotlesthesis">Aristotle&#8217;s Thesis</a>.</p>

<h1 id="aristotlesthesis">Aristotle&#8217;s Thesis</h1>

<p>This should produce the same HTML as the first heading, which will confusing things in a normal document since this would create a duplicate label&#8212;but here it is useful to see that the automatic result is the same as the manual result.</p>


Note the first title is given a valid ID and a typographic apostrophe in the visible title. Likewise the visible cross-reference link in the second paragraph uses a valid ID reference and typographic punctuation in the visible text.

The heading with the punctuation already baked into it has an invalid ID in both the <h1> and href pointer (in the first paragraph).

I do agree that using a custom handle is probably a better approach than inserting raw LaTeX code though.
.:.
Ioa Petra'ka
“Whole sight, or all the rest is desolation.” —John Fowles