Paragraph Formatting when Compiling to .epub

ru
rupansansei
Posts: 6
Joined: Wed Nov 14, 2012 7:42 pm
Platform: Mac

Sat Sep 27, 2014 10:30 pm Post

Hey everyone. I'm hoping to get a little help with paragraph formatting.

I'm trying to compile to epub as simply as possible, but no matter what I do, the compile seems to try and style the paragraphs. I just want basic paragraphs with a single line break between each, which should be accomplished by the unstyled construction:

<p> text </p>

<p> text </p>

But when I compile, it automatically styles it with bottom margins of 0em (if I just have one line break between the text) or if I add an extra carriage return adds an empty paragraph like this:

<p> text </p>

<p> <br /> </p>

<p> text </p>

I tried using the option Convert paragraph spacing to Plain Text in the Compile manager but had no success (nor was I sure that's what I needed to do). Any suggestions would be greatly appreciated.

User avatar
AmberV
Posts: 20608
Joined: Sun Jun 18, 2006 4:30 am
Platform: Mac + Linux
Location: Santiago de Compostela, Galiza
Contact:

Sat Sep 27, 2014 10:59 pm Post

Have you tried using Markdown? Via the Pandoc converter, you can get a very clean, stripped down ePub base with semantic HTML usage throughout. The stuff that comes out of the Mac RTF->HTML converter is admittedly a bit messy (what you can expect from a word processor, basically, where emphasis is on appearance), but there isn’t a whole lot we can do about that.

There are tricks that could make things easier on yourself, while sticking to the standard rich text workflow. Keeping things as consistent as possible, so that you have a minimal amount of styling in the CSS files will make your life easier. You can just find which p# is responsible for the body text, and insert your formatting there. Thus you can customise the look of headings, block quotes, and everything else that has a different but consistent look. It’s not ideal, because those numbers can change meaning, and then you have to fix your stylesheet to the new numbers—but again consistency can help there, too. You probably won’t have to fix the CSS numbering later on if you settle on the formatting types needed for the document, first.

Like I say though, if you’re looking to use Scrivener to create a clean ePub base so that you can more easily style the book using your technical knowledge—I think Markdown is the right approach, if you don’t mind treating the writing in Scrivener like a plain-text editor, of course. :)
.:.
Ioa Petra'ka
“Whole sight, or all the rest is desolation.” —John Fowles

ru
rupansansei
Posts: 6
Joined: Wed Nov 14, 2012 7:42 pm
Platform: Mac

Sun Sep 28, 2014 12:54 pm Post

Interesting. Thanks for the suggestion. So I'm doing some research now - Pandoc is a converter program? So I could convert all my text in Scrivener to Markdown, compile it into a MultiMarkdown file, and then use pandoc to convert it into epub?

I think that might work. I'll give it a try.

Basically, I'm trying to figure out the approach that will take the least work. I'm producing a very, very basic ebook with limited styling. The only thing that I really change at all is the paragraph spacing - something I have paragraphs with 0.0em margins on the bottom. Other than that, I center title text, and I have a larger font for the title page of the book. So all in all, really just like 4-5 paragraph styles. (Scrivener epub outputs with 5 different stylesheets, many of which overlap.)

So I feel like most of my paragraphs are just basic <p> with no style, but I'm having trouble getting Scrivener to recognize that.

Thanks again. I'll give it a shot.

User avatar
AmberV
Posts: 20608
Joined: Sun Jun 18, 2006 4:30 am
Platform: Mac + Linux
Location: Santiago de Compostela, Galiza
Contact:

Sun Sep 28, 2014 6:17 pm Post

So I could convert all my text in Scrivener to Markdown, compile it into a MultiMarkdown file, and then use pandoc to convert it into epub?


Yes, that’s it in a nutshell, though if you already have the majority of this written—that may prove to be more trouble than it is worth. There is a handy tool for converting rich italic/bold styling to Markdown formatting, in the Format/Convert/ sub-menu, so if that is the extent of your formatting in the body text—it may be enough. Obviously you would want to test this theory on a copy of the project.

I would test with smaller project, with just a few chapters dragged over from your main project, and converted to Markdown, to see how it goes, to avoid a huge amount of work toward something that may not work for you.

The only thing that I really change at all is the paragraph spacing - something I have paragraphs with 0.0em margins on the bottom


That in particular may be difficult with Pandoc, I’m not as familiar with how to use it at the level of customising default output.

So I feel like most of my paragraphs are just basic <p> with no style, but I’m having trouble getting Scrivener to recognize that.


Well, that’s the difference between using a rich text engine to generate HTML and an HTML editor or just coding it oneself. :) There is no such thing as a “plain paragraph” in rich text. Something always has some formatting, otherwise you wouldn’t be able see it—and so rich text -> HTML converters juts do what they can, and print that format.
.:.
Ioa Petra'ka
“Whole sight, or all the rest is desolation.” —John Fowles

ru
rupansansei
Posts: 6
Joined: Wed Nov 14, 2012 7:42 pm
Platform: Mac

Sun Sep 28, 2014 8:40 pm Post

Excellent. Thanks for the reply. I have a copy of the project that I've been using to do test compiles to fool around with stuff.

I understand the difficulties, but I don't understand why they haven't been able to teach these rich text editors that extra line breaks are accomplished with the <p> tag. Oh well.

I'd found the Convert to Markdown feature earlier, so I knew about that trick. Would compiling to Multimarkdown do all of that automatically? I'll probably figure this out on my own in just a second when I give it a try.

Thanks again for the suggestion. I hope it works! Otherwise I'll be styling each paragraph individually! If I can get a majority of the epub in plain <p>, I feel like that will save me a load of time.

ru
rupansansei
Posts: 6
Joined: Wed Nov 14, 2012 7:42 pm
Platform: Mac

Sun Sep 28, 2014 9:30 pm Post

I just converted a test file with pandoc. It's interesting. Many parts of it are formatted exactly how I expected (simple paragraphing), but the project as a whole is too disorganized to use. I have folders for each chapter with docs under each, and for some reason pandoc has not separated out each document. Oh well. It was a good thought. I'll probably have to go through each page. Thanks again!

as
asotir
Posts: 186
Joined: Sun Jun 24, 2012 10:38 pm
Platform: Mac

Sun Sep 28, 2014 11:00 pm Post

Free epub editor Sigil lets you search and replace through all your chapters. If your chapter titles are set up to compile as h# level headers, then you can clean up the blank p-br-/p paragraphs with one go. If all your p's are formatted the same, then the built-in OSX text converters Scrivener uses won't create too many different styles – maybe only the one <p class="p1"> so one search and replace could get all those converted back to plain p's.

Also, you can compile straight to MultiMarkdown and use one of the plainer Markdown scripts to output very clean HTML; Sigil will then import this and you can split it up or release the epub as one internal file. Another way to go here is to compile to HTML (this creates only the one internal stylesheet and fewer p.# styles), clean it up in a text editor, and then bring that into Sigil (or even Calibre) for manual re-editing and cleaning.

The simplest way to go, though, is to accept what Scrivener and the OSX tools give you. The code may not be as clean as you would like but so long as the output looks OK, your readers will not be irked. And you will have saved yourself some time and headaches.

- asotir

ru
rupansansei
Posts: 6
Joined: Wed Nov 14, 2012 7:42 pm
Platform: Mac

Mon Sep 29, 2014 4:15 am Post

The simplest way to go, though, is to accept what Scrivener and the OSX tools give you. The code may not be as clean as you would like but so long as the output looks OK, your readers will not be irked. And you will have saved yourself some time and headaches.


That's an excellent point - thanks for the advice, asotir. Need to think about what my goal ultimate goal is - to have it look right for the readers. I'm just worried about the formatting ending up looking strange on different readers. It would be nice for it to look simple.

I have Sigil and was thinking that find and replace would be an easy way to go about formatting the paragraphs. I think that will be my best bet. And then I can input the images manually in Sigil as well to make sure they are all coded evenly.

Thanks for the advice.

User avatar
AmberV
Posts: 20608
Joined: Sun Jun 18, 2006 4:30 am
Platform: Mac + Linux
Location: Santiago de Compostela, Galiza
Contact:

Mon Sep 29, 2014 6:48 pm Post

I understand the difficulties, but I don’t understand why they haven’t been able to teach these rich text editors that extra line breaks are accomplished with the <p> tag. Oh well.


Well they do, as you noted above, but some readers will not show an empty element as being anything, so if you just have an empty p element by itself, it won’t serve as a spacer. Putting something in it resolves that. I wouldn’t myself put a br, but there is probably no “right” way of doing that, since it’s kind of the wrong approach to begin with. If something needs extra space, it should be done so with formatting, not with content work-arounds, like extra paragraph breaks. A paragraph with an extended margin or padding on the bottom edge is what really should be used. But a word processor isn’t going to do that by itself.

That is something you can influence with the paragraph spacing tools in Scrivener. Maybe consider spacing instead of a literal empty line, where you need them. I wouldn’t overly worry about this though, unless you want to adjust the amount of spacing and find that difficult with a full carriage return in there.

I’d found the Convert to Markdown feature earlier, so I knew about that trick. Would compiling to Multimarkdown do all of that automatically? I’ll probably figure this out on my own in just a second when I give it a try.


Not at this time, it’s a fully manual process and only handles bold and italic. There are a few things Scrivener can generate MultiMarkdown code for. Check the user manual’s chapter on MMD for details on what can be compiled.

But that is why I wondered if switching to a different form or writing at this point may be too much work. I suppose if the book is relatively “simple” in regards to formatting, it wouldn’t be too much.

I have folders for each chapter with docs under each, and for some reason pandoc has not separated out each document.


Do these document levels have headings in your Formatting pane settings? You’ll need that, so that you can get h2 or h3 or whatever headings, and then you will need to instruct Pandoc to use that depth for where file breaks should occur internally in the ePub, with --epub-chapter-level=#, where the number is the h# depth.

That is a technical detail however—you shouldn’t need to cut things up into tiny files inside the ePub, this setting is more for optimising against very long HTML files, which can bog down some readers. If you’re mainly just interested in manipulating the ToC, then all you need to know is that headings are used to create the ToC (even coming from the same technical HTML file, that doesn’t really matter, save for the performance concerns mentioned).

(And of course, one has full control of the ToC in Sigil/Calibre after compiling, so this is mainly about getting the majority of the grunt work done if you can’t find a setting that works perfectly.)

The simplest way to go, though, is to accept what Scrivener and the OSX tools give you. The code may not be as clean as you would like but so long as the output looks OK, your readers will not be irked. And you will have saved yourself some time and headaches.


I do second that, unless you are an HTML/CSS jockey and you’re looking for a clean start to do semantic styling against (and not having to wrestle with sequentially assigned class names).

Here’s the thing: you’ll get a consistent look with what Scrivener generates natively, because it’s using the same exact HTML/CSS tools a guru would be using. It might be a little messier and more verbose than a hand-coded CSS file (let alone that the engine creates a new CSS document for every HTML file even if the style is the same in every single one), but it works, and it works on the same principle as a hand-crafted ePub. So that’s my advice—unless you want to get in there and design the book personally, don’t worry too much about Scrivener’s output not being good for e-books. We do take care to make sure the output looks good in a broad range of readers.
.:.
Ioa Petra'ka
“Whole sight, or all the rest is desolation.” —John Fowles