If you’re new to Leanpub, you may have seen references to something called “Markua”.
In this article, we’re going to explain what Markua is in some detail, but without being too technical about things.
This will involve a brief discussion of the history of book writing and publishing, so we hope it will be a little fun, too.
(Just a brief warning in advance for all you experts out there: if you’re from the book publishing industry, or a book history expert, or you already know very well what “Markdown” and “plain text” are, please don’t shoot us for where we’ve chosen to jump into the historical timeline here, and for the many liberties we’ll take with some details along the way, in the interests of telling a fun story!)
OK, here we go!
The Before Times
In the olden days, people used to write books on paper.
Actually, that’s already a bit misleading, and a good example of how bad our common-sense way of talking about everyday things can often be.
Strictly speaking, authors didn’t write books; they wrote manuscripts.
They would do this on paper, using a pen, or one of these:
(Actually, there are still people who prefer to write manuscripts with pens, especially for first drafts, and there are also people who still prefer typewriters, too. You may have heard of this guy.)
The book’s “manuscript” was the complete set of pieces of paper for the project.
When they were finished writing their manuscripts, authors would then give them to publishing companies, who would get someone to turn the manuscripts into bound paper books (we’ll call them the “book-maker”), and then the publisher would handle all the marketing, and things like getting the books into bookstores.
Side note: sometimes, novels were first published “serially”, chapter by chapter, in magazines or newspapers. In that case, a novel may have been fully published, but never actually have been turned into a “book”! Also: one of the reasons some nineteenth-century novels are so long, is that they became hits as they were being published serially, so the authors naturally expanded their novel plots to make them longer, to cash in on the magazine sales.
But in any case, from the author’s perspective, the writing process with serial publishing was the same: you wrote your manuscript on pieces of paper, and then gave those pieces of paper to the magazine publisher. (For a twenty-first century take on all of this, please check out this video.)
It’s easy to get focused on how different these technologies — pen and paper, or typewriter and paper — are from the digital technologies we use to write books nowadays.
But another huge difference has to do with process.
Specifically, in the ink-and-paper days, the writer’s job — writing the manuscript — and the book-maker’s job — producing the book from the manuscript — were basically completely separate things.
There’s a lot we could talk about here, but given the point of this article, we want to focus on the fact that the writer was not expected to provide the book-maker with a text that looked anything like what the text would eventually look like in the book.
In other words, it the old days, it was not the author’s job to provide any “formatted” text, or to do any formatting themselves whatsoever. You just wrote your words on the paper — and if you wanted to, you had to write out explicitly any formatting instructions you wanted to give to the book-maker, like, “put the picture of that boat here,” or “these words should be in italics”, for example.
To any particularly sharp readers, the nature of the difference between then and now, is why it wasn’t a mistake when we wrote a few sentences back about “the tools we use to write books nowadays”: with the advent of what’s called “desktop publishing”, it now does actually make sense (well, kind of) to talk about actually “writing” a book, as opposed to only writing the manuscript for the book.
That’s because with apps from everything like good old PageMaker, to InDesign and Word, web page publishing, and especially today’s self-publishing short-form fiction apps and other platforms for storytellers, writers often do expect to format their writing, and for their words to be formatted in their book, ebook, or web page, in exactly the same way those words are formatted on their screen, when they type them.
(For an interesting account of one of the earlier efforts at real desktop publishing, check out this interview from our Frontmatter podcast, with special guest Thad McIlroy.)
As we’ve said, in the past, an ordinary pen-and-ink writer’s ordinary manuscript had no formatting, ordinarily. There was nothing in bold. There was nothing in italics.
In that sense, the words in the manuscript were “plain text”.
(Hold your horses there, all you experts; we’ll explain the concept of “plain text” more fully later on!)
What that means is that the words of the text themselves contain no information about how they are meant to be formatted. An “a” in plain text is just a plain old “a”: the only information it can be said to contain for the book-maker is, “This is an a”.
Actually, that’s not true, of course: a plain text “a” also includes the information, “This is a lower-case a”. But that’s about it for the information the letter contains, and that’s not exactly what people usually mean by “formatting”, anyway.
If you would like to see an example of what a whole novel presented in plain text looks like, please go here.
(Oh, and if you’ve ever wondered where the terms “upper case” and “lower case” come from, it’s how those really old-timey book-makers stored individual letters.)
As we’ve already mentioned, old-timey writers wrote their book formatting instructions directly in their manuscript, indicating manually what they wanted to be in bold, and what they wanted to be in italics, just to pick those two popular ways of formatting words in books.
Now, the relationship between writers and book-makers regarding formatting instructions has always been a bit ad hoc, and still is today.
So, sometimes the formatting instructions written on the manuscript often just involved the writer circling some words, and drawing an arrow pointing at the the circle with a drawn arrow, and then writing in the margin of the page, at the non-pointy end of the arrow, “Put this in italics!”
These instructions, written on the page by the author, are called “markup”.
(Yes, that’s just one letter different from “Markua”. Please adjust your Wordle algorithm accordingly!)
Here is an example of a typewritten manuscript with some “markup” on it:
Now, over time, writers and publishers got bored of spelling things like this out (sorry) explicitly, over and over again.
Eventually, they settled on various conventions for writing down common book formatting instructions in their manuscripts.
This became even more important when people called “editors” got involved, where there was more than one person “marking up” the manuscript, and where there may even have been some back-and-forth between the writer and the editor on the same sheet of paper passed back and forth between them, during the writing process.
One common convention for a writer’s book formatting markup was to underline words in their manuscript, that they wanted to appear in italics in their book. It was easy enough to add an underline feature to a typewriter, which stamped a little line under each letter; typing italics directly in a manuscript would have involved providing a totally separate set of letter keys for the typewriter, which would have been expensive and pointless!
(Which means someone almost certainly did make a typewriter with italics, by the way. If you know of an example, tell us about it and we’ll add a link here!)
Here’s another example: you know how with our magical writing machines these days, we can easily “undo” stuff? Well, you can’t do that with ink and paper. Even erasing something written in pencil leaves a trace; so does something else that seems really weird today but basically was actually a lifesaver in the old days, called Wite-Out.
So, what happens if you’re a writer or an editor, and you cross out a word on a manuscript sheet, and then realize you shouldn’t have?
Well, you could circle the word and draw an arrow pointing at it and then, at the start of the arrow, write something like “Actually, I changed my mind, keep this word.” But manuscript pages would get pretty messy pretty quickly, if writers and editors always resorted to that kind of thing.
So, eventually they settled on a conventional markup instruction for “undo” in this case: as anyone trained in the conventions of the markup that editors use knows, if something like this happens, you write “stet” next to the crossed-out word — because Latin — meaning you basically want to tell the book-maker to actually let the crossed-out word in the manuscript stand, and use it when they produce the book.
Anyway, now imagine eventually you and your editor and your publisher and their book-maker come up with a complete set of these markup conventions, that covers every category of book-related markup instruction you might need to write in your manuscript, before you pass it along to your book-maker: what you’ve got yourself now, my friend, is a whole “markup syntax”, assuming there’s some system to it, and the set of markup instructions you’ve settled on aren’t entirely arbitrary.
Plain Text, Computers Edition
OK, remember earlier when we asked experts in these things to give us a pass, when we talked about handwritten or typewritten things being “plain text”? That was a misleading anachronism, but also a very tempting analogy, and we gave in to that temptation, since the image conveys something important about formatted versus non-formatted letters.
The big question here is: what’s a letter?
Now, we all know what a written or typed letter is. It’s a letter. Duh.
But as we mentioned earlier, common sense can be misleading.
For example, using common sense, answer this simple question:
“How many letters are on the following line?
Two, duh. An A and an a. Two letters.
Actually no, there’s only one letter, as anyone with common sense can see, retorts your gruff uncle. It’s the first letter of the alphabet. And there can be only one first letter. Put that in your kale latte and smoke it.
Now, this is a classic demonstration of the type-token distinction, which is super fascinating on its own.
But we just used it here, as it applies to talking about what letters are, to make a point: there can actually be a lot more to think about than you think there is, when it comes to what “a letter” actually is.
And that’s especially true when you’re writing using a computer, like we all do nowadays.
Now, this may be surprising, but since all computer code is actually writing, computer people have always taken the question “What’s a letter?” very seriously.
If you’ve got a writing app handy, go ahead and open it up, and hit the “a”. What’s that thing you now see next to the blinking cursor?
Well, it’s a graphical representation of some computer code. When you hit the “a”, your writing app configured something specific on your computer or device that the app reads, and then displays as an “a”, following the instructions written in the app’s code to display that specific something as an “a”, in whatever way the app has been instructed to format and represent the letter “a”.
So what is it that the writing app is configuring, and then “reading”, and presenting graphically in the shape “a”, according the app’s programmed instructions?
Well, it’s basically a set of 1s and 0s that the app has been programmed to present graphically as an “a”.
What specific set of 1s and 0s counts as an “a”, you might ask?
Well, here’s the thing: in principle, any set of 1s and 0s can be programmed to be represented graphically as an “a” in your writing app.
In some cases, what that specific set of 1s and 0s is may be unique to your writing app. If you hit the “a” in Word, for example, the underlying set of 1s and 0s that Word reads as an “a” is different from what it might be in some other app.
And as crazy as it might sound, guess what: the specific set of 1s and 0s that means “a” in your writing app today, can can actually be changed in future versions of that particular app!
It’s kind of mind-bendy in a way, but computer code is written in apps too, and the code programmers write in their apps then also needs to be “read”, by the very computing machine being programmed, in order for that coding instruction to be “executed” or carried out properly in the end by the computing machine itself.
(Yup, as magical and otherworldly as they may seem to some, ultimately, computers are just good old dumb machines. Ominous voice: For now.)
Now, since all computer code is written using letters and other characters, it would be a huge problem for computer people, if writing computer code meant the underlying letters were in fact based on an infinite set of arbitrary sequences of 1s and 0s. When it’s reading code, you’d basically need every computer to run a decoder to know “Oh yeah, in this case, this specific sequence of 1s and 0s is an ‘a’”.
Having a single standardized set of sequences of 1s and 0s that represent specific characters was therefore really important for the aforementioned computer people.
That’s why they came up with something called ASCII, which handled many western languages reasonably well, but which didn’t handle languages like Chinese or Japanese at all. So then we got Unicode, but that’s a bit too much to go into for an easygoing article like this.
So, now we have our answer to the question, What’s an “a”?
In ASCII, it’s 01100001.
To see the complete list of ASCII “control characters”, go here.
(OK, we can’t completely ignore Unicode. Although it means getting a bit ahead of ourselves here, what we’re talking about from a technical perspective, and just to be clear that as an author this is something you’d never have to know, is using Unicode in the UTF-8 encoding. ASCII just happens to be a subset of Unicode encoded in UTF-8. So, if you’re writing in English, you can pretend that you’re writing in ASCII, and you’re technically not wrong.)
What Computer People and Literary Archivists Have in Common
Now, at this point, if you’re into books and writing, you might be thinking something like this:
“I’m ok with pretending all of this is very interesting in a way, but for all us ordinary humans, who cares about plain text, and standardized sets of 1s and 0s that stand for letters and other characters, which only matter as far as computers reading those letters and other characters are concerned? Why shouldn’t authors just use Word or whatever for writing books? After all, a book is not a computer program, and readers and writers are people who know an ‘a’ very well when they see it.”
Clearly, you’re no literary archivist.
The thing is, once authors started writing on computers, literary archivists were confronted with a version of the problem the computer people faced way back in the sixties.
What do you do with a pre-computer-era author’s manuscripts when they die? Well, you get all their papers, and put them safely in some boxes in a room in a building somewhere, where they will be carefully guarded until the end of time by your friendly neighborhood literary archivist.
And what do you do if you’re a biographer or historian or a scholar, and you want to read some dead author’s manuscripts? You get permission to go into that building, and ask an archivist to go into the room and get the manuscripts for you, so you can page through them very carefully, and get distracted trying to guess what the author was eating when they wrote some chapter, if you find funky stains on the pages.
But, as a literary archivist, what would do you do with a dead author’s manuscripts if they were all written in, say, Word?
Well, what you’d do is you’d get some boring old computer disks and put them safely in some boxes in a room in a building somewhere, where they would be carefully guarded until the end of time by your friendly neighborhood literary archivist.
Now, put yourself in the shoes of a biographer or historian or whomever, who wants to read those manuscripts.
What if the archivist comes back from that room with a disk and you’re like, what do I look like, a 1984 Apple Macintosh? I can’t read that.
Or, say you do have a working computer from the author’s era, but you don’t have the same version of Word they used?
Now multiply that problem across an author’s lifetime. Even a mildly aging Gen Xer’s manuscript archive could include numerous writing apps, across numerous versions of each app itself, and across 30 years’ worth of storage and computing devices and operating systems, and what have you.
And now add on to that pile of problems the issue of digital obsolescence, and you’ve got a real nightmare for all those archivists and researchers, and basically anyone concerned that about the people of the future trying to peer into the digital past.
And that’s why writing books in plain text, in a defined and standardized markup syntax, is so important: computer programs will always be able to read plain text files because in a sense computer programs all come from plain text files.
And if the book formatting markup syntax used in manuscripts is standardized, that means anyone who gets their hands on the standard can interpret all the book-making messages written in the manuscript itself, forever.
Neither of those things is possible if you’re writing in some arbitrary formatted or binary text format, with some ad hoc arbitrary markup shorthand you’ve devised yourself, for indicating something’s a chapter heading, or a figure, or an image that, say, you want centered on the book page.
And yes, for any wags out there, of course you could just explicitly write out “This is a chapter heading,” or whatever, for every single book-production attribute you mark up in your manuscript, with no need for a standardized syntax. But as we said earlier, that would be super boring to do over and over and over again, so no who actually works in the book world does it. Probably.
Good proof that people who write and format text for publication, and hence for presentation to other people, won’t put up with writing out lengthy repetitive instructions for formatting things, is something called Markdown, that was invented by a person named John Gruber.
Now, this John Gruber person very much enjoys writing and publishing articles for presentation to other people on screens on the web.
But what John Gruber evidently does not enjoy is writing out at length all the formatting instructions that computers displaying things on screens on the web need to know, things like, “I want this word to be displayed on the web in italics”, or “I want this word to link to this website”.
In case you’re unfamiliar with it, web pages are written in plain text in a standardized “language” called HTML.
“HTML” stands for — wait for it — HyperText Markup Language.
So, let’s say you were writing a blog post, and you wanted to type a sentence that would be displayed to people like this:
It was the best of times, it was the blurst of times.
Seems simple, right? Just type the sentence, highlight the word “blurst”, and select “Italics” from some menu? Right?
Well, you can’t do that in plain text, because the letters making up the word “blurst” are plain text, and can’t contain any hidden information about how they’re supposed to be formatted for display to other people.
So, in HTML, at one time anyway, you would type this:
It was the best of times, it was the <i>blurst</i> of times.
But, the whole “<i>” part made some web standards people really mad, since that wasn’t “semantic” enough.
So, ahem, in HTML, you would type this:
It was the best of times, it was the <em>blurst</em> of times.
In HTML, the <em> means “OK computer, start formatting letters in emphasis here”, and the </em>, the one with the slash in it — the / — means “stop formatting letters in emphasis here”. And “emphasis” usually means “italic”. But don’t say italic.
(And if you think that’s bad, some people have some <strong> opinions about the similarly “unsemantic” <b> tag for bold…)
Things get even more tedious if you want to make the word a link.
Let’s say that instead of the word “blurst” being in italics, you wanted it to be a link to a web page, say, for example, where people could view a related video clip.
Seems simple, right? Just type the sentence, highlight the word “blurst”, right-click it and select “Hyperlink…”, right? Then in end it would look something like this on the screen:
It was the best of times, it was the blurst of times.
Actually, here’s how you’d have to do it if you were writing it in HTML:
It was the best of times, it was the <a href=”https://youtu.be/no_elVGGgW8">blurst</a>.
The <a href “https://youtu.be/no_elVGGgW8"> part indicates “OK computer, start formatting letters here as a link to the web page address, the one I typed in between the quotation marks”, and the closing part with the slash — </a> — indicates “stop formatting letters as a link here”.
Well, John Gruber apparently got really tired of doing that all the time, and one day he was like hey, wait a minute, computers can interpret instructions, that’s how all programs work, so why can’t I just get them to read some more basic markup syntax I invent, and interpret that as the right HTML attribute, like “make this italics” or “make this a link”?
When he invented this markup syntax for writing text for publication on web pages, and called it Markdown (get it?), he decided asterisks would be a good substitute for <em> and </em>.
So, in Markdown, if you want the word “blurst” to show up on a web page in italics, in your writing app, you surround it with asterisks, like this:
It was the best of times, it was the *blurst* of times.
For linking words to web pages, he was like, square brackets and round brackets can handle that just as well as all this “a href’’ nonsense.
So, if you want the word “blurst” to show up on a web page as a link to another web page, you do this:
It was the best of times, it was the [blurst](https://youtu.be/no_elVGGgW8) of times.
The part in the square brackets — [blurst] — is the word people will see as a link they can click on, and the part in round brackets after — (https://youtu.be/no_elVGGgW8) — is where clicking the link will take them.
Gruber then went on and came up with systemically similar shorthand formatting conventions for all kinds of other things you’d typically want to do in a blog post, like adding images and other stuff.
Markdown is popular because it’s so easy and saves bloggers so much time. Since the syntax is standardized and public, and since it’s written in plain text, it can be used by anyone making any app intended for other people to use for writing stuff that they want to display to other people on a screen somewhere, even if it’s not specifically for a blog post, or anything like that.
So, What is Markua?
All right, with all that behind us, we can now go ahead and answer this question, saying:
Markua is basically Markdown for books. It’s a standardized plain text book manuscript markup syntax, which authors and publishers can use to produce formatted books from plain text manuscript documents.
Anyone who gets their hands on the open Markua standard can interpret all the book-related formatting instructions written in a plain text manuscript document using Markua markup, and produce a properly-formatted book from the manuscript in exactly the way the manuscript author (or its editors) intended.
And that’s not just the case for ebooks, by the way: Markua can also be used to create the digital files that are used by machines nowadays to produce print books, too.
(Yes, all paper books are made from digital files nowadays, except for the exceptions. The same kind of thing applies to “ordering books online”, incidentally; you don’t really think your local bookstore people are like filling out paper forms and sending them off by post, to order the books you ask them for, or to order any of the books on sale in their shops, right?)
If authors and publishers adopted Markua as a standard for the book industry, this would make book writing and production way more efficient. Everybody wouldn’t have to be converting from one thing to another, or go about compromising on what app to use together, and then try to keep up with everything over time, as all the apps inevitably change.
And it would make those literary archivists, and the historians of the future, really happy, too.
Writing in Markua
Just like with Markdown, Markua is meant to make things simpler for book writers.
A key point that’s kind of hard to explain until you’re familiar with it, is that when you’re writing in plain text, you don’t need to learn a new app’s user interface to know how to do things (are the “Styles” under “Edit” or “Format” or what?!?), like how to make a chapter heading, or other features of books like that.
Typically, with Markua, you write your book manuscript in “.txt” documents — digital files with a file name and then “.txt” at the end, like “chapter-1.txt” (or whatever you want to call it). This is a plain text format that can be read on any computer.
(Incidentally, on Leanpub, you can also write a book in a web browser, where you don’t need to handle any files.)
A good place to start is with the first thing you’ll typically do when you start writing a book, which is to start your first chapter.
When you’re writing in Markua, if you want to indicate that something is a chapter title, in your .txt document, all you do is type a #, a space, and then the chapter title on a line, like this:
# Chapter One
And if you want to put words in italics, you do it just like in Markdown, surrounding the italicized word or words with asterisks, like this:
It was the best of times, it was the *blurst* of times.
To see an actual .txt document with a single chapter, showing this example, please click this download link and open the .txt file on your computer.
Here’s what that would automatically look like in a PDF ebook file:
That page would appear similarly in an EPUB file (for most ebook readers and apps) or a MOBI file (for Kindle readers and the Kindle app), although how it would appear exactly would depend on your own personal settings (dark mode, font size, etc.).
When you generate book files from your manuscript, the Markua “processor” you’re using (like Leanpub’s book generators) will automatically “see” all the chapter headings and create a Table of Contents for you at the beginning of your book, like this:
And with that, believe it or not, you actually know pretty much everything you’ll need to know to write most novels in plain text, using the Markua book formatting markup syntax!
There’s a lot more to some books than that, though. For example, some books have indexes at the end, or images, or figures and lists of figures, or a section at the beginning with the page numbers in Roman numerals, and lots of other stuff like that.
For examples of all the book-type formatting and features that Markua can do, please see the manual here.
OK, What Now?
It’s easy to get started writing a book in Markua! Just go here to create a new book:
You may want to start by choosing the Browser writing mode. (It’s free, by the way; just make sure to select the “Free” plan.)
If you scroll through this tutorial for the Browser writing mode, you’ll learn everything you need to know about writing an ordinary book or novel manuscript in Markua, and how to publish it on the Leanpub bookstore when you’re ready.
When you create a new book, some default Markua content will be added to get you going.
And if you’d like to download a zip file with a manuscript folder and some .txt documents that show you how things work if you’re writing in files on your computer (as opposed to writing in a browser), please click this download link.
You can find easy-to-follow, comprehensive walkthroughs for all of Leanpub’s writing modes here.
Finally, Some Technical Stuff
The complete Markua manual is free to read online here:
…and the full Markua specification is here:
Finally, For Real This Time
The history of the book is a fascinating subject in its own right, and of course goes back way before typewriters and even pens, which is where we dropped into the timeline at the beginning of this article. If you’re interested in the real story, you may want to start your journey on Wikipedia here or here.
All right, that’s it!