Wednesday, March 13, 2013

The Line-Length Problem

The bane of publishing code for consumption on a variety of platforms is that the available horizontal space varies.  I've blogged elsewhere that I want to avoid horizontal scrolling or bad line breaks in code, and I'm working with my publisher on how to do that. I'd like your help, too.

My understanding is that on Kindle and iPad (the platforms for which I currently have some data), the size of the text you see depends on both the font size specified in the document's CSS (which you, as a reader, typically can't control) as well as on the font size specified for the device (which you, as a reader, typically can).  The response to my earlier post about font choices showed a marked preference for code in a fixed-pitch font, so that's what I plan to use in Effective C++11. I've received the following information regarding how many characters fit on a line in Kindle and iPad in various combinations of device and CSS font sizes and device orientations:
It's interesting that on iPad, using the device in landscape mode shows two columns instead of one, thus providing less horizontal line space per line. As an author, this means I actually have more room to work with when the device is used in portrait mode.

As you can see, if I limit my code displays to 45 characters per line, that should display without problems under all but two combinations of settings above.  I think that 45 characters per line would look strange on devices with more horizontal room, however, and the data also show that for many combinations of settings, I could use up to 60 characters per line (which is about what I'd have in a printed book).  Not being a fan of lowest-common-denominator constraint satisfaction (i.e., not penalizing people with devices and settings for wider lines for the benefit of people with devices and settings for narrower lines) my thought is that I'll format my code displays twice, once with no more than 45 characters/line and once with up to 60. As an example of what that could mean in real life, here's some sample code from Item 3 of the current (third) edition of Effective C++. As is all code in that book, it's in a proportional font:

Here it is formatted in a fixed-pitch font with no more than 60 characters/line:
class TextBlock {  
public:
  ...  

  const char&
  operator[](std::size_t position) const   // operator[] for
  { return text[position]; }               // const objects

  char&
  operator[](std::size_t position)       // operator[] for
  { return text[position]; }             // non-const objects

private:
  std::string text;  
};
And here it is again with no more than 45 characters/line:
class TextBlock {  
public:
  ...  

  // operator[] for const objects
  const char&
  operator[](std::size_t position) const
  { return text[position]; }

  // operator[] for non-const objects
  char& operator[](std::size_t position)
  { return text[position]; }

private:
  std::string text;  
};
Do you think it's worth my formatting code displays twice, once for wide lines and once for narrow ones, or do you think that using narrow formatting everywhere would suffice? Don't worry about how much work it is for me. That's my problem. Focus on what would work better for you.

Assuming for the moment that formatting the code twice is preferable, there's a logistical issue that has to be addressed, namely, how to write a single manuscript that can generate documents with one of two sets of code displays. My plan had been to use Microsoft Word and to use conditional text to switch between code displays, i.e., to set up "wide" and "narrow" configurations and hide the code displays that did not correspond to the current configuration. Alas, Microsoft Word 2010 (the version I'm using) lacks support for conditional text, something that quite surprised me, because both FrameMaker and OpenOffice/LibreOffice have had it for years.  Switching to a different document authoring system leads to new problems, because the publication process likely to be followed by my book is likely to involve Microsoft Word as the point of entry, meaning that even if I produce my manuscript using, say, OpenOffice, that's likely to be converted into Word as step 0, so what Word can't represent is likely to be troublesome. (Before you bombard me with suggestions to use LaTeX or some other markup language, I'm on record as viewing those as inferior to WYSIWYG systems, as I detail here.)

Do you have any ideas about how I should approach the production of code displays that look good on all "reasonable" publication platforms and that can reasonably be produced and maintained by my authoring tool, which is highly likely to be Word 2010?

Thanks,

Scott


16 comments:

DeadMG said...

I absolutely think that formatting your code again for 60+char users would be appreciated. I don't know exactly how much effort goes into a reformatting, but it is very frustrating to have large viewing areas and not be able to make good use of them, as presumably the user paid additional money for a larger device.

Unknown said...
This comment has been removed by the author.
Anonymous said...

Yes, two formats would be worthwhile. Since Word is a given, you could mark code sections by style and loop through them, calling out to an indenter (and a syntax highlighter like VS if needed) to reformat them according to the new settings you want. You could adapt this approach to check for syntax errors as well.

Neil said...

I'd be disappointed with a 45-char length if my device supported longer. Though I will be purchasing the printed book, so I'm not sure it matters. (I'm assuming you'll make full use of the printed book's available space.)

As for the Word conditional-text problem, have you looked into fields and macros? I'm wondering if you could have placeholder fields throughout the document for all the code listings, and then run a macro which would insert the appropriate text into each field based on some condition, such as user input or some document property.

Once you run the macro once, the text would stay in the fields so you would be able to see the code listings in whatever format you most recently ran the macro for, rather than having to imagine what code listing goes where.

The macro could read the actual code text from some other file or database to keep it manageable, since I doubt you'd want to embed the code strings into the macro!

Zahir said...

I really don't know about the formatting; but, trailing comments were the real reason I gave up on Kindle version and went for the hard copy.

Anonymous said...

The 60 characters is 14 lines the 45 characters is 15 lines. Is the extra work really worth it ?

I think I'll stick to portrait mode on my kindle. I prefer to be able to have code and comment on the same page than have beautiful code and comment on the next page.

Shon's Blog said...

I generally read in portrait mode on my iPad, and would definitely prefer the 60+ chars/line formatting.

The trailing comments on the same line are a big help to my understanding code examples in books.

Scott Meyers said...

@Neil: A printed book is a wide platform, so if there is special support for wide platforms, the book will get it. If we decide to go with a single width code display for all platforms, the printed book will get the same width as all other platforms. Practically speaking, there is one document (the manuscript I produce), and there are multiple output devices (platforms). I'm trying to figure out what needs to go into the document such that it displays well on out all output devices. Print books are not special. They're just one of several target output devices.

Regarding fields and macros, I have looked into them a bit, but my impression is that they would be hard to work with for code displays. I definitely want to edit the source code in the document, because I need to apply formatting, plus I chose line breaks and even identifier names based on what will fit in the available horizontal space. Writing code for a book is not the same as writing code for a software project, because code for a book (1) has to fit in the available horizontal space, (2) should avoid bad page breaks, and (3) should correspond to and support the surrounding prose. Code in a real project has as its primary goal to have the desired behavior. Code in a book has as its primary goal to explain something.

Scott Meyers said...

@Zahir: Do you mean that trailing comments did not display properly on your Kindle? If so, was the problem that they wrapped?

I can't speak for other authors, but I can tell you that my books were designed to look good in only one context: print (or PDF, which is essentially the same as print, in terms of appearance). It's thus no surprise that they don't necessarily look good in other contexts, e.g, narrower layout formats. Hence my goal for this project to increase the likelihood that the book will look good in a range of reading contexts.

Scott Meyers said...

@Anonymous: Whether it's worth the trouble to format code displays twice is something I'm trying to decide :-)

Anonymous said...

Either choice seems equally fine to me as long as there is no horizontal scrolling or line breaks. I always prefer portrait mode anyway.

Tom Panning said...

I vote for both line lengths. I find the wide version much more readable. I would also probably want access to both formats: wide for my iPad, and narrow for my Nexus 4, since I read programming books on both.

Anonymous said...
This comment has been removed by a blog administrator.
Bart Vandewoestyne said...

I vote for only one version of the code, the one with the comments above the code instead of in the form of endline comments.

Chapter '32.5 Commenting Techniques' in Steve McConnell's Code Complete, 2nd edition, has a nice discussion on why endline comments are less preferable. I quite agree with what he writes.

Scott Meyers said...

@Bart Vandewoestyne: McConnell's objections to endline comments are reasonable for real code, but they are less appropriate for code in books designed to explain things. For example, the time needed to format the comments at the end of the line is no different from time taken to format anything else for maximum clarity, and the demands of maintaining the comments in a code base (the book) that is unlikely to change over time can be ignored. Furthermore, real code is generally viewed on computer screens where the content can be scrolled so that the viewer sees the context that is most important. Printed books have no such capability, and even digital books are often broken into unscrollable pages. This means that the best way to minimize the likelihood that a page break occurs in a bad place is to eliminate techniques that spread code across too many lines. Endline comments are one way to achieve that.

As I noted in a previous comment, writing code for production software and writing code for books are two different tasks, and techniques suitable for one may not be suitable for the other.

Anonymous said...

Having returned a number of great programming books because of the formatting. The effort you're putting in will be greatly appreciated.