READ ME:
The following document describes the process involved to create on-screen and HTML -> PDF versions of the Federal Reporter.
Federal Reporter 1.0 has the following technology dependencies:
- Font-embedding (font-face)
- Required to use Century the official font of document of the United States Court System. [link]
- PrinceXML
- Required to produce pretty-print PDFs that work cross-platform and have reliable pagination for citation purposes. [link]
THE HTML/CSS VERSION: In order to produce the HTML/CSS version of the Federal Reporter as in our examples, we
recommend obtaining a high-quality font in the Century family to ensure maximum legibility. We chose Century Modern FS by FontSpring due to its superior quality and clear, concise End User Licence Agreement (one time $16.95 purchase fee, plus a $5 @font-face license for use on unlimited websites). We also chose
Manuskript Gotisch as a decorative font to remain in the spirit of the original Federal Reporter documents. We believe this to be in the scope of the current EULA. A high quality public domain version of Century was not available at the time of this writing. As an alternative, Times Roman is an adequate substitute, and the default of most modern web browsers.
UPDATE 02.22.11: Using Cocos Old to complement the Blackletter font used elsewhere in the documents.
Once the proper typefaces are obtained, a stylesheet must be developed to support all of the default elements in the base HTML. For purposes of example, we will use 0001.f.0001.html. The original unmodified version of this file can be viewed here. As the stylesheet will need to be applied across the entire collection of documents (approx. 9,000), modifications to the base HTML must be kept to a minimum. A diff of the two example files will reveal three modifications to the HTML:
- 1. <link rel="stylesheet" href="screen.css" type="text/css" />
- Required to apply custom CSS.
- 2. <p class="page"> to <span class="page">
- Required to denote page breaks inline.
- 3. <span class="num">¶ #</span>
- Required to add paragraph indicators along the left margin.
The stylesheet, screen.css, was created based on all of the default elements in the base HTML, plus the necessary modifications outlined above.
THE PRETTY PRINT VERSION (PrinceXML): The stylesheet for the HTML -> PDF conversion, print.css, is based upon the preliminary style work for the HTML/CSS version (i.e. screen.css) with the addition of style formatting techniques that apply to a paged-media environment.
Unfortunately, modern web browsers are not equipped to properly handle paged-media, therefore an interim solution was required to produce PDF versions to render in any web browser, mobile device, or to be downloaded in bulk.
PrinceXML was chosen for this task.
Our "pretty print" PDFs were achieved through the following workflow:
- 1. Download and install PrinceXML
- See the PrinceXML README file for instructions.
- 2. Create a "print.css" file based on "screen.css"
- Add page-size parameters, headers and footers.
- 3. Generate a test set of documents using the following command
prince 0001.f.0001.html -o 0001.f.0001.pdf
Please note: PrinceXML requires valid XHTML in order to perform an HTML -> CSS conversion. It is possible, but not probable, that scrubbing of the base HTML files will be required.
- 4. Modify the .page style declaration in print.css to include "page-break-before:always"
- This will add page breaks that were in the original scan PDF (see the original scan PDF for reference.)
- 5. Finesse the <body> font size and line-height (and any other relevant style declarations and page parameters) until the page breaks match the original scan PDF exactly.
- It will take several iterations until you achieve the desired results.
TO DO:
- 1. Make the headers the same on the right- and left-facing pages on the pretty print version.
- (See 0001.f.0001.pdf for reference.)
- 2. Remove the attribution line from the footer file of the pretty print PDF except on the last page.
- (See 0001.f.0001.pdf for reference.)
|