C06, was G3E

From BNC file G3E:  P.D. James, _Death of an expert witness_, Sphere
Books Ltd, London, 1979.


As a BNC file, this text contains a number of linguistic oddities which
proved, when compared with a printed copy of the same edition of the
novel, to have been introduced in the process of creating the BNC electronic
file.  Some of them (for instance an inverted comma on one line which
is recorded in the BNC file as a comma in the line above) are undoubtedly
optical-character-recognition problems; others, such as the introduction
of paragraph breaks where no breaks exist in the original typography,
are more mysterious.  (It is possible, in the case of complex dialogue
containing internal quotations, that the BNC compilation process attempted
to "normalize" the paragraphing via some algorithm which gave unsuitable
answers in certain cases, though it is far from clear that this is the
explanation.  A query we sent to the BNC discussion list received no
substantive answer.)

This correction process was possible only because we happened to have our
own copy of the P.D. James whodunnit.  It may give an impression of the
extent of uncorrected BNC typographical errors in other texts.


The LUCY file corrects the errors found, as follows:

01937  comma after "fortunate" is full stop in BNC file

01947  "punished the books":  "books" is "boo's" in BNC file

01950  comma after "Lorrimer's death" is full stop in BNC file

01955  "The Times" not marked as italicized in BNC file

01958  BNC omits comma after "Stephen Copley"

01978-9  Before "Inference", BNC inserts a paragraph break, </p><p>,
    which does not occur in the original source; the paragraph
    break is followed in BNC by a &quot symbol, apparently 
    intended as open-quotation mark, but in fact the internal
    quotation closes before "Inference", and that word and what
    follows are within the top-level quotation only, which 
    terminates after "a word you say."

01997  BNC introduces a spurious paragraph break between
    "Middlemass." and "By".

01998  BNC introduces a spurious paragraph break between
    "jury?" and "Suppose".

02015  BNC introduces a spurious comma after "manifestly".

02024-5  BNC introduces a spurious paragraph break and (open) &quot mark
    before "It's a game".

02030  "enough" reads "&mdash;enough" in BNC.

02030  BNC omits (closing, single) inverted comma after "makes them up."
