TEI Encoding by Edith M. Dunlap
While practicing TEI encoding for lab 1b (for one of the Dorr letters), I found myself feeling like this whole encoding thing may not be so bad until I remembered that we will have to deal with more complicated types of encoding in the future. Nonetheless, I actually enjoyed doing this lab because of how easy it was and because of the fact that I was able to practice learning some of the elements listed on our elements list.
Since we were dealing with what Liu calls a “facsimile” (natural expressions of a material model of text) of a letter written in (what I call) fancy 1834 handwriting that is physically housed in the Phillips Memorial Library, I was extremely thankful for the transcription that the library made available to readers. While encoding, since this is TEI, I was presented with a lot of different choices to make regarding what I wanted to include in the code. For example, I had to think about what names needed to be encoded and whether they were place names, organization names, or just regular names. The decisions that need to be made regarding the encoding of a certain text or document could change depending on what kind of text or document it is. For example, if I am encoding a piece of (traditional) prose, I will not have to deal with many line breaks, whereas a poem would require a lot more line breaks (just like the Dorr letters, although they are not poetry). Also, if I am encoding a piece of prose, I will likely not have to deal with salutations and/or signatures (unless there is a letter or another piece of text that exists within the prose that requires these elements).
Interpretation is a huge part of encoding, especially when it comes to TEI. This is because depending on who is doing the encoding, the relevancy of certain elements could differ, causing some things to be left out and others to be included. Also, different people view things differently, which means that some elements could be substituted for others. For example, instead of using the “
” element. Although they have subtle differences, it would be fairly easy to use the two interchangeably.
The difficulties that arose when digitally representing a physical object were not as bad as I thought that they would be. In fact, I only came across a couple of problems while encoding the Dorr letters. One thing that was more tedious than difficult was creating a line break for each and every line of the facsimile. Something that was difficult, though, was trying to remember if each element should be on its own line or not. At first, I just included things like organization names on the same line as the rest of the text included in that line. However, after checking the way the library encoded the document, I saw that these elements should be represented on their own lines. Having the transcription on hand is what made this process a lot less hectic than it probably could have been because I could just copy and paste the lines.