-
Presentation of *Digital Muqtabas* at conference 'Books in Motion' in Beirut
-
Workflow: Mark-up of page breaks
The following semi-automatic workflow allows to quickly mark-up page breaks and link them to facsimiles. Changes should be committed to GitHub after every step at the absolute minimum. Note that running XSLT 2.0 stylesheets requires some processing engine, most likely the open source Saxon processor. It comes built-in into many commercial XML editing tools, but currently I am not aware of any open-source and free-of-cost implementation. But as this project is envisioned as a collaboration, let’s collaborate. As long as someone commits TEI files with mannually added page breaks (see step 2.1 below) and then sends us a pull request, another collaborator with access to commercial software can run the transformations. 1. Mark-up 1.1 Page breaks Page breaks are recorded with the empty milestone element <pb/>. If a page break separates block level elements such as <div>, <p> or <lg>, the empty <pb/> is placed between the two elements and on the same level within the XML tree. <pb/> <div> <p>Some text in a paragraph <pb/> that spans across pages</p> <p>Some text in a paragraph that does not span across pages</p> <pb/> <p> <!-- --> </p> </div> <pb/> <div> <p></p> <!-- --> </div> Page breaks found in al-maktaba al-shāmila do not correspond to those in the original printed copies. They were therefore marked as <pb ed="shamila">. Page breaks corresponding to the original printed edition are identified by @ed="print". Dār Ṣādir in Beirut published a reprint in 1992, which is entirely unmarked as such but for the information on the binding...
-
How to contribute
Go to GitHub and register a free account. GitHub has fantastic tutorials that you might want to study if you are not yet familiar with the service. Fork the repository of the edition you are interested in. Edit the XML of the edition. This can be done either directly on GitHub, with any text editor on your local machine, or an online XML editor such as oXygen web that can hook into your GitHub account Send us a pull request We will review and merge your changes.
-
Archive.sakhrit.co's failure as a source for digitsed imagery of Arabic journals
Recently, a colleague pointed me to yet another gray online library of Arabic material—one that was entirely dedicated to cultural and litrary journals. Arshīf al-majallāt al-adabiyya wa-l-thaqafiyya al-ʿarabiyya (archive.sakhrit.co) presents a large number of Arabic journals over very long publication periods, providing: Partially watermarked digital imagery Functional tables of content for each issue, including author, title, page number Some bibliographic metadata on the issue level They do not provide a digital, machine-readable text. Focus of the corpus The focus is on cultural and scientific journals of the 20th century but they also have some journals of the late 19th and early 20th centuries, among them: Cairo al-Muqtaṭaf al-Ustādh al-Hilāl al-Bayān al-Manār al-Jāmiʿa (al-ʿUthmāniyya) al-Zuhūr Lebanon al-Mashriq Syria al-Muqtabas As one would imagine, I was exited to see a seemingly complete scan of al-Muqtabas among the journals hosted by archive.sakhrit. I am currently working on a digital scholarly and collaborative edition of this journal (see the project’s GitHub repository and blog)1 and only found accessible scans of volumes 1 to 8. Thus, the prospect of an additional and potentially complete scan, including volume 9, was exiting. But after my initial enthusiasm, I was in for a serious disappointment. Quality of the corpus As with other gray libraries, such as al-Maktaba al-Shāmila (shamela.ws), archive.sakhrit is quiet about the personnel or company behind it. It remains unclear where the originals came from, who scanned them, who transcribed the heads, authors, and page numbers seemingly available for every article. The rather illegal / gray...
-
Archive.sakhrit.co's failure as a source for digitsed imagery of Arabic journals