Sitzextase

A blog for very infrequent and work-related posts

Presentation of *Digital Muqtabas* at conference 'Books in Motion' in Beirut

4 min read · May 10, 2016

2016 · presentation conferences digital editions arabic periodicals · presentation project_dh project_oape
Workflow: Mark-up of page breaks

The following semi-automatic workflow allows to quickly mark-up page breaks and link them to facsimiles. Changes should be committed to GitHub after every step at the absolute minimum. Note that running XSLT 2.0 stylesheets requires some processing engine, most likely the open source Saxon processor. It comes built-in into many commercial XML editing tools, but currently I am not aware of any open-source and free-of-cost implementation. But as this project is envisioned as a collaboration, let’s collaborate. As long as someone commits TEI files with mannually added page breaks (see step 2.1 below) and then sends us a pull request, another collaborator with access to commercial software can run the transformations. 1. Mark-up 1.1 Page breaks Page breaks are recorded with the empty milestone element <pb/>. If a page break separates block level elements such as <div>, <p> or <lg>, the empty <pb/> is placed between the two elements and on the same level within the XML tree. <pb/> <div> <p>Some text in a paragraph <pb/> that spans across pages</p> <p>Some text in a paragraph that does not span across pages</p> <pb/> <p>  </p> </div> <pb/> <div> <p></p>  </div> Page breaks found in al-maktaba al-shāmila do not correspond to those in the original printed copies. They were therefore marked as <pb ed="shamila">. Page breaks corresponding to the original printed edition are identified by @ed="print". Dār Ṣādir in Beirut published a reprint in 1992, which is entirely unmarked as such but for the information on the binding...

1 min read · May 08, 2016 · OpenArabicPE

2016
How to contribute

Go to GitHub and register a free account. GitHub has fantastic tutorials that you might want to study if you are not yet familiar with the service. Fork the repository of the edition you are interested in. Edit the XML of the edition. This can be done either directly on GitHub, with any text editor on your local machine, or an online XML editor such as oXygen web that can hook into your GitHub account Send us a pull request We will review and merge your changes.

1 min read · April 24, 2016 · OpenArabicPE

2016
Archive.sakhrit.co's failure as a source for digitsed imagery of Arabic journals

Recently, a colleague pointed me to yet another gray online library of Arabic material—one that was entirely dedicated to cultural and litrary journals. Arshīf al-majallāt al-adabiyya wa-l-thaqafiyya al-ʿarabiyya (archive.sakhrit.co) presents a large number of Arabic journals over very long publication periods, providing: Partially watermarked digital imagery Functional tables of content for each issue, including author, title, page number Some bibliographic metadata on the issue level They do not provide a digital, machine-readable text. Focus of the corpus The focus is on cultural and scientific journals of the 20th century but they also have some journals of the late 19th and early 20th centuries, among them: Cairo al-Muqtaṭaf al-Ustādh al-Hilāl al-Bayān al-Manār al-Jāmiʿa (al-ʿUthmāniyya) al-Zuhūr Lebanon al-Mashriq Syria al-Muqtabas As one would imagine, I was exited to see a seemingly complete scan of al-Muqtabas among the journals hosted by archive.sakhrit. I am currently working on a digital scholarly and collaborative edition of this journal (see the project’s GitHub repository and blog)1 and only found accessible scans of volumes 1 to 8. Thus, the prospect of an additional and potentially complete scan, including volume 9, was exiting. But after my initial enthusiasm, I was in for a serious disappointment. Quality of the corpus As with other gray libraries, such as al-Maktaba al-Shāmila (shamela.ws), archive.sakhrit is quiet about the personnel or company behind it. It remains unclear where the originals came from, who scanned them, who transcribed the heads, authors, and page numbers seemingly available for every article. The rather illegal / gray...

1 min read · April 22, 2016 · OpenArabicPE

2016
Archive.sakhrit.co's failure as a source for digitsed imagery of Arabic journals

6 min read · April 22, 2016

2016 · review resources online libraries digitized resources · blog

Sitzextase

A blog for very infrequent and work-related posts

Presentation of *Digital Muqtabas* at conference 'Books in Motion' in Beirut

Workflow: Mark-up of page breaks

How to contribute

Archive.sakhrit.co's failure as a source for digitsed imagery of Arabic journals

Archive.sakhrit.co's failure as a source for digitsed imagery of Arabic journals

Presentation of Digital Muqtabas at conference 'Books in Motion' in Beirut