CAA2015 KEEP THE REVOLUTION GOING >>> Proceedings of the 43rd Annual Conference on Computer Applications and Quantitative Methods in Archaeology edited by Stefano Campana, Roberto Scopigno, Gabriella Carpentiero and Marianna Cirillo Volume 1 Archaeopress Archaeology Archaeopress Publishing Ltd Gordon House 276 Banbury Road Oxford OX2 7ED www.archaeopress.com CAA2015 ISBN 978 1 78491 337 3 ISBN 978 1 78491 338 0 (e-Pdf) © Archaeopress and the individual authors 2016 CAA2015 is availabe to download from Archaeopress Open Access site All rights reserved. No part of this book may be reproduced, or transmitted, in any form or by any means, electronic, mechanical, photocopying or otherwise, without the prior written permission of the copyright owners. This book is available direct from Archaeopress or from our website www.archaeopress.com GQBWiki Goes Open Stefano Costa (
[email protected], http://orcid.org/0000-0003-1124-3174) Alessandro Carabia (
[email protected]) Department of Historical Sciences and Cultural Heritage, University of Siena, Italy Abstract: GQBWiki is a wiki website for the archaeological research project on the Byzantine city of Gortyn, based on open source software and publicly accessible under the CC-BY-SA license. It has been used since 2005 to record excavation data together with other content ranging from bibliography to reflexive documentation. While the MediaWiki software lacks native capabilities for structured querying, the Semantic MediaWiki extension has been used to provide all the infrastructure necessary for linked open data. GQBWiki is the on-going result of collaborative work and strives to give attribution to all contributors in a transparent way, with all the challenges that a non-traditional publication workflow brings. Keywords: Multivocality, Collaborative authorship, Open data, Open source 1 GQBWiki use by third parties under the GNU General Public License. Since we adopted it in 2005, MediaWiki has been constantly GQBWiki is an online wiki website (http://www. updated and improved by the Wikimedia Foundation and gortinabizantina.it/wiki/) dedicated to the archaeological by other contributors, so far reducing the risk of finding research project in the Byzantine Quarter near the Pythion ourselves with an obsolete tool – while other pieces of wiki shrine in Gortyn (Crete), run by the University of Siena. It software were abandoned in the meantime – acknowledging has been operational since 2006. While fieldwork at the site that the maintenance of such a complex tool is well beyond started in 2001, it was only in 2005 that we decided to start the technical capabilities of a small team, not to mention the building a digital archive where the documentation could be increasing need to keep web-based software free from security collectively created and curated, not limited to excavation bugs that may put the privacy of users at risk. MediaWiki is data strictu sensu, migrating over content from previous built on the well-known LAMP platform (Linux, Apache, relational databases for stratigraphic data. In 2005, choosing a MySQL, PHP) and can be run with no difficulty on any web wiki over other available systems seemed to provide strategic hosting service, at least in its basic functionality. GQBWiki advantages, such as being online, always available when and is used all year round, but it is essential during the fieldwork where an Internet connection was available and more generally season. Due to the lack of an Internet connection at the mission facilitating the creation of an encyclopaedic resource about house in Agioi Deka near Gortyn, it was only in 2012 that we the research project. GQBWiki has always been restricted to could work directly on GQBWiki using a commercial mobile the research team members until April 2015, so there was no broadband Internet provider. In the previous years, we would benefit in terms of visibility of the resources that were created simply take advantage of software freedom and the flexibility and updated. In retrospect, the choice of an online platform of GNU/Linux systems to install a local wireless network with brought several ‘revolutionary’ advantages that took us some a web server running MediaWiki on a spare laptop and a local time to appreciate their full potential. In this paper, we outline ‘clone’ of GQBWiki (the online version was put in read-only the current status of GQBWiki and what we think we learned mode). At the end of each field season, the updated content in the past 10 years, particularly with respect to our first has been put online again until the next year. This approach discussion on the same topic (Zanini and Costa 2006) and a does not seem very widespread and in our case in has become wider overview of the situation for knowledge sharing in the obsolete, but in our case it worked well as an alternative to file- archaeological world (Zanini and Costa 2009). based collaboration, where all team members work separately and there is a collation process at the end, while retaining a The research project at GQB has a focus on the Late Antique ‘slow’ pace as described by Caraher (2015). and Early Byzantine phases of the urban area of Gortyna, and is therefore part of the rather large topic about the end The entire content of GQBWiki is in Italian: our team is of the ancient Mediterranean city. With this premise in mind, not international and it would be unnatural to write our it seemed natural for GQBWiki to become a comprehensive documentation in English. Italian is also known by many archive where the archaeological record could become part of scholars of the ancient world. In fact, it could be argued that a hypertext, and could be linked to historical evidence, broader Greek is the language that is actually missing from GQBWiki, interpretive texts and so on. because the research project is taking place in Greece, under control by the Greek authorities. The prevalence of the English On the technical side, GQBWiki is based on the popular language on the Web and in academic literature is undebatable MediaWiki software, better known for being used by (especially in a paper written in English), but its advantages Wikipedia and other related websites, but also available for over multilinguism are less clear. The Wikimedia movement 1033 CAA 2015 has taken a clear practical stance in favour of multilinguism, members of an academic audience have a basic understanding with hundreds of Wikipedias in minority languages. Since we of these tools). This allows relevant meta-information to adopted the software platform of the Wikimedia movement, it be immediately available, such as the last date when a page seems appropriate to reflect on this global issue, not just from was updated – and therefore whether the content is possibly our privileged point of view and with the concern of visibility outdated. In a general sense, the page history provides an and academic value of our work, but also from the perspective overview of ‘who contributed what’ with respect to the page of making knowledge available to as many people as possible. under examination. The wiki home page guides both the casual reader and the The big step we are taking in 2015 is opening GQBWiki to regular contributors to the various sections of the website, the public, even before there is a print edition of the research acting as a table of contents for the various areas of interest project, under a Creative Commons – Attribution – Share-Alike and the levels of detail. At a glance, the contents range from license (again, the same used by Wikipedia). By doing this, excavation data to interpretive texts, providing a necessary we hope to provide a useful digital resource for those working companion to the final GQB publication, the tone of which in Mediterranean archaeology, for example by sharing digital will be narrative and holistic rather than enumerative. There images of finds from dated contexts (a very common quest in are certainly parallels with other similar systems that were this field of studies). At the same time making GQBWiki open built in the same years, like the one developed for Villa Magna is a straightforward way to elicit and stimulate feedback about (developed by Andrew Dufton and Elizabeth Fentress), both in our archive as a whole. Unfortunately, for the moment the terms of types of content and of technical solutions. ability to edit content is limited to team members and registered users (mainly due to the need to avoid spam): this is perhaps not When comparing GQBWiki and our use of MediaWiki to other even considered in most cases when similar digital resources ‘archaeological information systems’, one aspect that should go online, but it seems worth pointing out that it would be very be immediately clear is that we are not proposing wiki systems interesting for anyone and especially other scholars to be able as the best solution for any archaeology research project, to comment on pages and even provide alternate interpretations particularly from a technical standpoint. There are limitations for site features and finds. We are keen on registering new users that make GQBWiki imperfect, if not in principle at least in on demand, but not with an effortless registration procedure practice, and it is important to recognise these limitations. that is standard for modern websites. However, this limitation The most substantial limitation is with spatial data (context is in our available time, not in the software. plans, sections, etc.) that MediaWiki is completely unable to support natively. Looking at this seemingly unacceptable 2 Dealing with limitations, exploring possibilities issue from a broader perspective, we can observe that in ‘traditional’ site archives and archaeological information A quick numerical summary of GQBWiki shows that, at the systems, alphanumeric, graphic and spatial data are managed in time of writing, there are 2089 pages, with 16190 internal links the same platform, while interpretation and publication are left and 27618 single edits. Pages range from stratigraphic units to on their own. On the other hand, GQBWiki is missing spatial find records, but it is journal entries that play a central role in data that is managed through separate software tools, but all the navigation path, rather than the useful but confusing list of other content is part of the same platform. Spatial data consists stratigraphic units. There are also pages about team members, mainly of context plans that are rendered as static raster images both as a means of collective memory and as a kind of meta- and uploaded to the wiki in batch, and then dynamically loaded documentation. GQBWiki contains data about who excavated in the relevant pages based on the semantic features outlined a certain stratigraphic unit, so in a sense we have become part below. of the data we create, and made it explicit. There is a category of pages devoted to bibliographic references, usually with A wiki page is a free form web page, where a lightweight markup extensive notes linking evidence from other sites and regions is used instead of HTML to ease authoring. Therefore, any to GQB and, as noted above, ‘incubators’ for ideas and written schematisation (such as the requirement that all stratigraphic content that will be included in the print publication. Internal context records have the same appearance and minimum links are certainly one of the main strengths of wikis, and information) is obtained by means of discipline and templates, GQBWiki makes no exception: looking at the broad categories not unlike Wikipedia content. There can be as many templates outlined above, it is important to point out that there is no as needed in a wiki page, for formatting parts of content in restriction to links, and any page can point to any number of specific ways (e.g. the well-known ‘infobox’ in the top right) other pages, regardless of their ‘category’. or more complex tasks. The consequence of the ‘flat nature’ of wiki is that in several Wiki systems are by definition multi-user, both technically and cases, the content ends up being very raw, not just in a technical socially. The net result is that GQBWiki is an incarnation of sense of ‘raw data’, but also in terms of human readability and written multivocality, probably not of the same kind envisaged usability: if, for example, on a certain day the archaeologist did by Ian Hodder, but nevertheless stimulating, especially when not feel like writing more than one sentence in their journal, we consider that all users/members have access to the same that will be the content for that day – there are minimum total amount of information, both for reading and editing. requirements that are directly derived from those of paper Users can edit any page as they see fit, fixing small typos or recording sheets, but since our methodological toolbox leaned changing the functional interpretation of a deposit. The reality towards using multimedia, the amount of mandatory data has is less radical than what it may seem, though. Each wiki page been reduced (Zanini and Costa 2006). The structure of a preserves its own ‘history’ of edits, providing an overview of wiki is only created by adding content and links. Having no who has been adding (or removing) content, when, etc., as predefined structure is stimulating on an intellectual level, anyone familiar with Wikipedia will find normal (we hope that because every bit of information has the same theoretical 1034 Stefano Costa and Alessandro Carabia: GQBWiki Goes Open importance within the documentation system and there is room 3 Collaborative authorship and attribution for both data and discussion of uncertainty, but in practice we need to create lists of pages, entry points and navigation paths Apart from the technical aspects discussed above, there is a that will guide both contributors and readers, keeping in mind second set of problems that are of equal interest and touch that MediaWiki has a very good internal search engine, and on the intrinsic difference of wiki authorship from traditional that is usually the quickest and most effective way of finding a publication, again from a standpoint where GQBWiki is first of specific page. Having no separation between structure and data all the recording of a research process, and the archaeological also means that both can be changed by editing wiki pages, and excavation is only one part of that process, as is the digital that this can be done at any moment. Following in the steps archive. The material wiki practice of creating content of Wikipedia, structured information in GQBWiki is stored in confronts us with problems such as: how do we manage lists and ‘infobox’ templates. If we decide to record a new piece contributions ranging from simple digitisation and data entry of information in a page, or a category of pages, there is no of analog records to fully digital stratigraphic data? underlying structure separate from the frontend ‘Edit this page’ button. Another significant enabler is that MediaWiki markup In a traditional setting, the path from content creation to encourages the kind of copy-and-paste editing made of trial publication is more or less linear, from the bottom up, with and error (edit, save, review, edit again) that was so beneficial checks for consistency at each step. With thousands of pages, to the early development of the Web in the 1990s. each one accessible separately, the need for a solid review is even stronger, but the difficulty is in the systematic application Again, great advantages come together with limitations: despite of review procedures in a way that is both efficient and quick, being based on a relational database (MySQL), MediaWiki is otherwise new contributions will stagnate. Therefore, content not a database and there is no native support for retrieving review happens on an opportunistic basis in GQBWiki, and structured information using SQL-like queries. After an initial it is not enforced. In general, the internal review process period of confidence in this ‘dictatorship of the unstructured’, has worked well for us, but some content is still outdated or it became clear that it was impractical to be left without the missing, and a complete external peer review seems unlikely capability of doing structured queries on our knowledge base. and we do not expect a substantial amount of feedback even At the same time, the amount of information we already had in after opening the wiki, as most potential contributors would place was substantial, and team members were pleased with the have their own archives to curate. general functionality of the wiki, despite a slow learning process. Using Semantic MediaWiki, an extension to the base software Another issue we think we are dealing with is attribution for package, we added a ‘thin ontology’ layer to GQBWiki, not all the digital work done by supervisors and undergraduate with the aim of building a Semantic Web resource, but as the students alike. The approach seen in GQBWiki is taking most convenient way of adding typical ‘relational’ functionality inspiration from initiatives like Fair Cite (2012), which tackles into our wiki. So, we could add dynamic content blocks like ‘a the problem of ‘how best to cite a web-based collaborative gallery of images of the context at the bottom of each context project developed in the humanities’ and whose names should page’. In practice, this works by turning internal wikilinks into be included in the citation. At the bottom of each wiki page, ‘typed’ links: an image page is linked to the page of the item it a list of all contributors has links to each user page and the depicts, conveying both the link and the relationship between suggested citation for a single page contains the URL of a these two pages; a page about a stratigraphic unit is linked to special visualisation showing that list. Furthermore, ‘bot’ another stratigraphic unit by expressing the type of stratigraphic users like the prolific GQBot (controlled by the pywikibot relationship between the two (following the Italian standard software) give us a chance to reflect upon the contribution of of highly descriptive ‘physical relationship’ as opposed to machines to our work, not only as mere tools, but as executors the British/MoLAS ‘earlier than/later than’ standard). At a of instructions that we only prepare, for repetitive work like basic level, Semantic MediaWiki usage is equivalent with the batch uploading of images or importing from databases. Our creation of a custom ontology that is only valid for the wiki in work is collaborative in this sense, too. use, based on properties, but there is a possibility of ‘mapping’ the internal properties to universal URI-based properties. In the 4 Conclusions example of the image-item link, the ‘depicts’ relation becomes a local mirror of the equivalent FOAF property, where FOAF After ten years working with GQBWiki we are convinced that is the ‘Friend of a Friend’ ontology, one of the earliest and most the benefits exceed the disadvantages, and that making this widespread Semantic Web vocabularies in use. This makes for body of knowledge open will further increase its value for the another case of serendipity: we started using a tool that worked wider archaeological community. natively on the web, before it was widely acknowledged that it would have been the only sensible choice in just a few years. A decade could seem a long time span, since most digital GQBWiki had unique, clean URLs for every excavation context works can easily become obsolete even in less time: the truth and find, since the very beginning, even though it was only is that we are collectively used to rapid decay cycles of our in more recent years that we understood how this represented digital archives and publications, while traditional paper-based a possibility for doing other things, such as linked open data. publication has stood the test of time. When, in an academic The idea that external vocabularies (such as Nomisma.org for context, we put our data and studies online, it usually means coins) can be used to link content from GQB to other online that we want them to be accessible and we want to ensure them archives and catalogues is based on the assumption of an ‘open a long life. Being on the Web does not make data automatically world’ of information where there are both internal (wiki)links linked and open, but as described above GQBWiki is and external links in a continuum. incrementally going in that direction, finding common ground with other existing initiatives in the field of ceramic studies (Gruber and Smith 2015), numismatics (Gruber et al. 2014) 1035 CAA 2015 and ancient world studies in general (Elliott and Gillies 2009) Bibliography with a very practical, URI-focused stance, and we hope that GQBWiki URIs will make appearance in linked open data Carabia, A. 2013. Wiki=beta: il modus vivendi di un sistema graphs. That said, we also think that a more pronounced focus per documentare la ricerca. Archeologia e Calcolatori on the human components of any technological platform is Supp. 4: 209-13. Firenze, All’Insegna del Giglio. needed, and Web 2.0 is no different in this respect (Shanks Caraher, W. 2015. Industrial Archaeology and Student and Whitmore 2012). In our experience, a wiki needs to be Resistance. [Online] Available from: https:// actively used in order to have a chance to survive, and having a mediterraneanworld.wordpress.com/2015/06/26/ long-term archival of wiki content or any other archaeological industrial-archaeology-and-student-resistance/ [accessed: data in a ‘freezed’ form is increasingly unsatisfactory, since 8 July 2015] the discoverability of such content is not getting better. Other, Elliott, T., Gillies, S. 2009. Digital Geography and Classics, separate wikis that we started for other research projects are «Digital Humanities Quarterly» 3, 1. Changing the Center unfortunately not as thriving as the one described in this paper. of Gravity: Transforming Classical Studies Through Cyberinfrastructure. [Online] Available from: http://www. So far, GQBWiki is the virtual workplace of our research digitalhumanities.org/dhq/vol/3/1/000031/000031.html team: consulted and updated by users all the time from many [accessed: 13 November 2015]. places in Europe, with huge peaks of activity reached during Fair Cite 2012. Current Citation Practices in Academia. the excavation campaigns (Carabia 2013). The availability [Online] Available from: https://faircite.wordpress.com/ of excavation data, interpretive texts, diaries, pictures and so [accessed: 31 March 2015]. on, all on the same platform concurrently and without any Gruber, E., Heath, S., Meadows, A., Pett, D., Tolle, K., Wigg- hierarchical limitation, has been a transformative environment Wolf, D. 2014. Semantic Web Technologies Applied to for our work. Numismatic Collections. In P. Verhagen (ed.), Archaeology in the Digital Era: Papers from the 40th Annual Conference We believe that this approach is fruitful at the research team of Computer Applications and Quantitative Methods in level and can be adopted on a wider basis. In an ideal situation, Archaeology (CAA), Southampton, 26-29 March 2012: new studies about specific aspects of archaeological interest 264-74. Amsterdam, Amsterdam University Press. (for example, the type of artisanal activity recognised in Gruber, E., Smith, T. J. 2015. Linked Open Greek Pottery. In F. 8th-century contexts from Byzantine Gortyn) would result Giligny, F. Djindjian, L. Costa, P. Moscati, S. Robert (eds.), not only in a specialist, peer-reviewed publication, but also Proceedings of the 42nd Annual Conference on Computer in the updating of a range of ‘wiki pages’ about Byzantine Applications and Quantitative Methods in Archaeology. craftsmanship, the history of Crete, or the work of Italian CAA 2014 - 21st Century Archaeology: 205-14. Oxford, archaeologists abroad. Archaeopress. Hadley, P. 2013. WikiProject Archaeology. [Online] At a global scale, Wikipedia represents the main way of Available from: http://pathadley.net/projects/wikiproject- accessing the knowledge landscape for a majority of Internet archaeology/ [accessed: 8 July 2015]. users. Archaeology is well represented on Wikipedia but expert Mediawiki Contributors 2015. MediaWiki. [Online] Available contributions are scarce, driven by the lack of incentive for from: https://www.mediawiki.org/ [accessed: 8 July 2015] academics to contribute and the rarity of collaboration-driven Semantic Mediawiki Contributors 2015. Semantic MediaWiki. publication among archaeologists (Hadley 2013). For very [Online] Available from: http://www.semantic-mediawiki. general topics, Wikipedia is recognised as the right platform org/ [accessed: 8 July 2015]. and there are known patterns for contributing content, debating Shanks, M., Whitmore, C. 2012. Archaeology 2.0? Review of contrasting views, accommodating for different types of source Archaeology 2.0: New Approaches to Communication and material and so on. It is less clear whether more specialist Collaboration. Internet Archaeology 32. [Online] Available content (for example the chronology of a very specific type from: http://dx.doi.org/10.11141/ia.32.7 of ceramic production ‒ even a minor one ‒ or the calibrated Zanini, E., Costa, S. 2006. Organizzare il processo conoscitivo radiocarbon date for an occupation sub-phase in an otherwise nell’indagine archeologica: riflessioni metodologiche ed settlement) can fit in the Wikipedia notability guidelines. As esperimenti digitali. Archeologia e Calcolatori 17: 241-64. we have shown, the tools and some of the good practice to Zanini, E., Costa, S. 2009. Sharing knowledge in archaeology: develop long-term collaborative platforms are already in place. looking forward the next decade(s). In M. Tsipopoulou (ed.), Digital Heritage in the new knowledge environment: Should we start working in a collaborative and incremental 69-72. Athens, Hellenic Ministry of Culture. fashion, rather than starting from scratch at each new study? 1036