Presentation slides

The A32 Quick Start Guide Version 1.0 Alpha has been presented in Gotheborg in may 2009.

29-05-2009: Link fixed!

Get the slides

Register here to receive updates and news about the guide (no, we won't spam you!).

Corpus and contributions Print E-mail
[Section edited by Matteo D'Alfonso]

Choose your corpus

You are about to create a digital environment in which you will probably have to manage a huge amount of sources, primary and secondary, relating to an Author or a Subject. The first step is to choose a consistent corpus which you might be able to enlarge on a second time, e.g. by collecting further contributions. By corpus we mean a given set of items relevant for your communities, such as works or handwritten documents of a writer, philosopher, poet etc., as well as books or articles written on his/her work, his life etc.. Most of the communities using Talia focus on literary remains containing handwritten documents. The first need for these communities was to publish a facsimile edition of this corpus. If you are interested in this aspect then you first need images of the originals.

Do you need images of your corpus?

If you don’t need facsimiles of originals skip this section and go directly to the next, otherwise follow these steps:

  1. Contact the owner, e. g. archive, museum, foundation etc. (see: Policy:Manuscripts and Policy:Facsimiles). The originals are probably conserved in a public or private Archive or Library. Contact the owner of the materials and check wether they are willing to let you digitize their materials. Please see the Policy section for questions relating to IPR facsimile reproductions of handwritings.
  2. Contact the owner or a private company to digitalise the corpus. Most public libraries have a department for the digital reproductions of their books or handwritten documents. If possible involve this department in your project, if you can't look for a private company. The standard technical requirements that the images have to respect in order to be a good support for scientific uses are listed in the Technical section.

Schopenhauersource worked well with the following company: Mikro Universe GmBH.

Classify your corpus for the web

Once you have chosen the corpus define a naming scheme for all the relevant items. As your platform is intended to collect contributions for the corpus you have chosen, the chosen names will work as pointers for these contributions. These names will then be useful for both citing the sources of your research and navigating through them in the website.
Moreover the names you will choose should become suffixes of the URL of your website and so contribute to set the URLs for each single pages showing the contents .

http://www.schopenhauersource.org/NL-I,1r is the URL for the first page (1r) of the first book (NL-I) of Schopenhauer's handwritings, and under this address you will find all contributions published in Schopenhaursource referring to this page.

Characteristics of the names

Since they will be part of an URL these names will have to be compatible with requirements of the Web (see also the URL page on wikipedia for more info and pointers to additional resources). Only use characters that are compatible with the syntax of an URL.

Some characters are reserved and cannot be used in a URL unless they are percent encoded. If you want you URLs to look nice is better to avoid using these characters in the object names. The set of reserved characters includes: ":" / "/" / "?" / "#" / "[" / "]" / "@"/ the following characters should also be avoided: "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="

See RFC3986 for a normative definition of the URL syntax.

Since they will be cited by scholars, who already have standards for citing primary and secondary sources for their studies, the naming schema should be compatible with this scholarly tradition. This will ensure the stability of quotations from and to paper publications. Unless the sources you are naming are absolutely unedited before, it is in your interest not to create an absolutely new naming schema, but to try to import the names that scholars usually use to refer to their research items (e. g. standard acronyms of works, established classification of handwritten documents in an archive, numbering of pages of a standard edition etc). You will modify them only in order to be able to import them in an electronic environment, hence replacing blanks “ ”with hyphens “-” or underscore “_” depending on your needs. You then should try to establish consensus around your classification proposal by discussing it with the holders and curators of the originals or involving the communities in its definition.

Therefore we suggest that:

  • you use the existing classifications in the archives where the documents are stored;
  • If your corpus or part of it has been already edited you should try to take the classification of the editors into account and import it into your one. The reason is to facilitate the link between the world of paper and the WEB, so that by reading and article citing this primary sources a scholar, after a short practice, can learn how to easily find the references. In any case you will have at least to create a table of alias.
  • you respect a given use of hyphens “-”, commas “,” and square brackets “[ ]” for indicating respectively the partition of the corpus in notebooks, the partition of each notebook in pages, and of each page in parts of text

For an overview of the criteria used in the Nietzschesource's naming schema see: Matteo d’Alfonso, Barbara Keiko Saile, Classifying manuscripts, works and iconography. See also: Id., Classifying manuscripts, works and iconography, part two: Defining manuscripts notes, texts passages, images details.

Publish the naming schema

Once you have established your naming schema you have to publish it as a separate contribution and also document the classification criteria you followed for generating it.

In order to ensure the maximal interoperability between your platform, other editions of your corpus and the secondary literature around it, you should publish a table of concordances with other existing classifications.

Using the naming schema for uploading the facsimile

In order to publish in your platform the facsimiles of your corpus you will have to give the right names to the each image files. You probably have a set of files with arbitrary names and now you have to connect the classifications names to the relevant file's name.
Please see the Technical section for more info on this matter.

Define metadata schemes for the digital objects

Metadata are all extra information about the data you will publish on your plattorm, such as names of the creator of the file, file size, place of the originals of facsimiles, etc. There is not a maximum amount of metadata, but there is a minimum of it and a standard for setting it. Make sure you collect at least the minimum amount of metadata required for compatibility with Europeana (see: http://dev.europeana.eu/provide_content.php).

Involve your community

A Scholarly Community on the Web isn’t complete unless you also activate an electronic, open access publishing system. To do so you must involve your community. First, ask your colleagues to contribute to the classification, the production of scholarly contributions and any other material relevant to your corpus.

In order to structure the production and publication of scholarly contribution you need to:

  1. Create an editorial board. You should try to collect specialists around your corpus in order to gain consensus for your platform. They will have to review submitted contributions and decide if they can be accepted for the publication in the platform and hence linked to other contributions already present in it.
  2. Establish the rules for accepting contributions. It is important that the criteria for accepting contributions are transparent and the rules that the reviewers have to follow are clear.
  3. Establish the kind of contributions you will accept. You don't necessarily have to accept all kinds of scholarly contributions. Choose in advance whether you intend to publish editions, essays, philological contributions, ... and set a clear definition of the standard you want to reach.
  4. Establish the format of the contributions. You don't have to accept all kinds of formats. Prefer contributions in xml format. Any other format can be well managed by Talia, but the scientificity of your on-line publications will suffer from bad standards (like Word).
  5. Establish the editorial rules. For any kind of contributions you will accept, you have to set the editorial rules that they have to follow. This implies not only the format of the file, but also how the file has to be edited, which font, format, text-encoding scheme and so on you want to have.
  6. Select a model licence to rule the relations between your platform and the authors.
  7. The names of all peer-reviewers, all the above-mentioned rules and criteria, and the kind of licence you've chosen have to be explicitly listed and published on an web page containing the documentation of your platform.
Last Updated on Monday, 12 October 2009 13:11