Please use the following text to cite this item or export to a predefined format:
Abel, Andrea; Glaznieks, Aivars; Culy, Chris; Nicolas, Lionel and Stemle, Egon W., 2014, KoKo German L1 Learner Corpus v3, CLARIN DSpace, http://hdl.handle.net/20.500.12124/12
dc.contributor.authorAbel, Andrea
dc.contributor.authorGlaznieks, Aivars
dc.contributor.authorCuly, Chris
dc.contributor.authorNicolas, Lionel
dc.contributor.authorStemle, Egon W.
dc.date.accessioned2019-09-19T14:27:45Z
dc.date.available2019-09-19T14:27:45Z
dc.date.issued2014-12
dc.descriptionThe KoKo Corpus is an error-annotated learner corpus of L1 German speakers. It has been created with the aim to investigate and describe the writing skills of German-speaking secondary-school pupils at the end of their school career by analysing authentic texts produced in classrooms. The corpus consists of 1503 argumentative essays which contain manually performed transcription annotations and linguistic error annotations. Transcription annotations reflect surface features of the text, such as the graphical arrangement and self-corrections. Error annotations relate to the orthographic level (including punctuation errors), and a selection of the texts (n=597) also contain error annotations on the grammatical level. The corpus building process was guided by two goals: 1. describe writing skills at the transition from secondary school to university, 2. determine external factors that may influence the distribution of writing skills, such as the region, sociolinguistic (gender, age), socio-economic, and language-related biographical factors (L1, preferred variety of German, reading and writing habits, etc.). The pupils were selected from three different German-speaking areas: - North Tyrol (Austria), South Tyrol (Italy), and Thuringia (Germany). Classes were sampled randomly, using the size of the cities in which the schools were located (small vs. medium vs. big) and the type of school (providing general education vs. education specific to a particular profession) as strata for the sampling. Since data were collected during regular courses, the typical formation of secondary-school classes in the three regions is represented in the whole corpus. Most of the participants are German native speakers (n=1319, 82.7%). Person-related metadata provides information about: - writer's L1 - writer's gender - type of school the essay comes from - location of the school the essay comes from - grade attended at data collection In addition, the corpus is automatically annotated, including tokenisation, sentence splitting, POS-tagging and lemmatization.
dc.identifier.urihttp://hdl.handle.net/20.500.12124/12
dc.language.isodeu
dc.publisherInstitute for Applied Linguistics, Eurac Research
dc.relation.isbasedonhttps://gitlab.inf.unibz.it/commul/koko/data/bundle/-/tags/v3
dc.relation.isreferencedbyhttp://www.lrec-conf.org/proceedings/lrec2014/pdf/934_Paper.pdf
dc.relation.isreplacedbyhttp://hdl.handle.net/20.500.12124/77
dc.relation.replaceshttp://hdl.handle.net/20.500.12124/11
dc.rightsCLARIN ACADEMIC END-USER LICENCE (ACA-BY-NC-NORED 1.0)
dc.rights.labelACA
dc.rights.urihttps://gitlab.inf.unibz.it/commul/var/eurac-licenses/-/raw/v1.0/EULA-CLARIN-ACA-BY-NC-NORED.md
dc.source.urihttp://www.korpus-suedtirol.it/KoKo.html
dc.subjectlearner corpus
dc.subjectGerman varieties
dc.subjectstudents in secondary school
dc.subjectargumentative essays
dc.titleKoKo German L1 Learner Corpus v3
dc.typecorpus
local.brandingLearner Language
local.contact.personCorpus Manager clarin@eurac.edu Eurac Research CLARIN Centre (ERCC)
local.demo.urihttps://commul.eurac.edu/annis/koko
local.files.count7
local.files.size148995332
local.has.filesyes
local.hasCMDIfalse
local.hiddenfalse
local.language.nameGerman
local.size.info1503 texts
local.size.info950,000 tokens
metashare.ResourceInfo#ContentInfo.mediaTypetext

Version History

Showing 1 - 4 out of 4 results
VersionDateSummary
2024-06-01 00:00:00
3*
2014-12-01 00:00:00
2012-12-01 00:00:00
2012-12-01 00:00:00
* Selected version
This item isAcademic Use
and licensed under:
 Files in this item
Loading files... This may take a few seconds as file previews are being generated. If the process takes too long, please contact the system administrator