This is not the latest version of this item. The latest version can be found here.
KrdWrd CANOLA Corpus 1.0
Please use the following text to cite this item or export to a predefined format:
Stemle, Egon W. and Steger, Johannes M., 2010, KrdWrd CANOLA Corpus 1.0, CLARIN DSpace, http://hdl.handle.net/20.500.12124/8
Authors
Item identifier
Project URL
Date issued
2010-09-10
Size
216 files
Language(s)
Description
The CANOLA Corpus is a visually annotated English web corpus for training classification engines to remove boiler plate on unseen Web pages. It was harvested, annotated and evaluated by the tools and infrastructure of the KrdWrd Project.
Collections
This item isPublicly Available
and licensed under:
Files in this item
Loading files... This may take a few seconds as file previews are being generated. If the process takes too long, please contact the system administrator test@test.sk