OpenMinTeD Catalogue of Corpora

Find easily accessible corpora of scholarly content and mine them!

Provided by:
Scientific domain:
Dedicated for:
(0.0 /5) 0 reviews
Access the resource Open Access
PowerFit web portal
Finding Anisotropy Tensor
HADDOCK Web Portal
DisVis web portal

A catalogue of corpora (datasets) made up of mainly Open Access scholarly publications. Users can view publicly available corpora that have been created with the OpenMinTeD Corpus Builder for Scholarly Works, or manually uploaded to the OpenMinTeD platform. The catalogue can be browsed and searched via the faceted navigation facility or a google-like free text search query. All users can view the descriptions of the corpora (with administrative and technical information, such as language, domain, keywords, licence, resource creator, etc.), as well as the contents and, when available, the metadata descriptions of the individual files that compose them. In addition, registered users can process them with the TDM applications offered by OpenMinTeD and download them in accordance with their licensing conditions. StandardFor users interested in finding corpora of various languages and domains easily accessible and ready to be processed with TDM applications; the use of a uniform metadata schema for their description facilitates comparison and contrast and thereby selection of the appropriate corpus.

Scientific categorisation
  • Generic
    • Generic
  • Data Analysis
    • Other
Target users
  • Researchers
Resource availability and languages
  • English
More about OpenMinTeD Catalogue of Corpora

The EOSC portal is been jointly developed and maintained by the EOSC-hub, eInfraCentral and OpenAIRE-Advance projects funded by the European

Union’s Horizon 2020 research and innovation programme with contribution of the European Commission

2018 EOSC Portal