> top > projects

Projects

NameTDescription# Ann.AuthorMaintainer Updated_atStatus

1-20 / 228 show all
Find2ERFind the Findings of Enzymatic Reaction0Akihiro KamedaAkihiro Kameda2015-02-20Testing
Lectin3.52 Kangata2018-02-09
AlvisNLP-TestProject for testing AlviNLP PubAnnotation server during BLAH3.17Bibliome2017-01-20Testing
SNPPhenoExt3behrouz bokharaeianbokharaeian2016-04-30Developing
PreeclampsiaA collection of titles and abstracts of "Preeclampsia"-related papers. They were extracted from PubMed using the MeSH term "Preeclampsia" and specifying the language to be "English, on 11th September, 2017. The texts were then annotated by PubDictionaries using the dictionary "Preeclampsia".58.9 Kcallahan_tiff2018-02-27Developing
NCBIDiseaseCorpusThe NCBI disease corpus is fully annotated at the mention and concept level to serve as a research resource for the biomedical natural language processing community.6.88 KRezarta Islamaj Doğan,Robert Leaman,Zhiyong LuChih-Hsuan Wei2015-08-06Released
tmVarCorpusWei C-H, Harris BR, Kao H-Y, Lu Z (2013) tmVar: A text mining approach for extracting sequence variants in biomedical literature, Bioinformatics, 29(11) 1433-1439, doi:10.1093/bioinformatics/btt156.1.43 KChih-Hsuan Wei , Bethany R. Harris , Hung-Yu Kao and Zhiyong LuChih-Hsuan Wei2015-08-06Released
Test_PubTator62Chih-Hsuan Wei2016-04-20Testing
Ab3P-abbreviationsThis corpus was developed during the creation of the Ab3P abbreviation definition identification tool. It includes 1250 manually annotated MEDLINE records. This gold standard includes 1221 abbreviation-definition pairs. Abbreviation definition identification based on automatic precision estimates Sunghwan Sohn, Donald C Comeau, Won Kim and W John Wilbur BMC Bioinformatics20089:402 DOI: 10.1186/1471-2105-9-4022.34 KSunghwan Sohn, Donald C Comeau, Won Kim and W John Wilburcomeau2016-07-29Beta
DLUT931DLUT NLP Lab.Test our event extration result for 16 GE task.4.57 KDLUT9312016-05-17Testing
test010Erika Asamizu2015-09-11Testing
Erin_test@ Yonsei University0ErinErinHJ_Kim2017-07-13Testing
bionlp-st-2016-SeeDev-testEntities annotations from the test set of the BioNLP-ST 2016 SeeDev task. SeeDev task focuses on seed storage and reserve accumulation on the model organism, Arabidopsis thaliana. The SeeDev task is based on the knowledge model Gene Regulation Network for Arabidopsis (GRNA) that meets the needs of text-mining (i.e. manual annotation of texts and automatic information extraction), experimental data indexing and retrieval and reuse in other plant systems. It is also expected to meet the requirements of the integration of the text knowledge with knowledge derived from experimental data in view of modeling in systems biology. GRNA model defines 16 different types of entities, and 22 types of event (in five sets of event types) that may be combined in complex events. For more information, please refer to the task website All annotations : Train set Development set Test set (without events) 184EstelleChaix2018-01-13Released
bionlp-st-2016-SeeDev-trainingEntities and event annotations from the training set of the BioNLP-ST 2016 SeeDev task. SeeDev task focuses on seed storage and reserve accumulation on the model organism, Arabidopsis thaliana. The SeeDev task is based on the knowledge model Gene Regulation Network for Arabidopsis (GRNA) that meets the needs of text-mining (i.e. manual annotation of texts and automatic information extraction), experimental data indexing and retrieval and reuse in other plant systems. It is also expected to meet the requirements of the integration of the text knowledge with knowledge derived from experimental data in view of modeling in systems biology. GRNA model defines 16 different types of entities, and 22 types of event (in five sets of event types) that may be combined in complex events. For more information, please refer to the task website All annotations : Train set Development set Test set (without events) 35EstelleChaix2018-01-13Released
bionlp-st-2016-SeeDev-devEntities and event annotations from the development set of the BioNLP-ST 2016 SeeDev task. SeeDev task focuses on seed storage and reserve accumulation on the model organism, Arabidopsis thaliana. The SeeDev task is based on the knowledge model Gene Regulation Network for Arabidopsis (GRNA) that meets the needs of text-mining (i.e. manual annotation of texts and automatic information extraction), experimental data indexing and retrieval and reuse in other plant systems. It is also expected to meet the requirements of the integration of the text knowledge with knowledge derived from experimental data in view of modeling in systems biology. GRNA model defines 16 different types of entities, and 22 types of event (in five sets of event types) that may be combined in complex events. For more information, please refer to the task website All annotations : Train set Development set Test set (without events) 61EstelleChaix2018-01-13Released
disease_gene_microbe_smallSmall version (48 abstract that mention both Crohns and S. aureus) for development purposes Abbreviation: dgm Content: annotated abstracts on Crohn’s disease or on on Staphylococcus aureus (according to the jensenlab.org indexing resources) Entity types: (three for a start, organisms (NCBI Taxonomy taxa), disease (Disease Ontology terms), human genes (ENSEMBL proteins) Aim: Explore indirect associations of diseases to microbial species in this corpus via gene co-mentions536evangelos2018-01-13Testing
SPECIES800_autotaggedThis project comprises the SPECIES800 corpus documents automatically annotated by the Jensenlab tagger. Annotated entity types are: Genes/proteins from the mentioned organisms (and any human ones) PubChem Compound identifiers NCBI Taxonomy entries Gene Ontology cellular component terms BRENDA Tissue Ontology terms Disease Ontology terms Environment Ontology terms The SPECIES 800 (S800) comprises 800 PubMed abstracts. In its original form species mentions were manually identified and mapped to the corresponding NCBI Taxonomy identifiers. Described in: The SPECIES and ORGANISMS Resources for Fast and Accurate Identification of Taxonomic Names in Text. Pafilis E, Frankild SP, Fanini L, Faulwetter S, Pavloudi C, et al. (2013). PLoS ONE, 2013, 8(6): e65390. doi:10.1371/journal.pone.0065390. The manually annotated corpus is also available as a PubAnnotation project (see here). 0Evangelos Pafilis, Sampo Pyysalo, Lars Juhl Jensenevangelos2015-11-20Testing
SPECIES800SPECIES 800 (S800): an abstract-based manually annotated corpus. S800 comprises 800 PubMed abstracts in which organism mentions were identified and mapped to the corresponding NCBI Taxonomy identifiers. Described in: The SPECIES and ORGANISMS Resources for Fast and Accurate Identification of Taxonomic Names in Text. Pafilis E, Frankild SP, Fanini L, Faulwetter S, Pavloudi C, et al. (2013). PLoS ONE, 2013, 8(6): e65390. doi:10.1371/journal.pone.00653903.71 KEvangelos Pafilis, Sune P. Frankild, Lucia Fanini, Sarah Faulwetter, Christina Pavloudi, Aikaterini Vasileiadou, Christos Arvanitidis, Lars Juhl Jensenevangelos2018-04-25Released
disease_ontology_term_microbe5evangelos2018-04-25Developing
Genomics_InformaticsGenomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Text corpus for this journal annotated with various levels of linguistic information would be a valuable resource as the process of information extraction requires syntactic, semantic, and higher levels of natural language processing. In this study, we publish our new corpus called GNI Corpus version 1.0, extracted and annotated from full texts of Genomics & Informatics, with NLTK (Natural Language ToolKit)-based text mining script. The preliminary version of the corpus could be used as a training and testing set of a system that serves a variety of functions for future biomedical text mining.35.3 KHyun-Seok Parkewha-bio2018-11-27Beta
NameT# Ann.AuthorMaintainer Updated_atStatus

1-20 / 228 show all
Find2ER0Akihiro KamedaAkihiro Kameda2015-02-20Testing
Lectin3.52 Kangata2018-02-09
AlvisNLP-Test17Bibliome2017-01-20Testing
SNPPhenoExt3behrouz bokharaeianbokharaeian2016-04-30Developing
Preeclampsia58.9 Kcallahan_tiff2018-02-27Developing
NCBIDiseaseCorpus6.88 KRezarta Islamaj Doğan,Robert Leaman,Zhiyong LuChih-Hsuan Wei2015-08-06Released
tmVarCorpus1.43 KChih-Hsuan Wei , Bethany R. Harris , Hung-Yu Kao and Zhiyong LuChih-Hsuan Wei2015-08-06Released
Test_PubTator62Chih-Hsuan Wei2016-04-20Testing
Ab3P-abbreviations2.34 KSunghwan Sohn, Donald C Comeau, Won Kim and W John Wilburcomeau2016-07-29Beta
DLUT9314.57 KDLUT9312016-05-17Testing
test010Erika Asamizu2015-09-11Testing
Erin_test0ErinErinHJ_Kim2017-07-13Testing
bionlp-st-2016-SeeDev-test184EstelleChaix2018-01-13Released
bionlp-st-2016-SeeDev-training35EstelleChaix2018-01-13Released
bionlp-st-2016-SeeDev-dev61EstelleChaix2018-01-13Released
disease_gene_microbe_small536evangelos2018-01-13Testing
SPECIES800_autotagged0Evangelos Pafilis, Sampo Pyysalo, Lars Juhl Jensenevangelos2015-11-20Testing
SPECIES8003.71 KEvangelos Pafilis, Sune P. Frankild, Lucia Fanini, Sarah Faulwetter, Christina Pavloudi, Aikaterini Vasileiadou, Christos Arvanitidis, Lars Juhl Jensenevangelos2018-04-25Released
disease_ontology_term_microbe5evangelos2018-04-25Developing
Genomics_Informatics35.3 KHyun-Seok Parkewha-bio2018-11-27Beta