The repository is cloned from this for own learning.
Annotated Corpus for Named Entity Recognition
This Groningen Meaning Bank-based corpus is tagged, annotated and built specifically to train the classifier to predict named entities such as geographical entity, person, event, location, etc.
The corpus is included in the repository here.
- numpy
- Pillow
- torch>=1.2
- tabulate
- tqdm
Please refer to requirements.txt for refering to the versions I used.
The code is GPU-compatible.
- Create train, val, test splits using:
python build_dataset.py
- Build the words and tags vocabularies and dataset parameters using:
python build_vocab.py --min_count_word=1 --min_count_tag=1
- Train the model using:
python train.py --model_dir=experiments/base_model --restore_file=best
Feel free to change the training parameters like learning_rate, batch_size, num_epochs, etc. using params.json
- Evaluate the model using:
python evaluate.py --model_dir=experiments/base_model --restore_file=best
Source code borrowed from here.