A collaborative data annotation platform for NLP research in low-resource language settings. Built on top of doccano and customized for the NLP4LRL project.
Live platform: annotate.nlp4lrl.com
- Collaborative annotation with role-based access (admin / annotator)
- Multi-language support — including low-resource and morphologically rich languages
- Annotation types: NER / sequence labeling, text classification, sequence-to-sequence
- Mobile support
- Dark theme
- RESTful API
Requirements: Python 3.10+, Poetry, Node 18+, Yarn
git clone https://github.com/NLP4LRL/doccano.git
cd doccanoBackend:
cd backend
poetry install
poetry run python manage.py migrate
poetry run python manage.py create_roles
poetry run python manage.py create_admin --noinput \
--username admin --email admin@example.com --password yourpassword
poetry run python manage.py runserverIn a second terminal, start the task worker:
cd backend
poetry run celery --app=config worker --loglevel=INFO --concurrency=1Frontend:
cd frontend
yarn install
yarn dev # http://localhost:3000See Doccano_Deployment_Notes_NLP4LRL.md for the full deployment guide including HTTPS setup, nginx configuration, and the Docker image build workflow for the VPS.
cp docker/.env.example docker/.env
# Edit docker/.env with your credentials
docker compose -f docker/docker-compose.prod.yml up -dThe production stack includes: Django + Gunicorn, PostgreSQL, RabbitMQ, Celery, Flower, and Nginx.
The VPS (2 GB RAM) cannot build Docker images. Build locally and push to Docker Hub:
docker buildx build \
--platform linux/amd64 \
-f docker/Dockerfile.nginx \
-t billofosuhene/doccano-frontend:nlp4lrl \
--push \
.Then on the VPS:
docker compose -f docker/docker-compose.prod.yml pull nginx
docker compose -f docker/docker-compose.prod.yml up -d nginxUI customizations live on the frontend/nlp4lrl-ui-rebrand branch. See Doccano_Deployment_Notes_NLP4LRL.md for a full list of changes made.
This platform is a customized fork of doccano by Hiroki Nakayama et al. If you use doccano in your research, please cite:
@misc{doccano,
title={{doccano}: Text Annotation Tool for Human},
url={https://github.com/doccano/doccano},
note={Software available from https://github.com/doccano/doccano},
author={
Hiroki Nakayama and
Takahiro Kubo and
Junya Kamura and
Yasufumi Taniguchi and
Xu Liang},
year={2018},
}For questions about the NLP4LRL annotation platform, visit nlp4lrl.com.
