PHP

Tiv Heritage Archive

The Tiv Heritage Archive is a comprehensive digital preservation and cultural knowledge platform dedicated to documenting, archiving, and making accessible the rich cultural heritage of the Tiv people of Nigeria. It functions as a living digital library for Tiv language, traditions, history, and knowledge systems — serving language learners, researchers, educators, community members, and the global Tiv diaspora.

The platform is built on a fully custom PHP 8 MVC framework developed from scratch — with a proprietary router, base controller, ORM-like model layer, view renderer, and file-based caching system. The database schema spans 25+ tables (utf8mb4 encoding for full Tiv diacritic support) covering 7 content categories, a translation engine, community submissions, knowledge graph linking, team management, and a security/audit subsystem.

At the heart of the platform is an 8-layer hybrid translation engine designed to prioritise cultural accuracy before falling back to AI. It matches exact proverbs first, then curated phrase pairs, partial phrases, single dictionary words, category lookups, word-by-word grammar-rule reconstruction, and finally an optional Hugging Face NLLB AI model (facebook/nllb-200-distilled-600M) as the last resort — ensuring that culturally significant phrases are never overridden by raw machine translation.

The system supports a freemium monetisation model (5 translations/day for guests, 15 for registered users, unlimited for paid subscribers via Paystack/Flutterwave), a moderated community contribution workflow with audio pronunciation upload, a knowledge graph that links content items across categories, a team contributor showcase with live statistics, and a full admin panel with CRUD, duplicate detection, CSV bulk import, and IP-based security.


Tiv Heritage Archive screenshot

Key Features

33 features built into this project

Seven-category archive — Names
Proverbs
Plants
Festivals
Foods
Words
Animals
8-layer hybrid Tiv ↔ English translation engine with cultural priority ordering
Hugging Face NLLB AI fallback (facebook/nllb-200-distilled-600M) for unmatched text
Community contribution system with moderator review and approval workflow
Audio pronunciation recording and upload (WebM/MP3/OGG/WAV up to 10 MB)
Freemium translation model — daily limits for guests/users and paid unlimited subscriptions
Crowdsourced missing-word queue with public suggestions and admin review
Translation feedback and star-rating system that converts accepted corrections into phrases
Knowledge graph — cross-content links between plants
festivals
foods
proverbs
and animals
Festival photo gallery supporting multiple images per event with sort ordering
4-tier RBAC — User
Contributor
Moderator
Admin with hierarchical permissions
IP-based rate limiting — 5-attempt lockout and auto-block after 10 failed login attempts
File-based caching system (Cache::remember) for archive counts and recent items
CSV bulk word import for rapid dictionary expansion
Global full-text search across all content categories
Team contributor showcase with live per-category contribution statistics
Learning module with embedded YouTube educational videos and difficulty levels
Progressive Web App — Service Worker and Web App Manifest for offline support
Activity audit log (JSON old/new value deltas) for all admin and moderator actions
Duplicate detection and removal tool for content quality control

Challenges & Solutions

Technical problems encountered during development and how each was resolved.

1

Designing a translation engine for a low-resource African language like Tiv required an entirely custom pipeline rather than relying on generic NLP tools. The solution was an 8-layer architecture that queries exact proverbs first (preserving cultural nuance), then curated phrase pairs, partial matches, dictionary words, category lookups, word-by-word grammar reconstruction, and finally the NLLB AI model as a last resort. This layered approach ensures that culturally significant phrases always take precedence over statistical AI guesses, while a missing-words queue and community suggestion workflow continuously expand the dictionary without admin bottlenecks.

2

Building a fully custom MVC framework from scratch — including a regex-based router, CSRF middleware, file-based caching, and an ORM-like base model with full-text search — meant every security concern had to be handled manually. Session ID regeneration every 5 minutes, bcrypt password hashing, progressive IP blocking, Content Security Policy headers, and JSON delta audit logs were all implemented without a framework scaffolding these defaults, requiring careful design to avoid common vulnerabilities.

3

Modelling the knowledge graph across 25+ tables with a generic knowledge_links pivot presented a schema challenge: any content type (plant, festival, food, proverb) needed to be linkable to any other, without a proliferation of join tables. The solution was a polymorphic source_table/source_id + target_table/target_id design, enabling relationship types like "plant_used_in_festival" or "ingredient_in_food" across all categories through a single table.

4

Supporting the freemium translation model required tracking daily usage for both guests (IP-based) and authenticated users (DB-based) while simultaneously recording payment subscriptions, credit balances, and expiry dates. The translation_logs, translation_payments, and site_settings tables were designed together so that the engine checks payment status first, falls back to the daily limit, and logs every request — giving the admin full analytics on which words are missing and where the translation pipeline fails most often.