Current Affairs

GS-I: Indian Culture (165)
GS-I: Indian Society (119)
GS-I: Modern Indian History (81)
GS-I: Modern World History (22)
GS-II: Constitution of India & Polity (645)
GS-II: Governance (328)
GS-II: International Relations (810)
GS-II: Social Justice (266)
GS-III: Economic Development (1574)
GS-III: Environment & DM (1473)
GS-III: Internal & External Security (385)
GS-III: Science & Technology (747)
GS-IV: Ethics, Integrity and Aptitude (20)

General Studies Prelims

General Studies (Mains)

BharatGPT’s Made-for-India Language Model

February 26, 2024 Current Affairs

In a major boost to natural language processing capabilities for Indian languages, IIT Bombay-led BharatGPT group has collaborated with Seetha Mahalaxmi Healthcare (SML) to introduce ‘Hanooman’ – a suite of Indic language models trained across 22 languages. With plans to expand support to over 20 languages, these AI models aim to enable text, speech, video and multimedia generation for diverse applications spanning healthcare, governance, finance and education sectors.

Key Details

Series of large language models (LLMs) responding in 11 Indian languages presently
Multimodal AI tools generating text, speech, video in Indian languages
Size ranging from 1.5 billion to 40 billion parameters
Working across 4 key areas – healthcare, governance, finance, education
BharatGPT Ecosystem
Research consortium of 8 IITs led by IIT Bombay
Backed by Dept of Science & Technology, SML, Reliance Jio
Aims to develop India-specific LLMs similar to ChatGPT
Significance of Indigenous Models
Mitigate concerns around data privacy, relevance
Address language diversity in India
Enhance access and proliferation of AI

LLM Architecture, Training and Applications

Architecture

Hanooman models based on transformer architecture, composed of:

Embeddings to numerically represent text
Encoders to Establish Contextual Relationships
Decoders to Generate Target Text

Self-attention mechanism

It detects correlations across input data. Helps capture longer range dependencies in text.

Training
Models trained on large Indian language corpora
Corpus contains text from diverse sources
Helps models understand complex concepts, relationships in text
Applications
Natural Language Understanding
Machine Translation
Question Answering
Text Summarization
Sentiment Analysis

Specialized Models – VizzhyGPT and LegalGPT

Apart from the base Hanooman models, customized models have also been developed for specific domains:

VizzhyGPT

Fine-tuned AI model for healthcare
Trained on large volumes of Indian medical data
Applications: Medical chat, lab report assessments etc.

LegalGPT

Targets legal domain
Trained on Indian legal data
Applications: Review of legal contracts, case law analysis etc.

These demonstrate potential of building task-specific LLMs with sufficient domain training data.

Key Data and Figures

Parameter Size of Select Hanooman Models:

Model	Parameter Size
Hanooman 1.5B	1.5 billion
Hanooman 40B	40 billion
Hanooman-Tamil	2.8 billion

Languages Covered: Hindi, Tamil, Marathi, Bengali, Telugu, Malayalam, Kannada, Gujarati, Punjabi, Odia, Urdu

Neural Network Architecture

The Hanooman models are based on a transformer neural network architecture. This consists of an encoder and a decoder:

Encoder: Breaks down the input text into smaller chunks and draws contextual relationships between words through self-attention mechanisms. Captures dependencies regardless of position in text.
Decoder: Uses output from encoder to generate target text word-by-word while predicting next word based on all previous words. Helps generate coherent, relevant output text.

The self-attention layer connects all positions of input sequence and computes representation by integrating information from entire sequence. This allows modeling long-range dependencies in text.

Training Techniques

Some key training techniques used with Hanooman models:

Transfer Learning: The models are first pre-trained on large unlabeled corpora like Wikipedia to obtain general language understanding capabilities. These capabilities are then transferred for fine-tuning on downstream tasks. Reduces compute requirements.
Self-Supervised Learning: Pre-training tasks formulated to make use of unlabeled data to capture linguistic properties. Example – Masked language modeling task predicts randomly masked words based on context.
Multitask Learning: Different end tasks like translation, Q&A incorporated into single model enabling sharing of learned representations across tasks. Improves overall performance.
Curriculum Learning: Model training progresses from simpler to complex topics by gradually increasing dataset difficulty over epochs.

Development of indigenous language models like Hanooman underscores India’s advancements in AI research and applications. As these models evolve further, they shall empower citizens by enhancing access to information in native languages while also presenting new opportunities for innovation. Robust policy frameworks and inter-disciplinary collaboration shall be vital to guide the responsible and ethical development of this technology.

Download PDF

Related Articles
Supreme Court Upholds Child’s Right to Both Parents’ Affection	India’s first advanced manufacturing hub (AMHUB) to be set up in Tamil Nadu.
Cuscuta Dodder	BRIGHT STAR-23
Kakrapar Atomic Power Plant’s 700 MW Reactor Goes Commercial	Wildlife Protection Act Completes 51 Years: Success and Challenges