WhatsApp Database Analysis
WhatsApp is one of the most widely used messaging applications in India, making it a critical source of evidence in cyber crime investigations. Understanding its database structure and storage locations is essential for any mobile forensics examiner.
WhatsApp Data Locations
# Android WhatsApp Locations
/data/data/com.whatsapp/databases/
msgstore.db # Main message database
wa.db # Contacts database
axolotl.db # Encryption keys
chatsettings.db # Chat-specific settings
/data/media/0/WhatsApp/
Media/ # Sent/received media files
Databases/ # Encrypted backup (msgstore.db.crypt14)
Backups/ # Local backups
# iOS WhatsApp Locations
/private/var/mobile/Containers/Shared/AppGroup/[UUID]/
ChatStorage.sqlite # Main message database
ContactsV2.sqlite # Contacts
Media/ # Media files
Message Database Schema (msgstore.db)
| Table | Key Columns | Forensic Value |
|---|---|---|
| messages | key_remote_jid, data, timestamp, status, media_wa_type | Message content, sender/receiver, timestamps |
| chat_list | key_remote_jid, subject, creation, last_read_message_table_id | All conversations, group info |
| message_media | message_row_id, file_path, file_size, media_duration | Media file references |
| messages_quotes | message_row_id, quoted_row_id | Reply/quote relationships |
| call_log | jid, duration, timestamp, video_call | Voice/video call records |
-- Extract all messages with contact info
SELECT
m.key_remote_jid AS contact,
m.key_from_me AS sent_by_me,
m.data AS message_text,
datetime(m.timestamp/1000, 'unixepoch', 'localtime') AS message_time,
m.media_wa_type AS media_type,
m.status
FROM messages m
ORDER BY m.timestamp DESC;
-- Find messages between specific dates
SELECT * FROM messages
WHERE timestamp BETWEEN
strftime('%s', '2024-01-01') * 1000
AND strftime('%s', '2024-01-31') * 1000;
-- Extract deleted messages (where data IS NULL but row exists)
SELECT * FROM messages
WHERE data IS NULL
AND media_wa_type = 0
AND key_remote_jid IS NOT NULL;
WhatsApp uses end-to-end encryption for messages in transit, but local databases (msgstore.db) are stored unencrypted on rooted/jailbroken devices or accessible via backups. The crypt14 backup files require decryption using the key file stored on the device.
Browser History Extraction
Mobile browser data provides valuable evidence of web activity, searches, downloads, and login credentials. Each browser stores data in slightly different formats and locations.
Browser Data Locations
Chrome (Android)
/data/data/com.android.chrome/app_chrome/Default/
History, Cookies, Login Data, Web Data
Safari (iOS)
/private/var/mobile/Library/Safari/
History.db, Bookmarks.db, BrowserState.db
Firefox (Android)
/data/data/org.mozilla.firefox/databases/
browser.db, permissions.sqlite
Samsung Internet
/data/data/com.sec.android.app.sbrowser/databases/
history.db, bookmarks.db
Chrome History Analysis
-- Extract browsing history with visit count
SELECT
u.url,
u.title,
datetime(v.visit_time/1000000-11644473600, 'unixepoch') AS visit_time,
u.visit_count,
v.transition
FROM urls u
JOIN visits v ON u.id = v.url
ORDER BY v.visit_time DESC;
-- Extract search terms
SELECT
term,
url_id
FROM keyword_search_terms
ORDER BY url_id DESC;
-- Extract downloads
SELECT
target_path,
tab_url,
datetime(start_time/1000000-11644473600, 'unixepoch') AS download_time,
total_bytes
FROM downloads;
Location Data Analysis
Mobile devices continuously collect location data through GPS, cell towers, and WiFi networks. This data is stored by both the operating system and individual applications.
Location Data Sources
| Source | Location (Android) | Data Stored |
|---|---|---|
| Google Location History | /data/data/com.google.android.gms/ | GPS coordinates, timestamps, activity type |
| Google Maps | /data/data/com.google.android.apps.maps/ | Search history, saved places, directions |
| Photos EXIF | /data/media/0/DCIM/ | GPS coordinates embedded in images |
| WiFi Networks | /data/misc/wifi/ | SSID, BSSID (can be geolocated) |
| Cell Towers (SIM) | LOCI on SIM card | LAC, Cell ID of last connection |
iOS Location Databases
# Significant Locations (requires encryption key)
/private/var/mobile/Library/Caches/com.apple.routined/
Cache.sqlite # Significant locations learned by device
# Location Services Cache
/private/var/root/Library/Caches/locationd/
consolidated.db # Historical location cache (legacy)
cache_encryptedB.db # Cell tower and WiFi locations
# App-specific location data
/private/var/mobile/Containers/Data/Application/[UUID]/
# Individual apps store their own location history
Use EXIF extraction tools to obtain GPS coordinates from photos. The metadata includes latitude, longitude, altitude, timestamp, and camera direction. This can place the device at a specific location at a specific time - powerful evidence for establishing presence.
Social Media App Artifacts
Social media applications store significant amounts of user data locally, including messages, posts, friend lists, and activity logs.
Key Social Media Data Locations
Direct messages, stories viewed, search history, liked posts, downloaded media
Facebook Messenger
Messages, calls, photos shared, stickers, reactions, read receipts
Telegram
Messages, secret chats (if accessible), channels, groups, media cache
TikTok
Viewed videos, saved videos, messages, search history, drafts
# Facebook/Messenger
/data/data/com.facebook.orca/databases/
threads_db2 # Messenger conversations
prefs_db # User preferences
# Instagram
/data/data/com.instagram.android/databases/
direct.db # Direct messages
/data/data/com.instagram.android/shared_prefs/
# Account tokens and settings
# Telegram
/data/data/org.telegram.messenger/files/
cache4.db # Messages and chats
# Note: Secret chats use device-specific encryption
# Twitter/X
/data/data/com.twitter.android/databases/
[user_id]-66.db # Timeline and DMs
Deleted Message Recovery
Recovering deleted messages is often critical in investigations. Success depends on the app, device state, and time elapsed since deletion.
Methods for Deleted Data Recovery
| Method | Applicability | Success Rate |
|---|---|---|
| SQLite WAL Recovery | Apps using SQLite with WAL mode | High if WAL not checkpointed |
| SQLite Freelist Analysis | All SQLite databases | Medium - depends on overwrites |
| Physical Image Carving | Full physical acquisition | Variable - fragments may be found |
| Cloud Backup Recovery | If cloud backup was made before deletion | High if backup exists |
| Notification Databases | System notification logs | Low - limited retention |
SQLite Deleted Record Recovery
# Check for WAL (Write-Ahead Log) files
msgstore.db-wal # Contains uncommitted transactions
msgstore.db-shm # Shared memory file
# WAL files may contain deleted records that haven't been
# checkpointed (written back) to the main database
# Freelist pages contain previously deleted data
# Use specialized tools to extract:
# - sqlparse (forensic SQLite parser)
# - undark (SQLite deleted record recovery)
# - Commercial tools: Oxygen, Cellebrite, Magnet AXIOM
# Manual approach: Examine database pages for fragments
SELECT * FROM sqlite_master
WHERE type = 'table';
# Then use hex editor to examine raw page content
When WhatsApp messages are deleted, the database row is typically marked as deleted but data may remain in freelist pages. "Delete for Everyone" removes content from the data field but leaves metadata. Media files are deleted from storage but may be recoverable from unallocated space.
Timeline Correlation
Building a comprehensive timeline by correlating artifacts from multiple sources is essential for reconstructing events and establishing patterns of behavior.
Timeline Data Sources
- Call Logs: Incoming/outgoing calls with timestamps and duration
- Messages: SMS, WhatsApp, and other messenger timestamps
- Browser History: Web visits with precise timestamps
- Location Data: GPS coordinates and cell tower connections
- Photos: EXIF timestamps and GPS coordinates
- App Usage: Android UsageStats, iOS Screen Time data
- System Logs: Device boot times, app installations, updates
Always normalize all timestamps to a single timezone (preferably UTC) and account for any time zone differences. Verify device time settings - if the device clock was incorrect, all timestamps may be shifted. Cross-reference with external data sources (CDR from service provider, CCTV timestamps) to validate timeline accuracy.
Correlation Example
# Example: Establishing presence at crime scene
Time: 2024-01-15 14:30 IST
Source: Photo EXIF
GPS: 19.0760, 72.8777 (Mumbai)
Evidence: Photo taken at location
Time: 2024-01-15 14:32 IST
Source: WhatsApp
Message: "I'm here" sent to contact
Evidence: Self-admission of presence
Time: 2024-01-15 14:35 IST
Source: Cell Tower (CDR)
LAC/Cell: 12345/6789
Evidence: Connected to tower covering area
Time: 2024-01-15 14:40 IST
Source: Google Maps
Search: "nearest ATM"
Evidence: Activity at location
Corroboration: Multiple independent sources
confirm presence at location during timeframe
- WhatsApp stores messages in msgstore.db (Android) or ChatStorage.sqlite (iOS) - learn the schema for effective queries
- Browser history provides valuable evidence - Chrome, Safari, and Firefox each use SQLite databases with different schemas
- Location data comes from multiple sources: GPS, cell towers, WiFi networks, and photo EXIF metadata
- Social media apps store local caches of messages, media, and activity that persist even after cloud deletion
- Deleted messages may be recoverable from SQLite WAL files, freelist pages, or unallocated space
- Timeline correlation using multiple data sources provides stronger evidence than any single artifact
- Always normalize timestamps to UTC and verify device clock accuracy against external references
- Document extraction methods and maintain chain of custody for Section 65B/63 BSA compliance