-
Notifications
You must be signed in to change notification settings - Fork 3
Home
Welcome to the LogSentinelAI Wiki! This comprehensive guide covers everything you need to know about using LogSentinelAI for intelligent log analysis.
# Install LogSentinelAI
pip install logsentinelai
# Or using UV (recommended)
uv add logsentinelai
# Copy configuration template
cp config.template config
# Edit configuration (set your LLM provider)
nano config
logsentinelai-geoip-download
# Analyze Apache access logs
logsentinelai-httpd-access /var/log/apache2/access.log
# Or use sample data
logsentinelai-httpd-access sample-logs/access-100.log
- Python: 3.11 or higher
- Memory: 4GB RAM minimum, 8GB recommended
- Storage: 1GB free space for GeoIP database
- Network: Internet connection for LLM API calls
pip install logsentinelai
# Install UV first
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install LogSentinelAI
uv add logsentinelai
git clone https://github.com/call518/LogSentinelAI.git
cd LogSentinelAI
uv sync
-
Configuration: Copy and edit
config.template
-
GeoIP Database: Run
logsentinelai-geoip-download
-
Test Installation: Run
logsentinelai --help
[llm]
provider = "openai" # openai, ollama, vllm
model = "gpt-4o-mini"
api_key = "your-api-key" # Only for OpenAI
base_url = "" # For Ollama/vLLM
[elasticsearch]
enabled = true
host = "localhost"
port = 9200
index_prefix = "logsentinelai"
[geoip]
enabled = true
database_path = "~/.logsentinelai/GeoLite2-Country.mmdb"
[analysis]
language = "english"
max_tokens = 4000
temperature = 0.1
[llm]
provider = "openai"
model = "gpt-4o-mini" # or gpt-4, gpt-3.5-turbo
api_key = "sk-your-api-key-here"
[llm]
provider = "ollama"
model = "llama3.1:8b" # or llama3.1:70b, mistral, etc.
base_url = "http://localhost:11434"
[llm]
provider = "vllm"
model = "meta-llama/Llama-3.1-8B-Instruct"
base_url = "http://localhost:8000"
# Basic analysis
logsentinelai-httpd-access /var/log/apache2/access.log
# With Elasticsearch output
logsentinelai-httpd-access /var/log/nginx/access.log --output elasticsearch
# Real-time monitoring
logsentinelai-httpd-access /var/log/apache2/access.log --monitor
What it detects:
- SQL injection attempts
- XSS attacks
- Brute force attacks
- Suspicious user agents
- Unusual request patterns
- Geographic anomalies
logsentinelai-httpd-apache /var/log/apache2/error.log
What it detects:
- Configuration errors
- Module failures
- Security-related errors
- Performance issues
logsentinelai-linux-system /var/log/syslog
What it detects:
- Authentication failures
- Service crashes
- Security events
- System anomalies
# TCPDump output analysis
logsentinelai-tcpdump /path/to/tcpdump.log
# Direct from network interface
sudo tcpdump -i eth0 -w - | logsentinelai-tcpdump -
What it detects:
- Network intrusion attempts
- Suspicious traffic patterns
- Protocol anomalies
- Data exfiltration
-
Get API Key
- Visit https://platform.openai.com/api-keys
- Create new API key
- Copy the key
-
Configure LogSentinelAI
[llm] provider = "openai" model = "gpt-4o-mini" api_key = "sk-your-key-here"
-
Test Configuration
logsentinelai-httpd-access sample-logs/access-100.log
-
Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
-
Pull Model
ollama pull llama3.1:8b
-
Configure LogSentinelAI
[llm] provider = "ollama" model = "llama3.1:8b" base_url = "http://localhost:11434"
Use Case | OpenAI | Ollama | Performance |
---|---|---|---|
High Accuracy | gpt-4o | llama3.1:70b | Excellent |
Balanced | gpt-4o-mini | llama3.1:8b | Good |
Fast/Local | gpt-3.5-turbo | mistral:7b | Fast |
# Start Elasticsearch
docker run -d \
--name elasticsearch \
-p 9200:9200 \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
elasticsearch:8.11.0
[elasticsearch]
enabled = true
host = "localhost"
port = 9200
index_prefix = "logsentinelai"
use_ssl = false
verify_certs = false
LogSentinelAI automatically creates optimized index templates for:
-
Security Events:
logsentinelai-security-*
-
Raw Logs:
logsentinelai-logs-*
-
Metadata:
logsentinelai-metadata-*
Default ILM policy:
- Hot Phase: 7 days
- Warm Phase: 30 days
- Cold Phase: 90 days
- Delete: 365 days
-
Download Dashboard
- Get
Kibana-9.0.3-Dashboard-LogSentinelAI.ndjson
from repository
- Get
-
Import in Kibana
- Go to Stack Management β Saved Objects
- Click "Import"
- Select the
.ndjson
file
-
Configure Index Patterns
- Go to Stack Management β Index Patterns
- Create pattern:
logsentinelai-*
- Security Overview: Real-time threat detection
- Geographic Analysis: Attack origin mapping
- Timeline Analysis: Event chronology
- Top Attackers: Most active threat sources
- Attack Types: Categorized threat analysis
β οΈ Important: For SSH connections, the target host must be added to your system's known_hosts file first. Runssh-keyscan -H <hostname> >> ~/.ssh/known_hosts
or manually connect once to accept the host key.
[ssh]
enabled = true
host = "remote-server.com"
username = "loguser"
key_file = "~/.ssh/id_rsa"
# Analyze remote logs
logsentinelai-httpd-access \
--ssh-host remote-server.com \
--ssh-user loguser \
--ssh-key ~/.ssh/id_rsa \
/var/log/apache2/access.log
- Use SSH keys, not passwords
- Limit SSH user permissions
- Use dedicated log analysis user
- Consider SSH tunneling for security
# Monitor Apache logs in real-time
logsentinelai-httpd-access /var/log/apache2/access.log --monitor
# With sampling (analyze every 100th entry)
logsentinelai-httpd-access /var/log/apache2/access.log --monitor --sample-rate 100
- Live Analysis: Process logs as they're written
- Sampling: Reduce load on high-traffic systems
- Real-time Alerts: Immediate threat detection
- Continuous Indexing: Stream to Elasticsearch
Edit src/logsentinelai/core/prompts.py
:
HTTPD_ACCESS_PROMPT = """
Analyze this Apache/Nginx access log for security threats:
Focus on:
1. SQL injection patterns
2. XSS attempts
3. Your custom criteria here
Log entry: {log_entry}
"""
Change analysis language in config:
[analysis]
language = "korean" # korean, japanese, spanish, etc.
# Process multiple files
logsentinelai-httpd-access /var/log/apache2/access.log.* --batch
# Parallel processing
logsentinelai-httpd-access /var/log/*.log --parallel 4
[analysis]
batch_size = 100 # Process 100 entries at once
max_tokens = 2000 # Reduce token limit
- Use smaller models for high-volume analysis
- Enable sampling for real-time monitoring
- Cache results for repeated patterns
logsentinelai-httpd-access [OPTIONS] LOG_FILE
Options:
--output [json|elasticsearch|stdout] Output format
--monitor Real-time monitoring
--sample-rate INTEGER Sampling rate for monitoring
--ssh-host TEXT SSH hostname
--ssh-user TEXT SSH username
--ssh-key TEXT SSH key file path
--help Show help message
logsentinelai-httpd-apache [OPTIONS] LOG_FILE
# Similar options to httpd-access
logsentinelai-linux-system [OPTIONS] LOG_FILE
# Similar options to httpd-access
logsentinelai-tcpdump [OPTIONS] LOG_FILE
# Similar options to httpd-access
logsentinelai-geoip-download [OPTIONS]
Options:
--force Force re-download even if database exists
--help Show help message
All commands support:
-
--config PATH
: Custom configuration file -
--verbose
: Enable verbose logging -
--quiet
: Suppress output except errors
[llm]
# LLM Provider Configuration
provider = "openai" # openai, ollama, vllm
model = "gpt-4o-mini" # Model name
api_key = "" # API key (OpenAI only)
base_url = "" # Base URL (Ollama/vLLM)
timeout = 30 # Request timeout (seconds)
max_retries = 3 # Maximum retry attempts
[elasticsearch]
# Elasticsearch Configuration
enabled = true # Enable Elasticsearch output
host = "localhost" # Elasticsearch host
port = 9200 # Elasticsearch port
index_prefix = "logsentinelai" # Index prefix
use_ssl = false # Use SSL connection
verify_certs = true # Verify SSL certificates
username = "" # Authentication username
password = "" # Authentication password
[geoip]
# GeoIP Configuration
enabled = true # Enable GeoIP lookups
database_path = "~/.logsentinelai/GeoLite2-City.mmdb" # City database includes coordinates
fallback_country = "Unknown" # Fallback for unknown IPs
cache_size = 1000 # Cache size for performance
include_private_ips = false # Include private IPs in processing
[analysis]
# Analysis Configuration
language = "english" # Output language
max_tokens = 4000 # Maximum tokens per request
temperature = 0.1 # LLM temperature (creativity)
batch_size = 50 # Batch processing size
enable_cache = true # Enable result caching
[ssh]
# SSH Configuration
enabled = false # Enable SSH functionality
default_host = "" # Default SSH host
default_user = "" # Default SSH user
default_key = "~/.ssh/id_rsa" # Default SSH key
[logging]
# Logging Configuration
level = "INFO" # Log level (DEBUG, INFO, WARNING, ERROR)
format = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
file = "" # Log file path (empty = stdout)
{
"timestamp": "2024-01-15T10:30:45Z",
"log_type": "httpd_access",
"original_log": "192.168.1.100 - - [15/Jan/2024:10:30:45 +0000] \"GET /admin.php HTTP/1.1\" 200 1234",
"analysis": {
"threat_detected": true,
"threat_type": "suspicious_access",
"severity": "medium",
"confidence": 0.85,
"description": "Access to admin interface from unusual IP",
"recommendations": [
"Monitor this IP for further suspicious activity",
"Consider implementing IP-based access controls"
]
},
"parsed_fields": {
"ip_address": "192.168.1.100",
"timestamp": "15/Jan/2024:10:30:45 +0000",
"method": "GET",
"path": "/admin.php",
"status_code": 200,
"response_size": 1234
},
"enrichment": {
"geoip": {
"ip": "192.168.1.100",
"country_code": "US",
"country_name": "United States",
"city": "New York",
"latitude": 40.7128,
"longitude": -74.0060
},
"reputation": {
"is_known_bad": false,
"threat_score": 0.3
}
},
"metadata": {
"analyzer_version": "0.2.3",
"model_used": "gpt-4o-mini",
"processing_time": 1.2
}
}
Field | Type | Description |
---|---|---|
threat_detected |
boolean | Whether a threat was detected |
threat_type |
string | Type of threat (sql_injection, xss, brute_force, etc.) |
severity |
string | Severity level (low, medium, high, critical) |
confidence |
float | Confidence score (0.0-1.0) |
description |
string | Human-readable description |
recommendations |
array | Recommended actions |
Problem: API calls to LLM provider failing
Solutions:
- Check API key validity
- Verify network connectivity
- Check provider status page
- Increase timeout in config
# Test connectivity
curl -H "Authorization: Bearer $OPENAI_API_KEY" \
https://api.openai.com/v1/models
Problem: GeoIP lookups failing
Solutions:
# Re-download database (City database includes coordinates)
logsentinelai-geoip-download
# Check database location and verify it's the City database
ls -la ~/.logsentinelai/GeoLite2-City.mmdb
# Test GeoIP functionality
python -c "from logsentinelai.core.geoip import get_geoip_lookup; g=get_geoip_lookup(); print(g.lookup_geoip('8.8.8.8'))"
Problem: Cannot connect to Elasticsearch
Solutions:
- Check Elasticsearch status:
curl http://localhost:9200
- Verify configuration in config file
- Check network connectivity
Problem: Cannot read log files
Solutions:
# Add user to log group
sudo usermod -a -G adm $USER
# Change log file permissions
sudo chmod 644 /var/log/apache2/access.log
Enable debug logging:
[logging]
level = "DEBUG"
Or use command line:
logsentinelai-httpd-access --verbose /var/log/apache2/access.log
- Reduce
batch_size
in config - Use smaller LLM models
- Enable sampling for large files
- Use local LLM (Ollama) instead of API
- Reduce
max_tokens
- Enable parallel processing
# Clone repository
git clone https://github.com/call518/LogSentinelAI.git
cd LogSentinelAI
# Install development dependencies
uv sync
# Setup pre-commit hooks
pre-commit install
# Format code
black src/
isort src/
# Type checking
mypy src/
# Linting
flake8 src/
-
Create analyzer file:
src/logsentinelai/analyzers/your_analyzer.py
- Define Pydantic models for structured output
-
Create LLM prompts in
src/logsentinelai/core/prompts.py
-
Add CLI entry point in
pyproject.toml
-
Add tests in
tests/
- Fork the repository
- Create feature branch
- Make changes following style guide
- Add tests
- Submit pull request
from logsentinelai.core.commons import LogAnalyzer
analyzer = LogAnalyzer(config_path="config")
results = analyzer.analyze_file("access.log", log_type="httpd_access")
from logsentinelai.core.elasticsearch import ElasticsearchClient
es_client = ElasticsearchClient(config)
es_client.index_security_event(event_data)
from logsentinelai.core.geoip import GeoIPLookup
geoip = GeoIPLookup()
# Get comprehensive location data including coordinates
location = geoip.lookup_geoip("8.8.8.8")
# Returns: {"ip": "8.8.8.8", "country_code": "US", "country_name": "United States",
# "city": "Mountain View", "latitude": 37.406, "longitude": -122.078}
# Legacy method for backward compatibility (country only)
country = geoip.lookup_country("8.8.8.8")
from logsentinelai.analyzers.httpd_access import analyze_httpd_access_logs
# Analyze logs programmatically
results = analyze_httpd_access_logs(
log_file="access.log",
output_format="json",
config_path="config"
)
for result in results:
if result.analysis.threat_detected:
print(f"Threat detected: {result.analysis.description}")
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Log Sources βββββΆβ LogSentinelAI βββββΆβ Elasticsearch β
β β β β β β
β β’ Local Files β β βββββββββββββββ β β β’ Security β
β β’ Remote SSH β β β Log Parser β β β Events β
β β’ Real-time β β βββββββββββββββ β β β’ Raw Logs β
β β β βββββββββββββββ β β β’ Metadata β
β β β β LLM β β β β
β β β β Analysis β β β β
β β β βββββββββββββββ β β β
β β β βββββββββββββββ β β β
β β β β GeoIP β β β β
β β β β Enrichment β β β β
β β β βββββββββββββββ β β β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βΌ
βββββββββββββββββββ
β Kibana β
β Dashboard β
β β
β β’ Visualization β
β β’ Alerts β
β β’ Analytics β
βββββββββββββββββββ
src/logsentinelai/
βββ analyzers/ # Log type-specific analyzers
β βββ httpd_access.py # Apache/Nginx access logs
β βββ httpd_apache.py # Apache error logs
β βββ linux_system.py # Linux system logs
β βββ tcpdump_packet.py # Network packet analysis
βββ core/ # Core functionality
β βββ commons.py # Common analysis functions
β βββ config.py # Configuration management
β βββ elasticsearch.py # Elasticsearch integration
β βββ geoip.py # GeoIP functionality
β βββ llm.py # LLM provider interface
β βββ monitoring.py # Real-time monitoring
β βββ prompts.py # LLM prompt templates
β βββ ssh.py # SSH remote access
β βββ utils.py # Utility functions
βββ utils/ # Additional utilities
β βββ geoip_downloader.py # GeoIP database management
βββ cli.py # Command-line interface
- Input: Log files (local/remote)
- Parsing: Extract structured data
- Analysis: LLM-powered threat detection
- Enrichment: GeoIP, reputation data
- Output: JSON, Elasticsearch, stdout
- Visualization: Kibana dashboards
This wiki provides comprehensive documentation for LogSentinelAI. For specific questions or issues, please:
- π Create an Issue
- π¬ Join Discussions
- π§ Email Support
Happy Log Analyzing! π