Skip to content

Zero Hassle String Extraction & Analysis - Extract, analyze, and manage strings from JSON, YAML, CSV, TOML, INI, and ENV files. Features pattern recognition, validation, and comprehensive reporting.

License

Notifications You must be signed in to change notification settings

nolindnaidoo/string-le

Repository files navigation

Strings-LE Logo

Strings-LE: Zero Hassle String Extraction

Instantly extract every user-visible string in precise order
JSON, YAML, CSV, TOML, INI, ENV

VSCode Marketplace Version Open VSX Version Build Status MIT License

Tested on Ubuntu, macOS, and Windows for maximum compatibility.


String-LE Demo

Command Palette

πŸ™ Thank You

If String-LE saves you time, a quick rating helps other developers discover it:
⭐ VS Code Marketplace β€’ Open VSX

βœ… Why String-LE?

Extract strings from any file format β€” JSON, CSV, ENV, YAML, TOML β€” in one click. No regex. No scripts. Just results.

String-LE intelligently identifies user-visible text while filtering out numbers, IDs, URLs, and technical noise. Get a clean, ordered set of strings ready for i18n, documentation, or validation.

  • i18n without the hassle

    Instantly extract and de-duplicate locale strings into en.json, fr.json, or any language pack. Keep translation files lean and in sync across your codebase.

  • Clarity across configs & APIs
    Surface every user-facing message. Validation errors, system prompts, and API responses are cataloged in one place.

  • Confident edits in complex specs
    Flatten nested strings into a simple list you can safely edit without breaking structure or formatting.

  • Stream massive CSVs
    Work with millions of rows without locking up VS Code. Select only the columns you need and stream results directly to the editor.

  • Automatic cleanup built-in

    • Sort for stable diffs and reviews
    • Dedupe to eliminate noise
  • Fast at any scale
    Benchmarked at millions of lines per second, String-LE keeps up with large datasets and enterprise monorepos without slowing you down.

πŸš€ More from the LE Family

πŸ’‘ Use Cases

  • i18n & Localization - Extract user-visible strings for translation files and language packs
  • Content Management - Pull titles, descriptions, and messages from CMS exports for auditing
  • API Validation - Extract user-facing messages and errors from API responses for documentation
  • Documentation Audits - Get all text content from docs for reviews and updates

πŸš€ Quick Start

  1. Install from VS Code Marketplace or Open VSX
  2. Open any supported file type (Cmd/Ctrl + P String-LE).
  3. Run Quick Extract (Cmd+Alt+E / Ctrl+Alt+E / Status Bar).

βš™οΈ Configuration

Basic Settings

  • string-le.openResultsSideBySide – Open to the side
  • string-le.csv.streamingEnabled – Toggle CSV streaming
  • string-le.dedupeEnabled – Auto-dedupe strings
  • string-le.sortEnabled – Auto-sort output

Advanced Options

  • Safety Guards – File size warnings & thresholds
  • Notification Levels – Control verbosity and alerts

Real-World Configuration Examples

i18n Workflow

{
  "string-le.openResultsSideBySide": true,
  "string-le.dedupeEnabled": true,
  "string-le.sortEnabled": true,
  "string-le.csv.streamingEnabled": false
}

Large Dataset Processing

{
  "string-le.csv.streamingEnabled": true,
  "string-le.dedupeEnabled": false,
  "string-le.sortEnabled": false,
  "string-le.openResultsSideBySide": false
}

Content Audit

{
  "string-le.dedupeEnabled": true,
  "string-le.sortEnabled": true,
  "string-le.openResultsSideBySide": true,
  "string-le.csv.streamingEnabled": false
}

⚠️ Behaviors & Limits

  • CSV support assumes standard delimiter/quoting; unusual dialects not supported
  • Large outputs can be slow to open β€” use Copy when prompted
  • Streaming applies only to CSV; other formats load in memory
  • Multi-line strings (e.g., YAML block scalars) are only partially supported
  • CSV multi-line and all-column extracts stream to the editor only (no auto-copy)
  • Sorting and deduplication apply to final strings, not their original positions
  • Fallback mode uses quoted-string heuristics and may include false positives

See CONFIGURATION.md.

⚑ Performance

String-LE is built for speed and handles files from 100KB to 30MB+. See detailed benchmarks.

Format File Size Throughput Duration Memory Tested On
JSON 0.13MB 4703509 ~1.14 < 1MB Apple Silicon
JSON 1.31MB 6337898 ~8.47 < 1MB Apple Silicon
JSON 6.55MB 6181529 ~43.42 < 1MB Apple Silicon
ENV 5K lines 1,144,828 ~0.29 < 1MB Apple Silicon

Note: CSV format tests encountered parsing errors with generated test data. Real-world CSV performance may vary.
Real-World Performance: Tested with actual data up to 20MB (practical limit: 1MB warning, 10MB error threshold)
Performance Monitoring: Built-in real-time tracking with configurable thresholds
Full Metrics: docs/PERFORMANCE.md β€’ Test Environment: macOS, Bun 1.2.22, Node 22.x

  • Sorting: Adds 10-20% processing time when enabled
  • Hardware Requirements: Minimum 4GB RAM, recommended 8GB+ for large datasets

See PERFORMANCE.md.

🌍 Language Support

13 languages: English, German, Spanish, French, Indonesian, Italian, Japanese, Korean, Portuguese (Brazil), Russian, Ukrainian, Vietnamese, Chinese (Simplified)

🧩 System Requirements

VS Code 1.70.0+ β€’ Platform Windows, macOS, Linux
Memory 500MB recommended for large files

πŸ”’ Privacy

100% local processing. No data leaves your machine. Optional logging: string-le.telemetryEnabled

πŸ”§ Troubleshooting

Not detecting strings?
Ensure file is saved with supported extension (.json, .yaml, .csv, .toml, .ini, .env)

Large files slow?
Enable CSV streaming: string-le.csv.streamingEnabled: true

Need help?
Check Issues or enable logging: string-le.telemetryEnabled: true

❓ FAQ

What strings are extracted?
User-visible text only (excludes IDs, numbers, technical values, variable names)

Can I get deduplication?
Yes, enable string-le.dedupeEnabled: true to remove duplicates automatically

Max file size?
Up to 500MB with CSV streaming. Practical limit: 10MB for other formats

Perfect for i18n?
Absolutely! Extract strings for translation files, locale management, and content audits

πŸ“Š Testing

92 unit tests β€’ 44.11% function coverage, 9.05% line coverage
Powered by Vitest β€’ Run with bun test --coverage

See TESTING.md.


Copyright Β© 2025
@nolindnaidoo. All rights reserved.