This is a python version of samesame repo to generate homograph strings
-
Updated
Aug 22, 2018 - HTML
This is a python version of samesame repo to generate homograph strings
Code and Resources for "LLM-Powered Grapheme-to-Phoneme Conversion: Benchmark and Case Study", introducing methods to leverage LLMs for G2P tasks without additional training, featuring Sentence-Bench and Kaamel-Dict.
Given a TLD zone file, PhishCanary extracts International Domain Names (IDNs) that are homoglyphs of specified target domain names.
A Persian grapheme-to-phoneme (G2P) model designed for homograph disambiguation, fine-tuned using the HomoRich dataset to improve pronunciation accuracy.
Benchmarking notebooks for various Persian G2P models, comparing their performance on the SentenceBench dataset, including Homo-GE2PE and Homo-T5.
HomoRich: The first large-scale Persian homograph dataset for G2P conversion, featuring 528K annotated sentences with balanced pronunciation variants and dual phoneme representations.
This is a simple JavaScript based project that checks, detects and validate given URL for possibilities of homograph, homoglyph, IDN and any suspicious format.
Detect hidden Unicode characters & homoglyphs, reveals non-printable, zero-width and visually similar characters in text strings.
Solving various image processing, machine learning, and deep learning problems. Assignments for Computer Vision Course in UGR.
Add a description, image, and links to the homograph topic page so that developers can more easily learn about it.
To associate your repository with the homograph topic, visit your repo's landing page and select "manage topics."