Source: Schneier on Security
Article note: Complexity always has costs.
Unicode is absurdly complex to allow it to not just represent goddamn everything, but to mix all those things together, and we pay for that shit everywhere it appears.
My usual preference is "make parsers vigorously flag and/or refuse to allow mixed pages."
Really interesting research demonstrating how to hide vulnerabilities in source code by manipulating how Unicode text is displayed. It’s really clever, and not the sort of attack one would normally think about.
From Ross Anderson’s blog:
This potentially devastating attack is tracked as CVE-2021-42574, while a related attack that uses homoglyphs –- visually similar characters –- is tracked as CVE-2021-42694. This work has been under embargo for a 99-day period, giving time for a major coordinated disclosure effort in which many compilers, interpreters, code editors, and repositories have implemented defenses.
Website for the attack. Rust security advisory.
Brian Krebs has a blog post.
EDITED TO ADD (11/12): An older paper on similar issues.