Noroboto: Lying Fonts and Mitigation in Rust

83 points by piker 3 days ago

phuff 22 hours ago

I think that this is an attack on the understanding of the LLM _potentially_ but it doesn't seem like it's likely to standup to legal scrutiny?

Seems like this is pretty clearly a case of fraudulent misrepresentation (https://www.law.cornell.edu/wex/fraudulent_misrepresentation) which kinda nullifies the contract, if I understand correctly:

  Fraudulent misrepresentation is a tort claim, typically arising in the field of contract law, that occurs when a defendant makes a intentional or reckless misrepresentation of fact or opinion with the intention to coerce a party into action or inaction on the basis of that misrepresentation.
  To determine whether fraudulent misrepresentation occurred, the court will look for six factors:
    A representation was made
    The representation was false 
    That when made, the defendant knew that the representation was false or that the defendant made the statement recklessly without knowledge of its truth
    That the fraudulent misrepresentation was made with the intention that the plaintiff rely on it
    That the plaintiff did rely on the fraudulent misrepresentation
    That the plaintiff suffered harm as a result of the fraudulent misrepresentation
  Like most claims under contract law, the standard remedy for fraudulent misrepresentation is damages.

piker 22 hours ago

That would be an open question in every jurisdiction. There wasn't really a representation here, but it might be something more like the doctrine of "mistake". It's also not clear "your honor I never read the contract but my LLM told me it was okay to sign" is a great argument either. Doubly-true for your $1,500/hour law firm duped by something like this.
[Edit: by "nullify" you probably mean "void" or "voidable" which are remedies in equity, and the "never read it" argument carries even more burden there. As the citation notes the traditional remedy for contract issues is damages (i.e., cash payment).]
Aurornis 22 hours ago

The LLM part is confusing people.
You can remove the LLM from the story and see how the trick would be a legal problem even with only humans involved: If you put an extra clause in a contract in white font that says “Oh and also if you agree to this you owe me $1,000” because you want to selectively hide it from reviewers but benefit from the text, no court is going to look kindly on you.
- piker 22 hours ago
  
  That’s not really a good analogy. (For blind people maybe. That is addressed in the legal accompanying post.) Here, only automation systems are actually vulnerable. The text on the screen is the same as print which is what the party signs.
- josephg 18 hours ago
  
  The trick is this:
  The white text is not visible to humans, and therefore not binding as part of the contract. But if lawyers use LLMs to assess the contract in part of the negotiation process, the LLM will be confused by the contract's contents.
  You could - for example - say the contract is for $10000. Then use unicode tricks to make any LLM reading it think the contract is only for $1000. The LLM will say this is good value, and not worth negotiating hard over. The human signs.
  Would anyone notice? Would a judge care? A human signed the contract. If they didn't do proper due diligence, its their own fault.
  
  rcxdude 17 hours ago
  
  I would be surprised if a judge looks favorably on such shenanigans.
  
  shakna 17 hours ago
  
  It would surprise me if the judge of such a case did not tell both sides off. Both fraud and negligence are problems.
  
  Aurornis 4 hours ago
  
  You would be surprised, then.
  If one party is intentionally misleading the other and employing technology to do it, they are the villain.
  The law doesn’t “both sides” these issues and cancel bad behavior out because the other side didn’t notice something.
  
  SolarNet 14 hours ago
  
  If they notice. Again, a printed version of the contract that is signed has no evidence of the attack. The attack is on getting your legal LLM to hallucinate specific things of what you are signing.
  I doubt a judge will look favorable on people saying "but my LLM said it was 1k"... cause they are known to hallucinate.
  
  like_any_other 7 hours ago
  
  Sabotaging due diligence, even if that diligence is performed with unreliable tools, is probably not legally great. What if the attack was against plain text search, so that a computer search for a phrase turns up zero results, but the phrase is still there, legible to a human? (E.g. as an embedded picture, or some font hackery)
  
  Aurornis 4 hours ago
  
  > The white text is not visible to humans, and therefore not binding as part of the contract.
  Using font tricks doesn’t make part of a contract not legally binding.
  Intentionally tricking an LLM doesn’t make the other party immune to the consequences of intentionally misleading the other party.
- makeitdouble 17 hours ago
  
  Your point on LLM not beeing needed is right. Trying to put it in other contexts, what about writing a full contract on a sheet with a pencil, then erase everything and print the final revised version on the same sheet with a printer.
  If the other party somehow relies on scanning the physically etched version of the contract and not the printer ink laid on top to digitize the contract, would you be legally responsible for their automated process misreading the document ?

PufPufPuf 23 hours ago

Wouldn't ligatures be a more effective attack vector for the "Maryland -> Delaware" case? That's all that ligatures do -- render a specific sequence of characters as something else.

stavros 23 hours ago

Came here to say this, I saw the initial video and thought they used ligatures, and then I was surprised the actual post was much more complicated.
piker 22 hours ago

We're definitely not TrueType experts and took the relatively "straightforward" approach of generating a small custom font for each mapping. If it's possible to render "Maryland" with ligatures while mapping the same string to "Delaware" in Unicode, then that's just another example of the vector. Really interesting stuff, and we'll be checking it out!
- Conscat 22 hours ago
  
  These are some very extreme examples of this that push the feature's limits:
  https://news.ycombinator.com/item?id=47256810
  https://news.ycombinator.com/item?id=26495059
- trebligdivad 16 hours ago
  
  Yeh there's lots of fun things like this; PDF is a full programming language so I think in principal you can generate PDFs that display different things to different people depending on the tools used etc. I've heard it said some of the incorrect text mapping stuff has been used in the past as a copy-protection silly to stop people copy/pasting content. (It's also a pain for those using screen readers).

projektfu 6 hours ago

One of the options in Paperless-NG is to always rasterize and OCR the PDFs because font obfuscation is already a thing in PDF land. I'm not sure why my gas bill needs to be obfuscated but here we are.

You could argue that it's legal malpractice to not do this for contracts 100% of the time.

echoangle 1 day ago

At that point you can just paste a screenshot of your doc into word and celebrate.

Also, the mitigation can probably be fooled with ligatures since they are only verifying the letters alone as far as I skimmed.

I don’t even understand the threat model. Is my opponent in a court case going to use this on the PDF they give the court? Surely the judge will be pretty annoyed since you can’t even ctrl+f in the files then.

piker 1 day ago

That's true for the full obfuscation, but not for the replacement. For replacement there's really nothing like it. We just shared the full obfuscation as just a PoC.
[Edit: The point here is not to prove some massive "gotcha", but rather demonstrate that there are a whole class of vulnerabilities that these pipelines are subject to. There will be follow-up posts that pack much more punch.]
- echoangle 1 day ago
  
  Assuming you’re the author since you also posted it: I just stealth-edited my comment, could you maybe talk about the threat model a bit more? I am not a lawyer so I don’t really see when I would want to do this.
  Also, I hope the „lame exploit“ I just edited out was not too offensive, it’s always great when people try to find attacks to make systems more safe.
  
  piker 23 hours ago
  
  Absolutely, and we definitely agree this particular attack is "lame" in the sense of not allowing CVE, etc.
  But, we're working on a lot of these (as we encounter them in developing Tritium), and the point really is just to demonstrate that LLMs can be blind to ineffective implementations of the specs and other tricks.
  As mentioned in the accompanying LegalQuants post, we see a lot of these available in the pipelines of applications like Claude for Legal, Harvey, Legora and others.
  The most nefarious case here requires crafting a number of custom fonts to do character-swapping. It's less discoverable but may be sanctionable to your point.
  But bear in mind this particular "attack" was vibe coded in a day or two and most of the frontier models fail to pick up on it. As "AI native" firms come on line, and aim to be increasingly end-to-end automated, these will become real legal issues.
  And there will be a lot of them available.
  
  minimaltom 23 hours ago
  
  It seems like the main attack scenario for this + legal AI would be during discovery: if opposing counsel gave you a poisoned PDF, and you threw it into one of these products to help you sift through it and got bad answers.
  However, wouldnt this be a rather risky move? Courts authorized the discovery, so I imagine the judge might loose their marbles and throw the hammer at them if this came to light.
  
  piker 22 hours ago
  
  Yes, this particular vector is probably better in contracting than discovery. There is a duty of candor to the court and court rules that might come into play. In the case of contracting the attacker would be exposed to the jurisdiction's law of contracts. That might call it a "misrepresentation" or fraudulent thus making the contract void or voidable, but it's not clear "your honor I never read the contract but my LLM told me it was okay to sign" is a great argument either.
  
  minimaltom 21 hours ago
  
  In this case you can say "the contract we reviewed was poisoned via technical means to show different words, depending on how the file was read". Perhaps if pushed you can say you loaded it into GCAI/Harvey/Legora and read/reviewed it there.
  There is no parallel construction where this wasn't deliberate & malicious, so it seems really high risk given the judge would rip you a new one if discovered.
  You can't rely on the defense that the other party didnt read it, if you made it show different words depending on how it was loaded.
  
  dwallin 13 hours ago
  
  You don’t need to make that argument. “I changed the font when reviewing to one that is easier for me to read.”

sheept 22 hours ago

Wouldn't it also work just to render the visible text as an image/path, then put invisible text objects over it?

I've heard suggestions like having white/invisible text in resumes for tricking applicant tracking systems,[0] but it's apparently mitigated by showing recruiters the plain text version of the resume.

[0] example: https://news.ycombinator.com/item?id=36857909

fourgreen 22 hours ago

I also had this idea a few month ago: https://czterycztery.pl/blog/show.php?f=1763873614-EN https://czterycztery.pl/inne/zmylkowy_pdf/#english

chaidhat 19 hours ago

That's very creative. I don't understand why the other comments are so critical. I think it is a good idea to always keep in mind new vulnerabilities and this certainly is one that I never thought about.

mproud 1 day ago

Someone could also just make a font file that swaps all of the characters around. So like an A looks like a Z, and a Z looks like an A.

piker 1 day ago

Covered in the post! It's the more aggressive approach for sure.

xiaod 22 hours ago

The compile-time vs runtime safety tradeoff is worth calling out. For infrastructure tooling where correctness matters more than iteration speed, the upfront cost pays dividends in reduced production incidents.

laserbeam 15 hours ago

I’m fairly certain if you give any substitution cypher to an LLM it will decipher the message. And that’s all I see here, a substitution cypher in a private area of unicode.

At best this is an adversarial attack to poison LLM training data… at worst this screws up accessibility tools (like screen readers) and copy paste.

archargelod 12 hours ago

> I’m fairly certain if you give any substitution cypher to an LLM it will decipher the message.
*with sufficiently long cyphertext
You can construct encoding in the way that every 2-5 words will use a brand new different key. Remember, Unicode is big enough to fit over 10000 English alphabets.
- piker 10 hours ago
  
  This is addressed in the post! ChatGPT 5.5 out of the box deciphered the first 1-to-1 mapping. We then scrambled it as you suggest and thwarted that.