| Svelte Hacker News

points by westurner 2 years ago

"Automated Unit Test Improvement using Large Language Models at Meta" (2024) https://arxiv.org/abs/2402.09171 :

> This paper describes Meta's TestGen-LLM tool, which uses LLMs to automatically improve existing human-written tests. TestGen-LLM verifies that its generated test classes successfully clear a set of filters that assure measurable improvement over the original test suite, thereby eliminating problems due to LLM hallucination. [...] We believe this is the first report on industrial scale deployment of LLM-generated code backed by such assurances of code improvement.

Coverage-guided unit test improvement might [with LLMs] be efficient too.

https://github.com/topics/coverage-guided-fuzzing :

- e.g. Google/syzkaller is a coverage-guided syscall fuzzer: https://github.com/google/syzkaller

- Gitlab CI supports coverage-guided fuzzing: https://docs.gitlab.com/ee/user/application_security/coverag...

- oss-fuzz, osv

Additional ways to improve tests:

Hypothesis and pynguin generate tests from type annotations.

There are various tools to generate type annotations for Python code;

> pytype (Google) [1], PyAnnotate (Dropbox) [2], and MonkeyType (Instagram) [3] all do dynamic / runtime PEP-484 type annotation type inference [4] to generate type annotations. https://news.ycombinator.com/item?id=39139198

icontract-hypothesis generates tests from icontract DbC Design by Contract type, value, and invariance constraints specified as precondition and postcondition @decorators: https://github.com/mristin/icontract-hypothesis

Nagini and deal-solver attempt to Formally Verify Python code with or without unit tests: https://news.ycombinator.com/item?id=39139198

Additional research:

"Fuzz target generation using LLMs" (2023) https://google.github.io/oss-fuzz/research/llms/target_gener... https://security.googleblog.com/2023/08/ai-powered-fuzzing-b... https://hn.algolia.com/?q=AI-Powered+Fuzzing%3A+Breaking+the...

OSSF//fuzz-introspector//doc/Features.md: https://github.com/ossf/fuzz-introspector/blob/main/doc/Feat...

https://scholar.google.com/scholar?hl=en&as_sdt=0%2C43&q=Fuz... :

- "Large Language Models Based Fuzzing Techniques: A Survey" (2024) https://arxiv.org/abs/2402.00350 : > This survey provides a systematic overview of the approaches that fuse LLMs and fuzzing tests for software testing. In this paper, a statistical analysis and discussion of the literature in three areas, namely LLMs, fuzzing test, and fuzzing test generated based on LLMs, are conducted by summarising the state-of-the-art methods up until 2024

DeanMey 2 years ago

Thanks for sharing this. By far the best tool I've seen in the market centered around Code Integrity is CodiumAI (https://www.codium.ai/). They generate unit test based on entire code repos. Also integrates into SDLC through a PR Agent on GitHub or GitLab. My whole team uses them.

westurner 2 years ago

Any take on whether an LLM trained solely on formally verified code will generate unverifiable code?