Show HN: Speeding up LLM inference 2x times (possibly)

(asciinema.org)

419 points | by kolinko 14 days ago [ vote ]

114 comments [+] Add Comment