Have any studies been done on the use of newer or less popular programming languages in the era of LLMs? I'd guess that the relatively low number of examples and the overall amount of code available publicly in a particular language means that LLM output is less likely to be good.
If the hypothesis is correct, it sets an incredibly high bar for starting a new programming language today. Not only does one need to develop compiler, runtime, libraries, and IDE support (which is a tall order by itself), but one must also provide enough data for LLMs to be trained on, or even provide a custom fine-tuned snapshot of one of the open models for the new language.
Research takes some time, both to do but also to publish. In my area (programming languages), we have 4 major conferences a year, each with like a 6-to-8-month lag-time between submission and publication, assuming the submission is accepted by a double-blind peer review process.
I don't work in this area (I have a very unfavorable view of LLMs broadly), but I have colleagues who are working on various aspects of what you ask about, e.g., developing testing frameworks to help ensure output is valid or having the LLMs generate easily-checkable tests for their own generated code, developing alternate means of constraining output (think of, like, a special kind of type system), using LLMs in a way similar to program synthesis, etc. If there is fruit to be borne from this, I would expect to start seeing more publications about it at high-profile venues in the next year or two (or next week, which is when ICFP and SPLASH and their colocated workshops will convene this year, but I haven't seen the publications list to know if there's anything LLM-related yet).
(I have a pretty unfavorable view of LLMs myself, but) a quick search for "LLM" does find four sessions of the colocated LMPL workshop that are explicitly about LLMs and AI agents, plus a spread of other work across the schedule. ("LMPL" stands for "Language Models and Programming Languages", so I guess that's no surprise.)
Just anecdotally, I'm more productive in languages that I know _and_ which have good LLM understanding, than in languages that I'm just experienced with.
As much as I dislike Go as a language, LLMs are very good at it. Java too somewhat, Python a fair amount but less (and LLMs write Python I don't like). Swift however, I love programming in, but LLMs are pretty bad at it. We also have an internal config language which our LLMs are trained on, but which is complex and not very ergonomic, and LLMs aren't good at it.
It's not only the amount of code but also the quality of the available code. If a language has a low barrier to entry (e.g. python, javascript), there will be a lot of beginner code. If a language has good static analysis and type checking, the available code is free of certain error classes (e.g. Rust, Scala, Haskell).
I see that difference in llm generated code when switching languages. Generated rust code has a much higher quality than python code for example.
I know it's a meme project, but still it's impressive. And cc is at the point where you can take the repo of that language, ask it to "make it support emoji variables", and 5$ later it works. So yeah ... pretty impressive that we're already there.
on the other hand, it opens up the opportunity to build a language that is extremely easy to use with LLMs. I suspect a lot of issues in LLM usage comes from the fact that coding languages are built for humans.
Have any studies been done on the use of newer or less popular programming languages in the era of LLMs? I'd guess that the relatively low number of examples and the overall amount of code available publicly in a particular language means that LLM output is less likely to be good.
If the hypothesis is correct, it sets an incredibly high bar for starting a new programming language today. Not only does one need to develop compiler, runtime, libraries, and IDE support (which is a tall order by itself), but one must also provide enough data for LLMs to be trained on, or even provide a custom fine-tuned snapshot of one of the open models for the new language.
Research takes some time, both to do but also to publish. In my area (programming languages), we have 4 major conferences a year, each with like a 6-to-8-month lag-time between submission and publication, assuming the submission is accepted by a double-blind peer review process.
I don't work in this area (I have a very unfavorable view of LLMs broadly), but I have colleagues who are working on various aspects of what you ask about, e.g., developing testing frameworks to help ensure output is valid or having the LLMs generate easily-checkable tests for their own generated code, developing alternate means of constraining output (think of, like, a special kind of type system), using LLMs in a way similar to program synthesis, etc. If there is fruit to be borne from this, I would expect to start seeing more publications about it at high-profile venues in the next year or two (or next week, which is when ICFP and SPLASH and their colocated workshops will convene this year, but I haven't seen the publications list to know if there's anything LLM-related yet).
ICFP and SPLASH are this week, actually! Here's the program website for anyone interested: https://conf.researchr.org/program/icfp-splash-2025/program-...
(I have a pretty unfavorable view of LLMs myself, but) a quick search for "LLM" does find four sessions of the colocated LMPL workshop that are explicitly about LLMs and AI agents, plus a spread of other work across the schedule. ("LMPL" stands for "Language Models and Programming Languages", so I guess that's no surprise.)
Do most people consider important for LLMs to be able to generate code for the language they use? I think I'd consider it a positive if they can't.
Just anecdotally, I'm more productive in languages that I know _and_ which have good LLM understanding, than in languages that I'm just experienced with.
As much as I dislike Go as a language, LLMs are very good at it. Java too somewhat, Python a fair amount but less (and LLMs write Python I don't like). Swift however, I love programming in, but LLMs are pretty bad at it. We also have an internal config language which our LLMs are trained on, but which is complex and not very ergonomic, and LLMs aren't good at it.
It's not only the amount of code but also the quality of the available code. If a language has a low barrier to entry (e.g. python, javascript), there will be a lot of beginner code. If a language has good static analysis and type checking, the available code is free of certain error classes (e.g. Rust, Scala, Haskell).
I see that difference in llm generated code when switching languages. Generated rust code has a much higher quality than python code for example.
> Not only does one need to develop compiler, runtime, libraries, and IDE support (which is a tall order by itself)
CC can do that by itself in a loop, in ~3mo apparently. https://cursed-lang.org/
I know it's a meme project, but still it's impressive. And cc is at the point where you can take the repo of that language, ask it to "make it support emoji variables", and 5$ later it works. So yeah ... pretty impressive that we're already there.
on the other hand, it opens up the opportunity to build a language that is extremely easy to use with LLMs. I suspect a lot of issues in LLM usage comes from the fact that coding languages are built for humans.
More languages should treat code in docs as actual runnable code/tests. E.g. elixir has doctests: https://hexdocs.pm/elixir/docs-tests-and-with.html
See also Opalang or Ur/Web for very similar ideas, both released ~15 years ago.