All python packaging challenges are solved. Lesson learned is that there is not a single solution for all problems. getting more strings attached with VC funded companies and leaning on their infrastructure is a high risk for any FOSS community.
Well I started with pip because it's what I was told to use. But it was slow and had footguns. And then I started using virtualenv, but that only solved part of the problem. So I switched to conda, which sometimes worked but wrecked my shell profile and often leads to things mysteriously using the wrong version of a package. So someone told me to use pipenv, which was great until it was abandoned and picked up by someone who routinely broke the latest published version. So someone told me to use poetry, but it became unusably slow. So I switched back to pip with the built-in venv, but now I have the and problems I had before, with fewer features. So I switched to uv, because it actually worked. But the dependency I need is built and packaged differently for different operating systems and flavor of GPU, and now my coworkers can't get the project to install on their laptops.
I'm so glad all the Python packaging challenges are "solved"
I started with "sudo apt install python" a long time ago and this installed python2. This was during the decades-long transition from python2 to python3, so half the programs didn't work so I installed python3 via "sudo apt install python3". Of course now I had to switch between python2 and python3 depending on the program I wanted to run, that's why Debian/Ubuntu had "sudo update-alternatives --config python" for managing the symlink for "python" to either python2 or python3. But shortly after that, python3-based applications also didn't want to start with python3, because apt installed python3.4, but Python developers want to use the latest new features offered by python3.5 . Luckily, Debian/Ubuntu provided python3.5 in their backports/updates repositories. So for a couple of weeks things sort of worked, but then python3.7 was released, which definitely was too fresh for being offered in the OS distribution repositories, but thanks to the deadsnakes PPA, I could obtain a fourth-party build by fiddling with some PPA commands or adding some entries of debatable provenance to /etc/apt/lists.conf. So now I could get python3.7 via "sudo apt install python3.7". All went well again. Until some time later when I updated Home Assistant to its latest monthly release, which broke my installation, because the Home Assistant devs love the latest python3.8 features. And because python3.8 wasn't provided anymore in the deadsnakes PPA for my Ubuntu version, I had to look for a new alternative. Building python from source never worked, but thank heavens there is this new thing called pyenv (cf. pyenv), and with some luck as well as spending a weekend for understanding the differences between pyenv, pyvenv, venv, virtualenv (a.k.a. python-virtualenv), and pyenv-virtualenv, Home Assistant started up again.
This wall of text is an abridged excursion of my installing-python-on-Linux experience.
There is also my installing-python-on-Windows experience, which includes: official installer (exe or msi?) from python.org; some Windows-provided system application python, installable by setting a checkbox in Windows's system properties; NuGet, winget, Microsoft Store Python; WSL, WSL2; anaconda, conda, miniconda; WinPython...
I understand this is meant as caricature, but for doing local development tools like mise or asdf are really something I've never looked back from. For containers it's either versioned Docker image or compile yourself.
I started at about the same time you did, and I've never seen an instance of software expecting a Python version newer than what is in Debian stable. It happens all the time for Nodejs, Go, or Rust though.
Your comment shows the sad state of software quality those days. Rust is the same, move fast and break things. And lately also Mesa started to suffer from the same disease. You basically need, those days, the same build env like the one on the developer's machine or the build will fail.
What's wrong with just using virtualenv. I never used anything else, and I never felt the need to. Maybe it's not as shipping l shiny as the other tools, but it just works.
Nothing is inherently wrong with virtualenv. All these tools make virtual environments and offer some way to manage them. But virtualenv doesn't solve the problem of dependency management.
I've walked the same rocky path and have the bleeding feet to show for it! My problem is that now my packaging/environment mental model is so muddled I frequently mix up the commands...
I felt like python packaging was more or less fine, right up until pip started to warn me that I couldn't globally install packages anymore. So I need to make a billion venvs to install the same ml, plotting libraries and dependencies, that I don't want in a requirements.txt for the project.
I just want packaging to fuck off and leave me alone. Changes here are always bad, because they're changes.
I'd otherwise agree but this problem seems unique to Python. I don't have problems like this with npm or composer or rubygems. Or at least very infrequently. It's almost every time I need to update dependencies or install on a new machine that the Python ecosystem decides I'm not worthy.
I think pip made some poor design choices very early, but pip stuck around for a long time and people kept using it. Of course things got out of control, then people kept inventing new package management until uv comes along. I don't know enough about Python to understand how people could live with that for so long.
This comes across as uninformed at best and ignorant at worst. Python still doesn't have a reliable way to handle native dependencies across different platforms. pip and setuptools cannot be the end all be all of this packaging ecosystem nor should they be.
I agree, now I just use uv and forget about it. It does use up a fair bit of disk, but disk is cheap and the bootstrapping time reduction makes working with python a pleasure again
I recently did the same at work, just converted all our pip stuff to use uv pip but otherwise no changes to the venv/requirements.txt workflow and everything just got much faster - it's a no-brainer.
But the increased resource usage is real. Now around 10% of our builds get OOM killed because the build container isn't provisioned big enough to handle uv's excessive memory usage. I've considered reducing the number of available threads to try throttle the non-deterministic allocation behavior, but that would presumably make it slower too, so instead we just click the re-run job button. Even with that manual intervention 10% of the time, it is so much faster than pip it's worth it.
Please open an issue with some details about the memory usage. We're happy to investigate and feedback on how it's working in production is always helpful.
We run on-prem k8s and do the pip install stage in a 2CPU/4GB Gitlab runner, which feels like it should be sufficient for the uv:python3.12-bookworm image. We have about 100 deps that aside from numpy/pandas/pyarrow are pretty lightweight. No GPU stuff. I tried 2CPU/8GB runners but it still OOMed occasionally so didn't seem worth using up those resources for the normal case. I don't know enough about the uv internals to understand why it's so expensive, but it feels counter-intuitive because the whole venv is "only" around 500MB.
I've been dealing with python vs debian for the last three hours and am deeply angry with the ecosystem. Solved it is not.
Debian decided you should use venv for everything. But when packages are installed in a venv, random cmake nonsense does not find them. There are apt-get level packages, some things find those, others do not. Names are not consistent. There's a thing called pipx which my console recommended for much the same experience. Also the vestiges of 2 vs 3 are still kicking around in the forms of refusing to find a package based on the number being present or absent.
Whatever c++headerparser might be, I'm left very sure that hacking python out of the build tree and leaving it on the trash heap of history is the proper thing to do.
These tools together solve a fraction of the problen. The other parts of the problem are interfacing with classic c, c++ libraries and handling different hardware and different OSes. It is not even funny how tricky it is to use the same GPU/CUDA versions but with different CPU architectures and hopefully most people dont need to be exposed to it. Sometimes parts of the stack depends on a different version of a c++ library than other parts of the stack. Or some require different kernel modules or CUDA driver settings. But I would be happy if there was a standardized way to at least link to the same C++ libraries, hopefully with the same ABI, across different clusters or different OS versions. Python is so far from solved…
uv is venv + insanely fast pip. I’ve used it every day for 5+ months and I still stare in amazement every time I use it. It’s probably the most joy I’ve ever gotten out of technology.
I've been burned too many times by embracing open source products like this.
We've been fed promises like these before. They will inevitably get acquired. Years of documentation, issues, and pull requests will be deleted with little-to-no notice. An exclusively commercial replacement will materialize from the new company that is inexplicably missing the features you relied on in the first place.
For what it's worth, I understand this concern. However, I want to emphasize that pyx is intentionally distinct from Astral's tools. From the announcement post:
> Beyond the product itself, pyx is also an instantiation of our strategy: our tools remain free, open source, and permissively licensed — forever. Nothing changes there. Instead, we'll offer paid, hosted services that represent the "natural next thing you need" when you're already using our tools: the Astral platform.
Basically, we're hoping to address this concern by building a separate sustainable commercial product rather than monetizing our open source tools.
I believe that you are sincere and truthful in what you say.
Unfortunately, the integrity of employees is no guard against the greed of investors.
Maybe next year investors change the CEO and entire management and they start monetizing the open source tools. There is no way of knowing. But history tells us that there is a non-trivial chance of this happening.
It makes sense, but the danger can come when non-paying users unwittingly become dependent on a service that is subsidized by paying customers. What you're describing could make sense if pyx is only private, but what if there is some kind of free-to-use pyx server that people start using? They may not realize they're building on sand until the VC investors start tightening the screws and insist you stop wasting money by providing the free service.
(Even with an entirely private setup, there is the risk that it will encourage too much developer attention to shift to working within that silo and thus starve the non-paying community of support, although I think this risk is less, given Python's enormous breadth of usage across communities of various levels of monetization.)
The entire reason people choose "permissive licenses" is so that it won't last forever. At best, the community can fork the old version without any future features.
I don't think this is true -- a license's virality doesn't mean that its copyright holders can't switch a future version to a proprietary license; past grants don't imply grants to future work under any open source license.
Correct; however, without a CLA and assuming there are outside contributors, relicensing the existing code would be mildly painful, if not downright impossible.
You're saying that would be more painful in a viral license setting, right? If so I agree, although I think there's a pretty long track record of financially incentivized companies being willing to take that pain. MongoDB's AGPL transition comes to mind.
But, to refocus on the case at hand: Astral's tools don't require contributors to sign a CLA. I understand (and am sympathetic) to the suspicion here, but the bigger picture here is that Astral wants to build services as a product, rather than compromising the open source nature of its tools. That's why the announcement tries to cleanly differentiate between the two.
They have no need to, current repos show everything is under MIT/Apache. They could close the source at any time and not worry about CLA.
>bigger picture here is that Astral wants to build services as a product
What services? pyx? Looks nice but I doubt my boss is going to pay for it. More likely they just say "Whatever, package is in PyPi, use that."
UV, Ruff, Ty. Again, maybe they can get some data/quant firm who REALLY cares about speed to use their products. Everyone else will emit a long sigh, grab pip/poetry, black and mypy and move on.
> I think there's a pretty long track record of financially incentivized companies being willing to take that pain. MongoDB's AGPL transition comes to mind.
MongoDB had a CLA from the start, didn't it?
> Astral's tools don't require contributors to sign a CLA.
yeah? who's going to court for multiple years to fight giants? Maybe they'll pull a RedHat and put the code behind a subscription service. Still OSS right?
This is just plain false and honestly close-minded. People choose permissive licenses for all sorts of reasons. Some might want to close it off later, but lots of people prefer the non-viral nature of permissive licenses, because it doesn't constrain others' license choice in the future. Still others think that permissive licenses are more free than copyleft, and choose them for that reason. Please don't just accuse vast groups of people of being bad-faith actors just because you disagree with their license choice.
By viral, do you mean licenses like GPL that force those who have modified the code to release their changes (if they're distributing binaries that include those changes)?
Because FWIW CPython is not GPL. They have their own license but do not require modifications to be made public.
They've been the only game in town for a while, and their pricing reflects it. But this project is only for Python (for now?) so JFrog is not _immediately_ in danger.
Will pyx describe a server protocol that could be implemented by others, or otherwise provide software that others can use to host their own servers? (Or maybe even that PyPI can use to improve its own offering?) That is, when using "paid, hosted services like pyx", is one paying for the ability to use the pyx software in and of itself, or is one simply paying for access to Astral's particular server that runs it?
I might not be following: what would that protocol entail? pyx uses the same PEP 503/691 interfaces as every other Python index, but those interfaces would likely not be immediately useful to PyPI itself (since it already has them).
> or is one simply paying for access to Astral's particular server that runs it?
pyx is currently a service being offered by Astral. So it's not something you can currently self-host, if that's what you mean.
> pyx uses the same PEP 503/691 interfaces as every other Python index
... Then how can it make decisions about how to serve the package request that PyPI can't? Is there not some extension to the protocol so that uv can tell it more about the client system?
Interesting. I'll have to look into this further. (I've bookmarked the entire thread for reference, especially since I'm a bit attached to some of my other comments here ;) )
Has that ever happened in the Python ecosystem specifically? It seems like there would be a community fork led by a couple of medium-size tech companies within days of something like that happening, and all users except the most enterprise-brained would switch.
This is a valid concern, but astral just has an amazing track record.
I was surprised to see the community here on HN responding so cautiously. Been developing in python for about a decade now- whenever astral does something I get excited!
Pyx represents the server side, not the client side. The analogue in the pre-existing Python world is PyPI.
Many ideas are being added to recent versions of pip that are at least inspired by what uv has done — and many things are possible in uv specifically because of community-wide standards development that also benefits pip. However, pip has some really gnarly internal infrastructure that prevents it from taking advantage of a lot of uv's good ideas (which in turn are not all original). That has a lot to do with why I'm making PAPER.
For just one example: uv can quickly install previously installed packages by hard-linking a bunch of files from the cache. For pip to follow suit, it would have to completely redo its caching strategy from the ground up, because right now its cache is designed to save only download effort and not anything else about the installation process. It remembers entire wheels, but finding them in that cache requires knowing the URL from which they were downloaded. Because PyPI organizes the packages in its own database with its own custom URL scheme, pip would have to reach out to PyPI across the Internet in order to figure out where it put its own downloads!
> However, pip has some really gnarly internal infrastructure that prevents it from taking advantage of a lot of uv's good ideas (which in turn are not all original).
FWIW, as a pip maintainer, I don't strongly agree with this statement, I think if pip had the same full time employee resources that uv has enjoyed over the last year that a lot of these issues could be solved.
I'm not saying here that pip doesn't have some gnarly internal details, just that the bigger thing holding it back is the lack of maintainer resources.
> For just one example: uv can quickly install previously installed packages by hard-linking a bunch of files from the cache. For pip to follow suit, it would have to completely redo its caching strategy from the ground up, because right now its cache is designed to save only download effort and not anything else about the installation process.
I actually think this isn't a great example, evidenced by the lack of a download or wheel command from uv due to those features not aligning with uv's caching strategy.
That said, I do think there are other good examples to your point, like uv's ability to prefetch package metadata, I don't think we're going to be able to implement that in pip any time soon due to probably the need for a complete overhaul of the resolver.
> FWIW, as a pip maintainer, I don't strongly agree with this statement, I think if pip had the same full time employee resources that uv has enjoyed over the last year that a lot of these issues could be solved.
Fair enough. I'm sure if someone were paying me a competitive salary to develop my projects, they'd be getting done much faster, too.
> I actually think this isn't a great example, evidenced by the lack of a download or wheel command from uv due to those features not aligning with uv's caching strategy.
I guess you're talking about the fact that uv's cache only stores the unpacked version, rather than the original wheel? I'm planning to keep the wheel around, too. But my point was more that because of this cache structure, pip can't even just grab the wheel from its cache without hitting the Internet, on top of not having a place to put a cache of the unpacked files.
> uv's ability to prefetch package metadata,
You mean, as opposed to obtaining it per version, lazily? Because it does seem like the .metadata file system works pretty well nowadays.
> I don't think we're going to be able to implement that in pip any time soon due to probably the need for a complete overhaul of the resolver.
Ugh, yeah. I know the resolution logic has been extracted as a specific package, but it's been challenging trying to figure out how to actually use that in a project that isn't pip.
This doesn't generalize: you could have said the same thing about pip versus easy_install, but pip clearly has worthwhile improvements over easy_install that were never merged back into the latter.
Pip is broken and has been for years, they're uninterested in fixing the search. Or even removing the search or replacing it with a message/link to the package index.
imo, if pip's preference is to ship broken functionality, then what is/is not shipped with pip is not meaningful.
This is not a charitable interpretation. The more charitable read is that fixing search is non-trivial and has interlocking considerations that go beyond what pip's volunteer maintainers reasonably want to or can pick up.
(And for the record: it isn't their fault at all. `pip search` doesn't work because PyPI removed the search API. PyPI removed that API for very good reasons[1].)
That was 7 years ago. If it's not coming back, the CLI should make that clear, instead of giving a temporary "cannot connect" message that implies it could work, if you wait a minute and try again.
It was three years ago; 2018 is when they considered removing the command, not when the search API was actually removed from PyPI.
And this is part of the interlocking considerations I mentioned: there are private indices that supply the XML-RPC API, and breaking them doesn't seem justifiable[1].
Does that seem like a real solution to you? That it's ok to represent a never-functional operation as one that might maybe work? ...because it could work of you jump through a bunch of undocumented hoops?
It's so wild to me that so many people are apparently against making a user-friendly update. The whole thing seems very against pep8 (its surprising, complicated, non-specific, etc)
I don't know what to tell you; I just gave an example of it being functional for a subset of users, who don't deserve to be broken just because it's non-functional on PyPI.
Nobody wants anything to be user-unfriendly. You're taking a very small view into Python packaging and extending it to motives, when resources (not even financial ones) are the primary challenge.
Is code to detect when the user is not in that subset and say that it doesn't work really really hard for some non-obvious reason? If the default case for the vast majority of users doesn't work, it doesn't seem like printing a more useful error message to them should be that hard.
> it doesn't seem like printing a more useful error message to them should be that hard.
I think the existing error message is useful:
$ pip search foo
ERROR: XMLRPC request failed [code: -32500]
RuntimeError: PyPI no longer supports 'pip search' (or XML-RPC search). Please use https://pypi.org/search (via a browser) instead. See https://warehouse.pypa.io/api-reference/xml-rpc.html#deprecated-methods for more information.
It says (1) what failed, (2) why it failed, and (3) links to a replacement, and (4) links to a deprecation explanation. That last link could maybe then link back to the pip issue or include some more context, but it's a far cry from being not helpful.
> Why is it so hard to install PyTorch, or CUDA, or libraries like FlashAttention or DeepSpeed that build against PyTorch and CUDA?
This is so true! On Windows (and WSL) it is also exacerbated by some packages requiring the use of compilers bundled with outdated Visual Studio versions, some of which are only available by manually crafting download paths. I can't wait for a better dev experience.
Stuff like that led me fully away from Ruby (due to Rails), which is a shame, I see videos of people chugging along with Ruby and loving it, and it looks like a fun language, but when the only way I can get a dev environment setup for Rails is using DigitalOcean droplets, I've lost all interest. It would always fail at compiling something for Rails. I would have loved to partake in the Rails hype back in 2012, but over the years the install / setup process was always a nightmare.
I went with Python because I never had this issue. Now with any AI / CUDA stuff its a bit of a nightmare to the point where you use someone's setup shell script instead of trying to use pip at all.
Do I get it right that this issue is within Windows? I've never heard of the issues you describe while working with Linux.. I've seen people struggle with MacOS a bit due to brew different versions of some library or the other, mostly self compiling Ruby.
I had issues on Mac, Windows and Linux... It was obnoxious. It led me to adopt a very simple rule: if I cannot get your framework / programming language up and running in under 10 minutes (barring compilation time / download speeds) I am not going to use your tools / language. I shouldn't be struggling with the most basic of hello worlds with your language / framework. I don't in like 100% of the other languages I already use, why should I struggle to use a new language?
The mmdetection library (https://github.com/open-mmlab/mmdetection/issues) also has hundreds of version-related issues. Admittedly, that library has not seen any updates for over a year now, but it is sad that things just break and become basically unusable on modern Linux operating systems because NVIDIA can't stop breaking backwards and forwards compatibility for what is essentially just fancy matrix multiplication.
On Linux good luck if you're not using anything besides the officially nvidia-supported Ubuntu version. Just 24.04 instead of 22.04 has regular random breakages and issues, and running on archlinux is just endless pain.
Have you tried conda? Since the integration of mamba its solver is fast and the breadth of packages is impressive. Also, if you have to support Windows and Python with native extensions, conda is a godsend.
I would recommend learning a little bit of C compilation and build systems. Ruby/Rails is about as polished as you could get for a very popular project. Maybe libyaml will be a problem once in a while if you're compiling Ruby from scratch, but otherwise this normally works without a hassle. And those skills will apply everywhere else. As long as we have C libraries, this is about as good as it gets, regardless of the language/runtime.
I'm surprised to hear that. Ruby was the first language in my life/career where I felt good about the dependency management and packaging solution. Even when I was a novice, I don't remember running into any problems that weren't obviously my fault (for example, installing the Ruby library for PostgreSQL before I had installed the Postgres libraries on the OS).
Meanwhile, I didn't feel like Python had reached the bare minimum for package management until Pipenv came on the scene. It wasn't until Poetry (in 2019? 2020?) that I felt like the ecosystem had reached what Ruby had back in 2010 or 2011 when bundler had become mostly stable.
Bundler has always been the best package manager of any language that I've used, but dealing with gem extensions can still be a pain. I've had lots of fun bugs where an extension worked in dev but not prod because of differences in library versions. I ended up creating a docker image for development that matched our production environment and that pretty much solved those problems.
Lets be honest here - whilst some experiences are better/worse than others, there doesn't seem to be a dependency management system that isn't (at least half) broken.
I use Go a lot, the journey has been
- No dependency management
- Glide
- Depmod
- I forget the name of the precursor - I just remembered, VGo
- Modules
We still have proxying, vendoring, versioning problems
Python: VirtualEnv
Rust: Cargo
Java: Maven and Gradle
Ruby: Gems
Even OS dependency management is painful - yum, apt (which was a major positive when I switched to Debian based systems), pkg (BSD people), homebrew (semi-official?)
Dependency Management is the wild is a major headache, Go (I only mention because I am most familiar with) did away with some compilation dependency issues by shipping binaries with no dependencies (meaning that it didn't matter which version of linux you built your binary for, it will run on any of the same arch linux - none of that "wrong libc" 'fun'), but you still have issues with two different people building the same binary in need of extra dependency management (vendoring brings with it caching problems - is the version in the cache up to date, will updating one version of one dependency break everything - what fun)
NuGet for C# has always been fantastic, and I like Cargo, though sometimes waiting for freaking ever for things to build does kill me on the inside a little bit. I do wish Go had a better package manager trajectory, I can only hope they continue to work on it, there were a few years I refused to work on any Go projects because setup was a nightmare.
Have you tried JRuby? It might be a bit too large for your droplet, but it has the java versions of most gems and you can produce cross-platform jars using warbler.
In-house app that is easy to develop and needs to be cross-platform (windows, linux, aix, as400). The speed picks up as it runs, usually handling 3000-5000 eps on old hardware.
And yet operating distributed systems built on it is a world of pain. Elasticsearch, I am looking at you. Modern hardware resources leave the limitations of running/scaling on top JVM to be an expensive, frustrating endeavor.
In addition to elasticsearch's metrics, there's like 4 JVM metrics I have to watch constantly on all my clusters to make sure the JVM and its GC is happy.
Given that WSL is pretty much just Linux, I don't see what relevance Visual Studio compiler versions have to it. WSL binaries are always built using Linux toolchains.
At the same time, even on Windows, libc has been stable since Win10 - that's 10 years now. Which is to say, any binary compiled by VC++ 2015 or later is C-ABI-compatible with any other such binary. The only reasons why someone might need a specific compiler version is if they are relying on some language features not supported by older ones, or because they're trying to pass C++ types across the ABI boundary, which is a fairly rare case.
This is the right direction for Python packaging, especially for GPU-heavy workflows. Two concrete things I'm excited about: 1) curated, compatibility-tested indices per accelerator (CUDA/ROCm/CPU) so teams stop bikeshedding over torch/cu* matrixes, and 2) making metadata queryable so clients can resolve up front and install in parallel. If pyx can reduce the 'pip trial-and-error' loop for ML by shipping narrower, hardware-targeted artifacts (e.g., SM/arch-specific builds) and predictable hashes, that alone saves hours per environment. Also +1 to keeping tools OSS and monetizing the hosted service—clear separation builds trust. Curious: will pyx expose dependency graph and reverse-dependency endpoints (e.g., "what breaks if X→Y?") and SBOM/signing attestation for supply-chain checks?
In my experience, Anaconda (including Miniconda, Micromamba, IntelPython, et al.) is still the default choice in scientific computing and machine learning.
It's useful because it also packages a lot of other deps like CUDA drivers, DB drivers, git, openssl, etc. When you don't have admin rights, it's really handy to be able to install them and there's no other equivalent in the Python world. That being said, the fact conda (and derivatives) do not follow any of the PEPs about package management is driving me insane. The ergonomics are bad as well with defaults like auto activation of the base env and bad dependency solver for the longest time (fixed now), weird linking of shared libs, etc.
When was that ever a part of the definition? It was part of the early Unix culture, sure, but even many contemporary OSes didn't ship with compilers, which were a separate (and often very expensive!) piece of software.
OTOH today most Linux distros don't install any dev tools by default on a clean install. And, ironically, a clean install of Windows has .NET, which includes a C# compiler.
there's something about these comments ("name-collision") that drives me up the wall. do y'all realize multiple things can have the same name? for example, did you know there are many people with exactly the same names:
and yet no one bemoans this (hospitals don't consult name registries before filling out birth certificates). that's because it's almost always extremely clear from context.
> The real pyx
what makes that pyx any more "real" than this pyx? it's the extension of the language py plus a single letter. there are probably a thousand projects that could rightfully use that combination of letters as a name.
Human naming has nothing to do with software naming which seems obvious but apparently not. Python package creators should check the pypi registry for names and generally avoid name collisions where reasonable. Common sense applies for reduced confusion for users globally and also for potential legal issues if any party trademarks their software name. What makes one pyx more real than the other is one was first and took the spot on pypi. Simple as that. https://pypi.org/project/PyX/
Python packaging has a lot of standards, but I would say most of them (especially in the last decade) don't really compete with each other. They lean more towards the "slow accretion of generally considered useful features" style.
This itself is IMO a product of Python having a relatively healthy consensus-driven standardization process for packaging, rather than an authoritative one. If Python had more of an authoritative approach, I don't think the language would have done as well as it has.
Do you really think Python’s consensus-driven language development is better than authoritarian?
I am honestly tired of the Python packing situation. I breathe a sigh of relief in language like Go and Rust with an “authoritative” built-in solution.
I wouldn’t mind the 30 different packaging solutions as long as there was authoritative “correct” solution. All the others would then be opt-in enhancements as needed.
I guess a good thought experiment would be if we were to design a packaging system (or decide not to) for a new PL like python, what would it look like?
> I breathe a sigh of relief in language like Go and Rust with an “authoritative” built-in solution.
I don't know about Go, but Rust's packaging isn't authoritative in the sense that I meant. There's no packaging BDFL; improvements to Rust packaging happen through a standards process that closely mirrors that of Python's PEPs.
I think the actual difference between Rust and Python is that Rust made the (IMO correct) decision early on to build a single tool for package management, whereas Python has historically had a single installer and left every other part of package management up to the user. That's a product of the fact that Python is more of a patchwork ecosystem and community than Rust is, plus the fact that it's a lot older and a lot bigger (in terms of breadth of user installation base).
Basically, hindsight is 20/20. Rust rightly benefited from Python's hard lesson about not having one tool, but they also rightly benefited from Python's good experience with consensus-driven standardization.
Welp, I guess it's time to start pulling all the uv deps out of our builds and enjoy the extra 5 minutes of calm per deploy. I'm not gonna do another VC-fueled supply chain poisoning switcheroo under duress of someone else's time crunch to start churning profit.
As I said a couple weeks ago, they're gonna have to cash out at some point. The move won't be around Uv -- it'll be a protected private PyPi or something.
Not sure what you're trying to get at here. Charlie Marsh has literally said this himself; see e.g. this post he made last September:
> "An example of what this might look like (we may not do this, but it's helpful to have a concrete example of the strategy) would be something like an enterprise-focused private package registry."
Astral doesn't really have a business model yet, it has potential business models.
The issue is that there isn't a clean business model that will produce the kind of profits that will satisfy their VCs - not that there isn't any business model that will help support a business like theirs.
Private package management would probably work fine if they hadn't taken VC money.
Cash out is a bit of a negative word here. They've shown the ability to build categorically better tooling, so I'm sure a lot of companies would be happy to pay them to fix even more of their problems.
I haven't adopted uv yet watching to see what will be their move. We recently had to review our use of Anaconda tools due to their changes, then review Qt changes in license. Not looking forward to another license ordeal.
We're hoping that building a commercial service makes it clear that we have a sustainable business model and that our tools (like uv) will remain free and permissively licensed.
I think having a credible, proven business model is a feature of an open source project - without one there are unanswered questions about ongoing maintenance.
I've been wondering where the commercial service would come in and this sounds like just the right product that aligns with what you're already doing and serves a real need. Setting up scalable private registries for python is awful.
Fortunately for a lot of what uv does, one can simply switch to something else like Poetry. Not exactly a zero-code lift but if you use pyproject.toml, there are other tools.
Of course if you are on one of the edge cases of something only uv does, well... that's more of an issue.
What does GPU-aware mean in terms of a registry? Will `uv` inspect my local GPU spec and decide what the best set of packages would be to pull from Pyx?
Since this is a private, paid-for registry aimed at corporate clients, will there be an option to expose those registries externally as a public instance, but paid for by the company? That is, can I as a vendor pay for a Pyx registry for my own set of packages, and then provide that registry as an entrypoint for my customers?
> Will `uv` inspect my local GPU spec and decide what the best set of packages would be to pull from Pyx?
We actually support this basic idea today, even without pyx. You can run (e.g.) `uv pip install --torch-backend=auto torch` to automatically install a version of PyTorch based on your machine's GPU from the PyTorch index.
pyx takes that idea and pushes it further. Instead of "just" supporting PyTorch, the registry has a curated index for each supported hardware accelerator, and we populate that index with pre-built artifacts across a wide range of packages, versions, Python versions, PyTorch versions, etc., all with consistent and coherent metadata.
So there are two parts to it: (1) when you point to pyx, it becomes much easier to get the right, pre-built, mutually compatible versions of these things (and faster to install them); and (2) the uv client can point you to the "right" pyx index automatically (that part works regardless of whether you're using pyx, it's just more limited).
> Since this is a private, paid-for registry aimed at corporate clients, will there be an option to expose those registries externally as a public instance, but paid for by the company? That is, can I as a vendor pay for a Pyx registry for my own set of packages, and then provide that registry as an entrypoint for my customers?
We don't support this yet but it's come up a few times with users. If you're interested in it concretely feel free to email me (charlie@).
what happens in a situation in which I might have access to a login node, from which I can install packages, but then the computing nodes don't have internet access. Can I define in some hardware.toml the target system and install there even if my local system is different?
To be more specific, I'd like to do `uv --dump-system hardware.toml` in the computing node and then in the login node (or my laptop for that matter) just do `uv install my-package --target-system hardware.toml` and get an environment I can just copy over.
Yes, we let you override our detection of your hardware. Though we haven't implemented dumping detected information on one platform for use on another, it's definitely feasible, e.g., we're exploring a static metadata format as a part of the wheel variant proposal https://github.com/wheelnext/pep_xxx_wheel_variants/issues/4...
Astral folks that are around - there seems to be a bit of confusion in the product page that the blog post makes a little more clear.
> The next step in Python packaging
The headline is the confusing bit I think - "oh no, another tool already?"
IMO you should lean into stating this is going to be a paid product (answering how you plan to make money and become sustainable), and highlight that this will help solve private packaging problems.
I'm excited by this announcement by the way. Setting up scalable private python registries is a huge pain. Looking forward to it!
Is there a big enough commercial market for private Python package registries to support an entire company and its staff? Looks like they're hiring for $250k engineers, starting a $26k/year OSS fund, etc. Expenses seem a bit high if this is their first project unless they plan on being acquired?
Its interesting because the value is definitely there. Every single python developer you meet (many of who are highly paid) has a story about wasting a bunch of time on these things. The question is how much of this value can Astral capture.
I think based on the quality of their work, there's also an important component which is trust. I'd trust and pay for a product from them much more readily than an open source solution with flaky maintainers.
Yeah they certainly generate a lot of value by providing excellent productivity tooling. The question is how they capture some of that value, which is notoriously hard with an OSS license. A non-OSS license creates the adobe trap on the other hand, where companies deploy more and more aggressive moetization strategies, making life worse and worse for users of the software.
Just one data point, but if it's as nice to use as their open source tools and not outrageously expensive, I'd be a customer. Current offerings for private python package registries are kind of meh. Always wondered why github doesn't offer this.
The real thing that I hope someone is able to solve is downloading such huge amounts of unnecessary code. As I understand, the bulk of the torch binary is just a huge nvfatbin compiled for every SM under the sun when you usually just want it to run on whatever accelerators you have on hand. Even just making narrow builds of like `pytorch-sm120a` (with stuff like cuBLAS thin binaries paired with it too) as part of a handy uv extra or something like that would make it much quicker and easier.
Another piece is that PyPI has no index— it's just a giant list of URLs [1] where any required metadata (eg, the OS, python version, etc) is encoded in the filename. That makes it trivial to throw behind a CDN since it's all super static, but it has some important limitations:
- there's no way to do an installation dry run without pre-downloading all the packages (to get their dep info)
- there's no way to get hashes of the archives
- there's no way to do things like reverse-search (show me everything that depends on x)
I'm assuming that a big part of pyx is introducing a dynamically served (or maybe even queryable) endpoint that can return package metadata and let uv plan ahead better, identify problems and conflicts before they happen, install packages in parallel, etc.
Astral has an excellent track record on the engineering and design side, so I expect that whatever they do in this space will basically make sense, it will eventually be codified in a PEP, and PyPI will implement the same endpoint so that other tools like pip and poetry can adopt it.
For sdists, this is impossible until we can drop support for a bunch of older packages that don't follow modern standards (which is to say, including the actual "built" metadata as a PKG-INFO file, and having that file include static data for at least name, version and dependencies). I'm told there are real-world projects out there for which this is currently impossible, because the dependencies... depend on things that can't be known without inspecting the end user's environment. At any rate, this isn't a PyPI problem.
> there's no way to get hashes of the archives
This is provided as a URL fragment on the URLs, as described in https://peps.python.org/pep-0503/. Per PEP 658, the hash for the corresponding metadata files is provided in the data-dist-info-metadata (and data-core-metadata) attributes of the links.
Ah interesting, thanks for that! I was frustrated once again recently to note that `pip install --dry-run` required me to pre-download all packages, so I assumed nothing had changed.
You could do worse than to start using --only-binary=:all: by default. (It's even been proposed as default behaviour: https://github.com/pypa/pip/issues/9140) Even if you can't actually install that way, it will point out the places where sdists are needed.
In principle, separate metadata availability should still at least be possible for most sdists eventually. But I'm not the one calling the shots here.
Should I expect that to download only metadata and not whole wheels/sdists for everything? Or does that depend on everything in my requirements file being available as a wheel?
I'm brushing up with Python for a new job, and boy what a ride. Not because of the language itself but the tooling around packages. I'm coming from Go and TS/JS and while these two ecosystems have their own pros and cons, at least they are more or less straightforward to get onboarded (there are 1 or 2 tools you need to know about). In Python there are dozens of tools/concepts related to packaging: pip, easy_install, setuptools, setup.py, pypy, poetry, uv, venv, virtualenv, pipenv, wheels, ...
There's even an entire website dedicated to this topic: https://packaging.python.org
Don't understand how a private company like Astral is leading here. Why is that hard for the Python community to come up with a single tool to rule them all? (I know https://xkcd.com/927/). Like, you could even copy what Go or Node are doing, and make it Python-aware; no shame on that. Instead we have these who-knows-how-long-they-will-last tools every now and then.
They should remove the "There should be one-- and preferably only one --obvious way to do it." from the Python Zen.
I don't know, I was looking at TS tutorials the other day and there seemed to be at least half a dozen "bundlers" with different tutorails suggesting different ones to use. It took me a while to figure out I could just directly invoke "tsc" to generate javascript from typescript.
It's not an easy task, and when there's already lots of established practices, habits, and opinions, it becomes even more difficult to get around the various pain points. There's been many attempts: pip (the standard) is slow, lacks dependency resolution, and struggles with reproducible builds. Conda is heavy, slow to solve environments, and mixes Python with non-Python dependencies, which makes understanding some setups very complicated. Poetry improves dependency management but is sluggish and adds unnecessary complexity for simple scripts/projects. Pipenv makes things simpler, but also has the same issue of slow resolution and inconsistent lock files. Those are the ones I've used over the years at least.
uv addressed these flaws with speed, solid dependency resolution, and a simple interface that builds on what people are already used to. It unifies virtual environment and package management, supports reproducible builds, and integrates easily with modern workflows.
> In Python there are dozens of tools/concepts related to packaging: pip, easy_install, setuptools, setup.py, pypy, poetry, uv, venv, virtualenv, pipenv, wheels,
Some of those are package tools, some are dependency managers, some are runtime environments, some are package formats...
Some are obsolete at this point, and others by necessity cover different portions of programming language technologies.
I guess what I'm saying is, for the average software engineer, there's not too many more choices in Python for programming facilities than in Javascript.
I don't know what guides you're reading but I haven't touched easy_install in at least a decade. It's successor, pip, had effectively replaced all use cases for it by around 2010.
I work with Python, Node and Go and I don't think any of them have great package systems. Go has an amazing module isolation system and boy do I wish hiding functions within a module/package was as easy in Python as it is in Go. What saves Go is the standard library which makes it possible to write almost everything without needing external dependencies. You've worked with JavaScript and I really don't see how Python is different. I'd argue that Deno and JSR is the only "sane" approach to packages and security, but it's hardly leading and NPM is owned by Microsoft so it's not like you have a great "open source" platform there either. On top of that you have the "fun" parts of ESM vs CommonJS.
Anyway, if you're familiar with Node then I think you can view pip and venv as the npm of Python. Things like Poetry are Yarn, made as replacements because pip sort of sucks. UV on the other hand is a drop-in replacement for pip and venv (and other things) similar to how pnpm is basically npm. I can't answer your question on why there isn't a "good" tool to rule them all, but the single tool has been pip since 2014, and since UV is a drop-in, it's very easy to use UV in development and pip in production.
I think it's reasonable to worry about what happens when Astral needs to make money for their investors, but that's the beauty of UV compared to a lot of other Python tools. It's extremely easy to replace because it's essentially just smarter pip. I do hope Astral succeeds with their pyx, private registries and so on by the way.
I appreciate everything they’ve done but the group which maintains Pip and the package index is categorically incapable of shipping anything at a good velocity.
It’s entirely volunteer based so I don’t blame them, but the reality is that it’s holding back the ecosystem.
I suspect it’s also a misalignment of interests. No one there really invests in improving UX.
> the group which maintains Pip and the package index is categorically incapable of shipping anything at a good velocity.
> It’s entirely volunteer based so I don’t blame them
It's not just that they're volunteers; it's the legacy codebase they're stuck with, and the use cases that people will expect them to continue supporting.
> I suspect it’s also a misalignment of interests. No one there really invests in improving UX.
"Invest" is the operative word here. When I read discussions in the community around tools like pip, a common theme is that the developers don't consider themselves competent to redesign the UX, and there is no money from anywhere to hire someone who would be. The PSF operates on an annual budget on the order of $4 million, and a big chunk of that is taken up by PyCon, supporting programs like PyLadies, generic marketing efforts, etc. Meanwhile, total bandwidth use at PyPI has crossed into the exabyte range (it was ~600 petabytes in 2023 and growing rapidly). They would be completely screwed without Fastly's incredible in-kind donation.
Indeed, they broke a few features in the last few years and made the excuse "we can't support them, we're volunteers." Well, how about stop breaking things that worked for a decade? That would take less effort.
They had time to force "--break-system-packages" on us though, something no one asked for.
> how about stop breaking things that worked for a decade?
They aren't doing this.
> They had time to force "--break-system-packages" on us though, something no one asked for.
The maintainers of several Linux distros asked for it very explicitly, and cooperated to design the feature. The rationale is extensively documented in the proposal (https://peps.python.org/pep-0668/). This is especially important for distros where the system package manager is itself implemented in Python, since corrupting the system Python environment could produce a state that is effectively unrecoverable (at least without detailed Python-specific know-how).
Was being a facetious, sure someone asked for it, but it was pretty dumb. This has never "corrupted" anything, is rare (not happened to me in last 15 years), and simply fixed when knowledgeable.
Not everyone can simply fix it, so a better solution would be to isolate the system python, allow more than one installed, etc.
Distros already do this to some extent.
> Why is that hard for the Python community to come up with a single tool to rule them all? (I know https://xkcd.com/927/).
because they're obsessed with fixing non-issues (switching out pgp signing for something you can only get from microsoft, sorry, "trusted providers", arguing about mission statements, etc.)
whilst ignoring the multiple elephants in the room (namespacing, crap slow packaging tool that has to download everything because the index sucks, mix of 4 badly documented tools to build anything, index that operates on filenames, etc.)
You don't need to know most of those things. Until last year I used setup.py and pip exclusively for twenty years, with a venv for each job at work. Wheels are simply prebuilt .zips. That's about an hour of learning more or less.
Now we have pyproject.toml and uv to learn. This is another hour or so of learning, but well worth it.
Astral is stepping up because no one else did. Guido never cared about packaging and that's why it has been the wild west until now.
> Why is that hard for the Python community to come up with a single tool to rule them all?
Whatever I would say at this point about PyPA would be so uncharitable that dang would descend on me with the holy hammer of banishment, but you can get my drift. I just don't trust them to come out with good tooling. The plethora they have produced so far is quite telling.
That said, pip covers 99% of my needs when I need to do anything with Python. There are ecosystems that have it way worse, so I count my blessings. But apparently, since Poetry and uv exist, my 99% are not many other people's 99%.
If I wanted to package my Python stuff, though, I'm getting confused. Is it now setup.py or pyproject.toml? Or maybe both? What if I need to support an older Python version as seen in some old-but-still-supported Linux distributions?
> They should remove the "There should be one-- and preferably only one --obvious way to do it." from the Python Zen.
Granted, tooling is different from the language itself. Although PyPA could benefit from a decade having a BDFL.
> If I wanted to package my Python stuff, though, I'm getting confused. Is it now setup.py or pyproject.toml? Or maybe both? What if I need to support an older Python version as seen in some old-but-still-supported Linux distributions?
Your Python version is irrelevant, as long as your tools and code both run under that version. The current ecosystem standard is to move in lock-step with the Python versions that the core Python team supports. If you want to offer extended support, you should expect to require more know-how, regardless. (I'm happy to receive emails about this kind of thing; I use this username, on the Proton email service.)
Nowadays, you should really always use at least pyproject.toml.
If your distribution will include code in non-Python languages and you choose to use Setuptools to build your package, you will also need a setup.py. But your use of setup.py will be limited to just the part that explains how to compile your non-Python code; don't use it to describe project metadata, or to orchestrate testing, or to implement your own project management commands, or any of the other advanced stuff people used to do when Setuptools was the only game in town.
In general, create pyproject.toml first, and then figure out if you need anything else in addition. Keeping your metadata in pyproject.toml is the sane, modern way, and if we could just get everyone on board, tools like pip could be considerably simpler. Please read https://blog.ganssle.io/articles/2021/10/setup-py-deprecated... for details about modern use of Setuptools.
Regardless of your project, I strongly recommend considering alternatives to Setuptools. It was never designed for its current role and has been stuck maintaining tons of legacy cruft. If your project is pure Python, Flit is my current recommendation as long as you can live with its opinionated choices (in particular, you must have a single top-level package name in your distribution). For projects that need to access a C compiler for a little bit, consider Hatch. If you're making the next Numpy, keep in mind that they switched over to Meson. (I also have thrown my hat in this ring, although I really need to get back to that project...)
If you use any of those alternatives, you may have some tool-specific configuration that you do in pyproject.toml, but you may also have to include arbitrary code analogous to setup.py to orchestrate the build process. There's only so far you can get with a config file; real-world project builds get ferociously complex.
> Why is that hard for the Python community to come up with a single tool to rule them all?
0. A lot of those "tools and concepts" are actually completely irrelevant or redundant. easy_install has for all practical purposes been dead for many years. virtualenv was the original third party project that formed the basis for the standard library venv, which has been separately maintained for people who want particular additional features; it doesn't count as a separate concept. The setup.py file is a configuration file for Setuptools that also happens to be Python code. You only need to understand it if you use Setuptools, and the entire point is that you can use other things now (specifically because configuring metadata with Python code is a terrible idea that we tolerated for far too long). Wheels are just the distribution format and you don't need to know anything about how they're structured as an end user or as a developer of ordinary Python code — only as someone who makes packaging tools. And "pypy" is an alternate implementation of Python — maybe you meant PyPI? But that's just the place that hosts your packages; no relevant "concept" there.
Imagine if I wanted to make the same argument about JavaScript and I said that it's complicated because you have to understand ".tar.gz (I think, from previous discussion here? I can't even find documentation for how the package is actually stored as a package on disk), Node.js, NPM, TypeScript, www.npmjs.com, package.json..." That's basically what you're doing here.
But even besides that, you don't have to know about all the competing alternatives. If you know how to use pip, and your only goal is to install packages, you can completely ignore all the other tools that install packages (including poetry and uv). You only have any reason to care about pipenv if you want to use pip and care about the specific things that pipenv does and haven't chosen a different way to address the problem. Many pip users won't have to care about it.
1. A lot of people actively do not want it that way. The Unix philosophy actually does have some upsides, and there are tons of Python users out there who have zero interest in participating in an "ecosystem" where they share their code publicly even on GitHub, never mind PyPI — so no matter what you say should be the files that give project metadata or what they should contain or how they should be formatted, you aren't going to get any buy-in. But beyond that, different people have different needs and a tool that tries to make everyone happy is going to require tons of irrelevant cruft for almost everyone.
2. Reverse compatibility. The Python world — both the packaging system and the language itself — has been trying to get people to do things in better, saner ways for many years now; but people will scream bloody murder if their ancient stuff breaks in any way, even when they are advised years in advance of future plans to drop support. Keep in mind here that Python is more than twice as old as Go.
3. Things are simple for Go/JS/TS users because they normally only have to worry about that one programming language. Python packages (especially the best-known, "serious" ones used for heavyweight tasks) very commonly must interface with code written in many other programming languages (C and C++ are very common, but you can also find Rust, Fortran and many more; and Numpy must work with both C and Fortran), and there are many different ways to interface (and nothing that Python could possibly do at a language level to prevent that): by using an explicit FFI, by dlopen() etc. hooks, by shelling out to a subprocess, and more. And users expect that they can just install the Python package and have all of that stuff just work. Often that means that compiled-language code has to be rebuilt locally; and the install tools are expected to be able to download and set up a build system, build code in an isolated environment, etc. etc. All of this is way beyond the expectations placed on something like NPM.
4. The competition is deliberate. Standards — the clearest example being the PEPs 517/518/621 that define the pyproject.toml schema — were created specifically to enable both competition and interoperation. Uv is gaining market share because a lot of people like its paradigms. Imagine if, when people in the Python community first started thinking about the problems and limitations of tools from the early days, decided to try to come up with all the paradigms themselves. Imagine if they got them wrong, and then set it in stone for everyone else. When you imagine this, keep in mind that projects like pip and setuptools date to the 2000s. People were simply not thinking about open-source ecosystems in the same way in that era.
> They should remove the "There should be one-- and preferably only one --obvious way to do it." from the Python Zen.
First, define "it". The task is orders of magnitude greater than you might naively imagine. I know, because I've been an active participant in the surrounding discussion for a couple of years, aside from working on developing my own tooling.
> Don't understand how a private company like Astral is leading here. Why is that hard for the Python community to come up with a single tool to rule them all? (I know https://xkcd.com/927/). Like, you could even copy what Go or Node are doing, and make it Python-aware; no shame on that. Instead we have these who-knows-how-long-they-will-last tools every now and then.
Python packaging is (largely) solving problems that Go and Node packaging are not even trying to address.
Not the person you're replying to, so I don't know if this is what he had in mind, but with Python packages you can distribute more than just Python. Some packages contain C/C++/Fortran/Rust/others? source code that pip will try to automatically build upon install. Of course you can't expect everyone to have a dev environment set up, so packages can also contain pre-compiled binary for any combination of windows/mac/linux + amd/arm + glibc/musl + CPython/pypy (did I miss any?).
I don't know much about go, and I've only scratched the surface with node, but as far as node goes I think it just distributes JS? So that would be one answer to what Python packaging is trying to solve that node isn't trying to address.
> any combination of windows/mac/linux + amd/arm + glibc/musl + CPython/pypy (did I miss any?).
From a standards perspective, it is a combination of a Python version/implementation, a "platform" and an "ABI". (After all, the glibc/musl distinction doesn't make sense on Windows.)
Aside from CPython/pypy, the system recognizes IronPython (a C# implementation) and Jython (a Java implementation) under the version "tag"; of course these implementations may have their own independent versioning with only a rough correspondence to CPython releases.
The ABI tag largely corresponds to the implementation and version tag, but for example for CPython builds it also indicates whether Python was built in debug or release mode, and from 3.13 onward whether the GIL is enabled.
The platform tag covers Mac, Windows, several generic glibc Linux standards (called "manylinux" and designed to smooth over minor differences between distros), and now also some generic musl Linux standards (called "musllinux"). Basic CPU information (arm vs intel, 32- vs 64-bit etc.) is also jammed in here.
> Some packages contain C/C++/Fortran/Rust/others? source code that pip will try to automatically build upon install.
And in the TS/JS world we have React.Native that has a flexible pluggable model that allows creating XCode projects with autodiscovered dependencies in C, C++, Swift and other languages.
It's also flexible enough to allow third-party products like Sentry to integrate into the build process to upload debug symbols to the Sentry servers on release builds.
So no, Python is really not unique in its requirements.
Specifically simultaneous distribution of precompiled binaries for many different OS and hardware configurations and built-on-demand source distribution of non-Python software to be used as dependencies with as little (ideally none) host setup by the user all installable under a single name/version everywhere.
imagine a world without: failed to build native gem extension
Strange to reject something when you don't even understand what it is.
PYX is a package registry, and therefore an alternative to PyPI (like how JSR is an alternative to NPM).
The alternative to `pip` that Astral has built is called `uv`. Feel free to not use that either, but personally I'd consider it if I were you. It has full pip compatibility (with significantly more speed), and many other nice features besides.
I wonder whether it will have a flat namespace that everyone competes over or whether the top-level keys will be user/project identifiers of some sort. I hope the latter.
Fundamentally we still have the flat namespace of top level python imports, which is the same as the package name for ~95% of projects, so I'm not sure how they
could really change that.
Package names and module names are not coupled to each other. You could have package name like "company-foo" and import it as "foo" or "bar" or anything else.
But you can if you want have a non-flat namespace for imports using PEP 420 – Implicit Namespace Packages, so all your different packages "company-foo", "company-bar", etc. can be installed into the "company" namespace and all just work.
Nothing stops an index from validating that wheels use the same name or namespace as their package names. Sdists with arbitrary backends would not be possible, but you could enforce what backends were allowed for certain users.
Thats because npm finally decided to adopt enough features as time went on that it could be used in place of yarn and eventually, if they adopt enough of the features of pnpm, it will replace that too.
Though speaking as a long time developer in the ecosystem, switching between npm, yarn, and pnpm is fairly trivial in my experience. Especially after node_gyp went away
It's a good thing that a newcomer came and showed the world some new concepts which ended up being adopted by the old tool. In the Haskell world, everybody used cabal, everybody switched to stack, and then everybody switched back to cabal once it got its new-build commands ready.
Node ecosystem still has the problem where if you try to build a project two years later, chances are good it won't work, because breaking changes are so common, and this is then multiplied across all the tiny packages that are dependencies of your dependencies.
Interesting watching this part of the landscape heating up. For repos you've got stalwarts like Artifactory and Nexus, with upstart Cloudsmith. For libraries you've got the OG ActiveState, Chainguard Libraries and, until someone is distracted by a shiny next week, Google Assured Open Source.
Sounds like Pyx is trying to do a bit of both.
Disclosure: I have interacted a bunch with folks from all of these things. Never worked for or been paid by, though.
the two ways of spacing the em dash on that quote is a joke about how it's not actually possible to do that? (And there's a third way in another line of the zen)
Do I buy it? Not sure. But apparently there’s more to this line than what it suggests.
Can I ask a dumb question. Why does Ruby (for example) not have this problem, but python still can't ship a standard solution which isn't constantly changing and rolled up in some corporate offering?
Python packaging for for Python only modules has never been a problem. When people say they hate python packaging they are usually talking about being able to successfully install dependencies without much thinking.
But, the biggest reason that doesn't work is because of the dependencies that have to be compiled. Which brings it's own problems.
Have you ever had a c dependency on node or ruby on a system that wasn't the same system they built it with? Turns out it sucks in all the languages. It's just that the amount of c-level packages in python is quite larger than say ruby. The likelihood of a problem is significantly larger.
Ruby is mostly used in web dev, where most if not all of your dependencies tend to be pure Ruby.
Python is used heavily for DS/ML/AI, which is exactly the area where native code packages are necessary and prevalent. Worse yet is that those packages often involve GPU code, and things like CUDA bring their own complications.
If you're writing web apps in Python, dependencies haven't really been a problem for a long time now.
> Ruby is mostly used in web dev, where most if not all of your dependencies tend to be pure Ruby.
There is literally no such thing as a rails app that’s pure ruby. Rails depends on nokogiri, which is a libxml2 wrapper, and all activerecord database adapters are C bindings. Ruby development involves dealing with frequent native extension compilation, just like python.
Before you answer that you have to answer what problem this is solving that PyPI doesn’t already address. uv works great against “legacy” package indexes so I’m not really clear why it’s needed other than to introduce lock-in to a for-profit facility.
Because CPython and PyPA are dysfunctional organizations in the hands of people who are in the right (often corporate) cliques. Don't expect anything from there.
I really want to pay someone money to run package repo mirrors for me, but my problems have been more with npm than with Pypi. Astral, if you're listening.... maybe tackle JS packaging too?
Only thing that is unclear to me is to which extend this setup depends on the package publisher. PyPi might be terrible at least it just works when you want to publish that it leads to more complexity for the ones that are looking to use this piece of free software is not for the maintainer.
Maybe they are only targeting dev tooling companies as a way to simplify how they distribute. Especially in the accelerated compute era.
> Emphasis mine. It would indeed be hard to survive without that kind of support from a corporation. A user on HN estimated the yearly cost of this traffic at around 12 million USD/year (according to AWS Cloudfront rates), more than four times the full operating budget of the Python Software Foundation as of 2024.
(As the user in question: three times.)
> Leverage progress in the systems programming ecosystem to create repeatable builds. Turn prebuilt binaries from “sources” into cacheable artifacts that can be deleted and reconstructed at will. Institute a way of creating secondary caches that can start shouldering some of the workload.
This doesn't avoid the need for the wheels to exist and be publicly available. People running CI systems should figure out local caching that actually works, sure. But if you delete that cacheable artifact on the public PyPI website for something like, I don't know, numpy-2.3.2-cp312-cp312-win_arm64.whl, you're going to be re-creating it (and having it downloaded again) constantly. Windows users are just not going to be able to build that locally.
And you know, space usage isn't the problem here — we're talking about a few hard drives' worth of space. The number of downloads is the problem. Outside of CI, I guess that's mostly driven by end users defaulting to the latest version of everything, every time they make a new virtual environment, rather than using whatever's in their package installer's cache. I do know that uv makes the latter a lot easier.
I hate that they are using the pyx name; it's the extension for Cython files. It's going to cause at least a moment of confusion for people. They could have easily checked for name collision in the Python ecosystem but they chose not to do that; that's like a middle finger gesture to the community.
I wanted to start a business exactly like this years ago, when I actually worked in Python. I ended up not doing so, because at the time (circa 2014-2015) I was told it would never take off, no way to get funding.
I'm glad you're able to do what ultimately I was not!
Been waiting to see what Astral would do first (with regards to product). Seems like a mix of artifactory and conda? artifactory providing a package server and conda trying to fix the difficulty that comes from Python packages with compiled components or dependencies, mostly solved by wheels, but of course PyTorch wheels requiring specific CUDA can still be a mess that conda fixes
Given Astral's heavy involvement in the wheelnext project I suspect this index is an early adopter of Wheel Variants which are an attempt to solve the problems of CUDA (and that entire class of problems not just CUDA specifically) in a more automated way than even conda: https://wheelnext.dev/proposals/pepxxx_wheel_variant_support...
Not exactly -- part of pyx is a registry (and that part speaks the same standards as PyPI), but the bigger picture is that pyx part of a larger effort to make Python packaging faster and more cohesive for developers.
To be precise: pyx isn't intended to be a public registry or a free service; it's something Astral will be selling. It'll support private packages and corporate use cases that are (reasonably IMO) beyond PyPI's scope.
FWIW, I think the full paragraph around that snippet is important context:
> Beyond the product itself, pyx is also an instantiation of our strategy: our tools remain free, open source, and permissively licensed — forever. Nothing changes there. Instead, we'll offer paid, hosted services that represent the "natural next thing you need" when you're already using our tools: the Astral platform.
I feel like I must be the crazy one for never having a problem with just vanilla pip, PyPi, and venv (or virtualenv for old Python 2 stuff). Maybe it's just my use case?
But I don’t get it. How does it work? Why is it able to solve the Python runtime dependency problem? I thought uv had kinda already solved that? Why is a new thingy majig needed?
> Why is it able to solve the Python runtime dependency problem? I thought uv had kinda already solved that?
The dependencies in question are compiled C code that Python interfaces with. Handling dependencies for a graph of packages that are all implemented in pure Python, is trivial.
C never really solved all the ABI issues and especially for GPU stuff you end up having to link against very specific details of the local architecture. Not all of these can be adequately expressed in the current Python package metadata system.
Aside from that, a lot of people would like to have packages that use pre-installed dependencies that came with the system, but the package metadata isn't designed to express those dependencies, and you're also on your own for actually figuring out where they are at runtime, even if you take it on blind faith that the user separately installed them.
> especially for GPU stuff you end up having to link against very specific details of the local architecture.
Hrm. This doesn’t sound right to me. Any project should target a particular version of Cuda and then the runtime machine simply needs to have that version available. Right?
> a lot of people would like to have packages that use pre-installed dependencies that came with the system
Those people are wrong. Everything these days requires Docker because it’s the only way to deploy software that can reliable not crash on startup. (This is almost entirely a Linux self induced problem)
What are the reasons that Python can't implement the same sort of module/packaging system as NodeJS? That seems to work well enough.
Executing a Python script in the same directory as some sort of project.json file that contains all the complicated dependency details would be a pretty good solution to me. But I'm probably missing a whole bunch of details. (Feel free to educate me).
In general I really dislike the current system of having to use new environment variables in a new session in order to isolate Py scripts. It has always seemed like a hack with lots of footguns. Especially if you forget which console is open.
It can. That's what uv is. Put an '#!/usr/bin/env -S uv run python' shebang in your script, add a `pyproject.toml` with all of your deps, and you're done.
There are a bunch of problems with PyPI. For example, there's no metadata API, you have to actually fetch each wheel file and inspect it to figure out certain things about the packages you're trying to resolve/install.
It would be nice if they contributed improvements upstream, but then they can't capture revenue from doing it. I guess it's better to have an alternative and improved PyPI, than to have no improvements and a sense of pride.
There is a lot of other stuff going on with Pyx, but "uv-native metadata APIs" is the relevant one for this example.
I'm guessing it's the right PyTorch and FlashAttention and TransformerEngine and xformers and all that for the machine you're on without a bunch of ninja-built CUDA capability pain.
They explicitly mention PyTorch in the blog post. That's where the big money in Python is, and that's where PyPI utterly fails.
I spend little time with Python, but I didn’t have any problems using uv. Given how great uv is, I’d like to use pyx, but first it would be good if they could provide a solid argument for using it.
Please stop posting like this. HN is not for self-promotion; it's for sharing and discussing interesting topics and projects. We ban people who post like this continually.
I actually think this is great. If Astral can figure out a way to make money using a private registry (something that is used mainly by companies), then they'll have to resources to keep building their amazing open-source projects — Ruff and uv. That's a huge win for Python.
100% agree. I am more than happy to see Astral taking steps in this direction. People can continue to use uv, ruff, and ty without having to pay anything, but companies that benefit tremendously from open source initiatives can pay for a private package registry and directly support the continued development of said tools.
In particular I think it's nice for uv and ruff to remain open source, not open core. And as you say, companies always need paid private registries, for their internal software. A true win-win.
I've been limiting myself to whatever is available on debian and it's been fine for me since several years.
I don't understand why people who don't do weird AI stuff would use any of that instead of sticking to distribution packages and having the occasional 1 or 2 external modules that aren't packaged.
Indeed, to expand on my remark: I wrote Python in academia for ~6 years and then professionally for nearly a decade in data science, data engineering and backend web apps. Virtualenv was fine. Pipenv had a nicer CLI and easier to use dependency pinning. But fundamentally all this stuff worked fine.
Because making external modules cooperate with the system environment is awkward at best (and explicitly safeguarded against since 3.11, since it can cause serious problems otherwise even with "user" installs), and installing the distro's packages in a separate environment is not supported as far as I can tell. And because the system environment is often deliberately crippled; it may not even include the entire standard library.
All python packaging challenges are solved. Lesson learned is that there is not a single solution for all problems. getting more strings attached with VC funded companies and leaning on their infrastructure is a high risk for any FOSS community.
Well I started with pip because it's what I was told to use. But it was slow and had footguns. And then I started using virtualenv, but that only solved part of the problem. So I switched to conda, which sometimes worked but wrecked my shell profile and often leads to things mysteriously using the wrong version of a package. So someone told me to use pipenv, which was great until it was abandoned and picked up by someone who routinely broke the latest published version. So someone told me to use poetry, but it became unusably slow. So I switched back to pip with the built-in venv, but now I have the and problems I had before, with fewer features. So I switched to uv, because it actually worked. But the dependency I need is built and packaged differently for different operating systems and flavor of GPU, and now my coworkers can't get the project to install on their laptops.
I'm so glad all the Python packaging challenges are "solved"
I started with "sudo apt install python" a long time ago and this installed python2. This was during the decades-long transition from python2 to python3, so half the programs didn't work so I installed python3 via "sudo apt install python3". Of course now I had to switch between python2 and python3 depending on the program I wanted to run, that's why Debian/Ubuntu had "sudo update-alternatives --config python" for managing the symlink for "python" to either python2 or python3. But shortly after that, python3-based applications also didn't want to start with python3, because apt installed python3.4, but Python developers want to use the latest new features offered by python3.5 . Luckily, Debian/Ubuntu provided python3.5 in their backports/updates repositories. So for a couple of weeks things sort of worked, but then python3.7 was released, which definitely was too fresh for being offered in the OS distribution repositories, but thanks to the deadsnakes PPA, I could obtain a fourth-party build by fiddling with some PPA commands or adding some entries of debatable provenance to /etc/apt/lists.conf. So now I could get python3.7 via "sudo apt install python3.7". All went well again. Until some time later when I updated Home Assistant to its latest monthly release, which broke my installation, because the Home Assistant devs love the latest python3.8 features. And because python3.8 wasn't provided anymore in the deadsnakes PPA for my Ubuntu version, I had to look for a new alternative. Building python from source never worked, but thank heavens there is this new thing called pyenv (cf. pyenv), and with some luck as well as spending a weekend for understanding the differences between pyenv, pyvenv, venv, virtualenv (a.k.a. python-virtualenv), and pyenv-virtualenv, Home Assistant started up again.
This wall of text is an abridged excursion of my installing-python-on-Linux experience.
There is also my installing-python-on-Windows experience, which includes: official installer (exe or msi?) from python.org; some Windows-provided system application python, installable by setting a checkbox in Windows's system properties; NuGet, winget, Microsoft Store Python; WSL, WSL2; anaconda, conda, miniconda; WinPython...
I understand this is meant as caricature, but for doing local development tools like mise or asdf are really something I've never looked back from. For containers it's either versioned Docker image or compile yourself.
I started at about the same time you did, and I've never seen an instance of software expecting a Python version newer than what is in Debian stable. It happens all the time for Nodejs, Go, or Rust though.
Your comment shows the sad state of software quality those days. Rust is the same, move fast and break things. And lately also Mesa started to suffer from the same disease. You basically need, those days, the same build env like the one on the developer's machine or the build will fail.
What's wrong with just using virtualenv. I never used anything else, and I never felt the need to. Maybe it's not as shipping l shiny as the other tools, but it just works.
Nothing is inherently wrong with virtualenv. All these tools make virtual environments and offer some way to manage them. But virtualenv doesn't solve the problem of dependency management.
Man I used python sparingly over the years and I still had to deal with all those package manager changes. Worse than the JS bundling almost?
I've walked the same rocky path and have the bleeding feet to show for it! My problem is that now my packaging/environment mental model is so muddled I frequently mix up the commands...
Even the way you import packages is kinda wack
You forgot the wheels and eggs
You can have my `easy_install` when you pry it from my cold dead fingers.
I felt like python packaging was more or less fine, right up until pip started to warn me that I couldn't globally install packages anymore. So I need to make a billion venvs to install the same ml, plotting libraries and dependencies, that I don't want in a requirements.txt for the project.
I just want packaging to fuck off and leave me alone. Changes here are always bad, because they're changes.
I'd otherwise agree but this problem seems unique to Python. I don't have problems like this with npm or composer or rubygems. Or at least very infrequently. It's almost every time I need to update dependencies or install on a new machine that the Python ecosystem decides I'm not worthy.
I think pip made some poor design choices very early, but pip stuck around for a long time and people kept using it. Of course things got out of control, then people kept inventing new package management until uv comes along. I don't know enough about Python to understand how people could live with that for so long.
Every big Python repo has a Dockerfile, which is much less common in JS.
> pip started to warn me that I couldn't globally install packages anymore
Yeah I had that on my work computer. I just created a venv and source that in my .bashrc.
Hahaha that is an awesome middle finger to pip :-)
You can turn that off and allow global packages again if you want.
Or install it with the os package manager or something simmilar
You assume the OS package manager I happen to be using even has packages for some of the libraries I want to use.
Nothing about their post indicated they assumed that.
They offered two options, so you can go do the other one if it doesn't work for you.
[dead]
> All python packaging challenges are solved.
This comes across as uninformed at best and ignorant at worst. Python still doesn't have a reliable way to handle native dependencies across different platforms. pip and setuptools cannot be the end all be all of this packaging ecosystem nor should they be.
Try doing CUDA stuff. It's a chemical fire. And the money would make solving it would fund arbitrary largesse towards OSS in perpetuity.
I share your concern but I have saved so much time with uv already that I figure ill ride it till the VC enshitification kills the host.
Hopefully at the point the community is centralized enough to move in one direction.
I've been heartened by the progress that opentofu has made, so I think if it gets enough momentum it could survive the inevitable money grab
I agree, now I just use uv and forget about it. It does use up a fair bit of disk, but disk is cheap and the bootstrapping time reduction makes working with python a pleasure again
I recently did the same at work, just converted all our pip stuff to use uv pip but otherwise no changes to the venv/requirements.txt workflow and everything just got much faster - it's a no-brainer.
But the increased resource usage is real. Now around 10% of our builds get OOM killed because the build container isn't provisioned big enough to handle uv's excessive memory usage. I've considered reducing the number of available threads to try throttle the non-deterministic allocation behavior, but that would presumably make it slower too, so instead we just click the re-run job button. Even with that manual intervention 10% of the time, it is so much faster than pip it's worth it.
Please open an issue with some details about the memory usage. We're happy to investigate and feedback on how it's working in production is always helpful.
(I work on uv)
Last time I looked into this I found this unresolved issue, which is pretty much the same thing: https://github.com/astral-sh/uv/issues/7004
We run on-prem k8s and do the pip install stage in a 2CPU/4GB Gitlab runner, which feels like it should be sufficient for the uv:python3.12-bookworm image. We have about 100 deps that aside from numpy/pandas/pyarrow are pretty lightweight. No GPU stuff. I tried 2CPU/8GB runners but it still OOMed occasionally so didn't seem worth using up those resources for the normal case. I don't know enough about the uv internals to understand why it's so expensive, but it feels counter-intuitive because the whole venv is "only" around 500MB.
Couldn’t agree more and the `uv run executable.sh` that contains a shebang, imports and then python is just magical.
Is that much different than the python inline script format?
https://peps.python.org/pep-0723/
I've been dealing with python vs debian for the last three hours and am deeply angry with the ecosystem. Solved it is not.
Debian decided you should use venv for everything. But when packages are installed in a venv, random cmake nonsense does not find them. There are apt-get level packages, some things find those, others do not. Names are not consistent. There's a thing called pipx which my console recommended for much the same experience. Also the vestiges of 2 vs 3 are still kicking around in the forms of refusing to find a package based on the number being present or absent.
Whatever c++headerparser might be, I'm left very sure that hacking python out of the build tree and leaving it on the trash heap of history is the proper thing to do.
from what I hear uv is the "solved" and venv by hand is the old way
These tools together solve a fraction of the problen. The other parts of the problem are interfacing with classic c, c++ libraries and handling different hardware and different OSes. It is not even funny how tricky it is to use the same GPU/CUDA versions but with different CPU architectures and hopefully most people dont need to be exposed to it. Sometimes parts of the stack depends on a different version of a c++ library than other parts of the stack. Or some require different kernel modules or CUDA driver settings. But I would be happy if there was a standardized way to at least link to the same C++ libraries, hopefully with the same ABI, across different clusters or different OS versions. Python is so far from solved…
uv is venv + insanely fast pip. I’ve used it every day for 5+ months and I still stare in amazement every time I use it. It’s probably the most joy I’ve ever gotten out of technology.
Installing packages it the most joy you've ever gotten outta tech?
Not a project you built, or something you're proud of? Installing packages?
I know it's like everyone's lost their mind right ?
Nope. You just haven't wrestled with python packages for long enough to appreciate the change.
pip is the default still
No. This is the only thing that python still doesn’t have just working. Otherwise there would be no excitement for anything new in this space.
solved with Docker
sorry, I guess you're new here? Here, try this Kool Aid. I think it will help you fit in. oh don't mind that "MongoDB" logo on the glass that's old
I've been burned too many times by embracing open source products like this.
We've been fed promises like these before. They will inevitably get acquired. Years of documentation, issues, and pull requests will be deleted with little-to-no notice. An exclusively commercial replacement will materialize from the new company that is inexplicably missing the features you relied on in the first place.
For what it's worth, I understand this concern. However, I want to emphasize that pyx is intentionally distinct from Astral's tools. From the announcement post:
> Beyond the product itself, pyx is also an instantiation of our strategy: our tools remain free, open source, and permissively licensed — forever. Nothing changes there. Instead, we'll offer paid, hosted services that represent the "natural next thing you need" when you're already using our tools: the Astral platform.
Basically, we're hoping to address this concern by building a separate sustainable commercial product rather than monetizing our open source tools.
I believe that you are sincere and truthful in what you say.
Unfortunately, the integrity of employees is no guard against the greed of investors.
Maybe next year investors change the CEO and entire management and they start monetizing the open source tools. There is no way of knowing. But history tells us that there is a non-trivial chance of this happening.
It makes sense, but the danger can come when non-paying users unwittingly become dependent on a service that is subsidized by paying customers. What you're describing could make sense if pyx is only private, but what if there is some kind of free-to-use pyx server that people start using? They may not realize they're building on sand until the VC investors start tightening the screws and insist you stop wasting money by providing the free service.
(Even with an entirely private setup, there is the risk that it will encourage too much developer attention to shift to working within that silo and thus starve the non-paying community of support, although I think this risk is less, given Python's enormous breadth of usage across communities of various levels of monetization.)
The entire reason people choose "permissive licenses" is so that it won't last forever. At best, the community can fork the old version without any future features.
Only viral licenses are forever.
I don't think this is true -- a license's virality doesn't mean that its copyright holders can't switch a future version to a proprietary license; past grants don't imply grants to future work under any open source license.
Correct; however, without a CLA and assuming there are outside contributors, relicensing the existing code would be mildly painful, if not downright impossible.
You're saying that would be more painful in a viral license setting, right? If so I agree, although I think there's a pretty long track record of financially incentivized companies being willing to take that pain. MongoDB's AGPL transition comes to mind.
But, to refocus on the case at hand: Astral's tools don't require contributors to sign a CLA. I understand (and am sympathetic) to the suspicion here, but the bigger picture here is that Astral wants to build services as a product, rather than compromising the open source nature of its tools. That's why the announcement tries to cleanly differentiate between the two.
They have no need to, current repos show everything is under MIT/Apache. They could close the source at any time and not worry about CLA.
>bigger picture here is that Astral wants to build services as a product
What services? pyx? Looks nice but I doubt my boss is going to pay for it. More likely they just say "Whatever, package is in PyPi, use that."
UV, Ruff, Ty. Again, maybe they can get some data/quant firm who REALLY cares about speed to use their products. Everyone else will emit a long sigh, grab pip/poetry, black and mypy and move on.
> I think there's a pretty long track record of financially incentivized companies being willing to take that pain. MongoDB's AGPL transition comes to mind.
MongoDB had a CLA from the start, didn't it?
> Astral's tools don't require contributors to sign a CLA.
That's a pretty vital difference!
yeah? who's going to court for multiple years to fight giants? Maybe they'll pull a RedHat and put the code behind a subscription service. Still OSS right?
This is just plain false and honestly close-minded. People choose permissive licenses for all sorts of reasons. Some might want to close it off later, but lots of people prefer the non-viral nature of permissive licenses, because it doesn't constrain others' license choice in the future. Still others think that permissive licenses are more free than copyleft, and choose them for that reason. Please don't just accuse vast groups of people of being bad-faith actors just because you disagree with their license choice.
By viral, do you mean licenses like GPL that force those who have modified the code to release their changes (if they're distributing binaries that include those changes)?
Because FWIW CPython is not GPL. They have their own license but do not require modifications to be made public.
Or they want to get more people to use it.
I think you are making a good point, but please don't use the old Steve Baller FUD term, "viral." Copyleft is a better term
I don't think the connotation these days is particular negative, in the sense it's being used here. See, e.g., "viral video".
The word "left" is now very charged too, maybe even more than "viral".
Every word is charged now, so you might as well use it. "Copyleft" is a fine pun on "copyright".
Careful, or you'll get the copyleftists calling you a neo-libre-al.
they call everybody a nazi so by definition, they like living in a nazi country
> Basically, we're hoping to address this concern by building a separate sustainable commercial product rather than monetizing our open source tools.
jfrog artifactory suddenly very scared for its safety
Only a matter of time before someone makes something better than Artifactory. It’s a low bar to hit imho.
They've been the only game in town for a while, and their pricing reflects it. But this project is only for Python (for now?) so JFrog is not _immediately_ in danger.
Will pyx describe a server protocol that could be implemented by others, or otherwise provide software that others can use to host their own servers? (Or maybe even that PyPI can use to improve its own offering?) That is, when using "paid, hosted services like pyx", is one paying for the ability to use the pyx software in and of itself, or is one simply paying for access to Astral's particular server that runs it?
I might not be following: what would that protocol entail? pyx uses the same PEP 503/691 interfaces as every other Python index, but those interfaces would likely not be immediately useful to PyPI itself (since it already has them).
> or is one simply paying for access to Astral's particular server that runs it?
pyx is currently a service being offered by Astral. So it's not something you can currently self-host, if that's what you mean.
> pyx uses the same PEP 503/691 interfaces as every other Python index
... Then how can it make decisions about how to serve the package request that PyPI can't? Is there not some extension to the protocol so that uv can tell it more about the client system?
The repository API allows server-driven content negotiation[1], so pyx can service specialized requests while also honoring the normal 503/691 ones.
[1]: https://packaging.python.org/en/latest/specifications/simple...
Interesting. I'll have to look into this further. (I've bookmarked the entire thread for reference, especially since I'm a bit attached to some of my other comments here ;) )
Ah.
> honoring the normal 503/691 ones.
Embrace
> pyx can service specialized requests
Extend
... ;)
Snark aside, you're missing the part where pyx doesn't compete with PyPI. It's a private service.
How does it being private mean it doesn't compete with PyPI?
Has that ever happened in the Python ecosystem specifically? It seems like there would be a community fork led by a couple of medium-size tech companies within days of something like that happening, and all users except the most enterprise-brained would switch.
This is a valid concern, but astral just has an amazing track record.
I was surprised to see the community here on HN responding so cautiously. Been developing in python for about a decade now- whenever astral does something I get excited!
> This is a valid concern, but astral just has an amazing track record.
The issue is, track record is not relevant when the next investors take over.
I agree. If any of the stuff was worthwhile to pursue, it would be merged into pip.
Pyx represents the server side, not the client side. The analogue in the pre-existing Python world is PyPI.
Many ideas are being added to recent versions of pip that are at least inspired by what uv has done — and many things are possible in uv specifically because of community-wide standards development that also benefits pip. However, pip has some really gnarly internal infrastructure that prevents it from taking advantage of a lot of uv's good ideas (which in turn are not all original). That has a lot to do with why I'm making PAPER.
For just one example: uv can quickly install previously installed packages by hard-linking a bunch of files from the cache. For pip to follow suit, it would have to completely redo its caching strategy from the ground up, because right now its cache is designed to save only download effort and not anything else about the installation process. It remembers entire wheels, but finding them in that cache requires knowing the URL from which they were downloaded. Because PyPI organizes the packages in its own database with its own custom URL scheme, pip would have to reach out to PyPI across the Internet in order to figure out where it put its own downloads!
> However, pip has some really gnarly internal infrastructure that prevents it from taking advantage of a lot of uv's good ideas (which in turn are not all original).
FWIW, as a pip maintainer, I don't strongly agree with this statement, I think if pip had the same full time employee resources that uv has enjoyed over the last year that a lot of these issues could be solved.
I'm not saying here that pip doesn't have some gnarly internal details, just that the bigger thing holding it back is the lack of maintainer resources.
> For just one example: uv can quickly install previously installed packages by hard-linking a bunch of files from the cache. For pip to follow suit, it would have to completely redo its caching strategy from the ground up, because right now its cache is designed to save only download effort and not anything else about the installation process.
I actually think this isn't a great example, evidenced by the lack of a download or wheel command from uv due to those features not aligning with uv's caching strategy.
That said, I do think there are other good examples to your point, like uv's ability to prefetch package metadata, I don't think we're going to be able to implement that in pip any time soon due to probably the need for a complete overhaul of the resolver.
Good to see you again.
> FWIW, as a pip maintainer, I don't strongly agree with this statement, I think if pip had the same full time employee resources that uv has enjoyed over the last year that a lot of these issues could be solved.
Fair enough. I'm sure if someone were paying me a competitive salary to develop my projects, they'd be getting done much faster, too.
> I actually think this isn't a great example, evidenced by the lack of a download or wheel command from uv due to those features not aligning with uv's caching strategy.
I guess you're talking about the fact that uv's cache only stores the unpacked version, rather than the original wheel? I'm planning to keep the wheel around, too. But my point was more that because of this cache structure, pip can't even just grab the wheel from its cache without hitting the Internet, on top of not having a place to put a cache of the unpacked files.
> uv's ability to prefetch package metadata,
You mean, as opposed to obtaining it per version, lazily? Because it does seem like the .metadata file system works pretty well nowadays.
> I don't think we're going to be able to implement that in pip any time soon due to probably the need for a complete overhaul of the resolver.
Ugh, yeah. I know the resolution logic has been extracted as a specific package, but it's been challenging trying to figure out how to actually use that in a project that isn't pip.
> For just one example: uv can quickly install previously installed packages by hard-linking a bunch of files from the cache.
Conda has been able to do this for years.
This doesn't generalize: you could have said the same thing about pip versus easy_install, but pip clearly has worthwhile improvements over easy_install that were never merged back into the latter.
Pip is broken and has been for years, they're uninterested in fixing the search. Or even removing the search or replacing it with a message/link to the package index.
imo, if pip's preference is to ship broken functionality, then what is/is not shipped with pip is not meaningful.
This is not a charitable interpretation. The more charitable read is that fixing search is non-trivial and has interlocking considerations that go beyond what pip's volunteer maintainers reasonably want to or can pick up.
(And for the record: it isn't their fault at all. `pip search` doesn't work because PyPI removed the search API. PyPI removed that API for very good reasons[1].)
[1]: https://github.com/pypa/pip/issues/5216
That was 7 years ago. If it's not coming back, the CLI should make that clear, instead of giving a temporary "cannot connect" message that implies it could work, if you wait a minute and try again.
It was three years ago; 2018 is when they considered removing the command, not when the search API was actually removed from PyPI.
And this is part of the interlocking considerations I mentioned: there are private indices that supply the XML-RPC API, and breaking them doesn't seem justifiable[1].
Edit: fixed the link.
[1]: https://github.com/pypa/pip/issues/5216#issuecomment-1235329...
Does that seem like a real solution to you? That it's ok to represent a never-functional operation as one that might maybe work? ...because it could work of you jump through a bunch of undocumented hoops?
It's so wild to me that so many people are apparently against making a user-friendly update. The whole thing seems very against pep8 (its surprising, complicated, non-specific, etc)
I don't know what to tell you; I just gave an example of it being functional for a subset of users, who don't deserve to be broken just because it's non-functional on PyPI.
Nobody wants anything to be user-unfriendly. You're taking a very small view into Python packaging and extending it to motives, when resources (not even financial ones) are the primary challenge.
Is code to detect when the user is not in that subset and say that it doesn't work really really hard for some non-obvious reason? If the default case for the vast majority of users doesn't work, it doesn't seem like printing a more useful error message to them should be that hard.
> it doesn't seem like printing a more useful error message to them should be that hard.
I think the existing error message is useful:
It says (1) what failed, (2) why it failed, and (3) links to a replacement, and (4) links to a deprecation explanation. That last link could maybe then link back to the pip issue or include some more context, but it's a far cry from being not helpful.It’s not that complex - just try it
> Why is it so hard to install PyTorch, or CUDA, or libraries like FlashAttention or DeepSpeed that build against PyTorch and CUDA?
This is so true! On Windows (and WSL) it is also exacerbated by some packages requiring the use of compilers bundled with outdated Visual Studio versions, some of which are only available by manually crafting download paths. I can't wait for a better dev experience.
Stuff like that led me fully away from Ruby (due to Rails), which is a shame, I see videos of people chugging along with Ruby and loving it, and it looks like a fun language, but when the only way I can get a dev environment setup for Rails is using DigitalOcean droplets, I've lost all interest. It would always fail at compiling something for Rails. I would have loved to partake in the Rails hype back in 2012, but over the years the install / setup process was always a nightmare.
I went with Python because I never had this issue. Now with any AI / CUDA stuff its a bit of a nightmare to the point where you use someone's setup shell script instead of trying to use pip at all.
Do I get it right that this issue is within Windows? I've never heard of the issues you describe while working with Linux.. I've seen people struggle with MacOS a bit due to brew different versions of some library or the other, mostly self compiling Ruby.
I had issues on Mac, Windows and Linux... It was obnoxious. It led me to adopt a very simple rule: if I cannot get your framework / programming language up and running in under 10 minutes (barring compilation time / download speeds) I am not going to use your tools / language. I shouldn't be struggling with the most basic of hello worlds with your language / framework. I don't in like 100% of the other languages I already use, why should I struggle to use a new language?
There certainly are issues on Linux as well. The Detectron2 library alone has several hundred issues related to incorrect versions of something: https://github.com/facebookresearch/detectron2/issues
The mmdetection library (https://github.com/open-mmlab/mmdetection/issues) also has hundreds of version-related issues. Admittedly, that library has not seen any updates for over a year now, but it is sad that things just break and become basically unusable on modern Linux operating systems because NVIDIA can't stop breaking backwards and forwards compatibility for what is essentially just fancy matrix multiplication.
On Linux good luck if you're not using anything besides the officially nvidia-supported Ubuntu version. Just 24.04 instead of 22.04 has regular random breakages and issues, and running on archlinux is just endless pain.
Have you tried conda? Since the integration of mamba its solver is fast and the breadth of packages is impressive. Also, if you have to support Windows and Python with native extensions, conda is a godsend.
I would recommend learning a little bit of C compilation and build systems. Ruby/Rails is about as polished as you could get for a very popular project. Maybe libyaml will be a problem once in a while if you're compiling Ruby from scratch, but otherwise this normally works without a hassle. And those skills will apply everywhere else. As long as we have C libraries, this is about as good as it gets, regardless of the language/runtime.
I'm surprised to hear that. Ruby was the first language in my life/career where I felt good about the dependency management and packaging solution. Even when I was a novice, I don't remember running into any problems that weren't obviously my fault (for example, installing the Ruby library for PostgreSQL before I had installed the Postgres libraries on the OS).
Meanwhile, I didn't feel like Python had reached the bare minimum for package management until Pipenv came on the scene. It wasn't until Poetry (in 2019? 2020?) that I felt like the ecosystem had reached what Ruby had back in 2010 or 2011 when bundler had become mostly stable.
Bundler has always been the best package manager of any language that I've used, but dealing with gem extensions can still be a pain. I've had lots of fun bugs where an extension worked in dev but not prod because of differences in library versions. I ended up creating a docker image for development that matched our production environment and that pretty much solved those problems.
> I ended up creating a docker image for development that matched our production environment and that pretty much solved those problems.
docker has massively improved things - but it still has edge cases (you have to be really pushing it hard to find them though)
Lets be honest here - whilst some experiences are better/worse than others, there doesn't seem to be a dependency management system that isn't (at least half) broken.
I use Go a lot, the journey has been
- No dependency management
- Glide
- Depmod
- I forget the name of the precursor - I just remembered, VGo
- Modules
We still have proxying, vendoring, versioning problems
Python: VirtualEnv
Rust: Cargo
Java: Maven and Gradle
Ruby: Gems
Even OS dependency management is painful - yum, apt (which was a major positive when I switched to Debian based systems), pkg (BSD people), homebrew (semi-official?)
Dependency Management is the wild is a major headache, Go (I only mention because I am most familiar with) did away with some compilation dependency issues by shipping binaries with no dependencies (meaning that it didn't matter which version of linux you built your binary for, it will run on any of the same arch linux - none of that "wrong libc" 'fun'), but you still have issues with two different people building the same binary in need of extra dependency management (vendoring brings with it caching problems - is the version in the cache up to date, will updating one version of one dependency break everything - what fun)
NuGet for C# has always been fantastic, and I like Cargo, though sometimes waiting for freaking ever for things to build does kill me on the inside a little bit. I do wish Go had a better package manager trajectory, I can only hope they continue to work on it, there were a few years I refused to work on any Go projects because setup was a nightmare.
Have you tried JRuby? It might be a bit too large for your droplet, but it has the java versions of most gems and you can produce cross-platform jars using warbler.
The speed of Ruby with the memory management of Java, what's not to love?
Also, now you have two problems.
For me, it solves a very specific problem.
In-house app that is easy to develop and needs to be cross-platform (windows, linux, aix, as400). The speed picks up as it runs, usually handling 3000-5000 eps on old hardware.
Java has quite possibly the best production GC of all the VMs out there.
And yet operating distributed systems built on it is a world of pain. Elasticsearch, I am looking at you. Modern hardware resources leave the limitations of running/scaling on top JVM to be an expensive, frustrating endeavor.
In addition to elasticsearch's metrics, there's like 4 JVM metrics I have to watch constantly on all my clusters to make sure the JVM and its GC is happy.
Have you tried Nix?
https://nixos.org
I'm on Arch these days, but nix would have maybe helped, but this was 2010s where as far as I remember, nobody was talking about NixOS.
Given they mentioned Windows (and not WSL) that might not be a viable option. AFAIK, Windows is not natively supported by nixpkgs.
Given that WSL is pretty much just Linux, I don't see what relevance Visual Studio compiler versions have to it. WSL binaries are always built using Linux toolchains.
At the same time, even on Windows, libc has been stable since Win10 - that's 10 years now. Which is to say, any binary compiled by VC++ 2015 or later is C-ABI-compatible with any other such binary. The only reasons why someone might need a specific compiler version is if they are relying on some language features not supported by older ones, or because they're trying to pass C++ types across the ABI boundary, which is a fairly rare case.
This is the right direction for Python packaging, especially for GPU-heavy workflows. Two concrete things I'm excited about: 1) curated, compatibility-tested indices per accelerator (CUDA/ROCm/CPU) so teams stop bikeshedding over torch/cu* matrixes, and 2) making metadata queryable so clients can resolve up front and install in parallel. If pyx can reduce the 'pip trial-and-error' loop for ML by shipping narrower, hardware-targeted artifacts (e.g., SM/arch-specific builds) and predictable hashes, that alone saves hours per environment. Also +1 to keeping tools OSS and monetizing the hosted service—clear separation builds trust. Curious: will pyx expose dependency graph and reverse-dependency endpoints (e.g., "what breaks if X→Y?") and SBOM/signing attestation for supply-chain checks?
This was basically the reason to use anaconda back in the day.
In my experience, Anaconda (including Miniconda, Micromamba, IntelPython, et al.) is still the default choice in scientific computing and machine learning.
It's useful because it also packages a lot of other deps like CUDA drivers, DB drivers, git, openssl, etc. When you don't have admin rights, it's really handy to be able to install them and there's no other equivalent in the Python world. That being said, the fact conda (and derivatives) do not follow any of the PEPs about package management is driving me insane. The ergonomics are bad as well with defaults like auto activation of the base env and bad dependency solver for the longest time (fixed now), weird linking of shared libs, etc.
Anaconda was a good idea until it would break apt on Ubuntu and make my job that much harder. That became the reason _not_ to use Anaconda in my book.
venv made these problems start to disappear, and now uv and Nix have closed the loop for me.
How did it manage to do that?
Not saying it didn't, I've just never ran into that after a decade of using the thing on various Nixes
In the past, part of the definition of an operating system was that it ships with a compiler.
When was that ever a part of the definition? It was part of the early Unix culture, sure, but even many contemporary OSes didn't ship with compilers, which were a separate (and often very expensive!) piece of software.
OTOH today most Linux distros don't install any dev tools by default on a clean install. And, ironically, a clean install of Windows has .NET, which includes a C# compiler.
The real pyx is an absolutely wonderful graphing package. It's like Tex in that everything looks wonderful and publication-quality.
https://pyx-project.org/gallery/graph/index.html
there's something about these comments ("name-collision") that drives me up the wall. do y'all realize multiple things can have the same name? for example, did you know there are many people with exactly the same names:
https://www.buzzfeed.com/kristenharris1/famous-people-same-n...
and yet no one bemoans this (hospitals don't consult name registries before filling out birth certificates). that's because it's almost always extremely clear from context.
> The real pyx
what makes that pyx any more "real" than this pyx? it's the extension of the language py plus a single letter. there are probably a thousand projects that could rightfully use that combination of letters as a name.
Human naming has nothing to do with software naming which seems obvious but apparently not. Python package creators should check the pypi registry for names and generally avoid name collisions where reasonable. Common sense applies for reduced confusion for users globally and also for potential legal issues if any party trademarks their software name. What makes one pyx more real than the other is one was first and took the spot on pypi. Simple as that. https://pypi.org/project/PyX/
> https://pypi.org/project/PyX/
the last release is Oct 16, 2022. are we doing this like jerseys - the name is now retired because pyx won all the championships?
it's the pyx you get with `pip install pyx`?
This is effectively what Charlie said they were going to build last September when quizzed about their intended business model on Mastodon: https://hachyderm.io/@charliermarsh/113103564055291456
Soon: there are 14 competing Python packaging standards.
This is a joke, obviously. We've had more than 14 for years.
Python packaging has a lot of standards, but I would say most of them (especially in the last decade) don't really compete with each other. They lean more towards the "slow accretion of generally considered useful features" style.
This itself is IMO a product of Python having a relatively healthy consensus-driven standardization process for packaging, rather than an authoritative one. If Python had more of an authoritative approach, I don't think the language would have done as well as it has.
(Source: I've written at least 5 PEPs.)
Do you really think Python’s consensus-driven language development is better than authoritarian?
I am honestly tired of the Python packing situation. I breathe a sigh of relief in language like Go and Rust with an “authoritative” built-in solution.
I wouldn’t mind the 30 different packaging solutions as long as there was authoritative “correct” solution. All the others would then be opt-in enhancements as needed.
I guess a good thought experiment would be if we were to design a packaging system (or decide not to) for a new PL like python, what would it look like?
> I breathe a sigh of relief in language like Go and Rust with an “authoritative” built-in solution.
I don't know about Go, but Rust's packaging isn't authoritative in the sense that I meant. There's no packaging BDFL; improvements to Rust packaging happen through a standards process that closely mirrors that of Python's PEPs.
I think the actual difference between Rust and Python is that Rust made the (IMO correct) decision early on to build a single tool for package management, whereas Python has historically had a single installer and left every other part of package management up to the user. That's a product of the fact that Python is more of a patchwork ecosystem and community than Rust is, plus the fact that it's a lot older and a lot bigger (in terms of breadth of user installation base).
Basically, hindsight is 20/20. Rust rightly benefited from Python's hard lesson about not having one tool, but they also rightly benefited from Python's good experience with consensus-driven standardization.
Welp, I guess it's time to start pulling all the uv deps out of our builds and enjoy the extra 5 minutes of calm per deploy. I'm not gonna do another VC-fueled supply chain poisoning switcheroo under duress of someone else's time crunch to start churning profit.
As I said a couple weeks ago, they're gonna have to cash out at some point. The move won't be around Uv -- it'll be a protected private PyPi or something.
https://news.ycombinator.com/item?id=44712558
Now what do we have here?
Not sure what you're trying to get at here. Charlie Marsh has literally said this himself; see e.g. this post he made last September:
> "An example of what this might look like (we may not do this, but it's helpful to have a concrete example of the strategy) would be something like an enterprise-focused private package registry."
https://hachyderm.io/@charliermarsh/113103605702842937
Astral has been very transparent about their business model.
Astral doesn't really have a business model yet, it has potential business models.
The issue is that there isn't a clean business model that will produce the kind of profits that will satisfy their VCs - not that there isn't any business model that will help support a business like theirs.
Private package management would probably work fine if they hadn't taken VC money.
Cash out is a bit of a negative word here. They've shown the ability to build categorically better tooling, so I'm sure a lot of companies would be happy to pay them to fix even more of their problems.
It’s not negative, it’s accurate. The playbook is well known and users should be informed.
I haven't adopted uv yet watching to see what will be their move. We recently had to review our use of Anaconda tools due to their changes, then review Qt changes in license. Not looking forward to another license ordeal.
We're hoping that building a commercial service makes it clear that we have a sustainable business model and that our tools (like uv) will remain free and permissively licensed.
(I work at Astral)
I think having a credible, proven business model is a feature of an open source project - without one there are unanswered questions about ongoing maintenance.
I'm glad to see Astral taking steps towards that.
I've been wondering where the commercial service would come in and this sounds like just the right product that aligns with what you're already doing and serves a real need. Setting up scalable private registries for python is awful.
You know what they say: The best time to adopt uv was last year...
I'm all seriousness, I'm all in on uv. Better than any competition by a mile. Also makes my training and clients much happier.
Given how widely popular uv is, I'm pretty sure that in the event of any impactful license change it would immediately get forked.
Fortunately for a lot of what uv does, one can simply switch to something else like Poetry. Not exactly a zero-code lift but if you use pyproject.toml, there are other tools.
Of course if you are on one of the edge cases of something only uv does, well... that's more of an issue.
What does GPU-aware mean in terms of a registry? Will `uv` inspect my local GPU spec and decide what the best set of packages would be to pull from Pyx?
Since this is a private, paid-for registry aimed at corporate clients, will there be an option to expose those registries externally as a public instance, but paid for by the company? That is, can I as a vendor pay for a Pyx registry for my own set of packages, and then provide that registry as an entrypoint for my customers?
> Will `uv` inspect my local GPU spec and decide what the best set of packages would be to pull from Pyx?
We actually support this basic idea today, even without pyx. You can run (e.g.) `uv pip install --torch-backend=auto torch` to automatically install a version of PyTorch based on your machine's GPU from the PyTorch index.
pyx takes that idea and pushes it further. Instead of "just" supporting PyTorch, the registry has a curated index for each supported hardware accelerator, and we populate that index with pre-built artifacts across a wide range of packages, versions, Python versions, PyTorch versions, etc., all with consistent and coherent metadata.
So there are two parts to it: (1) when you point to pyx, it becomes much easier to get the right, pre-built, mutually compatible versions of these things (and faster to install them); and (2) the uv client can point you to the "right" pyx index automatically (that part works regardless of whether you're using pyx, it's just more limited).
> Since this is a private, paid-for registry aimed at corporate clients, will there be an option to expose those registries externally as a public instance, but paid for by the company? That is, can I as a vendor pay for a Pyx registry for my own set of packages, and then provide that registry as an entrypoint for my customers?
We don't support this yet but it's come up a few times with users. If you're interested in it concretely feel free to email me (charlie@).
Hi Charlie
what happens in a situation in which I might have access to a login node, from which I can install packages, but then the computing nodes don't have internet access. Can I define in some hardware.toml the target system and install there even if my local system is different?
To be more specific, I'd like to do `uv --dump-system hardware.toml` in the computing node and then in the login node (or my laptop for that matter) just do `uv install my-package --target-system hardware.toml` and get an environment I can just copy over.
Yes, we let you override our detection of your hardware. Though we haven't implemented dumping detected information on one platform for use on another, it's definitely feasible, e.g., we're exploring a static metadata format as a part of the wheel variant proposal https://github.com/wheelnext/pep_xxx_wheel_variants/issues/4...
Astral folks that are around - there seems to be a bit of confusion in the product page that the blog post makes a little more clear.
> The next step in Python packaging
The headline is the confusing bit I think - "oh no, another tool already?"
IMO you should lean into stating this is going to be a paid product (answering how you plan to make money and become sustainable), and highlight that this will help solve private packaging problems.
I'm excited by this announcement by the way. Setting up scalable private python registries is a huge pain. Looking forward to it!
Thanks for the feedback!
Is there a big enough commercial market for private Python package registries to support an entire company and its staff? Looks like they're hiring for $250k engineers, starting a $26k/year OSS fund, etc. Expenses seem a bit high if this is their first project unless they plan on being acquired?
Its interesting because the value is definitely there. Every single python developer you meet (many of who are highly paid) has a story about wasting a bunch of time on these things. The question is how much of this value can Astral capture.
I think based on the quality of their work, there's also an important component which is trust. I'd trust and pay for a product from them much more readily than an open source solution with flaky maintainers.
Yeah they certainly generate a lot of value by providing excellent productivity tooling. The question is how they capture some of that value, which is notoriously hard with an OSS license. A non-OSS license creates the adobe trap on the other hand, where companies deploy more and more aggressive moetization strategies, making life worse and worse for users of the software.
Ask Docker how that worked out.
Continuum has been doing something very similar with Anaconda, and they've been around for over a decade now.
Just one data point, but if it's as nice to use as their open source tools and not outrageously expensive, I'd be a customer. Current offerings for private python package registries are kind of meh. Always wondered why github doesn't offer this.
The real thing that I hope someone is able to solve is downloading such huge amounts of unnecessary code. As I understand, the bulk of the torch binary is just a huge nvfatbin compiled for every SM under the sun when you usually just want it to run on whatever accelerators you have on hand. Even just making narrow builds of like `pytorch-sm120a` (with stuff like cuBLAS thin binaries paired with it too) as part of a handy uv extra or something like that would make it much quicker and easier.
Another piece is that PyPI has no index— it's just a giant list of URLs [1] where any required metadata (eg, the OS, python version, etc) is encoded in the filename. That makes it trivial to throw behind a CDN since it's all super static, but it has some important limitations:
- there's no way to do an installation dry run without pre-downloading all the packages (to get their dep info)
- there's no way to get hashes of the archives
- there's no way to do things like reverse-search (show me everything that depends on x)
I'm assuming that a big part of pyx is introducing a dynamically served (or maybe even queryable) endpoint that can return package metadata and let uv plan ahead better, identify problems and conflicts before they happen, install packages in parallel, etc.
Astral has an excellent track record on the engineering and design side, so I expect that whatever they do in this space will basically make sense, it will eventually be codified in a PEP, and PyPI will implement the same endpoint so that other tools like pip and poetry can adopt it.
[1]: Top-level: https://pypi.org/simple/ Individual package: https://pypi.org/simple/pyyaml/
Your information is out of date.
> there's no way to do an installation dry run without pre-downloading all the packages (to get their dep info)
Not true for wheels; PyPI implements https://peps.python.org/pep-0658/ here. You can pre-download just the dependency info instead.
For sdists, this is impossible until we can drop support for a bunch of older packages that don't follow modern standards (which is to say, including the actual "built" metadata as a PKG-INFO file, and having that file include static data for at least name, version and dependencies). I'm told there are real-world projects out there for which this is currently impossible, because the dependencies... depend on things that can't be known without inspecting the end user's environment. At any rate, this isn't a PyPI problem.
> there's no way to get hashes of the archives
This is provided as a URL fragment on the URLs, as described in https://peps.python.org/pep-0503/. Per PEP 658, the hash for the corresponding metadata files is provided in the data-dist-info-metadata (and data-core-metadata) attributes of the links.
But yes, there is no reverse-search support.
Ah interesting, thanks for that! I was frustrated once again recently to note that `pip install --dry-run` required me to pre-download all packages, so I assumed nothing had changed.
You could do worse than to start using --only-binary=:all: by default. (It's even been proposed as default behaviour: https://github.com/pypa/pip/issues/9140) Even if you can't actually install that way, it will point out the places where sdists are needed.
In principle, separate metadata availability should still at least be possible for most sdists eventually. But I'm not the one calling the shots here.
For clarity, if I do
Should I expect that to download only metadata and not whole wheels/sdists for everything? Or does that depend on everything in my requirements file being available as a wheel?I'm brushing up with Python for a new job, and boy what a ride. Not because of the language itself but the tooling around packages. I'm coming from Go and TS/JS and while these two ecosystems have their own pros and cons, at least they are more or less straightforward to get onboarded (there are 1 or 2 tools you need to know about). In Python there are dozens of tools/concepts related to packaging: pip, easy_install, setuptools, setup.py, pypy, poetry, uv, venv, virtualenv, pipenv, wheels, ... There's even an entire website dedicated to this topic: https://packaging.python.org
Don't understand how a private company like Astral is leading here. Why is that hard for the Python community to come up with a single tool to rule them all? (I know https://xkcd.com/927/). Like, you could even copy what Go or Node are doing, and make it Python-aware; no shame on that. Instead we have these who-knows-how-long-they-will-last tools every now and then.
They should remove the "There should be one-- and preferably only one --obvious way to do it." from the Python Zen.
I don't know, I was looking at TS tutorials the other day and there seemed to be at least half a dozen "bundlers" with different tutorails suggesting different ones to use. It took me a while to figure out I could just directly invoke "tsc" to generate javascript from typescript.
Python gained popularity in academic circles because it was easy, not because it was good.
Its a pain in the ass to work with professionally.
It's not an easy task, and when there's already lots of established practices, habits, and opinions, it becomes even more difficult to get around the various pain points. There's been many attempts: pip (the standard) is slow, lacks dependency resolution, and struggles with reproducible builds. Conda is heavy, slow to solve environments, and mixes Python with non-Python dependencies, which makes understanding some setups very complicated. Poetry improves dependency management but is sluggish and adds unnecessary complexity for simple scripts/projects. Pipenv makes things simpler, but also has the same issue of slow resolution and inconsistent lock files. Those are the ones I've used over the years at least.
uv addressed these flaws with speed, solid dependency resolution, and a simple interface that builds on what people are already used to. It unifies virtual environment and package management, supports reproducible builds, and integrates easily with modern workflows.
> In Python there are dozens of tools/concepts related to packaging: pip, easy_install, setuptools, setup.py, pypy, poetry, uv, venv, virtualenv, pipenv, wheels,
Some of those are package tools, some are dependency managers, some are runtime environments, some are package formats...
Some are obsolete at this point, and others by necessity cover different portions of programming language technologies.
I guess what I'm saying is, for the average software engineer, there's not too many more choices in Python for programming facilities than in Javascript.
You're right, it's not like there are actually 14 competing standards, but there are still too many—and that goes for Javascript as well.
I don't know why you were downvoted. You are absolutely correct.
> easy_install
I don't know what guides you're reading but I haven't touched easy_install in at least a decade. It's successor, pip, had effectively replaced all use cases for it by around 2010.
> I don't know what guides you're reading but I haven't touched easy_install in at least a decade.
It is mentioned in the "Explanations and Discussions" section [0] of the linked Python Packaging guide.
Old indeed, but can still be found at the top level of the current docs.
[0] https://packaging.python.org/en/latest/#explanations-and-dis...
I work with Python, Node and Go and I don't think any of them have great package systems. Go has an amazing module isolation system and boy do I wish hiding functions within a module/package was as easy in Python as it is in Go. What saves Go is the standard library which makes it possible to write almost everything without needing external dependencies. You've worked with JavaScript and I really don't see how Python is different. I'd argue that Deno and JSR is the only "sane" approach to packages and security, but it's hardly leading and NPM is owned by Microsoft so it's not like you have a great "open source" platform there either. On top of that you have the "fun" parts of ESM vs CommonJS.
Anyway, if you're familiar with Node then I think you can view pip and venv as the npm of Python. Things like Poetry are Yarn, made as replacements because pip sort of sucks. UV on the other hand is a drop-in replacement for pip and venv (and other things) similar to how pnpm is basically npm. I can't answer your question on why there isn't a "good" tool to rule them all, but the single tool has been pip since 2014, and since UV is a drop-in, it's very easy to use UV in development and pip in production.
I think it's reasonable to worry about what happens when Astral needs to make money for their investors, but that's the beauty of UV compared to a lot of other Python tools. It's extremely easy to replace because it's essentially just smarter pip. I do hope Astral succeeds with their pyx, private registries and so on by the way.
I appreciate everything they’ve done but the group which maintains Pip and the package index is categorically incapable of shipping anything at a good velocity.
It’s entirely volunteer based so I don’t blame them, but the reality is that it’s holding back the ecosystem.
I suspect it’s also a misalignment of interests. No one there really invests in improving UX.
> the group which maintains Pip and the package index is categorically incapable of shipping anything at a good velocity.
> It’s entirely volunteer based so I don’t blame them
It's not just that they're volunteers; it's the legacy codebase they're stuck with, and the use cases that people will expect them to continue supporting.
> I suspect it’s also a misalignment of interests. No one there really invests in improving UX.
"Invest" is the operative word here. When I read discussions in the community around tools like pip, a common theme is that the developers don't consider themselves competent to redesign the UX, and there is no money from anywhere to hire someone who would be. The PSF operates on an annual budget on the order of $4 million, and a big chunk of that is taken up by PyCon, supporting programs like PyLadies, generic marketing efforts, etc. Meanwhile, total bandwidth use at PyPI has crossed into the exabyte range (it was ~600 petabytes in 2023 and growing rapidly). They would be completely screwed without Fastly's incredible in-kind donation.
Indeed, they broke a few features in the last few years and made the excuse "we can't support them, we're volunteers." Well, how about stop breaking things that worked for a decade? That would take less effort.
They had time to force "--break-system-packages" on us though, something no one asked for.
> how about stop breaking things that worked for a decade?
They aren't doing this.
> They had time to force "--break-system-packages" on us though, something no one asked for.
The maintainers of several Linux distros asked for it very explicitly, and cooperated to design the feature. The rationale is extensively documented in the proposal (https://peps.python.org/pep-0668/). This is especially important for distros where the system package manager is itself implemented in Python, since corrupting the system Python environment could produce a state that is effectively unrecoverable (at least without detailed Python-specific know-how).
Oh really?
- https://github.com/pypa/packaging/issues/774
- https://github.com/pypa/setuptools/issues/3548
- https://github.com/pypa/pip/issues/7953
I relied on those for a decade, maybe two.
> something no one asked for
Was being a facetious, sure someone asked for it, but it was pretty dumb. This has never "corrupted" anything, is rare (not happened to me in last 15 years), and simply fixed when knowledgeable.
Not everyone can simply fix it, so a better solution would be to isolate the system python, allow more than one installed, etc. Distros already do this to some extent.
> Why is that hard for the Python community to come up with a single tool to rule them all? (I know https://xkcd.com/927/).
because they're obsessed with fixing non-issues (switching out pgp signing for something you can only get from microsoft, sorry, "trusted providers", arguing about mission statements, etc.)
whilst ignoring the multiple elephants in the room (namespacing, crap slow packaging tool that has to download everything because the index sucks, mix of 4 badly documented tools to build anything, index that operates on filenames, etc.)
You don't need to know most of those things. Until last year I used setup.py and pip exclusively for twenty years, with a venv for each job at work. Wheels are simply prebuilt .zips. That's about an hour of learning more or less.
Now we have pyproject.toml and uv to learn. This is another hour or so of learning, but well worth it.
Astral is stepping up because no one else did. Guido never cared about packaging and that's why it has been the wild west until now.
> Why is that hard for the Python community to come up with a single tool to rule them all?
Whatever I would say at this point about PyPA would be so uncharitable that dang would descend on me with the holy hammer of banishment, but you can get my drift. I just don't trust them to come out with good tooling. The plethora they have produced so far is quite telling.
That said, pip covers 99% of my needs when I need to do anything with Python. There are ecosystems that have it way worse, so I count my blessings. But apparently, since Poetry and uv exist, my 99% are not many other people's 99%.
If I wanted to package my Python stuff, though, I'm getting confused. Is it now setup.py or pyproject.toml? Or maybe both? What if I need to support an older Python version as seen in some old-but-still-supported Linux distributions?
> They should remove the "There should be one-- and preferably only one --obvious way to do it." from the Python Zen.
Granted, tooling is different from the language itself. Although PyPA could benefit from a decade having a BDFL.
> If I wanted to package my Python stuff, though, I'm getting confused. Is it now setup.py or pyproject.toml? Or maybe both? What if I need to support an older Python version as seen in some old-but-still-supported Linux distributions?
Your Python version is irrelevant, as long as your tools and code both run under that version. The current ecosystem standard is to move in lock-step with the Python versions that the core Python team supports. If you want to offer extended support, you should expect to require more know-how, regardless. (I'm happy to receive emails about this kind of thing; I use this username, on the Proton email service.)
Nowadays, you should really always use at least pyproject.toml.
If your distribution will include code in non-Python languages and you choose to use Setuptools to build your package, you will also need a setup.py. But your use of setup.py will be limited to just the part that explains how to compile your non-Python code; don't use it to describe project metadata, or to orchestrate testing, or to implement your own project management commands, or any of the other advanced stuff people used to do when Setuptools was the only game in town.
In general, create pyproject.toml first, and then figure out if you need anything else in addition. Keeping your metadata in pyproject.toml is the sane, modern way, and if we could just get everyone on board, tools like pip could be considerably simpler. Please read https://blog.ganssle.io/articles/2021/10/setup-py-deprecated... for details about modern use of Setuptools.
Regardless of your project, I strongly recommend considering alternatives to Setuptools. It was never designed for its current role and has been stuck maintaining tons of legacy cruft. If your project is pure Python, Flit is my current recommendation as long as you can live with its opinionated choices (in particular, you must have a single top-level package name in your distribution). For projects that need to access a C compiler for a little bit, consider Hatch. If you're making the next Numpy, keep in mind that they switched over to Meson. (I also have thrown my hat in this ring, although I really need to get back to that project...)
If you use any of those alternatives, you may have some tool-specific configuration that you do in pyproject.toml, but you may also have to include arbitrary code analogous to setup.py to orchestrate the build process. There's only so far you can get with a config file; real-world project builds get ferociously complex.
> Why is that hard for the Python community to come up with a single tool to rule them all?
0. A lot of those "tools and concepts" are actually completely irrelevant or redundant. easy_install has for all practical purposes been dead for many years. virtualenv was the original third party project that formed the basis for the standard library venv, which has been separately maintained for people who want particular additional features; it doesn't count as a separate concept. The setup.py file is a configuration file for Setuptools that also happens to be Python code. You only need to understand it if you use Setuptools, and the entire point is that you can use other things now (specifically because configuring metadata with Python code is a terrible idea that we tolerated for far too long). Wheels are just the distribution format and you don't need to know anything about how they're structured as an end user or as a developer of ordinary Python code — only as someone who makes packaging tools. And "pypy" is an alternate implementation of Python — maybe you meant PyPI? But that's just the place that hosts your packages; no relevant "concept" there.
Imagine if I wanted to make the same argument about JavaScript and I said that it's complicated because you have to understand ".tar.gz (I think, from previous discussion here? I can't even find documentation for how the package is actually stored as a package on disk), Node.js, NPM, TypeScript, www.npmjs.com, package.json..." That's basically what you're doing here.
But even besides that, you don't have to know about all the competing alternatives. If you know how to use pip, and your only goal is to install packages, you can completely ignore all the other tools that install packages (including poetry and uv). You only have any reason to care about pipenv if you want to use pip and care about the specific things that pipenv does and haven't chosen a different way to address the problem. Many pip users won't have to care about it.
1. A lot of people actively do not want it that way. The Unix philosophy actually does have some upsides, and there are tons of Python users out there who have zero interest in participating in an "ecosystem" where they share their code publicly even on GitHub, never mind PyPI — so no matter what you say should be the files that give project metadata or what they should contain or how they should be formatted, you aren't going to get any buy-in. But beyond that, different people have different needs and a tool that tries to make everyone happy is going to require tons of irrelevant cruft for almost everyone.
2. Reverse compatibility. The Python world — both the packaging system and the language itself — has been trying to get people to do things in better, saner ways for many years now; but people will scream bloody murder if their ancient stuff breaks in any way, even when they are advised years in advance of future plans to drop support. Keep in mind here that Python is more than twice as old as Go.
3. Things are simple for Go/JS/TS users because they normally only have to worry about that one programming language. Python packages (especially the best-known, "serious" ones used for heavyweight tasks) very commonly must interface with code written in many other programming languages (C and C++ are very common, but you can also find Rust, Fortran and many more; and Numpy must work with both C and Fortran), and there are many different ways to interface (and nothing that Python could possibly do at a language level to prevent that): by using an explicit FFI, by dlopen() etc. hooks, by shelling out to a subprocess, and more. And users expect that they can just install the Python package and have all of that stuff just work. Often that means that compiled-language code has to be rebuilt locally; and the install tools are expected to be able to download and set up a build system, build code in an isolated environment, etc. etc. All of this is way beyond the expectations placed on something like NPM.
4. The competition is deliberate. Standards — the clearest example being the PEPs 517/518/621 that define the pyproject.toml schema — were created specifically to enable both competition and interoperation. Uv is gaining market share because a lot of people like its paradigms. Imagine if, when people in the Python community first started thinking about the problems and limitations of tools from the early days, decided to try to come up with all the paradigms themselves. Imagine if they got them wrong, and then set it in stone for everyone else. When you imagine this, keep in mind that projects like pip and setuptools date to the 2000s. People were simply not thinking about open-source ecosystems in the same way in that era.
> They should remove the "There should be one-- and preferably only one --obvious way to do it." from the Python Zen.
First, define "it". The task is orders of magnitude greater than you might naively imagine. I know, because I've been an active participant in the surrounding discussion for a couple of years, aside from working on developing my own tooling.
Second, see https://news.ycombinator.com/item?id=44763692 . It doesn't mean what you appear to think it does.
> Don't understand how a private company like Astral is leading here. Why is that hard for the Python community to come up with a single tool to rule them all? (I know https://xkcd.com/927/). Like, you could even copy what Go or Node are doing, and make it Python-aware; no shame on that. Instead we have these who-knows-how-long-they-will-last tools every now and then.
Python packaging is (largely) solving problems that Go and Node packaging are not even trying to address.
As someone outside those communities, could you elaborate?
Not the person you're replying to, so I don't know if this is what he had in mind, but with Python packages you can distribute more than just Python. Some packages contain C/C++/Fortran/Rust/others? source code that pip will try to automatically build upon install. Of course you can't expect everyone to have a dev environment set up, so packages can also contain pre-compiled binary for any combination of windows/mac/linux + amd/arm + glibc/musl + CPython/pypy (did I miss any?).
I don't know much about go, and I've only scratched the surface with node, but as far as node goes I think it just distributes JS? So that would be one answer to what Python packaging is trying to solve that node isn't trying to address.
> any combination of windows/mac/linux + amd/arm + glibc/musl + CPython/pypy (did I miss any?).
From a standards perspective, it is a combination of a Python version/implementation, a "platform" and an "ABI". (After all, the glibc/musl distinction doesn't make sense on Windows.)
Aside from CPython/pypy, the system recognizes IronPython (a C# implementation) and Jython (a Java implementation) under the version "tag"; of course these implementations may have their own independent versioning with only a rough correspondence to CPython releases.
The ABI tag largely corresponds to the implementation and version tag, but for example for CPython builds it also indicates whether Python was built in debug or release mode, and from 3.13 onward whether the GIL is enabled.
The platform tag covers Mac, Windows, several generic glibc Linux standards (called "manylinux" and designed to smooth over minor differences between distros), and now also some generic musl Linux standards (called "musllinux"). Basic CPU information (arm vs intel, 32- vs 64-bit etc.) is also jammed in here.
Details are available at https://packaging.python.org/en/latest/specifications/platfo... .
> Some packages contain C/C++/Fortran/Rust/others? source code that pip will try to automatically build upon install.
And in the TS/JS world we have React.Native that has a flexible pluggable model that allows creating XCode projects with autodiscovered dependencies in C, C++, Swift and other languages.
It's also flexible enough to allow third-party products like Sentry to integrate into the build process to upload debug symbols to the Sentry servers on release builds.
So no, Python is really not unique in its requirements.
Specifically simultaneous distribution of precompiled binaries for many different OS and hardware configurations and built-on-demand source distribution of non-Python software to be used as dependencies with as little (ideally none) host setup by the user all installable under a single name/version everywhere.
imagine a world without: failed to build native gem extension
I don’t know how I feel about one company dominating this space. I love what they do but what happens 5 years down the road?
No Thanks. For majority of my use case pip is just fine. I’m not here to chase time just to live life
This is a low quality comment.
From the guidelines [1]
> Please don't post shallow dismissals, especially of other people's work.
[1] https://news.ycombinator.com/newsguidelines.html
Strange to reject something when you don't even understand what it is.
PYX is a package registry, and therefore an alternative to PyPI (like how JSR is an alternative to NPM).
The alternative to `pip` that Astral has built is called `uv`. Feel free to not use that either, but personally I'd consider it if I were you. It has full pip compatibility (with significantly more speed), and many other nice features besides.
I wonder whether it will have a flat namespace that everyone competes over or whether the top-level keys will be user/project identifiers of some sort. I hope the latter.
Fundamentally we still have the flat namespace of top level python imports, which is the same as the package name for ~95% of projects, so I'm not sure how they could really change that.
Package names and module names are not coupled to each other. You could have package name like "company-foo" and import it as "foo" or "bar" or anything else.
But you can if you want have a non-flat namespace for imports using PEP 420 – Implicit Namespace Packages, so all your different packages "company-foo", "company-bar", etc. can be installed into the "company" namespace and all just work.
Nothing stops an index from validating that wheels use the same name or namespace as their package names. Sdists with arbitrary backends would not be possible, but you could enforce what backends were allowed for certain users.
Once we learn to namespace things it’s gonna be so nice. Seems we keep re-learning that lesson…
I feel like github solved this problem pretty well.
I lost track of how many different ways to install a Python library there are at the moment.
Much better than the Node a handful of years back. Everybody used NPM, everybody switched to Yarn, everybody switched back to NPM.
Thats because npm finally decided to adopt enough features as time went on that it could be used in place of yarn and eventually, if they adopt enough of the features of pnpm, it will replace that too.
Though speaking as a long time developer in the ecosystem, switching between npm, yarn, and pnpm is fairly trivial in my experience. Especially after node_gyp went away
It's a good thing that a newcomer came and showed the world some new concepts which ended up being adopted by the old tool. In the Haskell world, everybody used cabal, everybody switched to stack, and then everybody switched back to cabal once it got its new-build commands ready.
pnpm for lif
The idea that every single directory for every single project down 18 subdirectories deep should have their own copy of is-even is insanity
I think node has a better tooling and ecosystem right now. Astral is doing a great job to reduce the gap.
Node ecosystem still has the problem where if you try to build a project two years later, chances are good it won't work, because breaking changes are so common, and this is then multiplied across all the tiny packages that are dependencies of your dependencies.
Don't lockfiles help?
Interesting watching this part of the landscape heating up. For repos you've got stalwarts like Artifactory and Nexus, with upstart Cloudsmith. For libraries you've got the OG ActiveState, Chainguard Libraries and, until someone is distracted by a shiny next week, Google Assured Open Source.
Sounds like Pyx is trying to do a bit of both.
Disclosure: I have interacted a bunch with folks from all of these things. Never worked for or been paid by, though.
> There should be one-- and preferably only one --obvious way to do it.
It’s been pointed out to me before that:
the two ways of spacing the em dash on that quote is a joke about how it's not actually possible to do that? (And there's a third way in another line of the zen)
Do I buy it? Not sure. But apparently there’s more to this line than what it suggests.
This is not only the main reason Python turned me off, they can't even stick to it with their umpteen package managers, which is hypocrisy
Can I ask a dumb question. Why does Ruby (for example) not have this problem, but python still can't ship a standard solution which isn't constantly changing and rolled up in some corporate offering?
GPU/c-bindings.
Python packaging for for Python only modules has never been a problem. When people say they hate python packaging they are usually talking about being able to successfully install dependencies without much thinking. But, the biggest reason that doesn't work is because of the dependencies that have to be compiled. Which brings it's own problems. Have you ever had a c dependency on node or ruby on a system that wasn't the same system they built it with? Turns out it sucks in all the languages. It's just that the amount of c-level packages in python is quite larger than say ruby. The likelihood of a problem is significantly larger.
Especially in the GPU accelerated space.
Ruby is mostly used in web dev, where most if not all of your dependencies tend to be pure Ruby.
Python is used heavily for DS/ML/AI, which is exactly the area where native code packages are necessary and prevalent. Worse yet is that those packages often involve GPU code, and things like CUDA bring their own complications.
If you're writing web apps in Python, dependencies haven't really been a problem for a long time now.
Before you answer that you have to answer what problem this is solving that PyPI doesn’t already address. uv works great against “legacy” package indexes so I’m not really clear why it’s needed other than to introduce lock-in to a for-profit facility.
Because CPython and PyPA are dysfunctional organizations in the hands of people who are in the right (often corporate) cliques. Don't expect anything from there.
I really want to pay someone money to run package repo mirrors for me, but my problems have been more with npm than with Pypi. Astral, if you're listening.... maybe tackle JS packaging too?
Cool idea! I think I could benefit from this at my job if they're able to eat Anaconda's lunch and provide secure, self-hosted artifacts.
Only thing that is unclear to me is to which extend this setup depends on the package publisher. PyPi might be terrible at least it just works when you want to publish that it leads to more complexity for the ones that are looking to use this piece of free software is not for the maintainer.
Maybe they are only targeting dev tooling companies as a way to simplify how they distribute. Especially in the accelerated compute era.
When this releases it will be crazy, Ive always wondered why something like this didn't already exist.
Really useful concept especially for school.
Is this going to solve the combinatorial explosion of pre-building native dependencies for every possible target?
Python should get rid of its training wheels :^)
https://kristoff.it/blog/python-training-wheels/
> Emphasis mine. It would indeed be hard to survive without that kind of support from a corporation. A user on HN estimated the yearly cost of this traffic at around 12 million USD/year (according to AWS Cloudfront rates), more than four times the full operating budget of the Python Software Foundation as of 2024.
(As the user in question: three times.)
> Leverage progress in the systems programming ecosystem to create repeatable builds. Turn prebuilt binaries from “sources” into cacheable artifacts that can be deleted and reconstructed at will. Institute a way of creating secondary caches that can start shouldering some of the workload.
This doesn't avoid the need for the wheels to exist and be publicly available. People running CI systems should figure out local caching that actually works, sure. But if you delete that cacheable artifact on the public PyPI website for something like, I don't know, numpy-2.3.2-cp312-cp312-win_arm64.whl, you're going to be re-creating it (and having it downloaded again) constantly. Windows users are just not going to be able to build that locally.
And you know, space usage isn't the problem here — we're talking about a few hard drives' worth of space. The number of downloads is the problem. Outside of CI, I guess that's mostly driven by end users defaulting to the latest version of everything, every time they make a new virtual environment, rather than using whatever's in their package installer's cache. I do know that uv makes the latter a lot easier.
I hate that they are using the pyx name; it's the extension for Cython files. It's going to cause at least a moment of confusion for people. They could have easily checked for name collision in the Python ecosystem but they chose not to do that; that's like a middle finger gesture to the community.
Good on you guys!
I wanted to start a business exactly like this years ago, when I actually worked in Python. I ended up not doing so, because at the time (circa 2014-2015) I was told it would never take off, no way to get funding.
I'm glad you're able to do what ultimately I was not!
Been waiting to see what Astral would do first (with regards to product). Seems like a mix of artifactory and conda? artifactory providing a package server and conda trying to fix the difficulty that comes from Python packages with compiled components or dependencies, mostly solved by wheels, but of course PyTorch wheels requiring specific CUDA can still be a mess that conda fixes
Given Astral's heavy involvement in the wheelnext project I suspect this index is an early adopter of Wheel Variants which are an attempt to solve the problems of CUDA (and that entire class of problems not just CUDA specifically) in a more automated way than even conda: https://wheelnext.dev/proposals/pepxxx_wheel_variant_support...
It's actually not powered by Wheel Variants right now, though we are generally early adopters of the initiative :)
Well it was just a guess, "GPU-aware" is a bit mysterious to those of us on the outside ;).
Python packaging is the least zen of python thing about python.
Pyx is just a registry, just like Pypi, or did I misunderstood it?
Not exactly -- part of pyx is a registry (and that part speaks the same standards as PyPI), but the bigger picture is that pyx part of a larger effort to make Python packaging faster and more cohesive for developers.
To be precise: pyx isn't intended to be a public registry or a free service; it's something Astral will be selling. It'll support private packages and corporate use cases that are (reasonably IMO) beyond PyPI's scope.
(FD: I work on pyx.)
Sounds like it. Also ..
> pyx is also an instantiation of our strategy: our tools remain free, open source, and permissively licensed — forever.
FWIW, I think the full paragraph around that snippet is important context:
> Beyond the product itself, pyx is also an instantiation of our strategy: our tools remain free, open source, and permissively licensed — forever. Nothing changes there. Instead, we'll offer paid, hosted services that represent the "natural next thing you need" when you're already using our tools: the Astral platform.
pyx itself is not a tool, it's a service.
How do you pronounce "pyx"? Pikes, picks, pie-ex?
We've been pronouncing it pea-why-ecks, like uv (you-vee) and ty (tee-why). But I wouldn't say that's permanent yet.
that's fascinating - I've definitely been saying "you've" and "tie". I assumed this was "picks"
definitely will just say picks...
py-x like pie-ex?
I demand it be pronounced like the existing English word "pyx" (aka "pyxis"), meaning a weird religious box:
https://en.wikipedia.org/wiki/Pyx
I do not trust Astral.
Much ad language.
They do not explain what an installation of their software does to my system.
They use the word "platform".
It's hosted SaaS; you don't install it on your system.
I feel like I must be the crazy one for never having a problem with just vanilla pip, PyPi, and venv (or virtualenv for old Python 2 stuff). Maybe it's just my use case?
Astral is the coolest startup
Been waiting for something like this to make it easier to manage multi-package projects.
If uv can install a Python version in 32ms (a time that I saw and confirmed just this morning), then sign me up.
Neat. uv is spectacular.
But I don’t get it. How does it work? Why is it able to solve the Python runtime dependency problem? I thought uv had kinda already solved that? Why is a new thingy majig needed?
> Why is it able to solve the Python runtime dependency problem? I thought uv had kinda already solved that?
The dependencies in question are compiled C code that Python interfaces with. Handling dependencies for a graph of packages that are all implemented in pure Python, is trivial.
C never really solved all the ABI issues and especially for GPU stuff you end up having to link against very specific details of the local architecture. Not all of these can be adequately expressed in the current Python package metadata system.
Aside from that, a lot of people would like to have packages that use pre-installed dependencies that came with the system, but the package metadata isn't designed to express those dependencies, and you're also on your own for actually figuring out where they are at runtime, even if you take it on blind faith that the user separately installed them.
This is not really my wheelhouse, though; you'll find a much better write-up at https://pypackaging-native.github.io/ .
> especially for GPU stuff you end up having to link against very specific details of the local architecture.
Hrm. This doesn’t sound right to me. Any project should target a particular version of Cuda and then the runtime machine simply needs to have that version available. Right?
> a lot of people would like to have packages that use pre-installed dependencies that came with the system
Those people are wrong. Everything these days requires Docker because it’s the only way to deploy software that can reliable not crash on startup. (This is almost entirely a Linux self induced problem)
This is a server hosted SaaS product that acts as a private registry for your company's Python packages.
uv becomes a client that can install those packages from your private registry.
I imagine pip will be able to install them from a PYX registry too.
What are the reasons that Python can't implement the same sort of module/packaging system as NodeJS? That seems to work well enough.
Executing a Python script in the same directory as some sort of project.json file that contains all the complicated dependency details would be a pretty good solution to me. But I'm probably missing a whole bunch of details. (Feel free to educate me).
In general I really dislike the current system of having to use new environment variables in a new session in order to isolate Py scripts. It has always seemed like a hack with lots of footguns. Especially if you forget which console is open.
It can. That's what uv is. Put an '#!/usr/bin/env -S uv run python' shebang in your script, add a `pyproject.toml` with all of your deps, and you're done.
> add a `pyproject.toml` with all of your deps
Or put your dependencies inline in the script (https://packaging.python.org/en/latest/specifications/inline...), and it's just a single file script again. But with dependency support
EDIT to clarify it was just the pyproject.toml that inline metadata could replace
Again! ezsetup, setuptools, conda, poetry, uv, now this.
i wonder if nix has been considered
this is so comical, entirely https://xkcd.com/927/
python has burned me with it's packaging so many times.
Yay, _another_, probably incompattible, python package manager has arrived.
Probably the more useful blog post: https://astral.sh/blog/introducing-pyx
Thanks, we've updated this now from https://astral.sh/pyx.
Thanks that’s bit less cryptic than the linked page.
Still don’t get how they are solving what they claim to solve.
I'm guesssing from the UV page [0] its mainly if the speed of pip is a problem for you?
[0] https://docs.astral.sh/uv/
There are a bunch of problems with PyPI. For example, there's no metadata API, you have to actually fetch each wheel file and inspect it to figure out certain things about the packages you're trying to resolve/install.
It would be nice if they contributed improvements upstream, but then they can't capture revenue from doing it. I guess it's better to have an alternative and improved PyPI, than to have no improvements and a sense of pride.
There is a lot of other stuff going on with Pyx, but "uv-native metadata APIs" is the relevant one for this example.
I'm guessing it's the right PyTorch and FlashAttention and TransformerEngine and xformers and all that for the machine you're on without a bunch of ninja-built CUDA capability pain.
They explicitly mention PyTorch in the blog post. That's where the big money in Python is, and that's where PyPI utterly fails.
I spend little time with Python, but I didn’t have any problems using uv. Given how great uv is, I’d like to use pyx, but first it would be good if they could provide a solid argument for using it.
I suspect that, in order to succeed, they will need to build something that is isomorphic to Nix.
Yeah, and uv2nix is already pretty good! I wonder if pyx will be competing with uv2nix.
It's easy to compete with Nix tooling, but pretty hard to compete with the breadth of nixpkgs.
They already built uv, which works extremely well for that
[flagged]
[flagged]
Please stop posting like this. HN is not for self-promotion; it's for sharing and discussing interesting topics and projects. We ban people who post like this continually.
[flagged]
I'm a fan of Deno's
> Waitlist
> Private registry
ouch.
I actually think this is great. If Astral can figure out a way to make money using a private registry (something that is used mainly by companies), then they'll have to resources to keep building their amazing open-source projects — Ruff and uv. That's a huge win for Python.
100% agree. I am more than happy to see Astral taking steps in this direction. People can continue to use uv, ruff, and ty without having to pay anything, but companies that benefit tremendously from open source initiatives can pay for a private package registry and directly support the continued development of said tools.
In particular I think it's nice for uv and ruff to remain open source, not open core. And as you say, companies always need paid private registries, for their internal software. A true win-win.
>Modern
I'll pass. I'd rather have the battle-tested old thing, thanks.
I've been limiting myself to whatever is available on debian and it's been fine for me since several years.
I don't understand why people who don't do weird AI stuff would use any of that instead of sticking to distribution packages and having the occasional 1 or 2 external modules that aren't packaged.
Indeed, to expand on my remark: I wrote Python in academia for ~6 years and then professionally for nearly a decade in data science, data engineering and backend web apps. Virtualenv was fine. Pipenv had a nicer CLI and easier to use dependency pinning. But fundamentally all this stuff worked fine.
Because making external modules cooperate with the system environment is awkward at best (and explicitly safeguarded against since 3.11, since it can cause serious problems otherwise even with "user" installs), and installing the distro's packages in a separate environment is not supported as far as I can tell. And because the system environment is often deliberately crippled; it may not even include the entire standard library.
> the system environment is often deliberately crippled; it may not even include the entire standard library.
well getting a good system is step 1