-
Notifications
You must be signed in to change notification settings - Fork 570
Size of the typst/packages repository #2024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We are well aware of this issue, and even if a shallow clone wouldn't help, a sparse checkout should normally keep the repository size smaller on your disk. As I said in #2007 I plan to rework the packaging guidelines at some point, and explanations on how to do that (as well as general tips on how to keep a package small) will definitely be part of it. On the long term, we will probably move away from Git and GitHub to store and review packages to use a custom solution instead. |
Would you be able to suggest a way to do this incrementally, so that we can get this useful documentation out right now, without being in conflict with your medium-plans for the README? It's a shame to find out how to generate thumbnails or cloning the repository later on, after having already done both in a non-optimal way, because the information is hard to find. For example, an immediate plan would be to create a doc/ repository, and move the content of the current README in subfiles there (with links from the main README). I would think of the following separate topics:
This is mostly for package authors. The main README should work as a landing page for package users by briefly mentioning how to use packages, and how to browse the set of existing packages. (The documentation on local packages can also be useful there.) Would you be willing to review a PR that splits up the existing guidelines in this way? |
Yes, that would be extremely helpful. There are few other things I have in mind when it comes to improving the docs, but this can be an incremental process, splitting everything as you suggested would be a good first step. |
I am not very knowledgable in this area, but I have wondered for a long time whether typst would consider moving away from a monolithic package repo in favor of a more distributed approach based on a registry. One example is the Julia programming language. Every package lives in its own repo (on any publicly accessible host) and is then registered within an official registry (e.g., the general registry: https://github.com/JuliaRegistries/General). |
Idea: Registries for Typst packagesWould it be worth to explore if we can define some sort of "registry" (or find more suitable name) like is being done in the Rust ecosystem with
// Use shared registry identifier
#import "@registry-identifier/my-package:0.1.1"
// Preview registry which will be replaced in the future
#import "@preview/my-package:0.1.1"
// Use direct url
#import "@{https://my-registry.org}/my-package:0.1.1"
I also see some caveats and it might be reasonable to not simply "throw around" new registries lightly.
|
Personally I think that this is somewhat illusory, especially the idea that "registries manage themselves": no, actually sizeable ecosystems have people who pour a lot of work into the repository (crates.io has dedicated volunteers, so does the opam-repository for the OCaml community that I'm more familiar with), and this also requires a lot of tooling and architecture. My understanding from a distance is that Typst is both an open-source tool and also a company, and that the people who tried to make a living working on it currently want something that is more integrated than a repository of stuff hosted elsewhere. (For example maybe they want to evolve the repository format at the same time as their cloud frontend, and for this being able to update previous package versions is actually very convenient.) I don't have a strong opinion on which approach is better, but I think it's their choice to make, based on concrete needs. If we want to stick to a repository that hosts packages content (in addition to metadata), but reduce the size that people have to download to contribute a package, there are plenty of technical solutions around. If people want to change the social organization of the package repository, this could/should be discussed separately. |
@gasche Thanks for the insights. I was in no way under the impression that this is the solution to the problem but simply offering a perspective. I personally do not really care about what the actual solution might be. What counts to me is that i can contribute my packages with a low barrier (which the current solution offers imho). I think one problem which needs to be addressed in the future is the question of namespaces (aka what else to do than Maybe now that I think about it, it would be more similar to the pypi.org package index. And maybe the name "registry" was misleading in the beginning. |
The typst/packages repository takes 1.9Gio on my machine with a current clone. As Typst gets more popular, its size will increase at a higher-than-linear speed, and there is a risk that it becomes painful in practice to operate with the package repository: at some point, people with low bandwidth will have trouble cloning the repository to contribute their own package.
The size of the repository is currently roughly:
(in particular, doing a shallow clone will not help much)
On my current checkout of the repository, there are
In the short term, the following could work:
Replacing identical asset files by symbolic links can be done by package authors if they are told how to do it, or by repository maintainers after the fact. (git already deduplicates its internal data, so it is not strictly necessary to do it at package-submission time.) A quick experiment suggests that doing this with the current repository should shrink its size from 1.4Gio to 947Mio, which is a sizeable win.
In the long term, I think that repository maintainers should maybe consider git-lfs or other options. The end goal would be that package authors do not need to download all other packages to submit theirs.
The text was updated successfully, but these errors were encountered: