Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a minimal repository for lightweight submodule usage #245

Open
pasabanov opened this issue Feb 6, 2025 · 4 comments
Open

Provide a minimal repository for lightweight submodule usage #245

pasabanov opened this issue Feb 6, 2025 · 4 comments
Assignees
Labels
feature New feature or enhancement request

Comments

@pasabanov
Copy link

pasabanov commented Feb 6, 2025

Is your feature request related to a problem? Please describe.

Your repository is convenient to use as a submodule.
However, the size of the downloaded submodule is clearly exceeds what is actually needed by the user, as it contains two copies of the header files, as well as other files.

Describe the solution you'd like

I’ve created a minimal one-file repository that tracks tomlplusplus:
https://github.com/pasabanov/tomlpp-min

This version is intended for more economical submodule addition with reduced size. Notably, a shallow clone of this repository (as well as its full size, since it consists of a single commit) is almost 10 times smaller than the original repository:

git submodule add --depth 1 https://github.com/pasabanov/tomlpp-min toml++

To create this repository, I squashed all commits from the initial commit up to v3.4.0 and removed all unnecessary files.
In the commit message, I included all contributors who worked on toml.hpp and the include directory in the original repository (both lists turned out to be identical).

To gather the authors, I combined the output of the following commands:

git log --pretty=format:"Co-authored-by: %an <%ae>" -- toml.hpp | sort | uniq
git log --pretty=format:"Co-authored-by: %an <%ae>" -- include | sort | uniq

as well as lines like "Co-authored-by: *" from:

git log toml.hpp | grep auth
git log include | grep auth

Would you be interested in maintaining a similar minimal repository? I believe it could be useful for those who want a lightweight submodule without extra files.

I think the maintenance process could simply be to make one squashed commit for each new release.

@pasabanov pasabanov added the feature New feature or enhancement request label Feb 6, 2025
@marzer
Copy link
Owner

marzer commented Feb 6, 2025

Hi there,

Interesting. I wasn't aware the clone times/sizes of this repo were anythign close to a problem, to be honest. 😅

Would you be interested in maintaining a similar minimal repository?

Sure, you can add me as a maintainer to it, that'll be fine. This repo is long overdue for some refactors and a new release, so I might end up contributing eventually.

I think the maintenance process could simply [...]

If you can capture your process in a re-usable shell-script and have that in the repo itself that'd be great.

@pasabanov
Copy link
Author

pasabanov commented Feb 6, 2025

I wasn't aware the clone times/sizes of this repo were anythign close to a problem, to be honest. 😅

Yeah, about 5 megabytes of data isn't that much, but if there's a simple way to download less data (almost 10 times less), it would be convenient in any case.
Also, for small projects, 5 megabytes can be on par with or even greater than the project itself.
I think it's always a good practice to keep your storage usage minimized.

you can add me as a maintainer to it

Ok, I'll add you.
I initially thought that, as the author, you could create your own minimized project. This would be more logical, making the minimized repository the official source.
But if you're okay with being a maintainer in my repository instead, that's fine too.
As another option, I could transfer this repository to you, so you could either maintain it yourself or add me as a maintainer.

If you can capture your process in a re-usable shell-script and have that in the repo itself that'd be great.

This would be beneficial.
I don't mind adding a few kilobytes to the repository, as the single file in it already is half a megabyte.

Also, a workflow for checking the match between files in the original and minimized repositories could be implemented.

But, I think we can hold off on that, at least until the next release version.

For now, I can present a few drafts:

  1. Bad: for some reason, it glues two (only two) lines in the output into one:
    (git log --pretty=format:"Co-authored-by: %an <%ae>" toml.hpp include; git log toml.hpp include | grep Co-authored-by: | awk '{$1=$1}1') | sort | uniq
  2. Better: inserts a new line with sed 's/>C/>\nC/':
    (git log --pretty=format:"Co-authored-by: %an <%ae>" toml.hpp include; git log toml.hpp include | grep Co-authored-by: | awk '{$1=$1}1') | sed 's/>C/>\nC/' | sort | uniq
  3. Best so far: removes lines with matching names but different emails, keeping only the last email:
    (git log --pretty=format:"Co-authored-by: %an <%ae>" toml.hpp include; git log toml.hpp include | grep Co-authored-by: | awk '{$1=$1}1') | sed 's/>C/>\nC/' | sort | uniq | awk -F ' <' '{if(seen[$1]++==0)print$0}'

@pasabanov
Copy link
Author

pasabanov commented Feb 6, 2025

  1. A shorter version (keeps only the last email):
    git log --pretty=format:"Co-authored-by: %an <%ae>%n%b" toml.hpp include | grep 'Co-authored-by:' | awk '{$1=$1}1' | sort | uniq | awk -F '<' '{if(seen[$1]++==0)print$0}'
  2. The shortest version (keeps the shortest email):
    git log --pretty="Co-authored-by: %an <%ae>%n%b" toml.hpp include | grep 'Co-authored-by:' | sort -u -t '<' -k1,1

@marzer
Copy link
Owner

marzer commented Feb 6, 2025

Any of 3, 4 and 5 are fine with me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or enhancement request
Projects
None yet
Development

No branches or pull requests

2 participants