Skip to content

Commit bbb6466

Browse files
committed
introduced GIT_LFS_SKIP_SMUDGE=1 as an option now for the package submission workflow; it doesn't hurt us any more if users don't set it
1 parent a64992d commit bbb6466

2 files changed

Lines changed: 21 additions & 18 deletions

File tree

archive_reviewer_guide.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
The role of the Poseidon package reviewer is to help ensuring quality standards for Poseidon's public package archives. Fortunately, many aspects of the Poseidon schema are machine-testable. Automatic validation catches various structural issues right away, for example missing mandatory columns in the Poseidon .janno file (such as the `Poseidon_ID`).
66

7-
But there are some aspects we cannot check, such as the scientific correctness of the given information. And there are other we don't want to formally check, because they are not included in the core definition of a Poseidon package, but just policy for our public archives. For these, we rely on a checklist every package author has to fill, and finally manual reviews.
7+
But there are some aspects we cannot check, such as the scientific correctness of the given information. And there are others we don't want to formally check, because they are not included in the core definition of a Poseidon package, but just policy for our public archives. For these, we rely on a checklist every package author has to fill, and finally reviews.
88

99
## GitHub Pull Requests
1010

archive_submission_guide.md

Lines changed: 20 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,6 @@ The Poseidon framework has a strongly decentralized philosophy and relies very m
66

77
We assume you have some basic knowledge about using a command line software like [`trident`](trident), and how to handle Git and GitHub. If not, then you can become knowledgable quickly about the latter, for example [here](https://githubtraining.github.io/training-manual).
88

9-
!> Never clone the archive repositories without `GIT_LFS_SKIP_SMUDGE=1`. Always clone with `GIT_LFS_SKIP_SMUDGE=1 git clone ...`.
10-
119
## Archive curation roles
1210

1311
To manage package submissions and modifications in our archives, we define the following roles, which are synonymous to the respective roles within github:
@@ -55,37 +53,42 @@ This is mandatory. Please also run [`trident validate`](trident?id=validate-comm
5553

5654
### Submitting the package
5755

58-
The procedure for the actual submission is then as follows (a shorter, slightly more hands-on tutorial is available [here](https://mpi-eva-archaeogenetics.github.io/comp_human_adna_book/poseidon.html#contributing-to-the-community-archive))
56+
The procedure for the actual submission is then as follows:
5957

60-
**1. Fork and then clone the GitHub repository for the archive you want to modify.**
58+
**1. Fork the GitHub repository for the archive you want to modify.**
6159

6260
You need to be logged into github with your user account. You can then navigate to our github repository: <https://github.com/poseidon-framework/community-archive> and hit the "Fork" button near the top of the page.
6361

6462
You will then have a copy of the entire repository under your own user name: `https://github.com/<yourGithubUserName>/community-archive`.
6563

66-
For the following to work, you need to have setup your github account in a way that allows you to communicate with github via the command line. For this, you need to configure an SSH public-key, so github really knows it's you. Find out more about it here: <https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent>.
64+
**2. Clone (download) your fork.**
6765

68-
!> To safe our [Git LFS](https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-git-large-file-storage) bandwidth, **please clone in a way that does not download the large data files from GitHub** (they should be downloaded from our webserver with [`trident fetch`](trident?id=fetch-command)).
66+
For the following to work, you need to have setup your github account in a way that allows you to communicate with github via the command line. For this, you need to configure an SSH public-key, so github really knows it's you. Find out more about it here: <https://docs.github.com/en/authentication/connecting-to-github-with-ssh/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent>.
6967

70-
At the same time you need to be able to add new LFS files. A proper setup for this includes the following steps:
68+
You need to be able to add new [Git LFS](https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-git-large-file-storage) files. A proper setup for this includes the following two steps:
7169

72-
- downloading and [installing Git LFS](https://git-lfs.github.com/),
73-
- setting it up for your user with `git lfs install`
74-
- cloning the repo **with the `GIT_LFS_SKIP_SMUDGE` environment variable**, which prevents downloading the LFS files despite Git LFS being enabled:
70+
1. downloading and [installing Git LFS](https://git-lfs.github.com/),
71+
2. setting it up for your user with `git lfs install`
7572

76-
```
77-
GIT_LFS_SKIP_SMUDGE=1 git clone git@github.com:<yourGitHubUserName>/community-archive.git
78-
```
73+
You can then clone the fork repository.
7974

80-
As a consequence the large files will not be downloaded, but only stub files, representing the real files on the LFS server. This clone is only for submission purposes after all -- you can not work with the genotype data in it. `2021_Wang_EastAsia/2021_Wang_EastAsia.bed` for example will look like this:
75+
To safe some time and storage space on your system, you can clone in a way that does not download the large data files in the repository. You can do so by setting `GIT_LFS_SKIP_SMUDGE` environment variable. As a consequence the large files will not be downloaded, but only stub files, representing the real files on the LFS server. This clone is only for submission purposes after all -- you will probably not work with the genotype data in it. `2021_Wang_EastAsia/2021_Wang_EastAsia.bed` for example will look like this:
8176

8277
```
8378
version https://git-lfs.github.com/spec/v1
8479
oid sha256:766e7c9f79c1659dfb924c901420f01e8720557a0ec37f2a694f6a29cdc0a55e
8580
size 177553875
8681
```
8782

88-
**2. Copy your new package into your local clone.**
83+
The clone command with `GIT_LFS_SKIP_SMUDGE` set is as follows:
84+
85+
```
86+
GIT_LFS_SKIP_SMUDGE=1 git clone git@github.com:<yourGitHubUserName>/community-archive.git
87+
```
88+
89+
If you want to download the large files as well, then omit `GIT_LFS_SKIP_SMUDGE=1`.
90+
91+
**3. Copy your new package into your local clone.**
8992

9093
You should now copy your package including the full genotype data into the cloned repository as a new package directory. The directory should include the genotype data. Git (with Git LFS enabled) and GitHub will detect automatically that it should treat them as LFS files. Then commit the changes and push:
9194

@@ -97,7 +100,7 @@ git push
97100

98101
If you accidentally pushed the large files as normal files, for example if your LFS setup was incomplete, you can fix this with `git lfs migrate import --no-rewrite path/to/file.bed` (see [here](https://github.com/git-lfs/git-lfs/blob/main/docs/man/git-lfs-migrate.adoc#import-without-rewriting-history)).
99102

100-
**3. Submit a pull request from your fork to merge your updates into our repository.**
103+
**4. Submit a pull request from your fork to merge your updates into our repository.**
101104

102105
Having successfully pushed your branch to your fork on github, you need to now tell github to propose your branch as a submission to our master repository. This is done through [github Pull Requests](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/about-pull-requests).
103106

@@ -111,7 +114,7 @@ If you identify a mistake in any package, be it in the context data (`.janno` fi
111114

112115
**1. Fork and clone the GitHub repository that contains the package you want to improve.**
113116

114-
Just as above described for the package submission, please remember to clone with `GIT_LFS_SKIP_SMUDGE=1`. Individual LFS files can be downloaded with `git lfs pull --include "PATH-TO-FILE"`. This is necessary if you would like to modify not just the context- and meta data, but also the genotype data of a package.
117+
Just as described above for the package submission. If you cloned with `GIT_LFS_SKIP_SMUDGE=1` but now want to edit individual LFS files, then you can download them with `git lfs pull --include "PATH-TO-FILE"`.
115118

116119
**2. Modify the files you want to change.**
117120

0 commit comments

Comments
 (0)