Skip to content

Commit 3b52b4f

Browse files
committed
update
1 parent 51cf2af commit 3b52b4f

2 files changed

Lines changed: 49 additions & 28 deletions

File tree

config.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -81,10 +81,10 @@ languages:
8181
parent: Help
8282
weight: 20
8383
url: /open-source-for-social-science/formatting-help/
84-
- name: Developer guide
84+
- name: Contributing to Harmony
8585
parent: Help
8686
weight: 30
87-
url: /developer-guide/
87+
url: /open-source-for-social-science/contributing-to-harmony-nlp-project/
8888
- name: Releases
8989
parent: Help
9090
weight: 35

content/en/contributing-to-harmony.md

Lines changed: 47 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,43 @@
11
---
22
title: "Guide to contributing to Harmony"
33
aliases:
4+
- "/contributing/"
45
- "/contributing-to-harmony/"
56
- "/developer-guide/"
67
url: "/open-source-for-social-science/contributing-to-harmony-nlp-project/"
78
---
8-
Welcome to Harmony! Thank you for considering contributing to the Harmony project. This guide consolidates all the resources, tools, and tips you need to start contributing effectively. We value your input, no matter what skill set you have - whether you’re a developer, researcher, or data and AI enthusiast, or even if you’re in an entirely different field.
9+
10+
Thank you for considering contributing to the Harmony project. This guide consolidates all the resources, tools, and tips you need to start contributing effectively. We value your input, no matter what skill set you have - whether you’re a developer, researcher, or data and AI enthusiast, or even if you’re in an entirely different field.
911
{{< youtube cEZppTBj1NI >}}
1012
## Getting started with Harmony
1113
We recommend that you try the free web tool: [Harmony App](https://harmonydata.ac.uk/app) to understand how the tool works and what it does.
1214
Join our [Discord Server](https://discord.gg/harmonydata), where you can interact with users and contributors, ask questions, and collaborate on ideas.
13-
Follow us on social media platforms: [Twitter](https://twitter.com/harmony_data), [LinkedIn](https://www.linkedin.com/company/harmonydata/), [Facebook](#Facebook-link), [YouTube](https://www.youtube.com/@harmonydata)
14-
We’ve split this guide into two sections, depending on whether you are intending to use Harmony in research or contribute to the code base:
15-
* I’m using Harmony -> see [Contributing to Harmony as a user](#contributing-to-harmony-as-a-user-eg-a-social-scientist-or-other-researcher)
16-
* I would like to contribute to the code base of Harmony -> see [Contributing to Harmony as a developer](#contributing-to-harmony-as-a-developer)
17-
15+
Please also follow us on social media platforms: [Twitter](https://twitter.com/harmony_data), [LinkedIn](https://www.linkedin.com/company/harmonydata/), [Facebook](#Facebook-link), [YouTube](https://www.youtube.com/@harmonydata).
1816

17+
We’ve split this guide into two sections, depending on whether you are intending to use Harmony in research or contribute to the code base:
18+
* I’m using Harmony 👉 see [Contributing to Harmony as a user](#contributing-to-harmony-as-a-user-eg-a-social-scientist-or-other-researcher)
19+
* I would like to contribute to the code base of Harmony 👉 see [Contributing to Harmony as a developer](#contributing-to-harmony-as-a-developer)
1920

2021

2122
## Contributing to Harmony as a user (e.g. a social scientist or other researcher)
2223

2324
There are two ways you can use Harmony in your work:
2425

25-
1. Using your internet browser at https://harmonydata.ac.uk/app
26-
2. If you can code in Python or R, by using the Python or R library
26+
1. Using the Harmony web app from your browser at https://harmonydata.ac.uk/app
27+
2. Using Harmony from code, via the Python or R library.
2728

2829
Some academic users start off on the web tool, and then switch to the [Python](https://github.com/harmonydata/harmony) or [R library](https://github.com/harmonydata/harmony_r) and work in Jupyter Notebooks or R Markdown to use the tool as part of their workflow.
2930

31+
We have some example notebooks to help you get started with the Python and R libraries:
32+
* [Python Colab notebook](https://colab.research.google.com/github/harmonydata/harmony/blob/main/Harmony_example_walkthrough.ipynb)
33+
* [R markdown](https://harmonydata.ac.uk/harmony_r_example.nb.html)
34+
* [R Colab notebook](https://colab.research.google.com/github/harmonydata/experiments/blob/main/Harmony_R_example.ipynb)
35+
3036
Here’s how you can help the project:
3137

32-
1. Say hi on Discord (or [use our contact form](/contact/) if you don’t have Discord). If you are using Harmony in your work, please let us know on Discord as we’re always really excited to hear from new users around the world. We’d like to know what you’re using it for, and if there are any features that you would like added, or any pain points that you would like to see resolved.
38+
1. Say hi on Discord (or [use our contact form](/contact/) if you don’t have Discord). If you are using Harmony in your work, please let us know on Discord as we’re always really excited to hear from new users around the world. We’d like to know what you’re using it for, and if there are any features that you would like added, or any pain points that you would like to see resolved. And of course we would love to know what came out of your research!
3339
2. Create issues in Github. If you find a bug or would like to request a new feature, please visit the [issue board on Github](https://github.com/harmonydata/harmony/issues) and create a new issue. Before creating your issue, please check if the issue in question hasn’t already been created by somebody else. If so, it's often better to just leave a comment on an existing issue, rather than creating a new one.
34-
3. Publicise the project. The project really benefits from people sharing us on social media, giving us shout-outs at events, and generally spreading awareness about Harmony. If your project has a website, please link to harmonydata.ac.uk and describe how you used Harmony.
40+
3. Publicise the project. The project benefits from people sharing us on social media, giving us shout-outs at events, and generally spreading awareness about Harmony. If your project has a website, please link to harmonydata.ac.uk and describe how you used Harmony.
3541
4. Cite us. If you use Harmony in your research, please cite the tool and the accompanying publication. [Citation details are here](/frequently-asked-questions/#how-do-i-cite-harmony).
3642
5. Attend events or invite us to talk. Please come to some of our in-person [events](/events/), and we are also interested in any opportunities to speak at your event.
3743
6. Share data. If you have any data that you’re trying to harmonise with Harmony and it’s not sensitive or restricted, please share it with us so that we can see how people are using the tool and understand what kind of input it should expect.
@@ -53,7 +59,9 @@ We have four main repositories on Github under the [harmonydata](https://github.
5359
2. Harmony API: https://github.com/harmonydata/harmonyapi - the Python API runs with Pydantic and Fast API. You can run this locally. The public Harmony web app is running this application as a Docker container on an on-prem server (CentOS, 16 GB).
5460
3. Harmony front end: https://github.com/harmonydata - this is everything to do with the front end and graphical interface of Harmony. We welcome feedback and contributions on front end and UX issues.
5561
4. R: https://github.com/harmonydata/harmony_r - the R port is on [CRAN](https://cran.r-project.org/web/packages/harmonydata/index.html) and it is slightly less mature than Python so we really appreciate if you can help move the R package forward.
62+
5663
### Installing Harmony
64+
5765
You can use Windows, Linux or Mac. We have made some videos to help you install Python and Harmony:
5866

5967

@@ -72,24 +80,32 @@ Here are the steps to get started:
7280
* Run the [example Colab notebook](https://colab.research.google.com/github/harmonydata/harmony/blob/main/Harmony_example_walkthrough.ipynb)
7381
* We recommend Anaconda and Jupyter Notebook
7482
* Then you can do `pip install harmonydata` to install Harmony once Python has been installed.
83+
7584
### What should I work on?
85+
7686
Each of the repositories has its own issue tracker. Before taking on an issue, please check that nobody is already working on it. Please write a comment at the bottom of an issue that you would like to pick up, so that other people don’t duplicate your work.
87+
7788
{{< image src="/images/issue-tracker.png" title="Issue tracker" >}}
89+
7890
*Above: This is a preview of the issues board. You can see that some issues are tagged “good first issue”. So they are good for new people to pick up.*
7991
* Issues for the core Python library are here: [https://github.com/harmonydata/harmony/issues](https://github.com/harmonydata/harmony/issues)
8092
* Issues for the API are here: [https://github.com/harmonydata/harmonyapi/issues](https://github.com/harmonydata/harmonyapi/issues)
8193
* Issues for the front end are here: [https://github.com/harmonydata/app/issues](https://github.com/harmonydata/app/issues)
8294
* Issues for the R port are here: [https://github.com/harmonydata/harmony_r/issues](https://github.com/harmonydata/harmony_r/issues)
95+
8396
You can also ask on Discord if you have any questions about how best to contribute.
97+
8498
It’s a good idea to check the open pull requests to check that nobody has already worked on that issue and submitted code back to the main project. For example, here are the pull requests for the Python library: https://github.com/harmonydata/harmony/pulls
8599

86-
## Coding Harmony
100+
### Coding Harmony
101+
87102
Harmony is mostly coded in Python. We use [Pycharm IDE](https://www.jetbrains.com/pycharm/) by JetBrains. Please ensure you are familiar with Python, [HuggingFace](https://huggingface.co/), and [FastAPI](https://fastapi.tiangolo.com/), or Javascript and [React](https://react.dev/) if you want to work on the front end.
103+
88104
* Please use [Pycharm default linter](https://www.reddit.com/r/pycharm/comments/mm77el/what_is_the_default_linter_in_pycharm/) - this is a set of rules of how many whitespace characters are allowed in a line, and in general provides consistency for formatting of human readable code and comments. If you use a different one (such as VS Code's linter, or pylint), this will make the code history hard to follow, so please be consistent. If one person uses spaces and another uses tabs, it's hard to manage it and keep track of code changes.
89105
* Please run unit tests before pushing. We use test driven development. That means that every commit gets tested automatically by Github and will get a green tick or red cross if the tests pass or fail. All the repos have tests in a folder called `tests` and you can run them on your computer and Github actions will run them when you commit. They will tell you if you break any functionality.
90106
* Check your PR hasn’t got any extra files made by your IDE that shouldn’t be committed, such as .vscode or DS_Store (Mac). It's a common mistake for beginners to bulk commit the entire contents of a directory including files which are not part of the project. For example, Mac puts extra hidden files inside folders when you open them in the file browser. Try not to let them clutter our code base. They make code hard to manage and in some cases can break the tool.
91107

92-
## Unit tests and code stability
108+
### Unit tests and code stability
93109

94110
Harmony uses the [pytest](http://doc.pytest.org/) framework for testing. For more info on this, see the [pytest documentation](http://docs.pytest.org/en/latest/contents.html). To be interpreted and run, all test files and test functions need to be prefixed with `test_`.
95111

@@ -103,10 +119,15 @@ Finally, the app repo [https://github.com/harmonydata/app](https://github.com/ha
103119

104120

105121
### Workflow for contributing
122+
106123
The preferred workflow for contributing to Harmony’s repository is to fork the repository that you’re working on, such as [the Python library](https://github.com/harmonydata/harmony/), on GitHub, clone, and develop on a new branch.
124+
107125
When you’re done, please run all unit tests and you can submit your changes back to the main project as a pull request. If you have worked on the core Python library, please also test your changes in the context of the Python API.
126+
108127
### Process of forking and making a pull request
128+
109129
If you are able to fix an issue, please feel free to submit your code back to the project by [making a pull request](https://github.blog/developer-skills/github-education/beginners-guide-to-github-merging-a-pull-request) (PR) but if you don't know how to do that, don't worry! You can always send us your work on Discord or by email. Here's a brief overview of the steps for making a pull request.
130+
110131
1. Fork the [main project repository](https://github.com/harmonydata/harmony) by clicking on the ‘Fork’ button near the top right of the page. This creates a copy of the code under your GitHub user account. For more details on how to fork a repository see [this guide](https://help.github.com/articles/fork-a-repo/).
111132
2. [Clone](https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/cloning-a-repository) your fork of the Harmony repo from your GitHub account to your local disk:
112133
```
@@ -145,32 +166,32 @@ git checkout -b <feature-branch>
145166
Always use a feature branch. It’s good practice to never work on the main branch! Name the feature branch after your contribution.
146167

147168
7. Develop your contribution on your feature branch. Add changed files using git add and then git commit files to record your changes in Git:
148-
149-
```
150-
git add <modified_files>
151-
git commit
152-
```
153-
154-
8. When finished, push the changes to your GitHub account with:
155-
156-
```
157-
git push --set-upstream origin my-feature-branch
158-
```
159-
169+
`git add <modified_files>`;
170+
`git commit`
171+
8. When finished, push the changes to your GitHub account with: `git push --set-upstream origin my-feature-branch`
160172
9. Follow [these instructions](https://help.github.com/articles/creating-a-pull-request-from-a-fork) to create a pull request from your fork. If your work is still work in progress, open a draft pull request.
161173

162174
### Making a good pull request
175+
163176
We recommend to open a pull request early, so that other contributors become aware of your work and can give you feedback early on.
164-
Please make your pull requests atomic. That is, please try to fix only one issue per pull request. If your pull request addresses issues A, B and C, it is very hard for moderators to merge it.
177+
178+
Please make your pull requests atomic. That is, please try to fix only one issue per pull request. If your pull request addresses three separate issues, it is very hard for moderators to merge it.
179+
165180
Please don’t make huge changes, such as adding many third party dependencies to requirements.txt, as this can quickly make the project bloated and we would ideally discuss alternatives before any more dependencies are added.
181+
166182
If you introduce a new feature, please can you document it, for example by making a script example in the [script examples repository](https://github.com/harmonydata/harmony_examples) so that people will know how to use it.
183+
167184
### Help on Git and pull requests
168185
If any of the above seems like magic to you, look up the [Git documentation](https://gitscm.com/documentation). If you get stuck, chat with us on [Discord](https://discord.gg/harmonydata), or contact us at [harmonydata.ac.uk](https://harmonydata.ac.uk/contact).
169186

170187
### Troubleshooting the code base
188+
171189
#### Errors from third party libraries when you try to run Harmony
190+
172191
When you first try running the code, you may encounter some errors. This is often because a 3rd party package such as Numpy, Pandas, Lxml or Huggingface has updated itself and broken a dependency somewhere. It’s a good idea to Google for the error and check if you can fix it with a simple change in version to the package that’s causing the issue.
173-
### Troubleshooting the API repo submodules after git clone
192+
193+
#### Troubleshooting the API repository after git clone
194+
174195
After you have cloned the repository at https://github.com/harmonydata/harmonyapi, if the folder inside called `harmony` is empty, or at any point you get an error like the below, please check you have cloned with `--recurse-submodules` as below:
175196
{{< image src="/images/error_no_submodules.png" title="Error for missing submodules" >}}
176197

0 commit comments

Comments
 (0)