Pawel plesniak emuhammad/split shell fixing#888
Conversation
|
The current implementation uses a superuser (for development with me and @emmuhamm, I was the superuser) to be the owner of all the proceses, similar to the way in which Docker works. Some other development that needs to happen before cleaning up, review, integ tests, etc.
|
|
Reference notes Env var Emir of 028, PM on 029, permissions failures K8s did not help - root controller and local conn srv failed. Changes made Removed k8s name validation - blocks external PM code |
|
Note that as of 66c8707, there is a conflict that needs to be resolved. I've rebased (not merged) this branch on top of develop since I feel like im always missing something when merging develop into this feature branch, resulting in an invalid set of changes. However, for backup, I have saved Soon after this message is posted, I'll force push the changes after the rebase, which should work. note that we should delete the backup branch when this PR gets merged |
66c8707 to
0114aa4
Compare
0114aa4 to
a4d174f
Compare
|
Going to need to do some git surgery. Ive copied the branch on PawelPlesniakEmuhammad/SplitShellFixing-after-941 just in case something got messed up update 1 local testing works which is as expected. tomorrow will be cleaning up the first 5 commits to grab the 'essentials' and the 'testing' commits to make it obvious. when that is done, it will finally be rebased on top of develop Update 2 Just finished cleaning up the first 5 commits, now theyre clearer in what they do. This should help in rebasing with develop Update 3 this has now been rebased successfully onto the nightly of 22 june. |
2ba33eb to
6039f4a
Compare
Add multi host and multi user support to the process manager
6039f4a to
f042c4e
Compare
|
PMaaS #888 Will need to investigate everything that has been commented out to check if it can be reintroduced |
|
TODOs before merging:
|
|
Integtests running in a |
|
Integration tests passed on Note the failure was associated with a stateful command timing out, and is not associated with this PR. Re-running this specific test to confirm. |
|
The This is now ready for another review |
Description
Fixes #905
This is an intermediary branch for #830, it does not address all the issues but contains a separate branch to allow for other PRs to go in without disrupting
developtoo much.This PR has had other PRs merged into it that have addressed the following issues
#709
#910
#904
#586
#326
#937
Type of change
List of required branches from other repositories
N/A
Change log
Several changes have been included, and this PR introduces progress on the various deployment mechanisms on the run control, focusing on the following
process-managerprocess-managerthrough aprocess-manager-shellunified-shellconnected to the standaloneprocess-managerNote - this development is fully focused on the
sshimplementation of the process manager. This implementation has not yet been tested on thek8sprocess manager, for which a separate issue has been listed.Suggested manual testing checklist
One can now deploy a standalone process manager and run sessions from it, but note that for this case, the user who owns (in Linux terms, the user who started) the
drunc-process-managerprocess must haverwxaccess to theDBT_AREA_ROOTfrom which the process manager is client is being run. Also note, if this is not available,druncwill report this and block the boot.To start a standalone process manager, use
Note - the
50000is the port number through which the process manager will communicate. This does not have to beYou can connect to it using a process manager shell (in a separate terminal!) as
Note -
<HOSTNAME>refers to the host on which the process manager is running, and can be marked aslocalhost. One can then start a session as e.g. (from within thedrunc-process-manager-shell)From here you will be able to execute the session stateful commands by connecting to the segment's top controller to take a run.
One can also run a session through the unified shell (in a separate terminal!) using this process manager as
This can also be done cross-host, e.g. if the process manager is running on
np04-srv-019and a*local*session is ran fromnp04-srv-028, the session applications will start on028.Developer checklist
Prior to marking this as "Ready for Review"
Tests ran on:
np04-srv-028from releaseNFD_DEV_260629_A9Unit tests - some tests can't be ran on the CI. This is documented. If this PR checks a feature that can't be tested with CI, this has been marked appropriately.
Integration tests - the
daqsystemtest_integtest_bundlerequires a lot of resources, and connections to the EHN1 infrastructure. Check the cross referenced list if you can't run these. The developer needs to run at least the .pytest --marker) passeddaqsystemtest_integtest_bundle.sh -k minimal_system_quick_test.pydaqsystemtest_integtest_bundle.shFinal checklist prior to marking this as "Ready for Review"
Reviewer checklist
druncare in the log filesdruncfailure appears:Once the features are validated and both the unit and integration tests pass, the PRs is ready to be merged.
Choose one of the following an complete all substepsPrior to merging
Once completed, the reviewer can merge the PR.
Notification message for a Slack channel
Note - this should be to #dunedaq-integration for general workflow that isn't during a release candidate period, and to #daq-release-prep otherwise.
For an single merge that changes the user workflow
For co-ordinated merge