Optimise module to filename#4600
Conversation
ab06e51 to
a11ff5b
Compare
a11ff5b to
2d8386c
Compare
|
I'm running this code in production on our large codebase for the 3rd day. a) My users are super happy, a few quotes: (it was more than 5 minutes before ;) Ok, that's all for the shameless plug ;) I think I observed a few reload / performance issues which are not related and would definitely do some followup issues. I'm investigating (this is difficult to describe right now. Looks like HLS is not able to give diagnostic back if it is typechecking a file which depends on the file I just modified and if this typechecking takes time). I also observed a surprising behavior, I need to investigate if that's linked to my MR or to something else:
It looks like a multi-home issue, see. #3237, which would make sense considering that I had done nothing to fix multiple home. But this is still rather surprising because HLS was supposed to restart and hence should have completely ignored the old selda version. Maybe it picked some information from the cache. |
We handle FOI(File of interest) and none-FOI differently. |
@soulomoon Thank you. This is indeed interesting. I will confirm that the bug I'm observing is not linked to my change and open an issue. This is not a dramatic issue (e.g. it is not everyday that you restart the editor with breaking changes in the packages from the package database), but it is not convenient and can be confusing. |
|
I've found a bug, when looking for module in the import path, an exception is raised if the import path is empty. |
2d8386c to
fdf3085
Compare
Fixed in latest commit. |
fdf3085 to
c856c6a
Compare
|
Improved the support for currently edit files but not yet saved to the disk. The implementation will be improved for performance but that's a good first draft. This fixs a lot of unit test, so it is good. I do have a few (around 20) tests faliling in ghcide-tests, but otherwise, most of the tests are ok. Still todo:
|
|
Multiple-home-unit seems to be fixed now (maybe I've not pushed the MR), but the tests are fine. I have 3 failing tests related to I'll work on removing the |
c856c6a to
b20c634
Compare
|
Latest push should fix hlint and actually remove the I have 3 (2 are flaky) remaining issues with |
98fc906 to
2270264
Compare
|
I will need a bit of help here. I'm blocked with the following errors locally: Note that out of the 3 tests related to In the context of this test, My understanding here is limited and especially, it is difficult for me to understand why my changes may have any impact on these tests. In such context, it is tempting to think that there is another problem such as a race condition / timing. During my investigation, I found the code in pPrint ("sessionOpts", cfp, v)
case HM.lookup (toNormalizedFilePath' cfp) v of
Just (opts, old_di) -> do
deps_ok <- checkDependencyInfo old_di
pPrint ("deps_ok", deps_ok, old_di)
if not deps_ok
then do
-- If the dependencies are out of date then clear both caches and start
-- again.
modifyVar_ fileToFlags (const (return Map.empty))
modifyVar_ filesMap (const (return HM.empty))
-- Keep the same name cache
modifyVar_ hscEnvs (return . Map.adjust (const []) hieYaml )
pPrint ("consultCradle not deps_ok", cfp)
consultCradle hieYaml cfp
else return (opts, Map.keys old_di)
Nothing -> do
pPrint ("consultCradle Nothing", cfp)
consultCradle hieYaml cfp(the I observed that the However, It seems that The result of this is that when calling And For each targetfile, we will create a I'll continue this investigation later. |
2270264 to
cce8f01
Compare
cce8f01 to
f817d89
Compare
f817d89 to
878bfdb
Compare
|
I've rebased and rerun the CI. Latest run was showing only a segfault with GHC 9.10 on window and one failure in the call hierarchy plugin on window with ghc 9.6, which may hence be a flakyness. The latest commit is a tentative fix for the issue described in #4600 (comment) and that I tried to describe and extract in #4754 (And really, I should answer the poor @soulomoon who tried to help me). In order to be in a mergable state, I need to:
|
| -- and also not find 'TargetModule Foo'. | ||
| fs <- filterM (IO.doesFileExist . fromNormalizedFilePath) targetLocations | ||
| pure $ map (\fp -> (TargetFile fp, Set.singleton fp)) (nubOrd (f:fs)) | ||
| pure $ do |
There was a problem hiding this comment.
@fendor, this is the offending change.
When extendKnownTargets is called with files which are not yet available on disk, the TargetFile f contains a file (either .hs or .hs-boot) and targetLocations contains a list of files and it seems that it always contains the .hs and .hs-boot.
However, these file do not exist on the disk, so the filterM always filter them out.
This "fix" allows the file to be added.
The bug is certainly NOT here, see #4754 for more details.
My current understanding is that there is a caching process which, when a .hs or .hs-boot file is registered, it calls extendKnownTargets with both .hs and .hs-boot files listed in targetLocations, but only the .hs or .hs-boot in TargetFile f.
However, it seems that there is some sort of caching which once a .hs-boot file or .hs file is registered, it blocks registration for the other file.
It means that depending on the registration order, we may end up with only one or the other file in the knownTarget.
The result is that during module name to filename association (the target of this MR), we won't be able to find either the .hs or the .hs-boot file, because we look into the knownTarget map for them.
Note that I don't understand why it was not failing before, because the previous implementation was actually looking in the knownTarget too, and on the filesystem. But for files not on the filesystem (as it is the case for tests and newly created files), it seems that the previous implementation was behaving the same.
My guess is that, because it seems that it is a race, my change had an impact on the way extendKnownTargets is called concurrently and as a result the problem appears more often. But I had not been able to confirm that the issue was there before my MR.
There was a problem hiding this comment.
If we want to extendKnownTargets, we should do so in the restart thread, see Note [Serializing runs in separate thread] . Otherwise we risk of destablizing the build system.
For the .hs or .hs-boot mess, one of the possible solution, we can update the sessionLoader to invalidate related cache and consult the cradle again if the missing .hs or .hs-boot is added.
take a look at |
878bfdb to
1c7b787
Compare
980890f to
15c312d
Compare
|
I think I had cleaned the module map logic initialisation: a) Now only computed once, when a new file is added / removed. The performances are back and it also handle if a new file is added to the |
0bdc5c8 to
550ba22
Compare
550ba22 to
f3c3a35
Compare
f3c3a35 to
e03e03c
Compare
|
My latest commit broke everything, sorry, this was a mistake. I've removed it from the patch stack. |
e03e03c to
01ecdba
Compare
5bb451c to
3a5eef0
Compare
3a5eef0 to
bf73604
Compare
|
I'm doing another tentative at finishing this. Many things are unclear for me. @soulomoon, you suggested to hook/have a look at However, I do not, instead, I'm reading And I'm doing that exactly at the some location as before in In my previous writings, here and in #4754, I claimed that it could be a race condition in the way However, I've today found another experience: It usually fails after a few iterations: However, with the folliwng change, adding a That's interesting, because it shows that: a) The issue is clearly a race condition, adding 200ms of thread delay seems to fix the problem (e.g. it usually fails in less than 5 iterations, and this addition get no failure in 157 iteration). I'll try to understand if the race condition I clearly reproduced here is linked to a mistake on my side (highly probable) OR if it was already existing, but the previous code was robust to it. As I described in #4754, I'm convinced that there is a race condition which depending on scheduling, the content of Stop. I've got it. 335 : let getTargetFor modName nfp
336 : | Just (TargetFile nfp') <- HM.lookupKey (TargetFile nfp) targets = do
337 : -- reuse the existing NormalizedFilePath in order to maximize sharing
338 : itExists <- getFileExists nfp'
339 : return $ if itExists then Just nfp' else Nothing
340 : | Just tt <- HM.lookup (TargetModule modName) targets = do
341 : -- reuse the existing NormalizedFilePath in order to maximize sharing
342 : let nfp' = fromMaybe nfp $ HashSet.lookupElement nfp tt
343 : itExists <- getFileExists nfp'
344 : return $ if itExists then Just nfp' else Nothing
345 : | otherwise = do
346 : itExists <- getFileExists nfp
347 344: return $ if itExists then Just nfp else NothingThe previous code was calling So: a) I think I can confirm that there is an issue in the way |
|
Actually, I'm wondering if instead of using |
842248a to
3f3f788
Compare
|
Tentative with using the This is not a version ready for review yet (e.g. the merge between the module cache and the VFS is done on every lookup for module, so that's still a process |
3f3f788 to
b8cbb03
Compare
9314197 to
58113c6
Compare
|
The ugly hack which uses the merge of VFS and known target content seems to work (tests 100% passes locally, and were just close to 100% green also in CI) I've now pushed a change which move the costly module -> filepath extension from known_targets / vfs to the cached location. Hopefully CI will be green tomorrow (that's already too late for me). Next steps:
|
|
The latest push is there check that CI is still OK. if green, I'll: a) Squash everything |
29533e8 to
35f5f6b
Compare
|
I've cleaned the MR and I'm ready for review. See the commit message for guidance about the different changes. I'll also add two inline comments in the review to ask two questions related to: a) Cache invalidation |
| ShakeExtras{moduleToPathCache} <- getShakeExtras | ||
|
|
||
| cache <- liftIO (readTVarIO moduleToPathCache) | ||
| case Map.lookup (envUnique env_eq) cache of |
There was a problem hiding this comment.
This is the cache logic which is improper.
I would like the rule GetModulesPaths to be invalidated when:
GetKnownTargetsis invalidated (for this session). It is easy to do that by moving the call toextendModuleMapWithKnownTargetsoutside of the cache lookup here, but this operation is too costly to be redone for each file of the session.- The cached value to be shared for all the file in the same session. I had not been able to implement that and this is why I've manually cached that inside
ShakeExtras
So roughly, if there is a solution to cache GetModulesPaths for all files of the session and invalidate it when GetKnownTargets changes, it could solve most of my problems.
| let pathsFromTargetsMap = HS.toList $ mconcat (HM.elems knownTargets) | ||
| let pathsFromVFS = catMaybes $ map uriToNormalizedFilePath (Map.keys vfs) | ||
|
|
||
| -- This is the list of all paths known by HLS right now. |
There was a problem hiding this comment.
This is a place which is a bit fuzzy. If you look at the original getTargetFor (Removed elsewhere in this diff), it was actually taking into account the values from knownTargets and from the VFS, so it is coherent to use both, but it seems a bit overkill.
35f5f6b to
3babef7
Compare
|
This tests: Fails in the CI, it was not failing this morning. I've broken something while cleaning up, I'm investigating, but I don't think this blocks the overall review process. |
This is related to #4598. This changes the file to module associating logic done during dependency graph building. Before, each time an import of module `Foo.Bar` is found, HLS is testing inside all the import path for the existence of a relevant file. It means that for `i` import paths and `m` modules to locate, `m * n` filesystem operations are done. Note also that this involves a lot of complex string concatenation primitive to build the `FilePath`. A module is tested for each `import` for each of the file of the project. We also test for `boot` files, doubling the number of test. In #4598 we have a project with `1100` modules, in more than 250 import paths and we count more than `17000` `import` statments, resulting on over 6 millions test for file existences. This project was blocking for more than 3 minutes during HLS startup. This commit changes the way this is computed: - A new rule `GetModulesPaths` generates a `ModuleToFilenames` map, associating module name to `NormalizedFilePath`. This rules builds a cache of all the know module -> filepath associations, based on a scan of the filesystem extended with the content of the `knownTargets` and `VFS` in `extendModuleMapWithKnownTargets`. This rule is cached in `ShakeExtras` in `moduleToPathCache`. - The lookup phase is not changed much, but instead of testing the knownTargets and VFS for all possibly paths, taking into account all possible import path, it just does an immediate lookup in the `ModuleToFilenames` maps. The performance improvement is as follows: - The number of IO operation is dramatically reduced, from multiples millions to a few recursive directories listing. - A lot of the boilerplate of converting path had be removed. - On my project, the graph building time is reduced from a few minutes to 3s. A note about performance Most users won't see the benefits of these change, but I think they apply to everbody: - We are still doing 1 lookup per `import` per module. But the lookup result is not multiples IO, so it should be faster by a large amount. - Most project only have 1 (or a few) import paths so won't benefit as dramatically as me from this. Points opens for discussions: a) In the previous implementation, a lookup in `knownTargets` was followed by a lookup in the VFS in case the value was not found in `knownTargets`. We observed in #4754 that there may be a race condition when populating `knownTargets` in the context of `-boot` files, leading to sometime missing either the `.hs-boot` or the `.hs` file. The previous implementation was still working because of the fallsback on the VFS. In this new implementation, I tried to only target the VFS, however it seems that some file are not part of the VFS, this is unclear for me yet. Extending the module maps (read from the real filesystem) by both the content of the VFS and `knownTargets` give good results. b) I'm not confident about the implementation of reexported modules, especially in multiple home unit context, but tests are greens. c) Cache invalidation. I had not fully understood the logic of the build graph and cache invalidation. The current status is that the `GetModulesPaths` rule is always rerun for each file, but the content is manually cached inside `ShakeExtras` and never invalidated. As a result, we cannot observe any change inside the import paths directories, and changes on `knownTargets` and the VFS are also ignored.
3babef7 to
8a32756
Compare
|
The regression should be fixed now. |




(only latests commit are relevant, the other are from other MR which should be merged before this work is merged. Or we can rebase)
This is related to #4598.
This changes the file to module associating logic done during dependency
graph building.
Before, each time a module
Foo.Baris found, HLS is testing inside allthe import path for the existence of a relevant fiel.. It means that for
iimport paths andmmodules to locate,m * nfilesystemoperations are done. Note also that this involves a lot of complex
string concatenation primitive to build the
FilePath.A module is tested for each
importfor each of the file of theproject. We also test for
bootfiles, doubling the number of test.In #4598 we have a project with
1100modules, in more than 250 importpaths and we count more than
17000importstatments, resulting onover 6 millions test for file existences. This project was blocking for
more than 3 minutes during HLS startup.
This commit changes the way this is computed:
Map ModuleName FilePath(the real type is a bit moreinvolved for performance, multiples unit and boot files handling) is
built by scanning all the import paths for files representing the
different modules.
haskell module, this will never do more job that listing the files of
the project.
Maplookup.The performance improvement is as follows:
millions to a few recursive directories listing.
to 3s.
Limitations:
Mapif the content of one directory change?interested, performance can be damaged. TODO: add a diagnostic during
this phase so the user can learn about this issue.
Code status:
lookupis not fully restored, especially it does not include thehandling of home unit as well as reexport.
TVarstored as a toplevel identifier using
unsafePerformIO. This is to be improved.