-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite loop when saving org file #177
Comments
I’d love to fix the issue. Two questions:
I suspect there is something specific to Doom Emacs. I do not use it and I don’t know how to use it |
Also do you use the latest commit of org-transclusion? I can see you use the main branch but I do not see which commit. |
I tried doom emacs. I am not sure if this is helpful, but I cannot reproduce the issue. Sorry, it's really hard for me to understand what is going on on your end. |
I've ran into this a couple times myself. I don't use doom, so I don't think it's exclusive to that. I don't know exactly what caused it, but I suspect it had something to do with undo. (Or maybe undo-fu in particular). If I can find a way to trigger this consistently I'll let you know. |
Thank you. I really struggle with this error; I'd really appreciate it if we can find a reproducible procedure. I don't use I have encountered the error when I was developing features. For my experience, it was caused by the incorrect way transclusion is removed and the origianl |
Hi @nobiot Sorry I did not try to dig more in this issue. I would need to setup a blank emacs with the faulty file I've been using. If I can, I'll for sure try to improve this issue by identifying the reproduction steps. I wanted to let you know of the issue, which is for now too problematic for me to adopt transclusion in my org production (as I have to almost kill emacs to get out of the infinite loop). I'd love to see this package included in doom distribution, it's a great community which could help you get lots of traction (they are on Discord to help). To answer your previous question : I was on main 'up to date' 2d0502f. |
@nobiot Could you point me to where you in the code (or the commit) you fixed this occurring on the occasions you know about? I'm thinking to debug this we could throw in a check that sees if the buffer size is growing since save was triggered and if so throw an error so we can debug-on-error instead of freezing emacs. |
So looking a little closer this should be simple to do. We already have |
Reported in GitHub issues #109 #177. I cannot reproduce the issue myself so far. I am put in place (1) small preventive measure and (2) heuristics to defect and break the infinite loop on save-buffer. (1) Org-transclusion (OT) tries not to save the transcluded buffer content and instead save only the #+transclude keyword line to the file. To achieve this, OT uses 'before-' and 'after-save-hook' to remove-all the transclusions and then add-all them. This operation relies on the returned value of the point from 'org-transclusion-remove' function. In this commit, the point (integer) is changed to marker. This way, any arbitrary buffer change between these remove-all and add-all processes can have less impact on the moving points of reference -- makers automatically move to adopt to the new buffer state. I suspect something like 'whitespace-cleanup` put in 'before-save-buffer' might dislocate the positions in some situations. This preventive measure hopefully preempt the issues. (2) The heuristics is simple but should work if there is an unexpected number loop happens. Since it is simply compare the length of a list, and the 'dolist' loops for the same list, logically this should be redundant; however, since the infinite loop itself to me is anomaly, this heuristics might catch the issue and break the loop. As you can see, both attempts are not based on causal analysis but rather "stabbing in the dark" heuristics.
In order to contain the issue of infinite loop, I have pushed commit 43c478c. I'd appreciate it if anyone has bumped into the error message: "org-transclusion: Aborting. You may be in an infinite loop". fix: heuristics to identify & break infinite loop on save Reported in GitHub issues #109 #177. I cannot reproduce the issue myself so far. I am put in place (1) small (1) Org-transclusion (OT) tries not to save the transcluded buffer (2) The heuristics is simple but should work if there is an unexpected As you can see, both attempts are not based on causal analysis but rather |
Hi nobiot, thanks for the great package! I'm also running into this issue on 1.3.2, likely because I use evil-mode and have set: Right now it's unfortunately unusable for me, as it repeats the links infinitely and I can't break out other than killing the emacs session. How can I try out this fix to see if it helps? |
@japhir Thank you. I don't use evil mode and I can't reproduce the issue any longer. The key is to reproduce the issue reliably so that we can analyze the code that causes the infinite loop. What I can suggest is:
I have begun to suspect there may be some insistency with evil mode... but I can't use vim keybinding so I am no use here. |
@nobiot I've observed this behavior a few times, and I'm not using evil. It seems to trigger when undoing just the right amount in the org file with the transclusions and then save. I'll see if I can reproduce with emacs -Q. |
@devcarbon-com Thank you. I see, undoing never occurred to me. I am really curious to see if anyone can repro reliably with emacs -q... Have you seen the infinite loop after commit 43c478c I mentioned above? It was merged in May 2023. |
It was after that, only a couple months ago, but I'm not entirely sure if my version was current at the time. |
Okay, I just reproduced this bug. Emacs version:
org-transclusion version: org file:
code.el: (defun bar ()
(interactive)
(message "hello")) Steps to reproduce:
|
(edited to be accurate) |
@devcarbon-com Thank you for the steps. It's the first time I have seen a concrete step to repro. I feel one step closer to know what's really happening. Now I have tried following the steps with
(add-to-list 'load-path "~/.config/emacs/elpa/org-transclusion-1.3.2.0.20230819.63913")
(load-library "org-transclusion")
(define-key global-map (kbd "<f12>") #'org-transclusion-add)
(define-key global-map (kbd "C-z") #'undo) infinite-loop.org is identical with yours.
I cannot exactly follow your steps, so I tried this: In the buffer
I cannot reproduce the infinite loop.
|
You can do these steps only when you have customizing |
Ah, yes, I see where I was not clear now. Following your steps, I get the exact same behavior as you do, and do not get an infinite loop. The key difference seems to be using a file that is already written, vs. writing one from scratch. Perhaps also the timing of activating org-transclusion-mode. Undo amalgamation may also pay a factor. At first I could not get consistent results, until I added an 'intentional mistake'.
|
Note that saving the buffer in step 10 seems to also be a key step. If you leave this out, it works without issue. (no loop, and transclution is removed on undo.) |
@devcarbon-com Thank you for the detail! I can reproduce the infinite loop now. I need to spend some time to get my head around it, though. It's a great leap forward! |
Omg, devcarbon-com managed to reproduce exactly what I was doing at the time that I encountered the crash, but I just didn't recall it anymore! short description of what I was doing at the timeI was working on some data analysis in R, using org-mode and org-babel. At some point I started to accumulate a few too many functions, so I tangled the functions to separate files in the R subdirectory, so I could make a package out of it. After that, I wanted to work in those R files directly to make debugging and potential duplication errors easier to deal with. It was the first time in a long time using org-transclusion, so I typed out the new I would not have been able to make a reproducible example but your description triggered my memory 😄 |
FYI The workaround is this:
This workaround is motivated by the observation that the infinite loop issue happens mostly because of the tiny misspelling by missing the colon ":". The hope is to minimize the experience of infinite loop. -- I realize it's not fixing the root cause. I have gone back to it but now I can only intermittently reproduce the problem (but more reliably than before, but not always). When I see the infinite loop happening, I am not able to determine the root cause... It would be great if anyone out there has experience in fixing this type of issue in Emacs and can help us. |
@nobiot Yes that is my understanding - buffer-undo-list corruption is simply one way in which text-properties regarding But there may be N number of ways in which the text-property may get corrupted - and for a permanent fix - we should make Maybe even try to recreate the However that property may get corrupted. |
I tried reproducing the issue by let-binding
@akashpal-21 I was not ever able to reproduce the issue. If it would still be helpful for me to report on these values, would you please send an Elisp snippet for me to run in |
@josephmturner Hmm that's good to hear - I didn't know how to emulate GC circumstances. To get the values we can debug the For example I cannot reproduce the bug myself - but I can report on partial corruption of one property -- that I talked about earlier that does not result in the infinite loop but causes corruption still. For infinite loop the beg and end of the overlay should equate - this causes the Please allow me a minute to attach a screen record to show the partial corruption still - I cannot reproduce the full corruption myself and therefore get the infinite loop - |
@josephmturner Just when I told you I cannot replicate it - I fell into it - now I forgot how to exit when such a situation arises I cannot quit emacs - it wont let me quit in any way Allow me a minute to recover. |
untitled.mp4Partial corruption case as noted earlier - I cannot now replicate the infinite loop since |
Ok recreated it again !! Lmao - Please see the video untitled.mp4 |
Thank you @akashpal-21! I can now reproduce the issue. I was able to stop the hung Emacs with
Furthermore, I was able to go back to the transclusion buffer, put point on the
|
Resolves nobiot#177 by making `org-transclusion-add` and `org-transclusion-remove` not affect the buffer undo history.
Here's a recipe for reproducing the infinite loop on the current In
Then in the org-mode buffer which appears containing the transcluded word "foobar", repeat the following steps until Emacs hangs:
On my machine, this reliably reproduces the infinite loop after repeating these steps a few times. For some reason I don't yet understand, I wasn't able to reproduce the loop with non-interactive calls to |
I thought not modifying the buffer-undo-list when adding/removing transclusions would solve the problem, but I was wrong. Please ignore the accidental commit above which claims to resolve this issue. |
I have tested different Emacs versions from compiling it from source: 29.1.90, 29.1, 29.2, 29.3; I have not been able to reliably reproduce the infinite loop with @devcarbon-com's procedure -- I used to be able to, but no longer. I have managed to make it happen a couple of times (I cannot record the exact steps) with 29.1 and 29.3. Based on @akashpal-21's analysis, I have pushed a new branch and commit to force the infinite loop to occur -- I still do not know exactly when we will arrive at this condition in real use of transclusions, but I think the branch can be used as a test harness to craft a preventive measure and test it. I also modified @josephmturner's automation code as below to be able to reproduce the infinite loop easily with a command. Instruction:
Infinite loop starts as soon as you save the buffer. Stop it immediately with (defun test/infinite-loop ()
(interactive)
(let ((code-file (make-temp-file "org-transclusion-test-code" nil ".el"))
(org-file (make-temp-file "org-transclusion-test-org" nil ".org")))
;; *Change to location on your machine where org-transclusion is installed.*
(add-to-list 'load-path "~/src/org-transclusion/")
(load-library "org-transclusion")
(with-temp-file code-file
(insert "(defun bar ()\n")
(insert " (interactive)\n")
(insert " (message \"hello\"))"))
(with-current-buffer (find-file org-file)
;; Inhibit read-only so that we can easily remove the problem for
;; testing purposes.
(setq-local inhibit-read-only t)
(org-transclusion-mode +1)
(insert "* OT test\n")
;; Colon after #+transclude intentionally omitted
(insert
(format "#+transclude: [[./%s::bar][bar]] :thing-at-point defun :src elisp"
(file-relative-name code-file)))
(org-transclusion-add)))) |
@josephmturner, I posted the comment above before noticing your latest comment about recipe for repro. I will come back to it later. Thank you. |
I also think so - particularly because org-transclusion can never generate this result -- it is impossible for a null file to be transcluded
The problem is inherited from the environment it is functioning in - not under normal operations but exceedingly rare alignment of circumstances, the problem really isn't that the problem exists - but that when - in a 1:100000 circumstance it is reached - it results in a catastrophe. The user is imprisoned until they figure out a way to exit. Should the remove protocol refuse to entertain the impossible case of getting a 0 length overlay - it doesn't even need to try to rectify errors - but it should give the user the choice to manually delete the overlays and be allowed to save and quit. |
Collective analysis efforts have found that infinite loop occurs when variables `beg` and `end` have an identical value in `org-transclusion-remove`. It's at the top of the function and looks like this: (beg (marker-position (get-char-property (point) 'org-transclusion-beg-mkr))) (end (marker-position (get-char-property (point) 'org-transclusion-end-mkr))) These values are used to know the size of the transclusion to remove it. The size of overlay on the source cannot be used because filter can alter the text size. `text-property-search-forward` can be a reliable way to do this. The message "(org-transclusion) Something is off" is still a WIP version. I think we need to come up with a better message if the method above is deemed viable as a preventive measure.
My first attempt at a preventive measure. The reproduce recipe from @josephmturner works on my end but only intermittently. So I used 7ad7936 to force infinite loop and to test the preventive measure. It seems to work with multiple transclusions. |
@nobiot It is always a pleasure to see you come up with solutions, I introduced the change to the |
@nobiot Your
We could use Thank you! |
I have managed to reproduce the infinite loop 100% of times on my end -- code in the "details" toggle below. The theory of what happens is this, and the the code supports the theory. [Fact / design of
[Now what happens]
Well, my description using a human language may not be precise and accurate, but based on this theory, I have come up with the code. Now I can reproduce the infinite loop condition 100% of my attempts. If this theory is confirmed, I think I need to work to get the current "workaround" to be the way forward, removing the use of makers in the current way. Let me know how you go... To use the code provided below, adjust the src location, evaluate the snippet, and call the command (defun test/repro-loop ()
(interactive)
(let ((source-file (make-temp-file "org-transclusion-test-source"))
(org-file (make-temp-file "org-transclusion-test-org" nil ".org")))
;; *Change to location on your machine where org-transclusion is installed.*
(add-to-list 'load-path "~/src/org-transclusion/")
(load-library "org-transclusion")
(with-temp-file source-file (insert "foobar"))
(pop-to-buffer (find-file org-file))
(insert (format "#+transclude: [[./%s]]" (file-relative-name source-file)))
(org-transclusion-add)
(save-buffer)))
(defun test/force-gc ()
(interactive)
(undo)
;; This save-buffer is the key. If you comment it out, the infinite loop won't
;; happen.
(save-buffer)
;; Garbage collect is also the key
(garbage-collect)
(undo-redo)
(describe-text-properties 1))
(defun test/combine ()
(interactive)
(test/repro-loop)
;; `undo-boundary' is necessary to get undo to work through calling the test
;; functions in Elisp.
(undo-boundary)
(test/force-gc)) |
My opps at silicon valley have hidden the comment but feels good to know my intuition was correct We finally have a confirmed fix to this issue. Both by simulating the conditions in a deterministic manner and solving it using a more general solution -- rather than specific. |
Can you reproduce the problem on your end with the code, too? |
Me too.
This is very clear. Thank you, @nobiot! IIUC, It would be nice if we could mark certain markers so that GC doesn't collect them. Like specifying "weakness" in |
Thank you all -- just letting you know that it looks like I won't have much capacity until mid-late June -- I will try to come back earlier to this work here, but I cannot promise. I wanted to let you know before I "disappear". I will try not to go completely radio silent but I may. Talk to you soon =) |
Reporting a related issue: duplication of the To reproduce: Introduce a transclusion: Result: Software Versions: This is new behavior, but I do not know from exactly when. Unfortunately, this makes org-transclusion incompatible with my workflow, though I do endorse the project overall. Thank you in advance for your attention to this matter. |
@jpt4 interesting, but this should be unrelated to this issue - can you help me diagnose this issue further? It would be better if you open a new issue and post the full stack - that way we can understand what is causing the error. (toggle-debug-on-error) |
@akashpal-21, thank you, have done so: #257 |
Just to be sure, the original issue of infinite loop is still to be fixed. We have a reproduction case, and I am on it. It's just I haven't got round to it yet, but I will. I'll keep this issue open. |
For future reference, these couple of threads on the Emacs mailing list and bug tracker may relevant: https://yhetil.org/emacs-devel/87ttcfncub.fsf@gmail.com/T/#t From these threads, I learned that Emacs |
What?
It seems org-transclusion has a bug when saving file and rendering. When the file is saved for the first time on a freshly launched emacs : no issue.
Then when I try to save the file a second (without modification) : emacs goes in an infinite loop and becomes unresponsive.
I managed to make it stop by using
pkill --signal USR2 emacs
and got a backtrace of what it was doing (see screenshot). The backtrace seems to show an interaction with org-element / org-transclusion was writing an infinite amount of time#+transclude:
(see the number of lines. My file is originally 200 lines long). Looks like it's related to the way org-transclusion is saving files (that was mentioned in #109)I suspect the bug is a bad interaction with org-element--cache-active-p which grows very very quickly. I noticed that running
org-element-cache-reset
can help when the file is opened again. Emacs also becomes unresponsive when i just move the cursor around the#+transclude:
lines... 💥Doom Emacs
I am running doom-emacs:
Is there any chance we could solve this issue? I would love to be able to use this package to design my programming courses!
Thanks!
The text was updated successfully, but these errors were encountered: