-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sometimes Windows installation failed #14
base: master
Are you sure you want to change the base?
Conversation
running This leads to a Windows corrupt
Hello, Can you open a bug report on https://bugs.launchpad.net/cloudbase-init and use the git review workflow to push a fix for the bug? Thank you. Let me know if you need help with using 'git review'. This repo is just a mirror of https://opendev.org/x/cloudbase-init.git and the bug tracking/code patches are done using launchpad and review.opendev.org |
@jiaweien have you tried using the registry keys in HKEY_LOCAL_MACHINE\SYSTEM\Setup\Status\ChildCompletion to achieve the same result? I think the process checking can be very brittle in case of newer Windows versions. |
I'm sorry.Since I had other work to do, I didn't try to modify the registry to test it. But, when windows installation fails, I had press the shift+F10, open the terminal(I also looked at the cloudbase-init and setup logs in this way and learned that there might be some problems with the execution timing of cloudbase-init), input regedit for open the registry. modify the HKEY_LOCAL_MACHINE\SYSTEM\Setup\Status\ChildCompletion/setup.exe value to 3, and reboot the windows, can be accessed the system. The chance of this happening again is extremely low, and I've been testing it ever since, but so far I haven't had one. If I run into it again, I'll upload the screenshot. 发自我的华为手机-------- 原始邮件 --------主题:Re: [cloudbase/cloudbase-init] Sometimes Windows installation failed (#14)发件人:Adrian Vladu 收件人:cloudbase/cloudbase-init 抄送:jiaweien <15321830303@163.com>,Mention @jiaweien have you tried using the registry keys in HKEY_LOCAL_MACHINE\SYSTEM\Setup\Status\ChildCompletion to achieve the same result? I think the process checking can be very brittle in case of newer Windows versions.
—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or mute the thread.
|
Hello I'll do it later. Thanks for your remind. 发自我的华为手机-------- 原始邮件 --------主题:Re: [cloudbase/cloudbase-init] Sometimes Windows installation failed (#14)发件人:Adrian Vladu 收件人:cloudbase/cloudbase-init 抄送:jiaweien <15321830303@163.com>,Author Hello,
Can you open a bug report on https://bugs.launchpad.net/cloudbase-init and use the git review workflow to push a fix for the bug? Thank you. Let me know if you need help with using 'git review'.
—You are receiving this because you authored the thread.Reply to this email directly, view it on GitHub, or mute the thread.
|
@ader1990 Someone told me recently that he had the same problem,I haven't come up with a better solution yet except process checking. So , could you review it? |
@jiaweien I understand the issue here, but we need to find a more reliable way to fix this problem, as looking for processes with a hardcoded executable (like setup,exe) can bring to an infinite loop or failure if the admin installed some service that has setup.exe in the path. If I make a patch using HKEY_LOCAL_MACHINE\SYSTEM\Setup\Status\ChildCompletion to test the issue, could you test it? |
@ader1990 Okay,I'll be test it later, and tell you the result when it's over. |
@ader1990 During this period, I tested Windows server 08 R2 and Windows server 2012 R2 several times(both are in chinese). I checked whether the registry value of SYSTEM\Setup\Status\SetupFinalTasks was 3 to determine whether the initialization had been completed.So far, there have been no initialization failures.But I'm not sure if I haven't tested enough or if I've fixed the problem.I also included my changes in the patches I submitted,I wonder if that's what you suggest. |
@ader1990 |
On a Windows 10, this is what these registry keys contain: Get-ChildItem -Path HKLM:\system\Setup\Status\
Hive: HKEY_LOCAL_MACHINE\system\Setup\Status
Name Property
---- --------
ChildCompletion setup.exe : 3
oobeldr.exe : 3
SetupFinalTasks : 3
SysprepStatus GeneralizationState : 7
UnattendPasses oobeSystem : 2 I need to check if all Windows versions have these keys to make sure there is no regression when we check the child completion keys. |
@jiaweien Here is a suggestion to improve the code (needs refactor or a better placement of code though). I would prefer putting the ChildCompletion check just before the plugin reboot as from my understanding, you get this error only if you run the SetHostname Plugin in the normal cloudbase-init run. The cloudbase-init installer configures the SetHostname Plugin to run during the Sysprep specialize step (which is the recommended way). Your current code will hang forever if cloudbase-init is installed using the default installer from cloudbase.it . The reason is that during the unattend phase, the value of the subkeys from ChildCompletion are: SetupFinalTasks = 0 and setup.exe = 1. def wait_for_boot_completion(self):
try:
with winreg.OpenKey(winreg.HKEY_LOCAL_MACHINE,
"SYSTEM\\Setup\\Status\\SysprepStatus", 0,
winreg.KEY_READ) as key:
while True:
gen_state = winreg.QueryValueEx(key,
"GeneralizationState")[0]
if gen_state == 7:
break
time.sleep(0.1)
LOG.info('Waiting for sysprep completion. '
'GeneralizationState: %d', gen_state)
with winreg.OpenKey(winreg.HKEY_LOCAL_MACHINE,
"SYSTEM\\Setup\\Status\\ChildCompletion", 0,
winreg.KEY_READ) as key:
steps = 0
while steps < 1000:
setup_state = winreg.QueryValueEx(key,
"setup.exe")[0]
setup_final_state = winreg.QueryValueEx(key,
"SetupFinalTasks")[0]
if setup_state == 3 and setup_final_state == 3:
LOG.info('Sysprep entered desired mode for service run.')
break
if setup_state == 1 and setup_final_state == 0:
LOG.info('Sysprep entered desired mode for unattended run.')
break
time.sleep(0.1)
LOG.info('Waiting for sysprep completion. '
'setup_state: %(setup_state)d and setup_final_state: %(setup_final_state)d', {"setup_state": setup_state, "setup_final_state": setup_final_state})
steps = steps + 1
except WindowsError as ex:
if ex.winerror == 2:
LOG.debug('Sysprep data not found in the registry, '
'skipping sysprep completion check.')
else:
raise ex |
@ader1990 Maybe make the cloudbase-init installer configures the SetHostname Plugin to run during the Sysprep specialize step is the best choice. |
When using the cloudbase-init MSI installer, the installer has an option to sysprep the machine at the end of the installation. By default, it configures SetHostname Plugin to run during the Sysprep specialize step. What my code does is to handle two scenarios:
|
I'm very sorry. I may have misunderstood you.I thought you wanted to use your code without refactoring. |
Hello @jiaweien, Did you manage to solve this issue per my suggestion from above or did you use a special unattedn.xml? Do you still need some code change in Cloudbase-Init? Thank you, |
@ader1990 |
Hello @ader1990 , You're right, my code will be hang forever when using the cloudbase-init MSI installer, the installer has an option to sysprep the machine at the end of the installation.. And I also tested your code, I have tested it many times and it always has completed the initialization of the virtual machine. But I still wonder if these changes only reduce the likelihood of initialization failure, not prevent it altogether.Because if I choose to use the cloudbase-init service to initialize the server, will there still be GeneralizationState=7, setup.exe = 1 and SetupFinalTasks = 0 at some point-in-time, and then restart the server directly after sethostname plugin completes?(But that's just my suspicion, and it didn't happen during my tests).
Also, I am not sure if I can add a judgment on the existence of the "Cloudbase-init-unattend.log" file to determine whether Cloudbase-init is initiated by the service or by unattended(Because I see that when Cloudbase-init is initiated by unattended. it will log to "Cloudbase-init-unattended. log"). eg.
Do you think this is better? Or that it may not be necessary at all. I'd like to hear your thoughts. Thank you, |
Hello @jiaweien , You can have a check of the config options that are set in the unattend conf, like https://github.com/cloudbase/cloudbase-init-installer/blob/master/CloudbaseInitSetup/Actions/ConfFileActions.js#L86, but that is a brittle and might change in the future. The best course of action would still be the registry keys check, as those keys are reliable source of identifying in what stage cloudbase-init is run. Still, checking that stop_service_on_exit is false can reliably tell that cloudbase-init is not running in normal mode, under a Windows service, but this change cannot be included in the upstream codebase, whereas the code I shared can and should be 100% reliable. In this case, I will create a patch with those checks and would be glad if you could test the new MSI. Thank you, |
Hello @ader1990 , Yes, I also noticed allow_reboot is false in cloudbase-init-unattend.conf when I was testing earlier. I am looking forward to your new Cloudbase-init.msi and I will test it again then. Thank you |
On the openstack platform, create the virtual machine using the Windows image template processed by sysprep(I used the windows 2012 data center Chinese).
sometimes cloudbase-init reboot the Windows after sethostname when setup.exe is running, although the probability of that happening is very small.
At that time the SYSTEM\Setup\Status\SysprepStatus\GeneralizationState's value is already 7, but the setup of oobe still running, this causes the Windows installation to fail to continue with the exception, and the restart remains the same.
I apologize for not keeping the log file, but please refer to the time in log file of sysprep and cloudbase-init for details.