You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To start, all users who navigate to the ondemand website first encounter the apache server. Any errors encountered at this step will be in the log(s) at <code>/var/log/apache2/error.log</code></p>
2163
-
</li>
2164
-
<li>
2165
-
<p>CAS
2166
-
Apache then redirects the users to CAS for authentication. You can <code>grep -r $user /var/cache/apache2/mod_auth_cas/</code> to check if users have been authed to CAS and a cookie has been set.</p>
2167
-
</li>
2168
-
<li>
2169
-
<p>Apache part deux
2170
-
CAS brings us back to apache and here apache runs all sorts of OOD Lua hooks. Any errors encountered at this step will be in the l
2160
+
<li>Apache</li>
2161
+
</ol>
2162
+
<p>To start, all users who navigate to the ondemand website first encounter the apache server. Any errors encountered at this step will be in the log(s) at <code>/var/log/apache2/error.log</code></p>
2163
+
<ol>
2164
+
<li>CAS</li>
2165
+
</ol>
2166
+
<p>Apache then redirects the users to CAS for authentication. You can <code>grep -r $user /var/cache/apache2/mod_auth_cas/</code> to check if users have been authed to CAS and a cookie has been set.</p>
2167
+
<ol>
2168
+
<li>Apache part deux</li>
2169
+
</ol>
2170
+
<p>CAS brings us back to apache and here apache runs all sorts of OOD Lua hooks. Any errors encountered at this step will be in the l
2171
2171
og(s) at <code>/var/log/apache2/$fqdn_error.log</code></p>
2172
-
</li>
2173
-
<li>
2174
-
<p>The PUN (Per User Nginx) session
2175
-
Apache then starts an NginX server as the user and most things like the main dashboard, submitting jobs, running apps, etc happen here in the PUN. Any errors encountered at this step will be in the logs at <code>/var/log/ondemand-nginx/$user/error.log</code>. You can also see what might be happening here by running commands like <code>ps aux | grep $USER</code> to see the users PUN, or <code>ps aux | grep -i nginx</code> to see all the PUNs. From the ondemand web UI theres an option to "Restart Web Server" which essentially kills and restarts the users PUN.</p>
2176
-
</li>
2177
-
<li>
2178
-
<p>/pun/sys/dashboard
2179
-
The dashboard is mostly covered in section 4, but just wanted to denote that apache then redirects us here after the PUN has been started where users can do everything else. At this step OOD will warn you about things like "Home Directory Not Found" and such. If you get this far I'd recommend you troubleshoot issues with users' home dir, NASii, and free space: <code>df | grep $HOME</code>, <code>du -sh $HOME</code>, <code>journalctl -u autofs</code>, and umount stuff. Check that <code>$HOME/ondemand</code> exists perhaps.</p>
2180
-
</li>
2181
-
<li>
2182
-
<p>OOD Apps
2183
-
When users start an app like JuyterLab or a VNC desktop the job is submitted by the users' PUN and here OOD copies and renders (with ERB) the global app template from <code>/var/www/ood/apps/sys/<app_name>/template/*</code> to <code>$HOME/ondemand/data/sys/dashboard/batch_connect/sys/<app_name>/(output)/<session_id></code>. Any errors encountered at this step will be in <code>$HOME/ondemand/data/sys/dashboard/batch_connect/sys/<app_name>/(output)/<session_id>/*.log</code>.</p>
2184
-
</li>
2185
-
<li>
2186
-
<p>Misc
2187
-
Maybe the ondemand server is just in some invalid state and needs to be reset. I'd recommend you check the puppet conf at <code>/etc/puppetlabs/puppet/puppet.conf</code>, run <code>puppet agent -t</code> , and maybe restart the machine. Running puppet will force restart the apache server and regenerate OOD from the ood config yamls. Then you can restart the server by either ssh-ing to the server and running <code>reboot</code>, or by ssh-ing to proxmox and running <code>qm reset <vmid></code> as root. TIP: you can find the vmid by finding the server in <code>qm list</code>. </p>
2188
-
</li>
2172
+
<ol>
2173
+
<li>The PUN (Per User Nginx) session</li>
2174
+
</ol>
2175
+
<p>Apache then starts an NginX server as the user and most things like the main dashboard, submitting jobs, running apps, etc happen here in the PUN. Any errors encountered at this step will be in the logs at <code>/var/log/ondemand-nginx/$user/error.log</code>. You can also see what might be happening here by running commands like <code>ps aux | grep $USER</code> to see the users PUN, or <code>ps aux | grep -i nginx</code> to see all the PUNs. From the ondemand web UI theres an option to "Restart Web Server" which essentially kills and restarts the users PUN.</p>
2176
+
<ol>
2177
+
<li>/pun/sys/dashboard</li>
2178
+
</ol>
2179
+
<p>The dashboard is mostly covered in section 4, but just wanted to denote that apache then redirects us here after the PUN has been started where users can do everything else. At this step OOD will warn you about things like "Home Directory Not Found" and such. If you get this far I'd recommend you troubleshoot issues with users' home dir, NASii, and free space: <code>df | grep $HOME</code>, <code>du -sh $HOME</code>, <code>journalctl -u autofs</code>, and umount stuff. Check that <code>$HOME/ondemand</code> exists perhaps.</p>
2180
+
<ol>
2181
+
<li>OOD Apps</li>
2182
+
</ol>
2183
+
<p>When users start an app like JuyterLab or a VNC desktop the job is submitted by the users' PUN and here OOD copies and renders (with ERB) the global app template from <code>/var/www/ood/apps/sys/<app_name>/template/*</code> to <code>$HOME/ondemand/data/sys/dashboard/batch_connect/sys/<app_name>/(output)/<session_id></code>. Any errors encountered at this step will be in <code>$HOME/ondemand/data/sys/dashboard/batch_connect/sys/<app_name>/(output)/<session_id>/*.log</code>.</p>
2184
+
<ol>
2185
+
<li>Misc</li>
2189
2186
</ol>
2187
+
<p>Maybe the ondemand server is just in some invalid state and needs to be reset. I'd recommend you check the puppet conf at <code>/etc/puppetlabs/puppet/puppet.conf</code>, run <code>puppet agent -t</code> , and maybe restart the machine. Running puppet will force restart the apache server and regenerate OOD from the ood config yamls. Then you can restart the server by either ssh-ing to the server and running <code>reboot</code>, or by ssh-ing to proxmox and running <code>qm reset <vmid></code> as root. TIP: you can find the vmid by finding the server in <code>qm list</code>. </p>
0 commit comments