-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance / Some documents take hours to convert #93
Comments
Which version are you using ? With the current develop branch, I was able to generate the HTML from the XML notice downloaded from the TED website:
It does take some time (almost 4 minutes), I guess because the notice has 158 lots. But it does work in the end. Please note that, as indicated in the README, the eForms Notice Viewer is a sample application:
So we haven't looked into performance improvements for it. We would be happy to accept contributions in this area. Please also note that we offer an online service to convert XML notices to HTML or PDF. In addition, you can download the notice in HTML from the TED website: https://ted.europa.eu/en/simap/developers-corner-for-reusers#download-notices-various-formats For your suggestion to include the HTML files in the bulk download, please contact the TED HelpDesk: https://ted.europa.eu/en/contact |
I was using 0.10.0 but even with the current 0.11.0-SNAPSHOT the problem occured. Finally it seemd to be a caching issue, after calling the programm with -f it converted the file in 3 minutes.
I know, but i am no java programmer and i dont know the eforms sdk at all. We are a small agency and have no capacities to employ a java programmer.
So can´t you provide the software that is used there to the public?
I am afraid that this download page and also the API you mentioned earlier are rate limited? Are these HTML-Files generated on the fly if you download them or are they lying on the server already. In the latter case it would be very easy for you / the EU, to pack them into an archive and provide a link... Anyway, thanks for your reply! |
Hello, some documents, for example todays document 00315692_2024.xml take hours to be converted into html. It seems as if the software is running in an endless loop because it generates thousands of lines like this:
<section title="block040115">5.1.12 <span class="label">Bedingungen für die Auftragsvergabe</span><section title="block04011503"><span class="label">Bedingungen für die Einreichung</span><span class="text">:</span><section title="block0401150301"><span class="label">Elektronische Einreichung</span><span class="text">:â<80><88></span><span class="dynamic-label">Erforderlich</span></section>
After two hours it had generated 23.000 lines like that and i killed the process...
Is this a bug in the software or in the document?
In general it would be great if the performance of this tool could be enhanced or if the bulk download files could be provided in html. This would be one conversion run for you, while at the moment lots of bulk download users have to convert the xml to html on their own.
The text was updated successfully, but these errors were encountered: