QGIS Server performance questions #10

lucamanga · 2020-06-05T10:24:00Z

Hello,
I'm doing a WFS querying on your qgis docker: https://hub.docker.com/r/3liz/qgis-map-server/
I noticed that there some performance issues on WFS querying.
First time it takes 5 seconds to identify a point, then the second time is instantaneously. But the third time, after some time (45 seconds), it takes 5 seconds again.

Logs of first REQUEST: 10:16:32 -> 10:16:37

2020-06-05 10:16:32,479 INFO    [3498]  Qgis: Server: BBOX:664037,5103713,664037,5103713,EPSG:25832
2020-06-05 10:16:32,479 INFO    [3498]  Qgis: Server: MAP:mercati3.qgs
2020-06-05 10:16:32,479 INFO    [3498]  Qgis: Server: OUTPUTFORMAT:application/json
2020-06-05 10:16:32,479 INFO    [3498]  Qgis: Server: PROPERTYNAME:*
2020-06-05 10:16:32,479 INFO    [3498]  Qgis: Server: REQUEST:GetFeature
2020-06-05 10:16:32,479 INFO    [3498]  Qgis: Server: SERVICE:WFS
2020-06-05 10:16:32,479 INFO    [3498]  Qgis: Server: TYPENAME:jlucia
2020-06-05 10:16:32,479 INFO    [3498]  Qgis: Server: VERSION:1.1.0
2020-06-05 10:16:32,480 INFO    [3498]  Qgis: Server: WFS Request parameters:
2020-06-05 10:16:32,480 INFO    [3498]  Qgis: Server:  - OUTPUTFORMAT : application/json
2020-06-05 10:16:32,480 INFO    [3498]  Qgis: Server:  - PROPERTYNAME : *
2020-06-05 10:16:32,480 INFO    [3498]  Qgis: Server:  - TYPENAME : jlucia
2020-06-05 10:16:32,480 INFO    [3498]  Qgis: Server:  - BBOX : 664037,5103713,664037,5103713,EPSG:25832
2020-06-05 10:16:32,480 INFO    [3498]  Qgis: Server:  - VERSION : 1.1.0
2020-06-05 10:16:37,515 DEBUG   [3498]  b'\x83S\xb5(\xa6J\x11\xea\xa2\x8a\x02B\xac\x1a\x00\x02': Flushing response data: (461 bytes)
2020-06-05 10:16:37,516 DEBUG   [25]    SND worker: b'\x83S\xb5(\xa6J\x11\xea\xa2\x8a\x02B\xac\x1a\x00\x02' -> client: b'OWS-SERVER-1' : b'\xa0\xf4\xec*\xa7\x15\x11\xea\x87i\x02B\xac\x1a\x00\x02'
2020-06-05 10:16:37,517 DEBUG   [3498]  b'\x83S\xb5(\xa6J\x11\xea\xa2\x8a\x02B\xac\x1a\x00\x02': Flushing response data: (4 bytes)
2020-06-05 10:16:37,517 DEBUG   [25]    SND worker: b'\x83S\xb5(\xa6J\x11\xea\xa2\x8a\x02B\xac\x1a\x00\x02' -> client: b'OWS-SERVER-1' : b'\xa0\xf4\xec*\xa7\x15\x11\xea\x87i\x02B\xac\x1a\x00\x02'
2020-06-05 10:16:37,517 DEBUG   [25]    SND worker: b'\x83S\xb5(\xa6J\x11\xea\xa2\x8a\x02B\xac\x1a\x00\x02' -> client: b'OWS-SERVER-1' : b'\xa0\xf4\xec*\xa7\x15\x11\xea\x87i\x02B\xac\x1a\x00\x02'
2020-06-05 10:16:37,517 INFO    [3498]  Qgis: Server: Request finished in 5038 ms
2020-06-05 10:16:37,518 DEBUG   [25]    READY b'\x83S\xb5(\xa6J\x11\xea\xa2\x8a\x02B\xac\x1a\x00\x02'
2020-06-05 10:16:37,582 RREQ    [1]             206     GET     ?MAP=mercati3.qgs&OUTPUTFORMAT=application%2Fjson&SERVICE=WFS&PROPERTYNAME=%2A&REQUEST=GetFeature&TYPENAME=jlucia&VERSION=1.1.0&BBOX=664037%2C5103713%2C664037%2C5103713%2CEPSG%3A25832  5104    -1
2020-06-05 10:16:37,636 REQ     [1]     192.168.10.78   200     GET     /ows/?MAP=mercati3.qgs&outputFormat=application%2Fjson&service=WFS&propertyname=%2A&request=GetFeature&typename=jlucia&version=1.1.0&bbox=664037%2C5103713%2C664037%2C5103713%2CEPSG%3A25832     5158    -1       python-requests/2.18.4

SECOND request: the response is instant

2020-06-05 10:16:51,697 INFO    [10917] Qgis: Server:  - VERSION : 1.1.0
2020-06-05 10:16:51,740 DEBUG   [10917] b'\x13\x97\x18$\xa0\xa5\x11\xea\x89\xcb\x02B\xac\x1a\x00\x02': Flushing response data: (461 bytes)

THIRD request after 30-45 seconds: 10:19:26 -> 10:19:31, again 5 seconds

2020-06-05 10:19:26,048 INFO    [10917] Qgis: Server:  - VERSION : 1.1.0
2020-06-05 10:19:31,081 DEBUG   [10917] b'\x13\x97\x18$\xa0\xa5\x11\xea\x89\xcb\x02B\xac\x1a\x00\x02': Flushing response data: (461 bytes)

The text was updated successfully, but these errors were encountered:

dmarteau · 2020-06-05T12:00:11Z

It is known that the first time a project has to be loaded in qgis server it may takes some outrageous amount of time depending on the number of layers and the datasource involved.

Now you must know that py-qgis-server use qgis server worker in child processes for handling requests (qgis server by itself is not asynchronous and that there is no shared cache between those workers (an issue that cannot be solved without rewriting a large part of Qgis code).

From this you may experience latency each time a project has to be loaded in a worker cache and
you will have always optimal response time when the project has been loaded into each worker cache.

Depending of the nature and the number of projects (number of layers, big datasources....) you are using you may have to use different proxy strategies (for exemple you may implement sharding with several py-qgis-server instances). If you have a few projects, you may also considering seeding with multiple initial requests until all workers have their projects loaded.

lucamanga · 2020-06-05T13:14:29Z

Interesting. How to implement sharding with several py qgis servers?

dmarteau · 2020-06-05T13:23:57Z

You pop several instances of py-qgis-servers and may use a nginx as reverse proxy with some consistent hashing of the MAP parameter.

lucamanga · 2020-06-05T13:48:40Z

I noticed that there is QGSRV_CACHE_ROOTDIR variable. May it help?

dmarteau · 2020-06-05T15:19:26Z

No, the cache QGSRV_CACHE_ROOTDIR set the location of the projects files. The configuration is not well documented and we are working on it. You may adjust the number of workers with QGSRV_SERVER_WORKERS.

kikislater · 2020-06-30T12:04:46Z

QGSRV_SERVER_WORKERS

Interesting discussion !
So how workers works ? Is there a rule between cpu threads and workers ?

dmarteau · 2020-06-30T14:31:00Z

It is not thread, it is really multi processing. Requests are distributed using a fair queuing with 0MQ messaging. you may also distribute your workers on a whole cluster by running worker only/proxy only containers.

kikislater · 2020-07-07T07:05:51Z

Ok understand, thank you. So it means, it will need a big infrastructure to achieve good performances but when I tried with lizmap-docker-compose on big project and 8 vcpu + 8 gb ram, it never reach full computer load (ram and cpu : ram does not reach his maximum and cpu stay at around 25% by vCPU).

Even with agressive parameters in :

qgis server (QGSRV_SERVER_WORKERS 2 to 8 to 32 : could be stupid but just trying to saturate computer and QGIS_SERVER_MAX_THREADS : 8),
php fpm : pm.start_servers, pm.min_spare_servers, pm.max_spare_servers ... I will play later with others fpm parameters like PM_CHILD_PROCESS, PM_MAX_CHILDREN, PM_MAX_REQUESTS and PM_PROCESS_IDLE_TIMEOUT setting in environment variable do nothing at this time in current docker configuration. First ones are written with bash command sed to fpm configuration.
nginx : worker_processes and worker_connections

So at this time, my question is : any idea where this limitation come from :

docker ? (I don't think so cause I manage to get 100% vCPU on running stress or stress ng in qgis server, lizmap, nginx and redis container)
project reading ? 2.2 Mo
...

dmarteau · 2020-07-07T08:15:38Z

You will saturate your CPU with computation intensive jobs. This is highly dependend of the context and the kind of project, as a rule a thumb you may expect that jobs spend most of the doing I/0 which means it has mostly no impact on cpu demand.

Increasing the number of workers will not change loading time nor the time spent internally by one worker to process your request: it will enable you to process more request at the same time scale according to your request rate.

Because of this you must also set the proper values for php-fpm depending on what is your scenario.

A said before, performances depends on many factors and the appropriate solution depends on what you want to improve.

AMHA, here are the questions to asks:

Number of workers, php configuration and cache size will play a role with:

What is the the expected request rate
How requests distribute on projects
How many numbers of differents projects I have to handle.

And the following will impact the internal performances of each workers

How many layers is there is in the projects
How big is the data I have to handle.
How is access to my backend databases - (I have seen many performances issues from bad settings in postgis databases).

The former questions target your infrastructure choice, the latter rely for the most part on Qgis internal performances.

kikislater · 2020-07-07T09:49:10Z

Ok thank you for the reply, with docker stats I clearly show the I/0.
Docker have one limitation about I/0 : by default it's reduced depending on linux distribution. docker.service needs to be updated from

LimitNOFILE=1048576
LimitNPROC=1048576

to

LimitNOFILE=infinity
LimitNPROC=infinity

Tested on Lizmap and show an improvement on pre cached layers

dmarteau · 2020-07-07T11:06:14Z

Tested on Lizmap and show an improvement on pre cached layers

Good to know thanks !

dmarteau · 2020-07-07T11:09:22Z

@kikislater

Do you have some metrics ? Could be interesting to investigate the performance gain.

kikislater · 2020-07-07T11:26:24Z

No sorry, just visual but you could read here some input about I/0 on docker with metrics : moby/moby#21485

So you could test by yourself with reading / writing inside and outside your container.

PR at the end moby/moby#24307
Consider looking at TasksMax=infinity as well in same systemd service as it was not mention in PR and related to your kernel option

lucamanga · 2020-07-07T11:52:51Z

Il giorno mar 7 lug 2020 alle ore 13:26 Sylvain POULAIN < notifications@github.com> ha scritto:

Consider looking at TasksMax=infinity as well in same systemd service as

it was not mention in PR and related to your kernel option Where can I try this setting (among the other as LimitNOFILE and LimitNPROC)?

…

-- Comune di Trento via Belenzani, 19 - 38122 Trento | C.F e P. IVA: 00355870221 tel. +39 0461.884111 | www.comune.trento.it <http://www.comune.trento.it>

kikislater · 2020-07-07T14:51:40Z

Depending on your host linux distribution. Some distribution already have it well tuned. Could not be a final solution and need to be more tested with qgis server
It used to be in /usr/lib/systemd/system/docker.service

TANK2003 · 2020-09-04T07:58:07Z

I see a new config "SERVER_RESTARTMON", can we use it to improve internal performances of each workers ? By updating the file that "SERVER_RESTARTMON" is watching before a user make an OWS request.

Is there a way to make have timeout before request send a '422 Unprocessable Entity' ?

Thanks for your work !

dmarteau · 2020-09-04T12:12:59Z

@TANK2003 SERVER_RESTARTMON is a just a very simple way to ask the workers to make a graceful restart, for example when you are updating plugins, it is not really related to internal performances.
The main process broadcasts a notification to the workers: they restart as soon they have finished the current processing. while new incoming requests are held back by the dispatcher. This ensure that there is no lost of requests during the restart process.

'422 Unprocessable Entity' has nothing to do with timeout, it is sent when you have invalid layers in strict checking mode.

dmarteau closed this as completed Jan 11, 2021

dmarteau pinned this issue Mar 11, 2021

dmarteau mentioned this issue Mar 15, 2022

running proxy and workers with docker #36

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QGIS Server performance questions #10

QGIS Server performance questions #10

lucamanga commented Jun 5, 2020

dmarteau commented Jun 5, 2020

lucamanga commented Jun 5, 2020

dmarteau commented Jun 5, 2020

lucamanga commented Jun 5, 2020

dmarteau commented Jun 5, 2020

kikislater commented Jun 30, 2020

dmarteau commented Jun 30, 2020

kikislater commented Jul 7, 2020 •

edited

Loading

dmarteau commented Jul 7, 2020

kikislater commented Jul 7, 2020

dmarteau commented Jul 7, 2020

dmarteau commented Jul 7, 2020

kikislater commented Jul 7, 2020

lucamanga commented Jul 7, 2020 via email

kikislater commented Jul 7, 2020

TANK2003 commented Sep 4, 2020

dmarteau commented Sep 4, 2020 •

edited

Loading

QGIS Server performance questions #10

QGIS Server performance questions #10

Comments

lucamanga commented Jun 5, 2020

dmarteau commented Jun 5, 2020

lucamanga commented Jun 5, 2020

dmarteau commented Jun 5, 2020

lucamanga commented Jun 5, 2020

dmarteau commented Jun 5, 2020

kikislater commented Jun 30, 2020

dmarteau commented Jun 30, 2020

kikislater commented Jul 7, 2020 • edited Loading

dmarteau commented Jul 7, 2020

kikislater commented Jul 7, 2020

dmarteau commented Jul 7, 2020

dmarteau commented Jul 7, 2020

kikislater commented Jul 7, 2020

lucamanga commented Jul 7, 2020 via email

kikislater commented Jul 7, 2020

TANK2003 commented Sep 4, 2020

dmarteau commented Sep 4, 2020 • edited Loading

kikislater commented Jul 7, 2020 •

edited

Loading

dmarteau commented Sep 4, 2020 •

edited

Loading