Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata Provider: Improve startup times by avoiding directory scanning #1089

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nicjansma
Copy link
Contributor

@nicjansma nicjansma commented Nov 10, 2023

There are a few open Issues sharing problems with Android startup times being slow, especially if there are a lot of files and directories to parse.

For example: e.g. #937 and #839

On one of my devices (Anbernic RG405M), I am using Pegasus with 30+ directories and 40k+ files, each with their own screenshot. Each directory has a complete metadata.pegasus.txt covering all of the files. Pegasus is regularly taking over a minute to start up. This is frustratingly slow, so I looked for opportunities to make it faster.

In profiling the application startup, one thing I found is that the pegasus_metadata provider uses a QDirIterator to go over each directory's contents, when it's looking for matching metadata.pegasus.txt, metadata.txt or *.metadata.pegasus.txt files. I don't know a ton about QT, but from a few threads I've read, QDirIterator may run a stat() on each file it finds (requesting more details from the file system). This is a lot more expensive than just getting a list of file names (with no details), which likely contributes to a slow directory processing time when reading directories with a lot of files.

It also appears we do a full QDirIterator on each directory at least twice:

  • Once to find all *metadata*.txt files
  • Once to find all files that match the metadata's include/exclude list of file extensions

I first attempted a couple ways of making the QDirIterators themselves faster. Here are some startup time results for my tests:

  • Latest Pegasus with Only show existing games checked: 222s
  • Latest Pegasus with Only show existing games disabled: 50s
  • Applying a nameFilter to the iterator and specifying only metadata.pegasus.txt, metadata.txt, *.metadata.pegasus.txt and *.metadata.txt: 50s (same as above)
    • (looks like internally, the nameFilter must still get and stat() all files before returning, so this didn't save any time)
  • Trying to use QDir::entryList() instead of a QDirIterator - 50s (same as above)
    • (looks like internally, this calls QDirIterator)

So, none of the above changes helped.

I believe each of the two QDirIterators phases was contributing to about 20-25s of startup time, so next I looked to find a way to avoid QDirIterator entirely.

Avoiding the first QDirIterator

One fix we could apply to save time: when looking for matching metadata files, run a small QDir::exists() check on just metadata.pegasus.txt and metadata.txt first and return immediately without a full directory scan if so (avoiding the slow stat() calls to the other files).

In the "best case" scenario (where metadata.pegasus.txt exists), this eliminates a QDirIterator pass entirely, reducing startup time in half (to about 25s).

One downside of doing this: It could cause a compatibility concern if people have both a metadata.pegasus.txt and then additional *.metadata.pegasus.txt files. Do we expect that to be common? I think we could explain this limitation in the release notes and wiki to avoid confusion: use one or the other file naming scheme, not both.

Avoiding the second QDirIterator

I reviewed the remaining 25 seconds of startup time, and most of it was spent doing the second QDirIterator. This happens after the metadata files have been parsed, in PegasusFilter.cpp, when applying the include/exclude list to the list of file extensions. It's looking for additional files to include dynamically, if I understand it correctly?

As far as I can tell, this step isn't necessary if you have a "full" metadata file (with all of the files you want defined already), and have Only show existing games unchecked. I think, in the spirit of #899, we should avoid doing this inclusion/exclusion scan if that option is un-checked. In that case, we "trust" the metadata file to be the complete view of the world and shouldn't be looking at the disk.

This removes the second QDirIterator entirely.

Results

On my Android device that was taking 50+ seconds to show the theme, it now loads in under 5 seconds. The rest of the startup time is mostly spent parsing the metadata.pegasus.txt file format. On my faster devices, the theme loads near instantly (except for Media processing, which I have a separate PR for).

@mmatyas
Copy link
Owner

mmatyas commented Nov 12, 2023

Thank you very much for the detailed and thorough investigation! Yes, the current implementation is perhaps a bit naive, and focuses more on being cross-platform and avoiding corner cases, than on raw speed. In Qt, as far as I know QDirIterator is the only built-in method for iterating the contents of a directory, and with some useful filtering features. For smaller collections, and the original target of embedded Linux (eg. the Raspberry Pi) it had an acceptable performance. Later, requests for handling large datasets, and also platform changes, like the filesystem update in Android 10+ made the issue more apparent.

Based on your analysis, it does seem like QDirIterator provides more information than what we need most of the time. Looking into the Qt code, it seems the iterator creates a file info object for each file, which in turn calls statx on construction. This is convenient in some places, like finding runnable scripts, but there are indeed many places where we merely run through file paths.

I was interested in how QDirIterator compares to other solutions, and made a little benchmark. Running on desktop Linux on a test set of ~40 000 files, the classic readdir iteration ran for about 100 ms for me, including the char to QString conversion (we might need to do some sort of Unicode conversion), which accounts for 15% of the duration. QDirIterator ran for about 280 ms, almost three times longer. There's also the C++17 std::filesystem::directory_iterator which might not do any stat calls, but in practice seems to stand between the two with ~200 ms run time. The readdir implementation does seem promising, however I worry about its cross platform compatibility; some experiments could be done there.

About the first QDirIterator

The prefixed metadata file names is a feature that was asked for, and we should keep it. There are two important use cases:

  • People who like to keep the metadata with the games (including all games together) can split a single big text file at that location (eg. by gaming platform or emulator)
  • People who keep the metadata separately from the games, at a global location, can create a separate file for each of their directory or platform

So we cannot use the shortcut of just checking the existence of a particular path. It is also unlikely, but possible that the path is a directory. However, these checks can be done after we iterated through the file names, as opposed to querying each file, like QDirIterator.

About the second QDirIterator

If a Collection in the metadata file has an extension or regex property, Pegasus has to find all files recursively that match the expression. Then, Game entries can extend the Collection with additional games, or add more information to the previously found entries.

Originally Pegasus showed only the games where the game files actually existed. With large game sets, these file checks also took a considerable time, which is why an option was added to show every game, and assume that the metadata file is correctly filled with valid games. Thus the settings option only disables an additional filtering step, but not the game lookup step.

However, note that this scanning only happens if the Collection has the properties mentioned above. If you know that your metadata file contains everything you need as a list of Game entries, then you can remove the extension line, which skips the scanning step.

In case scanning is necessary, the iteration there happens entirely over file names, so this is another place that would benefit from such optimization.

@nicjansma
Copy link
Contributor Author

nicjansma commented Nov 14, 2023

@mmatyas appreciate the feedback ❤️ and if it's not clear, I love Pegasus -- great work.

I should have mentioned as well, that I was trying Pegasus on several devices (Windows desktop, Retroid Pocket 3+ (Android), Ayn Odin 2 Pro (Android), and others). For some reason, Pegasus startup was especially slow on the RG405M where the other devices loaded a lot faster (<10s), using the same content set. So maybe there's something especially bad with the hardware/cpu/sd reader in the RG405M?

I love your research into the other solutions, and yes a straight readdir seems very promising. I agree having a more cross-platform solution (e.g. QT) is ideal. I was reviewing the QT docs as well to look for an alternate that didn't do any stats, but couldn't find anything obvious.

1st QDirIterator

Totally agree the prefixed metadata files feature should be kept!

To be clear, the proposed PR here takes a short-cut to look for just metadata.pegasus.txt and/or metadata.txt. If either of those files are found, it will short-cut out and not do the full directory iteration.

If not found, it proceeds with a full directory scan, as before.

The only downside I see is if an existing user has both a metadata.pegasus.txt and a separate collection of MyOtherPrefix.metadata.pegasus.txt files in the same directory. In that case, this change would only reflect the contents of the first file, not the prefixed files.

But I'm wondering, how common would that be? Someone using MyOtherPrefix.metadata.pegasus.txt files would likely have a bunch of prefixed files and not also a standalone metadata.pegasus.txt file in the same directory, right? We could just document that limitation (you shouldn't have a standalone plus prefixed files in the same path), so this optimization/short-cut could be more efficient.

2nd QDirIterator

That's a good point. I had also tested just removing my extensions: ... lines from all of my metadata files, and startup was a lot faster (because the directory scan was skipped). I can certainly update my metadatas to be that way, since I pre-generate all of my metadata for my devices.

My suggestion is that the option Only show existing games is logically equivalent today to "don't scan my files" when it's un-checked. This PR also applies that "don't scan my files" logic to the extensions scan, which is what I assume people who un-check Only show existing games would also want.

But maybe both of those changes are moot if we can convince ourselves readdir() is portable enough?

Thanks for your feedback! Happy to make any updates if you want to take either of the above changes.

@mmatyas
Copy link
Owner

mmatyas commented Nov 14, 2023

Thanks for sharing your points!

Hm, perhaps the issue on RG405M might not be the hardware, but the software? As far as I know, ever since the file system changes on Android, all file operations must go through a permission checking layer. This is where the RG405M might not excel, though the Retroid Pocket 3+ also supposed to have a recent Android, so I might be wrong here.

1st QDirIterator

I certainly agree it would be a good shortcut, but I think making a difference between the file namings would introduce an unnecessary friction. I would like to avoid such points of "gotcha", where something has to be kept in mind because it is not obvious from our design.

I believe having both kinds of files wouldn't be rare actually; I can imagine people to start splitting a large metadata file into smaller chunks when the main file gets too big. I think we can also assume some sort of categorization: while it is absolutely possible that someone might store even 100 000 games in a single directory, and a metadata file next to them, I feel most at that point would start making some subdirectories. If we assume humanly manageable directory sizes, as our general target, then a non-recursive, non-stat-calling directory iteration should also took a reasonable time.

2nd QDirIterator

I agree the wording might not be the most clear there; we can probably improve that, or add a description to the option. Perhaps the documentation too could be extended with a page for the settings options.

There are people who add details to every game, but also people who rely on auto-collecting all game files and assets, and specify details to none or only but a few of them in the metadata file. This option is also used by the optional third-party data sources, when a file is referenced in their data. A possible rewording should account for all the diverse use cases.

Regarding readdir

Yes, now that we have a good candidate for optimizing the whole directory iteration, and possibly through the whole codebase, it is possible that the situation changes significantly. It's probably not a magic solution, but perhaps we can actually avoid any user-side changes, and keep things as internal optimizations. We can always do new measurements if our performance gains turn out to be less than expected.

@nicjansma
Copy link
Contributor Author

Sounds good to me!

I'll take a stab at a lower-level readdir()-ish solution soon. I have a good set of files to test and can launch on all of the supported OSs, so hopefully we can build some confidence in testing it. Worse comes to worse, it could be an experimental flag/option for a while until we get more people testing it.

@dontsnm
Copy link

dontsnm commented Dec 16, 2023

News about this pull?

@mmatyas
Copy link
Owner

mmatyas commented Dec 17, 2023

I might actually have a little time today/next week for some experimenting.

@mmatyas
Copy link
Owner

mmatyas commented Dec 17, 2023

So I've replaced the QDirIterators in the metadata provider code, and while the result is measurably faster, by about 15–20% at best, it's not as fast as I was expecting it. Looking into some perf analysis it seems the string operations, UTF8 <-> UTF16 conversions, and calculating things like absolute paths do have a cost. The code could be still optimized a bit, and I've measured on a RAM-backed storage, so the results might not be representative.

Some important notes: the dirent method will likely not work out of the box on Windows with MSVC, so different filesystem backends would need to be implemented. The fast path to use dirent itself too would rely on a non-POSIX standard field to tell the file's type, and when it comes to symlinks, stat is still needed. I'm also concerned about Unicode support, which I haven't tested yet. I've also run into issues with subdirectory iteration in the test files, but that might be on me.

Let me know if you also made an implementation and got different results!

@j8r
Copy link
Contributor

j8r commented Dec 19, 2023

What's the cause of this UTF8-UTF16 conversions, is it related to windows?
Can checking itfthe string is all ASCII could help optimizing?

The step further can be to have some sort of cache. All mtime of directories could be stored, and their files they have.
At startup, check all directories in the cache: if any mtime has changed for a directory, iterate over it.
Could it work?

@mmatyas
Copy link
Owner

mmatyas commented Dec 19, 2023

Qt uses UTF16 internally, while different platforms use their platform-specific encoding. Some conversion will be always necessary, which is handled internally by Qt. Conversion has a cost, but this shouldn't be the bottleneck. I haven't tested my experiment on Windows yet, so it's entirely possible that the results would validate the implementation there.

It is possible to change directory contents without updating the directory's modification date, so trusting purely that value would likely be inaccurate (but of course you mentioned this too in a different issue). It could be experimented with though. There are also other low-level directory iteration solutions we could try for more raw speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants