-
-
Notifications
You must be signed in to change notification settings - Fork 791
Enable setting large WebView content #4062
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* extended webview example app
|
@freakboy3742 Please review this PR |
johnzhou721
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 small notes, otherwise this is a great feature.
| if self._large_content_filepath and os.path.exists( | ||
| self._large_content_filepath | ||
| ): | ||
| os.remove(self._large_content_filepath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure of the goals of WebView.... but navigating back to the previous page is an usual use case for a web view widget. As written, if you navigate to another page from the static content loaded, and you navigate back, it'll probably error. IMO it would be safer to defer removing the large content file when the widget is discussed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I could use a static file name or a list of uuid. Then, the temporary file(s) could be deleted in the WebView destructor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of a cache folder is that you shouldn't need to do this sort of cleanup; it should be an automated operation that happens independent of widget lifecycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved the cache cleanup to the destructor
changes/4062.misc.md
Outdated
| @@ -0,0 +1 @@ | |||
| The 2MB limit for setting static WebView content has been removed for Windows | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 cosmetic things:
- missing period at end
- IMO it'll be clearer to say "2MB size limit" not just "2MB limit".
- This shouldn't be a
miscnote -- it has user impact. I believebugfixwould be appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, fixed
|
Would this feature also make sense for other platforms, even when there is no documented upper limit? |
|
@t-arn The way you test for that is by making a non platform-specific test, i.e. removing the branch of line 210 of your testbed change. And then we see if it fails on other platforms... if it does but you feel it will be scope creep if worked on in this PR, add a property to the probes and use that to exclude platforms, instead of platform-specific checks in the testbed. If you want to exclude platforms stuff, instead of a probe you can also use a probe-specific method to perform an operation specific to the platform, and pytest.skip on others. That makes it easier to find missing functionalities through test probes. |
This isn't a "feature" - it's a bug fix for a specific detail/limitation of the Winforms implementation of WebView. If a similar limitation exists for other platforms, then yes - that should be resolved; any test that you add for this feature should be in the testbed, which means any platform-specific limitation will be exposed. |
freakboy3742
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadly makes sense; a couple of comments about cache strategy inline.
| # according to the Microsoft documentation, the max content size is | ||
| # 2 MB, but in fact, the limit seems to be at 1.5 MB | ||
| file_name = str(uuid.uuid4()) + ".html" | ||
| self._large_content_filepath = toga.App.app.paths.cache / file_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that a root_url is a required part of the set_content() call, that would seem to be the appropriate cache key. I'd also suggest that the base cache path isn't an appropriate directory location - we should be putting webview cache content into some sort of internal Toga directory (e.g., paths.cache / "toga/webview"). There's also a need to differentiate this widget from others - if I have 2 web views ,both displaying "index.html", I don't want them to collide in the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should be putting webview cache content into some sort of internal Toga directory (e.g., paths.cache / "toga/webview")
Agreed.
But I'm not so happy with the idea of using the root_url as a filename. An URL makes a bad filename under Windows: It contains characters that are not allowed, like : or / and filenames have a length restriction and we'd need to make them unique, for example by adding a timestamp. Is handling all this worth the effort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair concern. In which case, I'd suggest a hash that is derived from the root_url (e.g., a sha256 hash), rather than a random uuid4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it so important for you to use the root_url for creating the filename? Hashing the root_url will solve the problem of the invalid characters and the length concern. But we would still need to add something to make the hashed unique. Why is this any better than a random uuid4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of a cache folder is that you shouldn't need to do this sort of cleanup
IMHO the cache folder is meant to store temporary files and I feel bad about leaving behind large files when it is easy to clean them up.
But as suggested by John, we should probably make the cleanup in the widget's destructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it so important for you to use the root_url for creating the filename? Hashing the root_url will solve the problem of the invalid characters and the length concern. But we would still need to add something to make the hashed unique. Why is this any better than a random uuid4?
Because https://example.com will produce the same sha256 every time; uuid4() won't. There's no way to reconstruct a uuid() from the content once the initial generation is done.
The uniqueness that is required is uniqueness on URL, on a per-webview instance basis. The ID of the webview, and a sha256 of the URL gives you all the uniqueness you need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But as suggested by John, we should probably make the cleanup in the widget's destructor.
That would work for most cases; but there's no guarantee that the widget's destructor will actually run if the app crashes or the user CTRL-C's the app. But I guess cleaning up on "normal" shutdown makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ID of the webview, and a sha256 of the URL gives you all the uniqueness you need.
Since the root_url is ignored in set_content() on Windows, users will not give much attention to what they use as root_url. I'm personally inclined to always use "about:blank" as this is what Microsoft's CoreWebView2.NavigateToString is doing. Using the hashed root_url in cases like this will overwrite cache files that actually might have different content.
If you want to keep a relation between the cache filename and a certain set_content() operation, we'd better use the hashed content and webview ID as a filename. But this is expensive for large contents.
If you don't like uuid4() for some reason, we could also use str(time.time()) which gives us the current nanoseconds, which is practically just as unique as a uuid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ID of the webview, and a sha256 of the URL gives you all the uniqueness you need.
Since the root_url is ignored in set_content() on Windows, users will not give much attention to what they use as root_url.
Ok - but root_url is a required part of the API, so if you're ignoring it, your app isn't going to operate cross-platform. Having functionality break when you're deliberately not using a required part of the API seems like a user problem to me :-)
If you don't like uuid4() for some reason, we could also use
str(time.time())which gives us the current nanoseconds, which is practically just as unique as a uuid.
It's very likely to be unique, but not guaranteed; and it's not reversible, because you can't now what time the widget was populated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, fixed
| if self._large_content_filepath and os.path.exists( | ||
| self._large_content_filepath | ||
| ): | ||
| os.remove(self._large_content_filepath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of a cache folder is that you shouldn't need to do this sort of cleanup; it should be an automated operation that happens independent of widget lifecycle.
freakboy3742
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of notes inline; plus the test failure (which will be addressed by addressing the comments).
| if len(content) > 1572834: | ||
| # according to the Microsoft documentation, the max content size is | ||
| # 2 MB, but in fact, the limit seems to be at about 1.5 MB | ||
| os.mkdirs(self._large_content_dir, exist_ok=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_last_content_dir is a Path object; it has it's own native .mkdir() method .We don't need to import os.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, fixed
| # There appears to be no way to pass the root_url. | ||
| self.native.NavigateToString(content) | ||
|
|
||
| def get_large_content_file(self, root_url): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this needs to be factored out as a method. It's not being re-used anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, fixed
|
@freakboy3742 please re-review this PR |
freakboy3742
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting closer; a couple of issues related to destruction behavior and testing.
core/src/toga/widgets/webview.py
Outdated
| else: | ||
| self.url = url | ||
|
|
||
| def __del__(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow why this is needed. If the interface object is being deleted, then either the child has been deleted, or it should be a candidate to be deleted. The only way this wouldn't be the case is if something else is retaining a reference to _impl.
If this is to force a specific behavior in the testbed, the underlying problem is that garbage collection is an inherently unpredictable activity. The fix isn't to add fixes like this to try and make it predictable - it's to remove the dependency on an unpredictable behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, fixed
|
|
||
| async def test_static_large_content(widget, probe, on_load): | ||
| """Static large content can be loaded into the page on windows""" | ||
| if toga.platform.current_platform == "windows": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't put platform-specific flags like this in tests. Either the test needs to be run on all platforms, or there's some sort of probe-level feature to check for why the test shouldn't run.
In this case, checking that a large block of content can be added to the widget is a valid test, regardless of the underlying platform. We shouldn't need to validate the existence of an implementation-specific cache file - either the content loads, or it doesn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, test is now started for all platforms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're missing my point. There should be no references to current_platform in a test. If there's platform-specific logic - put it in the probe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's platform-specific logic - put it in the probe
OK, understood.
I added platform-specific code because there already is other platform-specific code in test_webview.py of the testbed. We should probably clean up those parts as well (in an other PR...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fixed
| with open(file_path, "w") as f: | ||
| f.write(content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a major deal, but this sort of pattern is a lot easier with Pathlib:
| with open(file_path, "w") as f: | |
| f.write(content) | |
| file_path.write_text(content, encoding='utf-8') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, fixed
|
@freakboy3742 The latest testbed test for webview clearly shows, that Android also has a limitation on the content size (exactly 2 MB). Linux also seems to have problems with large content. Here, I don't quite see what the real problem is. Is the content too large? Where exactly is the limit? Or, is it "only" a problem in the test assertion (url)? What about the other platforms? Do they have a size limit? Where is the limit? How should we proceed? I don't have a Linux environment. So, if the limit there is lower than 1.5 MB, it will be difficult for me to determine this limit. |
I have no idea. 2MB is evidently safe, because it's passing tests. Linux isn't failing across the board; it's only Qt.
|
* implemented workaround for qt * removed platform-specific code from testbed test
|
The temporary cache file can be created on Android, but it cannot be loaded due to a net::ERR_ACCESS_DENIED error. It seems that I need to implement a WebViewAssetLoader and then use the URL |
This PR adds a workaround, so that WebView content > 2MB can still be displayed without causing a native error: The content is written to a temporary file in the 'Cache' directory. This file is deleted when the WebView implementation is garbage collected.
Although the Microsoft documentation says that the limit is at 2 MB, the tests have revealed that the limit actually is at 1.5 MB
On Android and Qt, the limit is at 2 MB
PR Checklist: