Skip to content

Conversation

@t-arn
Copy link
Contributor

@t-arn t-arn commented Jan 8, 2026

This PR adds a workaround, so that WebView content > 2MB can still be displayed without causing a native error: The content is written to a temporary file in the 'Cache' directory. This file is deleted when the WebView implementation is garbage collected.

Although the Microsoft documentation says that the limit is at 2 MB, the tests have revealed that the limit actually is at 1.5 MB

On Android and Qt, the limit is at 2 MB

PR Checklist:

  • All new features have been tested
  • All new features have been documented
  • I have read the CONTRIBUTING.md file
  • I will abide by the code of conduct

@t-arn
Copy link
Contributor Author

t-arn commented Jan 8, 2026

@freakboy3742 Please review this PR

Copy link
Contributor

@johnzhou721 johnzhou721 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 small notes, otherwise this is a great feature.

if self._large_content_filepath and os.path.exists(
self._large_content_filepath
):
os.remove(self._large_content_filepath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure of the goals of WebView.... but navigating back to the previous page is an usual use case for a web view widget. As written, if you navigate to another page from the static content loaded, and you navigate back, it'll probably error. IMO it would be safer to defer removing the large content file when the widget is discussed.

Copy link
Contributor Author

@t-arn t-arn Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I could use a static file name or a list of uuid. Then, the temporary file(s) could be deleted in the WebView destructor

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of a cache folder is that you shouldn't need to do this sort of cleanup; it should be an automated operation that happens independent of widget lifecycle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved the cache cleanup to the destructor

@@ -0,0 +1 @@
The 2MB limit for setting static WebView content has been removed for Windows
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 cosmetic things:

  1. missing period at end
  2. IMO it'll be clearer to say "2MB size limit" not just "2MB limit".
  3. This shouldn't be a misc note -- it has user impact. I believe bugfix would be appropriate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fixed

@t-arn
Copy link
Contributor Author

t-arn commented Jan 9, 2026

Would this feature also make sense for other platforms, even when there is no documented upper limit?

@johnzhou721
Copy link
Contributor

@t-arn The way you test for that is by making a non platform-specific test, i.e. removing the branch of line 210 of your testbed change. And then we see if it fails on other platforms... if it does but you feel it will be scope creep if worked on in this PR, add a property to the probes and use that to exclude platforms, instead of platform-specific checks in the testbed.

If you want to exclude platforms stuff, instead of a probe you can also use a probe-specific method to perform an operation specific to the platform, and pytest.skip on others. That makes it easier to find missing functionalities through test probes.

@freakboy3742
Copy link
Member

Would this feature also make sense for other platforms, even when there is no documented upper limit?

This isn't a "feature" - it's a bug fix for a specific detail/limitation of the Winforms implementation of WebView. If a similar limitation exists for other platforms, then yes - that should be resolved; any test that you add for this feature should be in the testbed, which means any platform-specific limitation will be exposed.

Copy link
Member

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly makes sense; a couple of comments about cache strategy inline.

# according to the Microsoft documentation, the max content size is
# 2 MB, but in fact, the limit seems to be at 1.5 MB
file_name = str(uuid.uuid4()) + ".html"
self._large_content_filepath = toga.App.app.paths.cache / file_name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that a root_url is a required part of the set_content() call, that would seem to be the appropriate cache key. I'd also suggest that the base cache path isn't an appropriate directory location - we should be putting webview cache content into some sort of internal Toga directory (e.g., paths.cache / "toga/webview"). There's also a need to differentiate this widget from others - if I have 2 web views ,both displaying "index.html", I don't want them to collide in the cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be putting webview cache content into some sort of internal Toga directory (e.g., paths.cache / "toga/webview")

Agreed.

But I'm not so happy with the idea of using the root_url as a filename. An URL makes a bad filename under Windows: It contains characters that are not allowed, like : or / and filenames have a length restriction and we'd need to make them unique, for example by adding a timestamp. Is handling all this worth the effort?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair concern. In which case, I'd suggest a hash that is derived from the root_url (e.g., a sha256 hash), rather than a random uuid4.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it so important for you to use the root_url for creating the filename? Hashing the root_url will solve the problem of the invalid characters and the length concern. But we would still need to add something to make the hashed unique. Why is this any better than a random uuid4?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of a cache folder is that you shouldn't need to do this sort of cleanup

IMHO the cache folder is meant to store temporary files and I feel bad about leaving behind large files when it is easy to clean them up.

But as suggested by John, we should probably make the cleanup in the widget's destructor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it so important for you to use the root_url for creating the filename? Hashing the root_url will solve the problem of the invalid characters and the length concern. But we would still need to add something to make the hashed unique. Why is this any better than a random uuid4?

Because https://example.com will produce the same sha256 every time; uuid4() won't. There's no way to reconstruct a uuid() from the content once the initial generation is done.

The uniqueness that is required is uniqueness on URL, on a per-webview instance basis. The ID of the webview, and a sha256 of the URL gives you all the uniqueness you need.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But as suggested by John, we should probably make the cleanup in the widget's destructor.

That would work for most cases; but there's no guarantee that the widget's destructor will actually run if the app crashes or the user CTRL-C's the app. But I guess cleaning up on "normal" shutdown makes sense.

Copy link
Contributor Author

@t-arn t-arn Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@freakboy3742

The ID of the webview, and a sha256 of the URL gives you all the uniqueness you need.

Since the root_url is ignored in set_content() on Windows, users will not give much attention to what they use as root_url. I'm personally inclined to always use "about:blank" as this is what Microsoft's CoreWebView2.NavigateToString is doing. Using the hashed root_url in cases like this will overwrite cache files that actually might have different content.

If you want to keep a relation between the cache filename and a certain set_content() operation, we'd better use the hashed content and webview ID as a filename. But this is expensive for large contents.

If you don't like uuid4() for some reason, we could also use str(time.time()) which gives us the current nanoseconds, which is practically just as unique as a uuid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ID of the webview, and a sha256 of the URL gives you all the uniqueness you need.

Since the root_url is ignored in set_content() on Windows, users will not give much attention to what they use as root_url.

Ok - but root_url is a required part of the API, so if you're ignoring it, your app isn't going to operate cross-platform. Having functionality break when you're deliberately not using a required part of the API seems like a user problem to me :-)

If you don't like uuid4() for some reason, we could also use str(time.time()) which gives us the current nanoseconds, which is practically just as unique as a uuid.

It's very likely to be unique, but not guaranteed; and it's not reversible, because you can't now what time the widget was populated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, fixed

if self._large_content_filepath and os.path.exists(
self._large_content_filepath
):
os.remove(self._large_content_filepath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of a cache folder is that you shouldn't need to do this sort of cleanup; it should be an automated operation that happens independent of widget lifecycle.

Copy link
Member

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of notes inline; plus the test failure (which will be addressed by addressing the comments).

if len(content) > 1572834:
# according to the Microsoft documentation, the max content size is
# 2 MB, but in fact, the limit seems to be at about 1.5 MB
os.mkdirs(self._large_content_dir, exist_ok=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_last_content_dir is a Path object; it has it's own native .mkdir() method .We don't need to import os.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, fixed

# There appears to be no way to pass the root_url.
self.native.NavigateToString(content)

def get_large_content_file(self, root_url):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this needs to be factored out as a method. It's not being re-used anywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, fixed

@t-arn
Copy link
Contributor Author

t-arn commented Jan 19, 2026

@freakboy3742 please re-review this PR

Copy link
Member

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting closer; a couple of issues related to destruction behavior and testing.

else:
self.url = url

def __del__(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow why this is needed. If the interface object is being deleted, then either the child has been deleted, or it should be a candidate to be deleted. The only way this wouldn't be the case is if something else is retaining a reference to _impl.

If this is to force a specific behavior in the testbed, the underlying problem is that garbage collection is an inherently unpredictable activity. The fix isn't to add fixes like this to try and make it predictable - it's to remove the dependency on an unpredictable behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fixed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fixed


async def test_static_large_content(widget, probe, on_load):
"""Static large content can be loaded into the page on windows"""
if toga.platform.current_platform == "windows":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't put platform-specific flags like this in tests. Either the test needs to be run on all platforms, or there's some sort of probe-level feature to check for why the test shouldn't run.

In this case, checking that a large block of content can be added to the widget is a valid test, regardless of the underlying platform. We shouldn't need to validate the existence of an implementation-specific cache file - either the content loads, or it doesn't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, test is now started for all platforms

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing my point. There should be no references to current_platform in a test. If there's platform-specific logic - put it in the probe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's platform-specific logic - put it in the probe

OK, understood.

I added platform-specific code because there already is other platform-specific code in test_webview.py of the testbed. We should probably clean up those parts as well (in an other PR...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fixed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fixed

Comment on lines 255 to 256
with open(file_path, "w") as f:
f.write(content)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a major deal, but this sort of pattern is a lot easier with Pathlib:

Suggested change
with open(file_path, "w") as f:
f.write(content)
file_path.write_text(content, encoding='utf-8')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fixed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, fixed

@t-arn
Copy link
Contributor Author

t-arn commented Jan 20, 2026

@freakboy3742 The latest testbed test for webview clearly shows, that Android also has a limitation on the content size (exactly 2 MB).

Linux also seems to have problems with large content. Here, I don't quite see what the real problem is. Is the content too large? Where exactly is the limit? Or, is it "only" a problem in the test assertion (url)?

What about the other platforms? Do they have a size limit? Where is the limit?

How should we proceed?
Should I implement the cache file workaround for all platforms for contents larger than 1.5 MB (or the possibly lower max size on Linux)? Or, use a specific limit for every platform? Do we set a limit even when there is no official limit?

I don't have a Linux environment. So, if the limit there is lower than 1.5 MB, it will be difficult for me to determine this limit.

@freakboy3742
Copy link
Member

What about the other platforms? Do they have a size limit? Where is the limit?

I have no idea. 2MB is evidently safe, because it's passing tests.

Linux isn't failing across the board; it's only Qt.

How should we proceed?

  1. Implement the same type of workaround for any platform where you're comfortable doing so (I'm guessing you're OK with Android - but that's not a requirement if you don't want to)
  2. Implement a probe method for large_content_url() that does the URL transform required by the test. On platforms that still fail, the implementation can be pytest.skip() indicating there's a problem. On platforms that are OK, don't implement the method for other backends, and use "if probe doesn't implement the method, use default handling" behavior.
  3. Open an issue logging the problem and the platforms where it is known to be an issue. This PR (and the changenote) should then be modified to reference that issue number; this PR will then reference, but not close that issue.

t-arn added 3 commits January 22, 2026 22:56
* implemented workaround for qt
* removed platform-specific code from testbed test
@t-arn t-arn changed the title Enable setting large WebView content on Windows Enable setting large WebView content Jan 22, 2026
@t-arn
Copy link
Contributor Author

t-arn commented Jan 24, 2026

The temporary cache file can be created on Android, but it cannot be loaded due to a net::ERR_ACCESS_DENIED error. It seems that I need to implement a WebViewAssetLoader and then use the URL https://appassets.androidplatform.net/cache/toga/webview-xxx/yyyy.html to load the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants