Skip to content

Conversation

@pedrohsdb
Copy link
Contributor

@pedrohsdb pedrohsdb commented Jan 28, 2026

Summary

  • Fix critical bug where self._page was used instead of self.page in _wait_for_page_ready_before_action
  • Added diagnostic logging for screenshot timeouts to help debug Flinks issues

Bug Fix

The SkyvernPage class uses self.page but _wait_for_page_ready_before_action was checking self._page which doesn't exist. This caused every cached script action to log:

AttributeError: 'ScriptSkyvernPage' object has no attribute '_page'

Added Logging for Screenshot Diagnostics

Screenshot timeout now logs context:

Screenshot timeout | timeout_ms=3000 | url=https://... | viewport=1920x1080 | full_page=False | mode=detailed

Scrape retry attempts now logged:

Scrape attempt failed, will retry with next strategy | attempt=1 | scrape_type=normal | error_type=FailedToTakeScreenshot | url=...
Scrape attempt failed, will retry with next strategy | attempt=2 | scrape_type=normal | error_type=FailedToTakeScreenshot | url=...
All scrape attempts failed | total_attempts=3 | error_type=FailedToTakeScreenshot | url=... | step_order=0 | step_retry=1

This will help identify:

  • Which URLs are causing timeouts
  • What timeout value is actually being used
  • Page viewport size (large pages may timeout)
  • Which retry attempt failed
  • Step context (order and retry index)

Test plan

  • Deploy to staging and verify logs appear correctly
  • Have Nick deploy and check if new logs provide insight into slowness
  • Monitor for patterns in failing URLs/viewport sizes

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Enhanced error logging for screenshot and scraping operations with improved diagnostic context.
  • Chores

    • Internal refactoring to improve code consistency and maintainability.

✏️ Tip: You can customize this high-level summary in your review settings.


Important

Fix AttributeError in ScriptSkyvernPage and enhance logging for screenshot and scrape diagnostics.

  • Bug Fix:
    • Fix AttributeError in ScriptSkyvernPage._wait_for_page_ready_before_action by using self.page instead of self._page.
  • Logging Enhancements:
    • Add detailed logging for screenshot timeouts in _current_viewpoint_screenshot_helper() in page.py.
    • Log scrape retry attempts and failures in build_and_record_step_prompt() in agent.py with context like URL and error type.

This description was created by Ellipsis for 48c131a. You can customize this summary. It will automatically update as commits are pushed.



🐛 This PR fixes a critical AttributeError in the _wait_for_page_ready_before_action method by correcting the page attribute reference from self._page to self.page. Additionally, it enhances diagnostic logging for screenshot timeouts and scraping retry attempts to improve debugging capabilities for performance issues.

🔍 Detailed Analysis

Key Changes

  • Bug Fix: Corrected attribute reference in ScriptSkyvernPage._wait_for_page_ready_before_action from non-existent self._page to proper self.page
  • Enhanced Logging: Added comprehensive diagnostic logging for screenshot timeouts with URL, viewport, and timeout context
  • Scraping Diagnostics: Improved logging for scrape retry attempts and failures with detailed error context and step information

Technical Implementation

sequenceDiagram
    participant SP as ScriptSkyvernPage
    participant SF as SkyvernFrame
    participant Page as Browser Page
    
    SP->>SP: _wait_for_page_ready_before_action()
    SP->>SP: Check if self.page exists (fixed from self._page)
    SP->>SF: create_instance(frame=self.page)
    SF->>Page: wait_for_page_ready()
    Page-->>SF: Ready state
    SF-->>SP: Completion
    
    Note over SP: Previously failed with AttributeError
    Note over SP: Now works correctly with self.page
Loading

Impact

  • Stability Improvement: Eliminates AttributeError exceptions that were occurring on every cached script action
  • Enhanced Debugging: Screenshot timeouts now log comprehensive context including URL, viewport size, timeout values, and error details
  • Better Monitoring: Scraping retry attempts are now properly logged with attempt counts, error types, and step context for easier troubleshooting
  • Operational Visibility: Teams can now identify patterns in failing URLs, viewport sizes, and timeout scenarios to optimize performance

Created with Palmier

## Summary

- Fix critical bug where `self._page` was used instead of `self.page` in `_wait_for_page_ready_before_action`
- **Added diagnostic logging** for screenshot timeouts to help debug Flinks issues

## Bug Fix

The `SkyvernPage` class uses `self.page` but `_wait_for_page_ready_before_action` was checking `self._page` which doesn't exist. This caused every cached script action to log:
```
AttributeError: 'ScriptSkyvernPage' object has no attribute '_page'
```

## Added Logging for Screenshot Diagnostics

### Screenshot timeout now logs context:
```
Screenshot timeout | timeout_ms=3000 | url=https://... | viewport=1920x1080 | full_page=False | mode=detailed
```

### Scrape retry attempts now logged:
```
Scrape attempt failed, will retry with next strategy | attempt=1 | scrape_type=normal | error_type=FailedToTakeScreenshot | url=...
Scrape attempt failed, will retry with next strategy | attempt=2 | scrape_type=normal | error_type=FailedToTakeScreenshot | url=...
All scrape attempts failed | total_attempts=3 | error_type=FailedToTakeScreenshot | url=... | step_order=0 | step_retry=1
```

This will help identify:
- **Which URLs** are causing timeouts
- **What timeout value** is actually being used
- **Page viewport size** (large pages may timeout)
- **Which retry attempt** failed
- **Step context** (order and retry index)

## Test plan

- [ ] Deploy to staging and verify logs appear correctly
- [ ] Have Nick deploy and check if new logs provide insight into slowness
- [ ] Monitor for patterns in failing URLs/viewport sizes

🤖 Generated with [Claude Code](https://claude.ai/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **Refactor**
  * Use the public page attribute for readiness handling to improve stability when a page is unavailable.

* **Chores / Diagnostics**
  * Enhanced screenshot error/debug logs with URL and viewport info.
  * Improved scrape retry and failure logs to include attempt counts, error context, and next-step details.

* **Tests**
  * Added unit tests validating readiness behavior and resilience around frame creation and wait routines.

<sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- ELLIPSIS_HIDDEN -->

----

> [!IMPORTANT]
> Fix `AttributeError` in `ScriptSkyvernPage` and enhance logging for diagnostics.
>
>   - **Bug Fix**:
>     - Fix `AttributeError` in `ScriptSkyvernPage._wait_for_page_ready_before_action` by replacing `self._page` with `self.page`.
>   - **Logging Enhancements**:
>     - Add detailed logging for screenshot timeouts in `_current_viewpoint_screenshot_helper()` in `page.py`.
>     - Log scrape retry attempts and failures in `build_and_record_step_prompt()` in `agent.py`.
>   - **Testing**:
>     - Add unit tests in `test_script_skyvern_page.py` to verify correct attribute access and exception handling in `_wait_for_page_ready_before_action`.
>
> <sup>This description was created by </sup>[<img alt="Ellipsis" src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=Skyvern-AI%2Fskyvern-cloud&utm_source=github&utm_medium=referral)<sup> for cbe137ff395b4ef7b17a3b86bf839e46fc357a1f. You can [customize](https://app.ellipsis.dev/Skyvern-AI/settings/summaries) this summary. It will automatically update as commits are pushed.</sup>

<!-- ELLIPSIS_HIDDEN -->
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 48c131a in 26 seconds. Click for details.
  • Reviewed 95 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_dB4dVETPyB1Jr4Py

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed 48c131a in 27 seconds. Click for details.
  • Reviewed 95 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

Workflow ID: wflow_016Xp61R5neh31rs

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 28, 2026

Walkthrough

This PR refactors page reference handling by replacing self._page with self.page in SkyvernPage, enhances error logging in screenshot and scraping operations with structured context (URL, viewport information, timeout details), and improves retry logic to differentiate between intermediate and final failures.

Changes

Cohort / File(s) Summary
Page reference refactoring
skyvern/core/script_generations/script_skyvern_page.py
Replaces self._page with self.page in _wait_for_page_ready_before_action method and updates frame creation to use correct page reference.
Error handling and logging improvements
skyvern/forge/agent.py, skyvern/webeye/utils/page.py
Enhanced error logging in scraping retry logic to log warnings for intermediate failures and errors for final failures; added structured context collection (URL, viewport size, timeout details) in screenshot error handling for improved debugging.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs

Suggested reviewers

  • wintonzheng
  • suchintan

Poem

🐰 A page now clear, with references bright,
No more the hidden _page in sight!
When screenshots fail, we log with care,
The viewport, URL—context to spare!
Retries learn to whisper, then shout their pain,
Debugging dreams flow like spring rain! 🌿

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix AttributeError in _wait_for_page_ready_before_action' directly and precisely describes the primary bug fix in the changeset—correcting a self._page reference that caused an AttributeError.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@pedrohsdb pedrohsdb merged commit 912d8df into main Jan 28, 2026
7 of 8 checks passed
@pedrohsdb pedrohsdb deleted the pedro/fix-wait-for-page-ready-attribute-error branch January 28, 2026 01:25
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@skyvern/forge/agent.py`:
- Around line 2458-2475: The failure logs currently include full URLs
(task.url); create a small helper like _redact_url(raw_url: str) that uses
urllib.parse.urlsplit/urlunsplit to strip query and fragment, import those
functions at the top of the module, and replace usages of task.url in the
LOG.warning and LOG.error calls inside the scrape retry block (the block where
SCRAPE_TYPE_ORDER is iterated and where step.order/step.retry_index are logged)
with _redact_url(task.url) so logs never emit sensitive query strings or tokens.
🧹 Nitpick comments (1)
skyvern/webeye/utils/page.py (1)

78-85: Avoid blanket exception when deriving viewport_info.
You can compute it safely without a generic except, which keeps unexpected failures visible.

♻️ Suggested refactor
-    try:
-        viewport = page.viewport_size
-        viewport_info = f"{viewport['width']}x{viewport['height']}" if viewport else "unknown"
-    except Exception:
-        viewport_info = "unknown"
+    viewport = page.viewport_size or {}
+    width = viewport.get("width")
+    height = viewport.get("height")
+    viewport_info = f"{width}x{height}" if width and height else "unknown"

As per coding guidelines "Use specific exception classes rather than generic exceptions in error handling".

Comment on lines 2458 to +2475
if idx < len(SCRAPE_TYPE_ORDER) - 1:
LOG.warning(
"Scrape attempt failed, will retry with next strategy",
attempt=idx + 1,
scrape_type=scrape_type.value if hasattr(scrape_type, "value") else str(scrape_type),
error_type=e.__class__.__name__,
url=task.url,
)
continue
LOG.exception(f"{e.__class__.__name__} happened in two normal attempts and reload-page retry")
LOG.error(
"All scrape attempts failed",
total_attempts=len(SCRAPE_TYPE_ORDER),
error_type=e.__class__.__name__,
url=task.url,
step_order=step.order,
step_retry=step.retry_index,
exc_info=True,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Redact URLs in scrape failure logs.
These logs include full URLs, which may carry credentials or tokens in query strings. Please sanitize before logging.

🔒 Suggested change
-                        url=task.url,
+                        url=_redact_url(task.url),
...
-                        url=task.url,
+                        url=_redact_url(task.url),

Example helper (place near the top of the module):

from urllib.parse import urlsplit, urlunsplit

def _redact_url(raw_url: str) -> str:
    parts = urlsplit(raw_url)
    return urlunsplit((parts.scheme, parts.netloc, parts.path, "", ""))

As per coding guidelines "Never expose sensitive information in error messages".

🤖 Prompt for AI Agents
In `@skyvern/forge/agent.py` around lines 2458 - 2475, The failure logs currently
include full URLs (task.url); create a small helper like _redact_url(raw_url:
str) that uses urllib.parse.urlsplit/urlunsplit to strip query and fragment,
import those functions at the top of the module, and replace usages of task.url
in the LOG.warning and LOG.error calls inside the scrape retry block (the block
where SCRAPE_TYPE_ORDER is iterated and where step.order/step.retry_index are
logged) with _redact_url(task.url) so logs never emit sensitive query strings or
tokens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants