Skip to content

refactor!: Introduce fully typed clients#604

Open
vdusek wants to merge 2 commits intomasterfrom
generated-typed-clients
Open

refactor!: Introduce fully typed clients#604
vdusek wants to merge 2 commits intomasterfrom
generated-typed-clients

Conversation

@vdusek
Copy link
Contributor

@vdusek vdusek commented Feb 5, 2026

Summary

This is a major refactoring that introduces fully typed Pydantic models throughout the client library. The models are generated from the OpenAPI specifications. All API responses now return typed objects instead of raw dictionaries.

This follows up on apify/apify-docs#2182.

Issues

Packages

  • Add direct dependency on Pydantic.
  • Removes the dependency on apify-shared.
  • Add dev dependency datamodel-code-generator for model generation.

Key changes

  • Uses datamodel-code-generator tool configured via pyproject.toml to generate Pydantic models based on the OpenAPI specs.
  • Refactors the whole codebase to adopt the new generated models.
  • All resource clients now return typed Pydantic models (Actor, Task, Run, etc.).
  • Adds response wrappers for validating and extracting API response data.
  • Updates list methods to return typed pagination models.
  • Documentation examples now use typed attribute access.
  • Updates the SDK to use the new typed client.

Architecture

  • Get rid of 3/4/5 levels of inheritance.
  • Get rid of inline imports because of circular dependencies.
    • I had to utilize ClientRegistry to be able to achieve that (because of resource clients-siblings imports).

Breaking changes

  • Client methods now return Pydantic models instead of dicts.
    • Access patterns change from dict-style (result['key']) to attribute-style (result.key).

Test plan

  • Updated test concurrency to 16 workers.
  • A lot of new tests were implemented - coverage ~95%.
    • Unit tests - do not call production API, only for testing utils or other functionality using mocks.
    • Integration tests - call production API.
  • Thanks to the new tests, I was able to do a lot of fixes in the OpenAPI specs.

Next steps

  • Explore the generation of resource clients using openapi-python-client.
  • Fully automate model updates based on changes in apify-api/openapi.
  • This will be released as part of the Apify client v3.0.

@vdusek vdusek requested a review from Pijukatel February 5, 2026 12:10
@vdusek vdusek requested a review from Mantisus February 5, 2026 12:10
@github-actions github-actions bot added this to the 133rd sprint - Tooling team milestone Feb 5, 2026
@github-actions github-actions bot added the t-tooling Issues with this label are in the ownership of the tooling team. label Feb 5, 2026
@vdusek vdusek requested a review from janbuchar February 5, 2026 12:10
@codecov
Copy link

codecov bot commented Feb 5, 2026

Codecov Report

❌ Patch coverage is 96.56357% with 60 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.32%. Comparing base (308ddf3) to head (f4c476b).

Files with missing lines Patch % Lines
...apify_client/_resource_clients/_resource_client.py 89.86% 15 Missing ⚠️
src/apify_client/_http_clients/_http_client.py 93.65% 8 Missing ⚠️
src/apify_client/_resource_clients/run.py 93.96% 7 Missing ⚠️
src/apify_client/_resource_clients/user.py 86.95% 6 Missing ⚠️
src/apify_client/_apify_client.py 96.15% 4 Missing ⚠️
src/apify_client/_resource_clients/dataset.py 94.93% 4 Missing ⚠️
.../apify_client/_resource_clients/key_value_store.py 96.36% 4 Missing ⚠️
src/apify_client/_representations.py 93.54% 2 Missing ⚠️
src/apify_client/_resource_clients/actor.py 97.22% 2 Missing ⚠️
src/apify_client/_resource_clients/schedule.py 94.11% 2 Missing ⚠️
... and 5 more
Additional details and impacted files
@@             Coverage Diff             @@
##           master     #604       +/-   ##
===========================================
+ Coverage   76.02%   96.32%   +20.29%     
===========================================
  Files          42       43        +1     
  Lines        2482     4274     +1792     
===========================================
+ Hits         1887     4117     +2230     
+ Misses        595      157      -438     
Flag Coverage Δ
integration 94.52% <93.75%> (+25.74%) ⬆️
unit 75.19% <58.36%> (+9.88%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vdusek vdusek added the adhoc Ad-hoc unplanned task added during the sprint. label Feb 5, 2026
@Pijukatel Pijukatel self-requested a review February 9, 2026 14:41
@vdusek vdusek requested a review from Copilot February 10, 2026 19:48
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Major refactor that replaces dict-based API responses with generated, fully typed Pydantic models across the client, plus a new impit-based HTTP layer.

Changes:

  • Introduces new typed resource clients and response wrappers around generated Pydantic models.
  • Replaces legacy HTTP client with new SyncHttpClient / AsyncHttpClient and updates tests/docs accordingly.
  • Adds extensive new unit/integration coverage and model generation tooling configuration.

Reviewed changes

Copilot reviewed 100 out of 106 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
tests/unit/test_client_timeouts.py Updates timeout tests to use new http clients + timedelta.
tests/unit/test_client_request_queue.py Updates request-queue tests for typed responses and explicit API URLs.
tests/unit/test_client_headers.py Removes legacy header tests (likely superseded by new HTTP layer).
tests/unit/test_client_errors.py Switches error tests to new http clients and improves assertion naming.
tests/unit/test_actor_start_params.py Adds unit tests ensuring timeout query param is used with typed client.
tests/unit/conftest.py Removes patch_basic_url fixture; keeps httpserver fixture.
tests/unit/README.md Documents unit-test isolation requirements.
tests/integration/test_webhook_dispatch.py Adds unified sync/async integration coverage for webhook dispatch endpoints.
tests/integration/test_webhook.py Adds unified sync/async integration coverage for webhooks.
tests/integration/test_user.py Adds unified sync/async integration coverage for user endpoints (typed).
tests/integration/test_store.py Reworks store tests into unified sync/async style with typed models.
tests/integration/test_schedule.py Adds unified sync/async integration coverage for schedules.
tests/integration/test_run_collection.py Removes legacy run-collection integration tests (dict-based).
tests/integration/test_log.py Adds unified sync/async integration coverage for logs.
tests/integration/test_build.py Adds unified sync/async integration coverage for builds.
tests/integration/test_basic.py Removes legacy basic integration tests (dict-based).
tests/integration/test_apify_client.py Adds unified basic integration test using typed models.
tests/integration/test_actor_version.py Adds unified sync/async integration coverage for actor versions.
tests/integration/test_actor_env_var.py Adds unified sync/async integration coverage for actor env vars.
tests/integration/test_actor.py Adds unified sync/async integration coverage for actors.
tests/integration/integration_test_utils.py Removes old integration utilities (now likely in conftest).
tests/integration/README.md Documents integration tests’ requirement for real API tokens.
src/apify_client/errors.py Improves error docstrings and hardens JSON error extraction.
src/apify_client/clients/resource_clients/run_collection.py Removes legacy client implementation (dict-based).
src/apify_client/clients/resource_clients/build.py Removes legacy build client implementation (dict-based).
src/apify_client/clients/resource_clients/actor_env_var_collection.py Removes legacy env var collection client (dict-based).
src/apify_client/clients/base/resource_collection_client.py Removes legacy base list/create abstraction (dict-based).
src/apify_client/clients/base/resource_client.py Removes legacy base resource client abstraction (dict-based).
src/apify_client/clients/base/base_client.py Removes legacy base client (replaced by new resource client base).
src/apify_client/clients/base/actor_job_base_client.py Removes legacy polling/abort base (replaced elsewhere).
src/apify_client/clients/base/init.py Removes legacy base exports.
src/apify_client/clients/init.py Removes legacy large re-export module.
src/apify_client/_types.py Removes old JSON/ListPage types (moved/replaced).
src/apify_client/_statistics.py Renames Statistics -> ClientStatistics.
src/apify_client/_resource_clients/webhook_dispatch_collection.py Converts to typed models + new base resource client.
src/apify_client/_resource_clients/webhook_dispatch.py Converts get() to typed models and explicit 404 handling.
src/apify_client/_resource_clients/webhook_collection.py Converts list/create to typed models and new representation helpers.
src/apify_client/_resource_clients/webhook.py Converts get/update/delete/test/dispatches to typed models and new base.
src/apify_client/_resource_clients/user.py Converts user endpoints to typed models and new base.
src/apify_client/_resource_clients/store_collection.py Converts store listing to typed models and new base.
src/apify_client/_resource_clients/schedule_collection.py Converts schedule list/create to typed models and new base.
src/apify_client/_resource_clients/schedule.py Converts schedule get/update/delete/log to typed models and new base.
src/apify_client/_resource_clients/run_collection.py Adds new typed run collection client with pagination iteration.
src/apify_client/_resource_clients/request_queue_collection.py Converts list/get_or_create to typed models and new base.
src/apify_client/_resource_clients/log.py Updates to new base client, typed run usage, and task lifecycle handling.
src/apify_client/_resource_clients/key_value_store_collection.py Converts list/get_or_create to typed models and new base.
src/apify_client/_resource_clients/dataset_collection.py Converts list/get_or_create to typed models and new base.
src/apify_client/_resource_clients/build_collection.py Converts list to typed models and new base.
src/apify_client/_resource_clients/build.py Adds new typed build client with log + wait_for_finish.
src/apify_client/_resource_clients/actor_version_collection.py Converts list/create to typed models and new base.
src/apify_client/_resource_clients/actor_env_var_collection.py Adds new typed actor env var collection client.
src/apify_client/_resource_clients/actor_env_var.py Converts env var get/update/delete to typed models and new base.
src/apify_client/_resource_clients/init.py Expands exports for log streaming/status watchers; updates resource exports.
src/apify_client/_logging.py Refactors logging helpers, adds redirect logger utilities, updates context injection.
src/apify_client/_internal_models.py Adds internal minimal Pydantic models for polling/validation (non-generated).
src/apify_client/_http_clients/_sync.py Adds new synchronous impit-based HTTP client with retries/backoff.
src/apify_client/_http_clients/_base.py Adds shared HTTP base: headers, param conversion, gzip JSON, timeout scaling.
src/apify_client/_http_clients/_async.py Adds new async impit-based HTTP client with retries/backoff.
src/apify_client/_http_clients/init.py Adds package entrypoint for new HTTP clients.
src/apify_client/_http_client.py Removes legacy HTTP client implementation.
src/apify_client/_consts.py Adds shared constants (timeouts, retries, terminal statuses, API URL).
src/apify_client/_client_registry.py Adds sync/async registries for DI and avoiding circular imports.
src/apify_client/init.py Switches public import to new _apify_client module.
scripts/utils.py Improves docstring conversion and typing in scripts.
scripts/fix_async_docstrings.py Skips _http_clients and handles missing sync class/methods.
scripts/check_async_docstrings.py Skips _http_clients and handles missing sync classes.
pyproject.toml Updates dependencies, adds datamodel-codegen config, fixes ruff ignore for generated models, bumps test concurrency, adds generate-models task.
docs/03_examples/code/03_retrieve_sync.py Updates examples to attribute access for typed models.
docs/03_examples/code/03_retrieve_async.py Updates examples to attribute access for typed models.
docs/03_examples/code/02_tasks_sync.py Updates examples to typed Task/Run + attribute access.
docs/03_examples/code/02_tasks_async.py Updates examples to typed Task/Run + attribute access.
docs/03_examples/code/01_input_sync.py Updates timeout usage to timedelta.
docs/03_examples/code/01_input_async.py Updates timeout usage to timedelta.
docs/02_concepts/code/05_retries_sync.py Updates retry/timeout params to timedelta.
docs/02_concepts/code/05_retries_async.py Updates retry/timeout params to timedelta.
docs/02_concepts/code/01_async_support.py Updates example to attribute access for typed run id.
docs/01_overview/code/01_usage_sync.py Updates example to attribute access for typed dataset id.
docs/01_overview/code/01_usage_async.py Updates example to attribute access for typed dataset id.
.github/workflows/_tests.yaml Increases CI test concurrency default to 16.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@vdusek vdusek force-pushed the generated-typed-clients branch from a745fa3 to 10fb739 Compare February 18, 2026 12:49
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 101 out of 103 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (2)

src/apify_client/_logging.py:1

  • In async streaming, checking self._streaming_task.done() before raising RuntimeError prevents detecting when a task exists but has already completed. If a completed task exists and start() is called again, it will create a new task, potentially leading to resource leaks if the old task wasn't properly cleaned up. The condition should either always reject if self._streaming_task exists, or should set self._streaming_task = None after completion.
from __future__ import annotations

src/apify_client/_logging.py:1

  • Same issue as the streaming task - checking self._logging_task.done() before raising RuntimeError means a completed logging task won't prevent start() from being called again, potentially creating resource leaks. The condition should either always reject if self._logging_task exists, or should set self._logging_task = None after completion.
from __future__ import annotations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

result = await maybe_await(second_build_client.abort())
aborted_build = cast('Build', result)
assert aborted_build is not None
assert aborted_build.status.value in ('SUCCEEDED', 'FAILED', 'ABORTED')
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion allows 'ABORTED' as a valid status, but the comment on line 188 says "Test abort on already finished build". If the build is already finished with SUCCEEDED or FAILED status, calling abort() should not change it to ABORTED. The assertion should only check for SUCCEEDED or FAILED to match the test's intent, or the comment should be updated to clarify that ABORTED is also acceptable.

Suggested change
assert aborted_build.status.value in ('SUCCEEDED', 'FAILED', 'ABORTED')
assert aborted_build.status.value in ('SUCCEEDED', 'FAILED')

Copilot uses AI. Check for mistakes.
try:
response_data = response.json()
if 'error' in response_data:
if isinstance(response_data, dict) and 'error' in response_data:
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type check isinstance(response_data, dict) is added, but the subsequent access response_data['error']['message'] and response_data['error']['type'] still assumes nested dict structure without validation. Consider also checking that response_data['error'] is a dict before accessing its keys to prevent potential KeyError or TypeError.

Copilot uses AI. Check for mistakes.
def set_current_package_version(version: str) -> None:
with open(PYPROJECT_TOML_FILE_PATH, 'r+', encoding='utf-8') as pyproject_toml_file:
updated_pyproject_toml_file_lines = []
updated_pyproject_toml_file_lines = list[str]()
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates a type object list[str], not an empty list instance. This should be updated_pyproject_toml_file_lines = [] or updated_pyproject_toml_file_lines: list[str] = []. The code will fail when trying to call .append() on this type object.

Suggested change
updated_pyproject_toml_file_lines = list[str]()
updated_pyproject_toml_file_lines: list[str] = []

Copilot uses AI. Check for mistakes.
@@ -17,8 +17,6 @@

# Find all classes which end with "ClientAsync" (there should be at most 1 per file)
async_class = red.find('ClassNode', name=re.compile('.*ClientAsync$'))
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After removing the check for async_class existence (line 20-21 in the old code), the code now assumes async_class is never None. If no async class is found in a file, this will cause AttributeError when accessing async_class.name on line 22. The check should be restored, or a comment should explain why it's guaranteed to exist.

Suggested change
async_class = red.find('ClassNode', name=re.compile('.*ClientAsync$'))
async_class = red.find('ClassNode', name=re.compile('.*ClientAsync$'))
if async_class is None:
# No async client class in this file, nothing to fix
continue

Copilot uses AI. Check for mistakes.
@vdusek
Copy link
Contributor Author

vdusek commented Feb 18, 2026

Everything has been addressed. Feel free to add more feedback or reopen any conversations if you feel something wasn't properly resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Introduce fully typed Python API client Write integration tests

3 participants