FEAT: Pass all connection string params to mssql-py-core for bulk copy#439
FEAT: Pass all connection string params to mssql-py-core for bulk copy#439bewithgaurav merged 21 commits intomainfrom
Conversation
- Add extract_auth_type() fallback for Windows Interactive where process_connection_string returns auth_type=None (DDBC handles auth natively but bulkcopy needs it for Azure Identity) - Preserve auth_type in fallthrough return so bulkcopy can attempt fresh token acquisition even when connect-time token fails - Fix get_raw_token docstring to match actual credential behavior - Use 'is None' instead of '== None' in test assertion
Raises ValueError with supported types instead of KeyError when an unsupported auth_type is passed to _acquire_token.
…llthrough return, clean up extract_auth_type parsing
… bewithgaurav/bulkcopy-accesstoken-support
Drop app, workstationid, language, connect_timeout, mars_connection from key_map — the parser rejects these keywords so they never reach connstr_to_pycore_params. Also clean up bool_keys/int_keys to match.
048ec5d to
4953f9c
Compare
📊 Code Coverage Report
Diff CoverageDiff: main...HEAD, staged and unstaged changes
Summary
mssql_python/helpers.pyLines 271-279 271 # mars_connection) are silently dropped here. In the normal connect()
272 # path the parser validates keywords first (validate_keywords=True),
273 # but bulkcopy parses with validation off, so this mapping is the
274 # authoritative filter in that path.
! 275 key_map = {
276 # auth / credentials
277 "uid": "user_name",
278 "pwd": "password",
279 "trusted_connection": "trusted_connection",Lines 303-311 303 "packet size": "packet_size",
304 "connectretrycount": "connect_retry_count",
305 "connectretryinterval": "connect_retry_interval",
306 }
! 307 int_keys = {
308 "packet_size",
309 "connect_retry_count",
310 "connect_retry_interval",
311 "keep_alive",Lines 311-329 311 "keep_alive",
312 "keep_alive_interval",
313 }
314
! 315 pycore_params: dict = {}
316
! 317 for connstr_key, pycore_key in key_map.items():
! 318 raw_value = params.get(connstr_key)
! 319 if raw_value is None:
! 320 continue
321
322 # First-wins: match ODBC behaviour — first synonym in the
323 # connection string takes precedence (e.g. Addr before Server).
! 324 if pycore_key in pycore_params:
! 325 continue
326
327 # ODBC values are always strings; py-core expects native types for int keys.
328 # Boolean params (trust_server_certificate, multi_subnet_failover) are passed
329 # as strings — all Yes/No validation is in connection.rs for single-locationLines 327-345 327 # ODBC values are always strings; py-core expects native types for int keys.
328 # Boolean params (trust_server_certificate, multi_subnet_failover) are passed
329 # as strings — all Yes/No validation is in connection.rs for single-location
330 # consistency with Encrypt, ApplicationIntent, IPAddressPreference, etc.
! 331 if pycore_key in int_keys:
332 # Numeric params (timeouts, packet size, etc.) — skip on bad input
! 333 try:
! 334 pycore_params[pycore_key] = int(raw_value)
! 335 except (ValueError, TypeError):
! 336 pass # let py-core fall back to its compiled-in default
337 else:
338 # String params (server, database, encryption, etc.) — pass through
! 339 pycore_params[pycore_key] = raw_value
340
! 341 return pycore_params
342
343
344 # Settings functionality moved here to avoid circular imports📋 Files Needing Attention📉 Files with overall lowest coverage (click to expand)mssql_python.pybind.logger_bridge.hpp: 58.8%
mssql_python.pybind.logger_bridge.cpp: 59.2%
mssql_python.row.py: 66.2%
mssql_python.pybind.ddbc_bindings.cpp: 69.3%
mssql_python.pybind.ddbc_bindings.h: 69.7%
mssql_python.pybind.connection.connection.cpp: 75.3%
mssql_python.helpers.py: 77.4%
mssql_python.ddbc_bindings.py: 79.6%
mssql_python.pybind.connection.connection_pool.cpp: 79.6%
mssql_python.cursor.py: 84.7%🔗 Quick Links
|
|
code coverage isn't reflecting correct numbers since the tests are a part of mssql-py-core as of now |
Use jq --rawfile instead of ${{ env.* }} expansion for
PATCH_COVERAGE_SUMMARY and LOW_COVERAGE_FILES to prevent
raw Python source in diff hunks from being interpreted as
shell commands.
4953f9c to
0b1941c
Compare
…ate and multi_subnet_failover as strings
There was a problem hiding this comment.
Pull request overview
Refactors the bulk copy (“py-core”) connection context construction so that parsed ODBC connection-string parameters are centrally translated into the parameter format expected by mssql_py_core, reducing inline mapping logic in cursor.bulkcopy().
Changes:
- Added
connstr_to_pycore_params()helper to map/convert parsed connection-string params into py-core snake_case keys (with basic int coercion). - Updated
Cursor._bulkcopy()to buildpycore_contextvia the new helper and simplified sensitive-key cleanup. - Adjusted PR code coverage workflow to pass multiline env values into
jqvia shell variables.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
mssql_python/helpers.py |
Introduces connstr_to_pycore_params() for bulkcopy py-core parameter translation. |
mssql_python/cursor.py |
Uses the new helper to build py-core bulkcopy connection context and streamlines sensitive key cleanup. |
.github/workflows/pr-code-coverage.yml |
Tweaks jq --arg inputs to use shell vars for better handling of multiline env values. |
Comments suppressed due to low confidence (1)
mssql_python/cursor.py:2651
- In the Azure AD token flow, pycore_context is built from connstr_to_pycore_params(params) before adding access_token, which means user_name/password from the original connection string may also be included (and retained until finally). Previously, SQL credentials were only added in the non-AAD path. Consider explicitly removing user_name/password when _auth_type is set (or when setting access_token) to minimize sensitive-data exposure and avoid ambiguous/dual auth inputs being passed to mssql-py-core.
# Translate parsed connection string into the dict py-core expects.
pycore_context = connstr_to_pycore_params(params)
# Token acquisition — only thing cursor must handle (needs azure-identity SDK)
if self.connection._auth_type:
# Fresh token acquisition for mssql-py-core connection
from mssql_python.auth import AADAuth
try:
raw_token = AADAuth.get_raw_token(self.connection._auth_type)
except (RuntimeError, ValueError) as e:
raise RuntimeError(
f"Bulk copy failed: unable to acquire Azure AD token "
f"for auth_type '{self.connection._auth_type}': {e}"
) from e
pycore_context["access_token"] = raw_token
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
_normalize_params and connstr_to_pycore_params now skip a canonical key that was already set by an earlier synonym. Matches ODBC Driver 18 fFromAttrOrProp first-wins semantics confirmed via live test against sqlcconn.cpp. Add TestSynonymFirstWins (12 tests) covering server/addr/ address, trustservercertificate snake_case, and packetsize with-space synonym groups.
db08e21 to
dffadd6
Compare
… bewithgaurav/bulkcopy-connstr-to-pycore
sumitmsft
left a comment
There was a problem hiding this comment.
I am approving this with one question. Although I know the answer but just want to confirm:
Previously bulkcopy effectively defaulted TrustServerCertificate to “yes” when not specified (it used params.get("trustservercertificate", "yes") ... and passed a boolean to py-core).
Now we only forward trust_server_certificate if the keyword is present, which means bulkcopy depends on mssql-py-core’s default when it’s absent.
Since ODBC 18.5 is “secure by default” (Encrypt on; TrustServerCertificate off), can you confirm:
- What does mssql-py-core do when trust_server_certificate is not provided?
- Does this match ODBC 18.5’s effective defaults and our non-bulkcopy connection behavior?
|
@sumitmsft - thanks for the review
|
Work Item / Issue Reference
Summary
This pull request refactors and simplifies how ODBC connection string parameters are translated for use with the bulk copy feature in
mssql_python. The key change is the introduction of a new helper function,connstr_to_pycore_params, which centralizes and standardizes the parameter mapping logic. This reduces code duplication, improves maintainability, and ensures sensitive data is handled more securely.Bulk copy connection context refactor:
connstr_to_pycore_paramsinmssql_python/helpers.pyto translate ODBC connection string parameters into the format expected by the py-core bulk copy path, including type conversion and key normalization.mssql_python/cursor.pyto useconnstr_to_pycore_paramsfor building thepycore_contextdictionary, replacing the previous inline mapping and logic. [1] [2]Security and code quality improvements:
popcalls.Documentation and maintainability:
cursor.pyindicating the mapping from ODBC connection-string keywords.