The Cosmos data transfer extension provides source and sink capabilities for reading from and writing to containers in Cosmos DB using the Core (SQL) API. Source and sink both support string, number, and boolean property values, arrays, and hierarchical nested object structures.
Note: When specifying the JSON extension as the Source or Sink property in configuration, utilize the name Cosmos-nosql.
The Cosmos extension preserves all JSON properties during data migration, including properties that start with special characters like $type, $id, and $ref. These properties are commonly used by serialization libraries (such as Newtonsoft.Json) to store type information for polymorphic objects or reference tracking.
Example: If your source data contains documents with $type properties used for type discrimination:
{
"id": "1",
"name": "Dog",
"myFavouritePet": {
"$type": "MyProject.Pets.Dog, MyProject",
"Name": "Foo",
"OtherName": "OtherFoo"
}
}These properties will be preserved exactly as they appear in the source when migrating to the destination. This ensures that applications using type information embedded in JSON properties will continue to work correctly after migration.
Note: Prior to version 3.1.0, properties starting with
$were filtered out during migration. If you need the old behavior, please use an earlier version of the tool.
| Setting | Description | Default |
|---|---|---|
| ConnectionString | Cosmos DB connection string (AccountEndpoint + AccountKey) | |
| UseRbacAuth | Use Role Based Access Control for authentication | false |
| AccountEndpoint | Cosmos DB account endpoint (required for RBAC) | |
| EnableInteractiveCredentials | Prompt for Azure login if default credentials are unavailable | false |
| Database | Cosmos DB database name | |
| Container | Cosmos DB container name | |
| WebProxy | Proxy server URL for Cosmos DB connections | |
| InitClientEncryption | Enable Always Encrypted feature | false |
| LimitToEndpoint | Restrict client to endpoint (see CosmosClientOptions.LimitToEndpoint) | false |
| DisableSslValidation | Disable SSL certificate validation (for local dev only; not for production) | false |
| AllowBulkExecution | Enable bulk execution for optimized performance. Warning: May affect consistency and error handling. |
false |
Source and sink require settings used to locate and access the Cosmos DB account. This can be done in one of two ways:
- Using a
ConnectionStringthat includes an AccountEndpoint and AccountKey - Using RBAC (Role Based Access Control) by setting
UseRbacAuthto true and specifyingAccountEndpointand optionallyEnableInteractiveCredentialsto prompt the user to log in to Azure if default credentials are not available. See (migrate-passwordless for how to configure Cosmos DB for passwordless access.
The extension supports bulk execution for Cosmos DB operations. When the AllowBulkExecution setting is set to true, operations such as bulk inserts and updates are optimized for performance. Use with caution, as bulk execution may affect consistency and error handling. Default is false.
Example:
{
"ConnectionString": "AccountEndpoint=https://...",
"Database": "myDb",
"Container": "myContainer",
"AllowBulkExecution": true
}Source and sink settings also both require parameters to specify the data location within a Cosmos DB account:
DatabaseContainer
Source supports the following optional parameters:
IncludeMetadataFields(falseby default) - Enables inclusion of built-in Cosmos fields prefixed with"_", for example"_etag"and"_ts".PartitionKeyValue- Allows for filtering to a single partition.Query- Allows further filtering using a Cosmos SQL statement.WebProxy(nullby default) - Enables connections through a proxy.UseDefaultProxyCredentials(falseby default) - Whentrue, includes default credentials in the WebProxy request. Use this when connecting through an authenticated proxy that returns407 Proxy Authentication Required.UseDefaultCredentials(falseby default) - Whentrue, configures the underlying HttpClient with default network credentials. Use this when the connection to CosmosDB requires authentication through a proxy.PreAuthenticate(falseby default) - Whentrue, enables pre-authentication on the HttpClient, which sends credentials with the initial request rather than waiting for a 401/407 challenge. This can save extra round-trips but should only be used when the endpoint is trusted.
Source and Sink support Always Encrypted as an optional parameter. When InitClientEncryption is set to true, the extension will initialize the Cosmos client with the Always Encrypted feature enabled. This allows for the use of encrypted fields in the Cosmos DB container. The extension will automatically decrypt the fields when reading from the source and encrypt the fields when writing to the sink.
The extension will also automatically handle the encryption keys and encryption policy for the client, but it requires UseRbacAuth to be set to true and the user to have the necessary permissions to access the key vault.
Note: To use Always Encrypted, Cosmos DB container must be pre-configured with the necessary encryption policy and the user must have the necessary permissions to access the key vault.
{
"ConnectionString": "AccountEndpoint=https://...",
"Database":"myDb",
"Container":"myContainer",
"IncludeMetadataFields": false,
"PartitionKeyValue":"123",
"Query":"SELECT * FROM c WHERE c.category='event'",
"WebProxy":"http://yourproxy.server.com/",
"UseDefaultProxyCredentials": true,
"UseDefaultCredentials": true,
"PreAuthenticate": true
}Or with RBAC:
{
"UseRbacAuth": true,
"AccountEndpoint": "https://...",
"EnableInteractiveCredentials": true,
"Database":"myDb",
"Container":"myContainer",
"IncludeMetadataFields": false,
"PartitionKeyValue":"123",
"Query":"SELECT * FROM c WHERE c.category='event'",
"InitClientEncryption": false,
"WebProxy":"http://yourproxy.server.com/",
"UseDefaultProxyCredentials": true,
"UseDefaultCredentials": true,
"PreAuthenticate": true
}For development purposes with SSL validation disabled:
{
"ConnectionString": "AccountEndpoint=https://localhost:8081/;AccountKey=C2y6yDj...",
"Database":"myDb",
"Container":"myContainer",
"DisableSslValidation": true
}PartitionKeyPath: Specifies the partition key path when creating the container (e.g.,/id) if it does not exist.PartitionKeyPaths: Use this to supply an array of up to 3 paths for hierarchical partition keys.
UseAutoscaleForDatabase: Specifies if the database will be created with autoscale enabled or manual. Defaults tofalse. manual.
RecreateContainer: Optional, defaults tofalse. Deletes and recreates the container to ensure only newly imported data is present.CreatedContainerMaxThroughput: Specifies the initial throughput (in RUs) for a newly created container.UseAutoscaleForCreatedContainer: Enables autoscale for the newly created container.UseSharedThroughput: Set totrueto use shared throughput provisioned at the database level.
BatchSize: Optional, defaults to100. Sets the number of items to accumulate before inserting.WriteMode: Specifies the type of data write to use. Options:InsertStreamInsertUpsertStreamUpsert
-
ConnectionMode: Controls how the client connects to the Cosmos DB service. Options:Gateway(default)Direct
-
WebProxy: Optional. Specifies the proxy server URL to use for connections (e.g.,http://yourproxy.server.com/). -
UseDefaultProxyCredentials: Optional, defaults tofalse. Whentrue, includes default credentials in the WebProxy request. Use this when connecting through an authenticated proxy that returns407 Proxy Authentication Required. -
UseDefaultCredentials: Optional, defaults tofalse. Whentrue, configures the underlying HttpClient with default network credentials. Use this when the connection to CosmosDB requires authentication through a proxy. -
PreAuthenticate: Optional, defaults tofalse. Whentrue, enables pre-authentication on the HttpClient, which sends credentials with the initial request rather than waiting for a 401/407 challenge. This can save extra round-trips but should only be used when the endpoint is trusted. -
LimitToEndpoint: Optional, defaults tofalse. When the value of this property is false, the Cosmos DB SDK will automatically discover write and read regions, and use them when the configured application region is not available. When set totrue, availability is limited to the endpoint specified.- Note: CosmosClientOptions.LimitToEndpoint Property. When using the Cosmos DB Emulator Container for Linux it's been observed
setting the value to
trueenables import and export of data.
- Note: CosmosClientOptions.LimitToEndpoint Property. When using the Cosmos DB Emulator Container for Linux it's been observed
setting the value to
DisableSslValidation: Optional, defaults tofalse. Disables SSL certificate validation for development/emulator scenarios.⚠️ WARNING: Only use this for development purposes. Never use in production environments as it disables critical security checks and makes connections vulnerable to man-in-the-middle attacks.
IsServerlessAccount: Specifies whether the target account uses Serverless instead of Provisioned throughput, which affects the way containers are created.- Note: Serverless accounts cannot have shared throughput. See Azure Cosmos DB serverless account type.
PreserveMixedCaseIds: Optional, defaults tofalse. Writesidfields with their original casing while generating a separate lowercasedidfield as required by Cosmos.IgnoreNullValues: Optional. Excludes fields with null values when writing to Cosmos DB.InitClientEncryption: Optional, defaults tofalse. Uses client-side encryption with the container. Can only be used withUseRbacAuthset totrue
{
"ConnectionString": "AccountEndpoint=https://...",
"Database":"myDb",
"Container":"myContainer",
"PartitionKeyPath":"/id",
"RecreateContainer": false,
"BatchSize": 100,
"ConnectionMode": "Gateway",
"MaxRetryCount": 5,
"InitialRetryDurationMs": 200,
"CreatedContainerMaxThroughput": 1000,
"UseAutoscaleForDatabase": false,
"UseAutoscaleForCreatedContainer": true,
"WriteMode": "InsertStream",
"PreserveMixedCaseIds": false,
"IgnoreNullValues": false,
"IsServerlessAccount": false,
"UseSharedThroughput": false,
"InitClientEncryption": false,
"LimitToEndpoint": false
}