# SmartConnector External Data Sources

{% hint style="success" %}
**Audience:** Administrators, Solution Architects, Developers

**Purpose:** Explains the external database and cloud storage connections supported inside the <code class="expression">space.vars.smartconnector</code> SQL environment, how to configure credentials securely, and the limitations to plan around before building a <code class="expression">space.vars.workflow</code> that depends on an external source.
{% endhint %}

## Overview

<code class="expression">space.vars.smartconnectors</code> can query external databases and cloud storage directly from the SQL processing environment. This allows a <code class="expression">space.vars.workflow</code> to pull reference data from a PostgreSQL or MySQL database, read a file from an S3 bucket, or fetch a remote CSV over HTTP without uploading a file to the <code class="expression">space.vars.smartconnector</code>. External data source connections are batch-oriented and intended for scheduled or event-driven <code class="expression">space.vars.workflows</code> that need to enrich, join, or transform data against systems outside of <code class="expression">space.vars.Kizen\_company\_name</code>.

Credentials for external sources are stored as JSON-encoded integration secrets and referenced from SQL by name. Sensitive values are never written directly into a SQL statement.

***

## Supported External Sources

The following external data source connections are supported within the <code class="expression">space.vars.smartconnector</code> SQL environment. Each is accessed through a ClickHouse table function referenced directly in SQL. For full reference on each function's arguments and behavior, see the [ClickHouse documentation](https://clickhouse.com/docs) linked in each section.

#### **PostgreSQL**

Query a PostgreSQL database using the [`postgresql()`](https://clickhouse.com/docs/sql-reference/table-functions/postgresql) table function, referencing an integration secret that holds the connection parameters.

```sql
SELECT *
FROM postgresql(my_postgres_secret, table='customers');
```

See **Configuring Credentials** below for the JSON shape of the secret.

#### **MySQL**

Query a MySQL database using the [`mysql()`](https://clickhouse.com/docs/sql-reference/table-functions/mysql) table function, referencing an integration secret. The MySQL function uses the same named-collection pattern as Postgres.

```sql
SELECT *
FROM mysql(my_mysql_secret, table='orders');
```

#### **S3**

Read <code class="expression">space.vars.objects</code> from an S3 bucket using the [`s3()`](https://clickhouse.com/docs/sql-reference/table-functions/s3) table function. Public buckets require the `NOSIGN` parameter so ClickHouse does not attempt to sign the request with credentials.

```sql
SELECT *
FROM s3('https://bucket.s3.amazonaws.com/path/file.csv', NOSIGN, 'CSV');
```

For private buckets, store AWS credentials as an integration secret and reference it in the table function.

#### **URL / HTTP**

Fetch data from an HTTP or HTTPS endpoint using the [`url()`](https://clickhouse.com/docs/sql-reference/table-functions/url) table function. CSV and other common tabular formats are supported.

```sql
SELECT *
FROM url('https://example.com/data.csv', 'CSV');
```

***

## Configuring Credentials

Credentials for external data sources are stored as JSON-encoded integration secrets and referenced from SQL. This keeps sensitive values out of <code class="expression">space.vars.workflow</code> definitions, out of dry run reports, and out of execution history.

The configuration flow has three steps: identify the parameters the table function needs, encode them as JSON, and save that JSON as an integration secret your SQL can reference.

{% stepper %}
{% step %}

#### **Identify the required parameters**

Each ClickHouse table function takes a defined set of connection parameters. For PostgreSQL, the parameters are:

| Parameter  | Description                                   |
| ---------- | --------------------------------------------- |
| `host`     | Hostname or IP of the PostgreSQL server       |
| `port`     | Port number (typically 5432)                  |
| `database` | Database name                                 |
| `user`     | Username with read access to the target table |
| `password` | Password for the user                         |
| `schema`   | (Optional) Schema name; defaults to `public`  |

For MySQL parameters, see the [`mysql()` table function reference](https://clickhouse.com/docs/sql-reference/table-functions/mysql).
{% endstep %}

{% step %}

#### **Encode the parameters as JSON**

Format the connection parameters as a JSON <code class="expression">space.vars.object</code>.

```json
{
  "host": "db.example.com",
  "port": 5432,
  "database": "analytics",
  "user": "kizen_reader",
  "password": "REPLACE_WITH_PASSWORD",
  "schema": "public"
}
```

{% endstep %}

{% step %}

#### **Save as an integration secret**

Save the JSON as an integration secret in <code class="expression">space.vars.Kizen\_company\_name</code> so it can be referenced by name from SQL. Once saved, the secret name is what you reference in your SQL table function. See the per-source examples above.

{% hint style="info" %}
**Note:** Credentials stored as integration secrets are masked in execution logs and never appear in SQL output, dry run reports, XLS output reports, or execution history. Hard-coded credentials in SQL are not supported.
{% endhint %}
{% endstep %}
{% endstepper %}

***

## Known Limitations

**Kafka is not supported:** Kafka integration cannot be used within the <code class="expression">space.vars.smartconnector</code>r SQL environment because the required SASL cryptography libraries are not available in the <code class="expression">space.vars.Kizen\_company\_name</code> environment. Kafka connector code that works outside of <code class="expression">space.vars.Kizen\_company\_name</code> will not execute successfully inside a S<code class="expression">space.vars.smartconnector</code>.

**Streaming data sources are out of scope for SmartConnectors:** <code class="expression">space.vars.smartconnectors</code> are batch-oriented by design and are not intended to consume continuous event streams. <code class="expression">space.vars.workflows</code> that require streaming ingestion should be handled outside of the <code class="expression">space.vars.smartconnector</code> framework.

#### Security Considerations

External connections rely on integration secrets for credential management. Credentials are masked in execution logs and are never exposed in SQL output, so sensitive values do not appear in dry run reports, XLS output reports, or execution history.

Credentials must never be written directly into SQL.

***

## What's Next

Once SQL is reading from the right inputs, whether from input tables or an external source, the next step in the <code class="expression">space.vars.workflow</code> is declaring how that output flows downstream. Execution Variables are the typed contract between SQL output and the rest of the <code class="expression">space.vars.workflow</code>. Continue to [SmartConnector Execution Variables](/docs/concepts/smartconnectors/smartconnector-execution-variables.md).

<details>

<summary>Related Topics</summary>

* [SmartConnector Types](/docs/concepts/smartconnectors/smartconnector-types.md)
* [SmartConnector Diff Checking](/docs/concepts/smartconnectors/smartconnector-diff-checking.md)
* [SmartConnector Execution Variables](/docs/concepts/smartconnectors/smartconnector-execution-variables.md)

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://developer.kizen.com/docs/concepts/smartconnectors/smartconnector-external-data-sources.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
