OceanBase (Oracle Mode)
OceanBase is a natively distributed relational database developed by Ant Group. It is compatible with both MySQL and Oracle syntax and features high availability, high performance, and strong consistency. Tapdata supports OceanBase (Oracle Mode) as both a source and a target database, enabling you to build real-time, multi-source data pipelines for synchronization and integration across heterogeneous systems.
Supported Versions and Architectures
- Version: OceanBase 4.0 and above
- Architecture: Single-node or clustered deployments
Supported Data Types
Category | Data Types |
---|---|
String | CHAR, NCHAR, VARCHAR2, NVARCHAR2 |
Numeric | INTEGER, NUMBER, FLOAT, BINARY_FLOAT, BINARY_DOUBLE |
Date/Time | DATE, TIMESTAMP, TIMESTAMP WITH TIME ZONE, TIMESTAMP WITH LOCAL TIME ZONE, INTERVAL |
LOB | CLOB, NCLOB, BLOB, XMLTYPE |
Supported Sync Operations
DML: INSERT, UPDATE, DELETE
tipWhen OceanBase is used as a target database, you can configure write policies through advanced settings in the task node: for insert conflicts, you can choose to update or discard; for update failures, you can choose to insert or just log the errors.
DDL: ADD COLUMN, CHANGE COLUMN, DROP COLUMN, RENAME COLUMN
Prerequisites
- As Source Database
- As Target Database
Ensure the network where the TapData Agent resides is included in the tenant allowlist of OceanBase and has internal network access.
Log in to the tenant as
root
and create a user for data synchronization using the following command format:CREATE USER 'username' IDENTIFIED BY 'password';
- username: Enter user name.
- password: Corresponding password.
Grant read permissions to the user at the database level. You can also apply more fine-grained permission controls as needed:
- Full Data Synchronization
- Full + Incremental Data Synchronization
-- Replace 'username' with the actual username
GRANT
CREATE SESSION,
SELECT ANY TABLE,
TO username;-- Replace 'username' with the actual username
GRANT CREATE SESSION,
ALTER SESSION,
EXECUTE_CATALOG_ROLE,
SELECT ANY DICTIONARY,
SELECT ANY TRANSACTION,
SELECT ANY TABLE
TO username;To support incremental data reading from OceanBase, deploy and configure the following components:
Deploy OBProxy: the proxy layer that handles client connections and load balancing.
Deploy oblogproxy: the incremental log proxy that connects to OceanBase and fetches CDC logs.
Contact the Tapdata team to obtain the
OB-Log-Decoder
installation package.In Oracle mode, the OceanBase oblogproxy component only supports C/C++ clients. To enable cross-language incremental data consumption, Tapdata provides a dedicated OB-Log-Decoder component. This component wraps
liboblog
to parse and output CDC data efficiently and integrates seamlessly with the Tapdata platform.After extracting the
OB-Log-Decoder
package, copy theobcdcServer
file to the OceanBase installation path:${work_directory}/home/admin/oceanbase/bin
Set the dynamic library environment variable:
export LD_LIBRARY_PATH=${work_directory}/home/admin/oceanbase/lib64/
Start the log decoding service, using
-p
to specify the config path:./obcdcServer -p ${cdc_conf_path}
The default CDC user for OceanBase in Oracle mode is:
cluster_user=root@sys
.
Ensure the network where the TapData Agent resides is included in the tenant allowlist of OceanBase and has internal network access.
Log in to the tenant as
root
and create a user for data synchronization using the following command format:CREATE USER 'username' IDENTIFIED BY 'password';
- username: Enter user name.
- password: Corresponding password.
Grant full permissions to the user at the database level. You can also apply more fine-grained permission controls as needed:
-- Replace 'username' with the actual username
-- Under the user's own Schema
GRANT
CREATE SESSION,
CREATE ANY TABLE,
UNLIMITED TABLESPACE
TO username;
Connect to OceanBase (Oracle Mode)
In the left navigation bar, click Connections.
Click Create on the right side of the page.
In the pop-up dialog, search for and select OceanBase (Oracle).
On the redirected page, fill in the connection details as described below:
- Connection Settings
- Name: Enter a business-meaningful name for easy identification and management.
- Type: Select whether OceanBase will act as a source or target.
- Host: Host address of the database, typically the OBProxy deployment address. IP or domain name are both supported.
- Port: Access port for the database. Default is 2881.
- Tenant: Logical tenant name in OceanBase (similar to an instance in traditional databases).
- Database: Name of the database. One connection corresponds to one database; create multiple connections for multiple databases.
- User: Username within the tenant.
- Password: Password corresponding to the username.
- Connection Parameter String: Optional JDBC parameters (e.g., encoding, SSL); leave empty if not needed.
- RPC Port: OBProxy’s service port to access OceanBase, default is 2882.
- Raw Log Server Host: Host (IP or domain) where
OB-Log-Decoder
is deployed. - Raw Log Server Port: Listening port for
OB-Log-Decoder
, default is 8190. - CDC User: Account for accessing incremental logs, default is
root@sys
. - CDC Password: Password for the CDC account.
- Time Zone: Defaults to UTC (UTC+0). If changed, only affects time zone–less types like
TIMESTAMP
.TIMESTAMP WITH TIME ZONE
andDATE
are unaffected.
- Advanced Settings
- CDC Log Caching: Mining the source database's incremental logs. This allows multiple tasks to share the same source database’s incremental log mining process, reducing duplicate reads and minimizing the impact of incremental synchronization on the source database. After enabling this feature, you will need to select an external storage to store the incremental log information.
- Agent Settings: Defaults to Platform Automatic Allocation, you can also manually specify an agent.
- Model Load Time: If there are less than 10,000 models in the data source, their schema will be updated every hour. But if the number of models exceeds 10,000, the refresh will take place daily at the time you have specified.
- Enable Heartbeat Table: When OceanBase is used as source and target or source, you can enable this option. Tapdata will create a
_tapdata_heartbeat_table
in the source and update it every 10 seconds for health monitoring (requires proper permissions).
- Connection Settings
Click Test. If successful, click Save.
If the connection test fails, follow the on-screen instructions to resolve the issue.
Node Advanced Features
When OceanBase (MySQL Mode) is used as a source node in a data replication or transformation task, Tapdata provides built-in advanced options to improve performance and handle complex scenarios.
Single Table Concurrent Read (Disabled by default): When enabled, Tapdata splits large tables based on partitions and launches multiple read threads in parallel, reading different partitions simultaneously. This significantly improves full read performance. Recommended for large tables and when database resources are sufficient.
- Max Split Count for Single Table: Default to 16.
- Single Table Concurrent Read Thread Size: Default to 8.
These parameters can be adjusted in the UI.