Attunity MS-SQL CDC - transient storage capacity planning
Following article addresses the general considerations regarding transient storage allocations.
The basic question is: How much space should be allocated for the transient storage.
The answer relies on the anticipated volumes of CDC-ed data and the expected availability period of that data.
This actually means considering:
As one attempts to plan the required capacity, some simple calculations and assumptions should be made first:
- How many records are planned to be CDC-ed within the availability period.
- What is the storage size of the CDC-ed data?
- How much extra information is placed at the transient storage.
- How this details relate to the transient storage cleanup policy threshold figures.
+---+---------+---------+---+---------+---------+
- Logged data placed at the transient storage consists of INSERT/DELETE/COMMIT/ROLLBACK operations of all REPLICATE-d records.
- In addition to that - BEGIN records are stored also.
- A transient storage record is prefixed with a 2 bytes LENGTH field.
- A transient storage record consists of a Header portion and a Data portion as described below.
|Len|Header#1 | Data#1 |Len|Header#2 | Data#2 |
+---+---------+---------+---+---------+---------+
<-- Record #1 --><-- Record #2 -->
Transient storage record Header portion:
As of V5000, storage is optimized for processing speed, not for space.
Consequently - header information is bundled “AS IS” into a textual format – without any meaningful conversion / compression.
Roughly speaking - 100 bytes should be assumed.
In case where names of the CDC-ed tables’ are long, header information may slightly increase accordingly.
The Data portion is also saved “AS IS” – namely in a binary MS-SQL storage format.
Storage size of common MS-SQL types
- There are 4 leading bytes of informational/identifying data.
- Fixed size data type fields occupy their fixed storage size – as listed below.
- For variable size data types, a 2 byte LENGTH field is maintained, plus the actual size of the allocated data itself.
- For ‘N’ (=NLS) types – length should be interpreted by basic storage unit multiples (normally =2). Thus - NCHAR (2) will occupy 4 bytes.
- A variable number of more embedded control fields are maintained there. Assuming 8 bytes per record at most, as extra control fields is fair.
- Capacity planning calculations should not consider optional NULL values.
+------------------------+--------------------------------+
| data type | Allocated size (bytes) |
+------------------------+--------------------------------+
| Char | As declared |
+------------------------+--------------------------------+
| Datetime | 8 |
+------------------------+--------------------------------+
| Decimal / numeric | Depends on precision: |
| | 1 – 9 5 bytes |
| | 10-19 10 bytes |
| | 20-28 13 bytes |
| | 29-38 17 bytes |
+------------------------+--------------------------------+
| Float | 8 |
+------------------------+--------------------------------+
| Int | 4 |
+------------------------+--------------------------------+
| Money | 8 |
+------------------------+--------------------------------+
| Nchar | As declared (x2) |
+------------------------+--------------------------------+
| Nvarchar | As declared (x2) (at most) |
+------------------------+--------------------------------+
| Real | 4 |
+------------------------+--------------------------------+
| Smalldatetime | 4 |
+------------------------+--------------------------------+
| Smallint | 2 |
+------------------------+--------------------------------+
| Smallmoney | 4 |
+------------------------+--------------------------------+
| Tinyint | 1 |
+------------------------+--------------------------------+
| Varchar | As declared (at most) |
+------------------------+--------------------------------+
| Bigint | 8 |
+------------------------+--------------------------------+
| Ntext | 16 (Pointer / descriptor only) |
+------------------------+--------------------------------+
| Text | 16 (Pointer / descriptor only) |
+------------------------+--------------------------------+
| Uniqueidentifier | 16 |
+------------------------+--------------------------------+