Introduction
Lately I came across a topic to load data directly into Clustered Columnstore Index (CCI) via SSIS package. It was not working really well for the first time, so I needed to search around and test different SSIS parameters. This article does not discover anything new and is just a short summary for me when I will need this knowledge in the future.
Solution
It is important to set following SSIS OLEDB destination parameters:
Figure 1 – SSIS Destination Parameters (CCI)
- Table Lock – Do not use it for CCI as it will block possible parallel inserts.
- Rows Per batch – 1048576 is the size of delta store so data isn’t really inserted into delta-store and then compressed but compressed directly.
- Maximum insert commit size – always use 0! – Bit confusing as I would set it to 1048576, but 0 means to deliver data in a single batch, allowing to reach the maximum allowed number of rows per Row Group (2).
Other than that it’s good to check and set DefaultBufferMaxRows and DefaultBufferSize property of the package.
Figure 2 – SSIS Package Buffer Settings
DefaultMaxBufferSize can be useful when your rows in one batch cannot fit into SSIS package memory.
Conclusion
Loading data into CCI is not definitely fastest. (Although there are improvements in SQL 2016 regarding parallelism.) Table with row heap is always better then CCI destination regarding bulk insert performance. CCI load can be significantly improved trough using right SSIS parameters and parallelizing load procedure (1).