Each node in the Vertica cluster creates 10 threads per microbatch to process partitions:įor a three-node Vertica cluster, weblogs_pool provides resources for each node to create up to 10 threads to process partitions, or 30 total threads per microbatch. See Choosing a Frame Duration for information about the impacts of frame duration size.įor example, the following CREATE RESOURCE POOL statement creates a resource pool named weblogs_pool that loads 2 microbatches simultaneously. A properly sized configuration includes rest time to plan for traffic surges. After all of the microbatches are processed, the scheduler waits for the remainder of the frame to process the next microbatch. Set the resource pool parameter QUEUETIMEOUT to 0 to allow the scheduler to manage timings. QUEUETIMEOUT provides manual control over resource timings. If there are more partitions than threads across all nodes, remaining partitions are processed as threads become available. Each thread reads from one partition at a time until processing completes, or the frame ends. During each frame, a node creates a maximum of one thread for each partition. When a microbatch is loaded into Vertica, its partitions are distributed evenly among the nodes in the cluster. If there are more microbatches than scheduler threads, the scheduler queues the extra microbatches and loads them as threads become available.ĮXECUTIONPARALLELISM determines the maximum number of threads each node creates to process a microbatch's partitions. Each scheduler thread connects to Vertica and loads one microbatch at a time. At the start of each frame, the scheduler creates the number of scheduler threads specified by PLANNEDCONCURRENCY. PLANNEDCONCURRENCY determines the number of microbatches (COPY statements) the scheduler sends to Vertica simultaneously. The following resource pool settings play an important role in how Vertica loads microbatches and processes partitions: Key Resource Pool SettingsĪ microbatch is a unit of work that processes the partitions of a single Kafka topic within the duration of a frame. See Resource Pool Architecture for more information about resource pools. For example, you may get OVERSHOT DEADLINE FOR FRAME errors if the scheduler is not able to load data from all of the topics it is supposed to in a data frame. Not allocating enough resources to your schedulers can result in errors. Each time you start a scheduler that is using the GENERAL pool, the vkconfig utility will display a warning message. When you are ready to deploy your scheduler, create a resource pool that you have tuned to its specific needs. This fallback to using the GENERAL pool is intended as a convenience during testing your scheduler configuration. Vertica suggests you do not use the GENERAL pool for schedulers used in production environments. If you do not create and assign a resource pool for your scheduler, it uses a portion of the GENERAL resource pool. You create resource pools within Vertica using the CREATE RESOURCE POOL statement. Using a separate pool for a scheduler lets you fine-tune its impact on your Vertica cluster's performance. Schedulers assume they have exclusive use of the resource pool they are assigned. Vertica recommends you always create a resource pool specifically for each scheduler.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |