Salesforce Bulk API: Add Batch To The Job

Looking for Salesforce Training & HandsOn Projects?

Trailblazer Profile | LinkedIn | Salesforce Blog | Facebook | Youtube Channel | WhatsApp Community

The Salesforce Bulk API provides a robust solution for managing sizable data volumes within your Salesforce instance. Its capabilities facilitate the efficient loading and deletion of large datasets. To refresh your memory on fundamental Bulk API concepts, I encourage you to revisit my prior article on Bulk API Basics.

In Bulk API, jobs act as containers for your data operations. You define the operation type (insert, update, delete) and target object for the job. Batches, on the other hand, hold the actual data records to be processed within the job. Each job can have multiple batches, allowing you to segment your data for optimal performance and error handling.


Ways to add Batches to Bulk Job


There are two primary ways to add batches to a bulk job:


POST Request


This is the traditional Bulk API approach, where you send a POST request to the /services/async/APIversion/job/jobId/batch endpoint. The request body includes a list of records in JSON format, along with relevant headers like content-type and authorization.


Bulk API 2.0


This newer version simplifies batch creation by allowing direct data loading within the job creation request. Simply include your records in the operations field of the JSON payload when creating the job.

Key Considerations for Adding Batches

  • Batch Size: Salesforce recommends keeping each batch around 10,000 records for optimal performance. However, adjust this based on your data size and processing needs.
  • Record Format: Ensure your records adhere to the correct format for your target object, including required fields and proper data types.
  • Error Handling: Anticipate potential errors at the batch level. Leverage batch success flags and detailed error reports to identify and address issues efficiently.
  • Monitoring and Tracking: Utilize Bulk API job and batch information endpoints to monitor progress, track completion, and diagnose any anomalies.

Advanced Batch Techniques

  • Chunking: Break down large files into smaller chunks to prevent timeouts and handle processing failures gracefully.
  • Sequencing: Define dependencies between batches to ensure the correct order of execution for complex data operations.
  • Compression: Reduce data transfer size and improve processing times by compressing your batch files before sending them to Salesforce.

In this article, we will explore the can we add batches to the Bulk API Job in a step-by-step approach:

  • Identify the Invoice Object and Account ID
    • Locate the custom object "Invoice" within your Salesforce org.
    • Retrieve the Account ID that will serve as the Customer ID for the Invoice.


    • Construct the API Request Endpoint
      • Utilize your preferred API request tool.
      • Formulate the endpoint URL adhering to the following structure: <Salesforce Org Host Header>/services/async/<api version>/job/jobId/batch
    • Configure Request Headers
      • Navigate to the "Headers" section within the request tool.
      • Incorporate the necessary headers:
        • "Content-Type"
        • "X-SFDC-Session"

    • Prepare the Data Payload
      • Generate an XML file named "Invoices.xml" to house the records intended for processing.Meticulously construct XML nodes encompassing all mandatory object properties and their corresponding values.
      • Embed the Account ID within the "Customer__c" field to establish the Account lookup.

    • Attach the Data Payload
      • Switch to the "Body" section of the request.
      • Select "binary" as the data format, aligning with file-based processing.
      • Affix the "Invoices.xml" file to the request body.
    • Initiate the API Request
      • Click the "Send" button to dispatch the API request.

    • Analyze the Response
      • Scrutinize the "batchInfo" object embedded within the API response, paying close attention to these key elements:
        • id: The unique identifier of the batch.
        • jobId: The identifier of the job with which the batch is associated.
        • state: The current status of the batch (Queued, InProgress, Completed, Failed, NotProcessed).
        • createdDate: The timestamp denoting the batch's creation.
        • systemModstamp: The timestamp indicating the batch's last modification.
        • numberRecordsProcessed: The total count of successfully processed records.
        • numberRecordsFailed: The total count of records that failed processing.
        • totalProcessingTime: The overall processing duration in milliseconds.
        • apiActiveProcessingTime: The time actively spent processing the batch within Salesforce's servers (in milliseconds).
        • apexProcessingTime: The time consumed executing Apex triggers during batch processing (in milliseconds).


    Conclusion


    Salesforce Bulk API unlocks powerful capabilities for managing large datasets within your Salesforce org. By leveraging jobs and batches efficiently, you can seamlessly load, update, or delete substantial data volumes with agility and precision. Remember the key considerations for adding batches, and explore advanced techniques like chunking, sequencing, and compression to optimize your workflows. Don't hesitate to revisit the Bulk API Basics article for a refresher on the core concepts. Embrace the Bulk API and empower your Salesforce instance to scale with your data needs.


    Trailblazer Profile | LinkedIn | Salesforce Blog | Facebook | Youtube Channel | WhatsApp Community

    Comments