Get Data Moving 1
Results 1 to 2 of 2

Thread: File Channel - Target File Cleanup

  1. #1
    NigelP is offline Junior Member
    Join Date
    May 2015
    Posts
    2
    Rep Power
    0

    Question File Channel - Target File Cleanup

    Hello,

    I'm currently working with a customer that's utilizing File Channel (FC) to move data into The Cloud. They're using FC because of it's one-to-many functionality.

    Because one Target endpoint will be feeding many tasks we can't have "Delete processed files" enabled. Because of this we need a method for removing processed files.

    We have learnt that there's a naming convention for Full Load files so we can safely delete them:

    Code:
    The first 8 numbers in the directory name are the id of the table and the last numbers are timestamp in hex. If you look at the replicate log you can see that every table gets a unique numeric id. 
    For example, in the following message from the log: 
    00004256: 2017-08-15T16:12:55 [TASK_MANAGER ]I: Start loading table 'TEST'.'NATION' (Id = 2) by subtask 2. Start load timestamp 000556CA8973BF40 (replicationtask_util.c:1040) 
    
    you can see that the table: TEST.NATION got id=2 so the first 8 characters of the load library that includes the table will be 2: load00000002xxxxxxx (xxx is timestamp in hex). 
    And:

    Code:
    Following the example I gave in my last comment the FL directory name that was created is: 
    load00000002000556CA8973BF40 
    (as you can see the id correlates to table is and the timestamp correlates to the start load timestamp. 
    So, my question is: does anyone have any experience creating scripts/processes for removing processed CDC files?

    Thanks in advance,

    Nigel.

  2. #2
    Hein is offline Senior Member
    Join Date
    Dec 2007
    Location
    Nashua, NH - USA.
    Posts
    119
    Rep Power
    10
    Quote Originally Posted by NigelP View Post
    my question is: does anyone have any experience creating scripts/processes for removing processed CDC files?

    Nigel.
    Yes, me. :-)

    First I want to start with: KISS
    Just use an OS command delete the any file older than 2 days in the tree(s) and never look back.
    Google quickly found a powershell variant:
    Code:
    $limit =(Get-Date).AddDays(-2)$path ="C:\Some\Path"
    
    # Delete files older than the $limit.
    Get-ChildItem-Path $path -Recurse-Force|Where-Object{!$_.PSIsContainer-and $_.CreationTime-lt $limit }|Remove-Item-Force
    
    # Delete any empty directories left behind after deleting the old files.Get-ChildItem-Path $path -Recurse-Force|Where-Object{ $_.PSIsContainer-and (Get-ChildItem-Path $_.FullName-Recurse-Force|Where-Object{!$_.PSIsContainer})-eq $null }|Remove-Item-Force-Recurse
    On Linux, I think it is :
    Code:
    find /Some/Path* -mtime +2 -exec rm {} \;
    For bonus points make sure no (remote) task have more then 2 days latency, or is stopped for more that 2 days, before execution.

    Now if you don't do as I say, but do as i do, we don't keep is simple, and toss MOVE and REPORT on the files before the delete.
    I would run the attached script pretty much every day to move 'old' FCD files to a backup folder, in case they were needed after all.
    Before running the script I would delete everything from the backup folder from the prior run.
    If I needed space, I'd blindly empty the backup folder and/or run the script with a tighter timeline.
    Most customers, if they need FCD files from more then 2 days back, they are likely better of just re-loading.

    When you check out the script, pay extra attention to "$fc_backup" and "$cutoff"
    A run without a "true" argument like "1" will only show what would be done, without doing it.

    Hein
    Attached Files Attached Files

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •