Get Data Moving 1
Results 1 to 2 of 2

Thread: Helper scripts to Migrate tasks between environments, like DEV to QA, or QA to PROD

  1. #1
    Hein is offline Senior Member
    Join Date
    Dec 2007
    Location
    Nashua, NH - USA.
    Posts
    143
    Rep Power
    11

    Helper scripts to Migrate tasks between environments, like DEV to QA, or QA to PROD

    Migrating Replicate tasks from one Environment to the next can be a huge topic requiring some consulting.
    To discuss this properly we have to take NAMING CONVENTIONS, and SOURCE CONTROL into account.

    In the simplest form, the same task-name and end-point names are used for a given tasks no matter which environment it runs.

    The 'trick' here is that each environment has it's own details behind the endpoints (EP)
    For example the source-EP on the QA server points to the QA database, using QA credentials.
    To 'promote' a task from DEV to QA and (onwards to PROD), all one has to do is EXPORT the task definition on one server, strip the endpoints from the JSON, and IMPORT on the next server.
    The Replicate 5.1.2 makes this really easy, no longer requiring server access, using the EXPORT button on the TASKS page to TASK DESIGN page, and IMPORT on the TASKS page.
    Using a reasonable editor (anything but WORD or NOTEPAD) which is JSON syntax aware (Notepad++, UltraEdit, Textpad...) it is easy enough to colapse the verbose individual sections and remove 'databases'.

    And yet it still a bit tricky to get it right: "do I need to remove that comma before it or not? Yes if it was the last section as databases mostly are. Where is my closing curly?"
    It is specifically tricky when developers are not allowed to touch a production server, and the final deployment step is executed by 'operations' or 'deployment' resources which are unlikely to be JSON or Replicate savvy.

    The attached scripts came into existence to initially to help with just that: Cleanly, repeatably, strip the 'databases' section from a JSON task file.

    Typically, more edits are desirable as most customer settle for a naming convention where tasks and endpoints are identifiable.
    (making the EP's unique prevents accidental replacement, if ever an incorrect Json file is imported)
    For example source "S_ORA_personnel_east_QA","T_PDW_personnel_QA" and task "personnel_east_QA" and they have the JSON file named accordingly.
    Now you need to RENAME the tasks to move between systems, and easy single-location edit, but still.
    And one probably has to rename the endpoint in a couple of places, and removes databases to let the 'new' task latch on to pre-defined EndPoints on the target servers.
    The result might be "S_ORA_personnel_east_PROD", "T_PDW_personnel_PROD" and "personnel_east_PROD"

    Note: I am a strong believer in separating responsibilities for the core tasks, and the databases.
    Do not let 'random' imports destroy your carefully crafted EP definitions
    Luckily they probably wouldn't work anyway as typically the encrypted password can only be decrypted on the same server against the same mk.dat.
    Separate entering usernames/password for the (notably production) databases from the task design.

    The attached scripts were created with those challenges in mind.

    Here is the "Help" for the PowerShell implementation:

    Code:
    PS C:\> C:\scripts\modify_replicate_json.ps1 help
    
    
    modify_replicate_json.ps1 - Hein van den Heuvel - Attunity - 18-Feb-2017
    
    
        This script is used to process a task JSON from one environment to be imported
        on an other environment, for example from DEV to QA to PROD.
    
    
        Typically the Endpoints must NOT be carried over as existing endpoint on
        the targetted server will be used with a different target database(server)
        and different credentials.
    
    
        We also expect that the endpoint name as well as the taskname will need to be
        changed according to the naming standards in place to match the new environment.
    
    
        This scripts can adapt the following elements in a Replication Definition
    
    
        - Remove Databases section ( Endpoints ) should they exist,
              failing to remove could make existing endpoint unoperational on import.
        - Sort explicit included tables by owner,name
        - Remove original UUID if present.
        - Change Task Name
        - Adaption of End-Points to provided new source and target names
        - Remove Tasks section
        - List top-level sections. Typically:
              tasks, databases, replication_environment, notifications, error_behavior, scheduler
        - Remove specified top-level section
        - Add or Replace Explicitly selected table
    
    
        IF the switch "Reference" is used, then NO contents change is done.
        The purpose of this is to generate a template going through the same
        formatting for easy comparisons with tools like WinDiff
    
    
        Options:
    
    
        -InputJson          Required input file, with single task or full Replication_Definition
        -OutputJson         Name of output Json file formatted similar to Replicate, but not exactly the same.
        -SourceEP           New Name for Source Endpoint (aka "database") IF single task.
        -TargetEP           New Name for Target Endpoint IF single task.
        -TaskName           TaskName to act on, if not the first and only task.
        -RenameTaskTo       New Name for Task IF single task.
        -SectionName        Name of json section to act on.
        -AddTables_CSV_file Name of CSV file with tables to ADD to the current tables in task
        -NewTables_CSV_file Name of CSV file with tables to REPLACE the current tables in task with
    
    
        -- Switches --
    
    
        -Reference          Generate unchanged contents with script specifix formatting for comparisons
        -NotSortTables      Stop the script from Sorting the explicitly included table list.
        -DeleteEP           Remove the "databases" section from the InputJson
        -DeleteTasks        Remove the "tasks" section from the InputJson
        -DeleteSection      Request removal of json section identified by -SectionName
    
    
        -TaskOverview       List Tasks in the json file.
        -EPOverview         List Databases (End Points) section in the json file
        -SectionOverview    List the sections in the json file, useful as quick verification.
        -help               This text
    And here, much the same, for PERL (Linux: may have to obtain JSON.PM)

    Code:
    C:\>perl \scripts\modify_replicate_json.pl -h
    
    
      modify_task_json.pl    Hein van den Heuvel, Attunity, Feb-19 2017
    
    
    Usage:
            perl \scripts\modify_replicate_json.pl [-o output] [-l tables.csv] task.json
            -h for help
    
    
    
    
        This script can be used to set the same "explicit_included_tables" to all
        the Attunity Replicate task definitions provided in the task file, based on a simple CSV table list.
        Should be easy enough to tweak this to change other sections.
    
    
        The main input file is created using the 5.1.2(or later) GUI EXPORT button or:
            # repctl exportrepository [task=...]
    
    
        The main output file usage is as source for the 5.1.2(or later) GUI IMPORT button or:
            # repctl exportrepository json_file=...
    
    
        The table file is a simple CSV list with owner,table_name[,rows] entries.
        Rows, is an optional number for "estimated_size".
        Any line not starting with a 'word' is considered comment and is ignored.
        NO header line, unless commented out.
    
    
        When no output is provided it just provides an overview listing for the
        input json file, with tasknames and matching tablenames.
    
    
        Requesting an output without providing a table list is useful to get
        the exact a reference file layout to compare against (windiff, ultra-compare,
        beyond-compare) as the pretty-print generated json which, while functionally the
        same, it will have a different layout from the original source.
    
    
        Note 1: "-d databases" for Delete EndPoint recommended when renaming EndPoints in a task.
    
    
        Note 2: Unfortunately the perl JSON.PM module does NOT preserve ORDER for(nested) hash elements.
                The output JSON will have randomly ordered elements.
                That will work fine, but it is impossible to compare the text.
                Best suggestion is to import and export to get predictable formatting.
    
    
        Arguments:
                        Input file specification. Mandatory
        Options with paramters:
                -o      Output file specification.
                -l      list file containing owner,table_names
                -n      task Name match string (Perl Regex), default all.
                -N      task Name SKIP string (Perl Regex), default none.
                -i      Case insensitive Duplicate test for ADD
                -p      property name for Property Overview (optional)
                -r      Rename Task
                -s      Rename SOURCE EndPoint
                -t      Rename TARGET EndPoint
                -d      Comma separated list sections to delete such as 'tasks' or 'databases'
    
    
        Switches:
                -A      Add tables, default is to replace.
                -T      Task Overview
                -E      EndPoint Overview (aka databases)
                -P      Properties (Sections) overview.
                -R      Reference Run, no functional changes, just formatting for file comparisons.
                -S      Do NOT sort tables (owner.name)
                -H      Display this Help text

    And here is a sample output for the "end-point overview" sub function

    Code:
    PS C:\> C:\scripts\modify_replicate_json.ps1 
       'C:\Program Files\Attunity\Replicate\Data\imports\Replication_Definition.json'  -EPOverview
    
    
        EndPoint Name             Role   Lic Type         Used By...
        --------------------      ------ --- --------     ------------
        A_sql_server_target       TARGET Yes SQL_SERVER
        DSC_FIS_BSP_STMP          TARGET Yes HADOOP       FIS_BSP_STMP_T1
        GIFTCARDS                 SOURCE Yes MYSQL        GIFTCARDS_GIFTCARDS
        Hadoop_target             TARGET Yes HADOOP
        Kafka                     TARGET Yes KAFKA
        MySQL_To_FC               SOURCE Yes MYSQL        MySQL_to_MSSQL
        NETEZZA_DBGCARDS01        TARGET Yes NETEZZA      GIFTCARDS_GIFTCARDS
        New Endpoint Connection   TARGET Yes SQL_SERVER
        New Endpoint Connection 2 SOURCE Yes RMS
        Oracle_Source             SOURCE Yes ORACLE       Oracle_Source, Oracle_to_SqlServer, mark-test, oraHR-to-oraHein

    Give it a whirl?

    Let us know if it helped you, and what (generically usable) improvements would be desirable.

    Cheers,
    Hein
    Attached Files Attached Files
    Last edited by Hein; 11-09-2018 at 05:44 AM. Reason: fix attached script.

  2. #2
    Hein is offline Senior Member
    Join Date
    Dec 2007
    Location
    Nashua, NH - USA.
    Posts
    143
    Rep Power
    11
    There was a bug-lette in the modify_replicate_json_ps1.txt attachement incorrectly using an empty parameter test.
    BAD: ($DriverFile -ne '') # because non-existing is not the same as empty ... may depend on powerwhell versions?
    GOOD: ($DriverFile) # simpler, better

    Also, there is now a version which can use the Attunity Enterprise Manager (AEM) RestAPI to retrieve and store task Json.
    Please note though that the attached version is for AEM 6.2 and earlier.
    With AEM 6.3 some names are changing. An updated version will follow after further testing.

    AEM_62_Replicate_JSON_ps1.txt

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •