Get Data Moving 1
Results 1 to 4 of 4

Thread: Intro to manipulating JSON file from perl scripts. Examples in replies.

  1. #1
    Hein is offline Senior Member
    Join Date
    Dec 2007
    Location
    Nashua, NH - USA.
    Posts
    151
    Rep Power
    12

    Intro to manipulating JSON file from perl scripts. Examples in replies.

    [This topic is a companion to : repctl-gettablesstatus ]

    Perl is distributed with a JSON (JavaScript Object Notation) module which notably has to_json and from_json functions.
    Using those functions is the most convenient ways to handle JSON data files from REPCTL commands... in Perl that is.
    Obviously addressing those structures from JavaScript is easier still, but I don't know much about JS and just enough Perl.

    The two functions mentioned, convert json text to and from perl arrays, simple or associative as needed.
    The tricky part is that those arrays are 'nested', following the nested structure of the json.
    So the array elements are typically references to other arrays.
    De-reference is needed to 'drill down'.
    For example, the top element for an Attunity replicate tasks will be addressed as:
    Code:
    $decoded_json->{'cmd.replication_definition'}

    $decoded_json is a variable pointing to an associative array.
    It's (only) key value is 'cmd.replication_definition' the contents for that element is an other associative array with keys such as 'tasks', 'error_behavior', 'databases',...
    And under element 'tasks' we'll find a simple array (list) of 'task'.
    That array can be addressed as:
    Code:
    @tasks = @{$decoded_json->{'cmd.replication_definition'}->{'tasks'}}
    Head spinning yet? How does one know what is what... and where?
    Well, the element names such as 'task', 'name', 'target_names' are all keys in associative arrays.
    The boxes [ ... ] in the json file, correspond with boxes = lists = simple arrays in perl , for example pointed to by 'tasks', 'target_names' and 'rules'

    The perl utility function "DUMPER" can be used to 'see' the json-structures in a perl representation (albeit with an ugly amount of leading spaces causing nasty line wraps for a typical Attunity Replicate task.


    Enough said... let's get to work!
    Here is your first exercise.
    Dump a task export as an array to the screen, and fetch the name of the (first) task.
    There are really only a handful lines of code below, and some fluff to keep me honest.
    See if it makes sense?

    Hein.


    Code:
    #  dump_json.pl    Hein van den Heuvel, Attunity, January-2014
    #
    # Teaser perl script to show how read a JSON file and DUMP as a PERL array.
    # Just feed is a (small!) Attunity replicate exportrepository output file.
    # (or any other JSON file for that matter.
    #
    #    repctl exportrepository task=<some_task>
    #    perl dump_json.pl <some_task>.json 
    #
    # Hein
                
    use strict;
    use warnings;
    use JSON qw( from_json );
    use Data::Dumper;  
    
    
    my ($json_text, $decoded_json, $dumped);
    
    
    # Glue all input lines into a single string.
    #
    $json_text .= $_ for (<>); 
    
    
    print "$. lines, ", length($json_text), " bytes.\n";
    
    
    # This is where the magic happens, parse a json string into a perl array
    #
    $decoded_json = from_json( $json_text ); # Magic!
    
    
    #  "cmd.replication_definition" makes us start out too deep to print pretty.
    #  Dump into a string, and zap many of the leading spaces,  before printing.
    #
    $dumped =  Dumper $decoded_json;
    $dumped =~ s/\n {44}/\n/g;   
    print $dumped;
    
    
    # Now show how to access just one element, 'deep' down.
    # To make the de-referencing not too crazy, pick up the array of tasks first.
    # Then print the name element from the associative array task from the first element.
    #
    my @tasks = @{$decoded_json->{'cmd.replication_definition'}->{'tasks'}};
    print "\n\n(1) Task name = ", $tasks[0]->{'task'}->{'name'};
    
    
    # Show how as an alternative, we could have picked up the associative array 'task;
    # from the first array element of 'tasks' and then use that as a base.
    #
    my %task = %{@{$decoded_json->{'cmd.replication_definition'}->{'tasks'}}[0]};
    print "\n\n(2) Task name = ", $task{'task'}->{'name'};
    Attached Files Attached Files
    Last edited by Hein; 01-30-2014 at 03:15 PM.

  2. #2
    Hein is offline Senior Member
    Join Date
    Dec 2007
    Location
    Nashua, NH - USA.
    Posts
    151
    Rep Power
    12

    High level report on tasks in json listing

    This reply discusses the reptask_list_pl.txt attachment in this topic.
    This is a Perl script, disguised as .txt to appease the Forum gods.

    Expected usage is:

    Code:
    #    repctl exportrepository 
    #    perl reptask_list.pl C:"\Program Files\Attunity\Replicate\data\imports"\Replication_Definition.json  > tasks_overview.txt
    or
    #    repctl exportrepository task=<some_task_name>
    #    perl reptask_list.pl C:"\Program Files\Attunity\Replicate\data\imports"\<some_task_name>.json


    It builds on the 'dump' example by traversing all the tasks from the export.
    This time it reports on all of them, not just the first [0] entry and it also reports on databases used by them.
    This is an action where perl really shines as is is trivial to build the associative arrays.
    Here is the core...

    Code:
    my @tasks = @{$decoded_json->{'cmd.replication_definition'}->{'tasks'}};
    for my $task (@tasks) {
    	my $target = $task->{'task'}->{'target_names'}->[0];  # Only first one for now.
    	my $source = $task->{'task'}->{'source_name'};
            my $name   = $task->{'task'}->{'name'};
    	$tasks{$name} =  "$source,$target" ;
    }
    Example output below.
    Hope this helps someone, someday.


    Code:
    # perl ..\..\Replicate\reptask_list_json.pl Replication_Definition.json
    
    
    Task Name                 :                Source Database ->                Target Database
    ------------------------- : ------------------------------ -> ------------------------------
    MMto1                     :                MM (SQL_SERVER) -> Seg01_Target (SQL_SERVER)
    MMto2                     :                MM (SQL_SERVER) -> Seg02_Target (SQL_SERVER)
    Seg01_to_MM               :      Seg01_source (SQL_SERVER) -> Seg00_Target (SQL_SERVER)
    Seg02_to_MM               :      Seg02_source (SQL_SERVER) -> Seg00_Target (SQL_SERVER)
    Swing2sql                 :     Swingbench-Oracle (ORACLE) -> Seg00_Target (SQL_SERVER)
    Swing2sql2                :     Swingbench-Oracle (ORACLE) -> hein_sqlserver (SQL_SERVER)
    SwingToCT                 :     Swingbench-Oracle (ORACLE) -> hein_sqlserver (SQL_SERVER)
    TXNto0                    :               TXN (SQL_SERVER) -> Seg00_Target (SQL_SERVER)
    Vertica_to_Oracle         :             VerticaODBC (ODBC) -> Oracle Target (ORACLE)
    bd_0_2                    :      Seg00_Source (SQL_SERVER) -> Seg02_Target (SQL_SERVER)
    bd_2_0                    :      Seg02_source (SQL_SERVER) -> Seg00_Target (SQL_SERVER)
    bundy                     :                 bundy (RMSAIS) -> hein_sqlserver (SQL_SERVER)
    demo                      :                 bundy (RMSAIS) -> hein_sqlserver (SQL_SERVER)
    fc2vertica                :     FileChannel (FILE_CHANNEL) -> vertica-target (VERTICA)
    hw1                       :     Swingbench-Oracle (ORACLE) -> hein_sqlserver (SQL_SERVER)
    kevin                     :     Swingbench-Oracle (ORACLE) -> hein_sqlserver (SQL_SERVER)
    ora 2 ora                 :     Swingbench-Oracle (ORACLE) -> Oracle Target (ORACLE)
    ora2fc                    :     Swingbench-Oracle (ORACLE) -> file-channel (FILE_CHANNEL)
    ora2vertica1              :     Swingbench-Oracle (ORACLE) -> vertica-target (VERTICA)
    test                      :     Swingbench-Oracle (ORACLE) -> hein_sqlserver (SQL_SERVER)
    
    
    
    
    Database Name                 Role            Type   Used By Tasks
    ------------------------- : ------    ------------ - -------------------------
    ASM                       : SOURCE          ORACLE -
    FileChannel               : SOURCE    FILE_CHANNEL - fc2vertica
    MM                        : SOURCE      SQL_SERVER - MMto1, MMto2
    MM_Target                 : TARGET      SQL_SERVER -
    Oracle Target             : TARGET          ORACLE - Vertica_to_Oracle, ora 2 ora
    Seg00_Source              : SOURCE      SQL_SERVER - bd_0_2
    Seg00_Target              : TARGET      SQL_SERVER - Seg01_to_MM, Seg02_to_MM, Swing2sql, TXNto0, bd_2_0
    Seg01_Target              : TARGET      SQL_SERVER - MMto1
    Seg01_source              : SOURCE      SQL_SERVER - Seg01_to_MM
    Seg02_Target              : TARGET      SQL_SERVER - MMto2, bd_0_2
    Seg02_source              : SOURCE      SQL_SERVER - Seg02_to_MM, bd_2_0
    Swingbench-Oracle         : SOURCE          ORACLE - Swing2sql, Swing2sql2, SwingToCT, hw1, kevin, ora 2 ora, ora2fc, ora2vertica1,
    test
    TXN                       : SOURCE      SQL_SERVER - TXNto0
    VerticaODBC               : SOURCE            ODBC - Vertica_to_Oracle
    bundy                     : SOURCE          RMSAIS - bundy, demo
    file-channel              : TARGET    FILE_CHANNEL - ora2fc
    generic-NUL               : TARGET    GENERIC_FILE -
    hein_sqlserver            : TARGET      SQL_SERVER - Swing2sql2, SwingToCT, bundy, demo, hw1, kevin, test
    sql source                : SOURCE      SQL_SERVER -
    vertica-target            : TARGET         VERTICA - fc2vertica, ora2vertica1
    Last edited by Hein; 01-30-2014 at 03:30 PM.

  3. #3
    Hein is offline Senior Member
    Join Date
    Dec 2007
    Location
    Nashua, NH - USA.
    Posts
    151
    Rep Power
    12

    perl script to set a set of global_manipulation rules on many tasks.

    This reply discusses the set_global_manipulation.pl script which is attached to this topic as set_global_manipulation_pl.txt


    It takes the reptask_list.pl example in the prior reply a step further.
    This script not only reports on json files, but modifies and generates one.
    (ok, maybe 3 steps further).

    First question to answer is probably WHY one might want to do this with a script and not simply bang away at the GUI?

    We had two main reasons.

    1) The customer in question had 9 manipulations to be applied to 17 tasks, to be repeated on 6 sites.
    -- it is too error prone to enter, and too hard to verify to be correct.

    2) to date (Replicate V3.1.0.18) one can not control the order of the (global) manipulations in a tasks.
    -- the last rule touched, becomes the last rule to be executed. So any correction would require carefully ordered 'touching' all other rules to make benign changes to get those in 'the right place' whether that be for aesthetic, or functional reason, or maybe just because you want it 'just so'.

    This script got a little out of hand…. As always.
    But now you have a nice® framework to play with.
    I started out to make it too generic, as always, and punted with just doing the 'global manipulation' (for now :-)
    The script has build-in help which can be obtained activating it with the -h option

    The 'crux' is rather neat, if I dare say so.
    Here is the one perl line that adds or over-writes the global manipulation for a task:
    Code:
    $task->{'global_manipulation'} = $manipulation_json;
    Clean right?
    Usage:
    Code:
    # perl set_global_manipulation.pl    –o –new.json    -m manipulation.json    original.json


    This script can be used to set the same global_manipulation to all the Attunity Replicate task definition provided in the input file.
    It should be easy enough to tweak this script to change other sections.


    The main input file is created using REPCTL EXPORTREPOSITORY [TASK=...]


    The manipulation file is the desired global transformation section.
    This would typically be 'snipped' from an export file formatted like a nameless structure:


    Code:
                                      {
                                            "name": "global_manipulation",
                                            "rules":        [{
                                                :
                                            }]
                                       }

    When no output is requested the script just provides an overview listing for the input json file.
    The listing will show: taskname, number-of-rules, source-db, first-target-db

    Requesting an output without providing a manipulation is useful the get the exact a reference file layout to compare against.
    The pretty-print generated json, while functionally the same, will have a different layout from the original source json.
    Suggested compare tools: windiff, ultra-compare, beyond-compare...

    Worked out usage example, below.

    Enjoy!
    Hein



    First a run without actually adding a manipulation in orer to get a 'reference' output old.tmp.


    Code:
    # perl set_global_manipulation.pl  -o old.tmp original.json
    37321 input lines, 700953 bytes.
    nr TaskName              Rules SourceDB                TargetDB
    -- ---------------       ----- ---------------         ---------------
     1 TSK_DB1_CEPROC_FL         9 S_DB1_CEPROC_FL_R       T_CEPROC_FL
     2 TSK_DB1_ELIGPROC_FL       2 S_DB1_ELIGPROC_FL_R     T_ELIGPROC_FL
     3 TSK_DB1_ENCOUNTER_FL      2 S_DB1_ENCOUNTER_FL_R    T_ENCOUNTER_FL
     4 TSK_DB1_PLANDATA_FL       2 S_DB1_PLANDATA_FL_R     T_NZ_PLANDATA_FL
     5 TSK_DB1_PLANDOCUMENT      2 S_DB1_PLANDOCUMENT_FL_R T_PLANDOCUMENT_FL
     6 TSK_DB1_REPORTDB_FL       2 S_DB1_REPORTDB_FL_R     T_REPORTDB_FL
     7 TSK_DB2_DBA_Monitor_Test  0 S_DB2_Monitor_DBA_Test  T_DB3_DBA_Monitor_Test
     8 TSK_DB2_CERPROC_OH        3 S_DB2_CEPROC_OH         T_NZ_CEPROC_OH
     9 TSK_DB2_PLANDOCUMENT_NM   2 S_DB2_PLANDOCUMENT_NM   T_NZ_PLANDOCUMENT_NM
    10 TSK_DB2_DB3_DBA_Monitor   0 S_SQL_DBA_Monitor       T_SQL_DB3_DBA_Monitor
    11 TSK_DB3_ELIGPROC_NM       2 S_DB3_ELIGPROC_NM       T_ELIGPROC_NM
    12 TSK_DB3_ENCOUNTER_NM      2 S_DB3_ENCOUNTER_NM      T_ENCOUNTER_NM
    13 TSK_DB3_PLANDATA_NM       2 S_DB3_PLANDATA_NM       T_PLANDATA_NM
    14 TSK_DB3_CEPROC_NM         9 S_DB3_CEPROC_NM         T_CEPROC_NM
    15 TSK_DB3_REPORTDB_NM       2 S_DB3_REPORTDB_NM       T_REPORTDB_NM
    16 TSK_DB3_NZ_CORPDB_NM      2 S_DB3_CORPDB_NM         T_NZ_CORPDB_NM
    17 Troubleshooting           6 S_DB1_ELIGPROC_FL_R     T_ELIGPROC_FL
    18 Troubleshooting_NM_CEPROC 9 S_DB3_CEPROC_NM         T_NZ_CORPDB_NM
    Second a run adding the manipulation, and generating the real target json file.

    Code:
    # perl set_global_manipulation.pl  –o new.tmp -m manipulation.txt  original.json
    37321 input lines, 700953 bytes.
    112 manipulation lines, 2507 bytes.
     1 TSK_DB1_CEPROC_FL         9 S_DB1_CEPROC_FL_R       T_CEPROC_FL
     2 TSK_DB1_ELIGPROC_FL       2 S_DB1_ELIGPROC_FL_R     T_ELIGPROC_FL
    :
    <snip>
    :
     
    # dir *.tmp
    01/28/2014  05:29 PM         1,726,445 old.tmp
    01/28/2014  05:35 PM         1,821,107 new.tmp
    Third a verification run without adding the manipulation, using the new file as input… just to see if all tasks now have 9 rules.
    Code:
    # perl set_global_manipulation.pl  new.tmp
    44864 input lines, 761079 bytes.
     1 TSK_DB1_CEPROC_FL         9 S_DB1_CEPROC_FL_R       T_CEPROC_FL
     2 TSK_DB1_ELIGPROC_FL       9 S_DB1_ELIGPROC_FL_R     T_ELIGPROC_FL
    3 TSK_DB1_ENCOUNTER_FL       9 S_DB1_ENCOUNTER_FL_R    T_ENCOUNTER_FL
    :
    <snip>
    :
    Finally test the real thing!
    Use REPCTL to load the json before and after into the (TEMP!) repository, export and diff
    Code:
    # repctl -d \temp\replicate\data importrepository original.json
    # repctl -d \temp\replicate\data exportrepository
    command exportrepository response:
    :
    # copy \temp\replicate\data\imports\Replication_Definition.json old.json
    #
    # repctl -d \temp\replicate\data importrepository json_file=new.tmp
    [importrepository command] Succeeded
    # repctl -d \temp\replicate\data exportrepository
    command exportrepository response:
    :
    # copy \temp\replicate\data\imports\Replication_Definition.json new.json
    # windiff new.json old.json

  4. #4
    Hein is offline Senior Member
    Join Date
    Dec 2007
    Location
    Nashua, NH - USA.
    Posts
    151
    Rep Power
    12

    Adding wholesale table manipulations

    Currently (Replicate 3.1), does not have a Global Manipulation option for Filters and some (column) transformations which are available per table.
    This is understandable as is it not considered too likely that for example the same filter could be applicable to many tables.
    Still, we had a request for just that. This is now submitted as a formal enhancement request, but in the mean time one cold consider the attached "set_table_manipulation_pl.txt" perl script.
    This script is build as once step up (down?!) from the set_global_manipulation.pl script in the prior reply.
    The main challenge as to deal with manipulation being present at all in a task, and whether any table of interest already had manipulations.
    The script offer the option to replace, or add., select or skip by task/owner/table name match.
    For now it only has been very minimally tested, as it is not clear how useful this script might be, or not.

    There is some interesting, if I dare say, new core code to implement creating a fresh manipulation.
    It involves hooking up a fresh anonymous, initialized, datastructure as an array element. Have a look:

    Code:
    for $task (@tasks) {
       :
            $task->{manipulations} = [] unless $task->{manipulations};
       :
        for $table (@tables) {
       :
            for (sort keys %todo) {   # Keys for this array look like "owner.table"
                  push @{$task->{manipulations}}, 
                    { 'name' => $_, 'table_manipulation' =>
                        { 'owner' => (split /\./, $_)[0],
                           'name'  => (split /\./, $_)[1],
                        'keep_columns_without_manipulation' => \1,
                        %manipulations }
                    } ;
            }
    Help instructions below.

    Enjoy,
    Hein.






    This script can be used to set the same global_manipulation to all
    the Attunity Replicate task definition provided in the input file.
    Should be easy enough to tweak this to change other sections.

    Usage:
    perl $0 [-o output] [-m manipulation] input.json
    -h for help

    The main input file is created using REPCTL EXPORTREPOSITORY [TASK=...]


    The manipulation file is the desired global transformation section, 'snipped'
    from such export file formatted like a nameless structure:


    Code:
    {        "filter_columns":    [{
                        "column_name":    "DEPARTMENT_ID",
                        "ranges":    "..123"
                    }],
                "transform_columns":    [{
                        :
                        }, {
                        "column_name":    "",
                        "new_column_name":    "middle",
                        "action":    "ADD",
                        "new_data_type":    "kAR_DATA_TYPE_STR",
                        "length":    50,
                        "computation_expression":    "'Malcolm'"
                    }]
        }
    When no output is provided is just provides an overview listing for the
    input json file, with tasknames and matching tablenames.


    Requesting an output without providing a manipulation is useful the get
    the exact a reference file layout to compare against (windiff, ultra-compare,
    beyond-compare) as the pretty-print generated json while functionally the
    same will have a different layout from the original source.


    Code:
        Arguments:
                        Input file specification. Mandatory      
        Options:    
                -o        Output file specification.
                -m        Manipulation file specification.
                -h        Display this Help text
                -a        Add manipulation, default is to replace.
                -n        task Name match string (Perl Regex), default all.
                -s        Schema match string (Perl Regex), default any.
                -t        Table match string (Perl Regex), default all.
                -N        task Name SKIP string (Perl Regex), default none.
                -S        Schema SKIP string (Perl Regex), default none.
                -T        Table SKIP string (Perl Regex), default none.
                -v        Verbose
    Attached Files Attached Files
    Last edited by Hein; 02-03-2014 at 02:33 PM.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •