Author: rajatbagga
Posted: Thu Nov 10, 2016 6:15 am (GMT 5.5)
Hello everyone,
For the sake of my example, I mocked up the data. In actual, my key is 5,PD .
And Arun please refer to my comments below for your questions :-
Now , here is the issue. The record length of my input file 2 is 5000 FB it contains heaps of information so I was thinking even a 1M record count could result in a massive actual file size and also yeah this count varies and is not static. It may not be a matter of concern now but going forward this could potentially cause issues.
- Input file 1 is 80/FB sorted key 1,5,PD, input file 2 is 5000/FB its just sorted on column 11,5,PD (first key position) the other key positions starts from Column 80 to 131 (5 * 10 = 50 bytes)
- Sequence of the output datasets doesn't matter to me
- Output 1 is 5000/FB same format data as stored in input file 2 and Output 2 is 80/FB as data format as stored in input file 1.
Yes, in input file 2 , the key can be present in 11 spots (11,5,PD [first occurance] and rest spanning between 80 to 131 as 5,PD[10 occurances] ) on a single record. The same key can be located in multiple instances in the single record or can be present in other single record entries as well.
I am not sure if this can be done but I was thinking if there is a way to define these key values like an array in DFSORT so that I can first put all the keys together right at the end of the input file 2 and then define an array to do the search on my key from input file 1 if found I write the record of input file 2 to output file 1 , this way I can get all the matching records, but still need to figure out how do I get the unmatching records from input file 1 to output file 2
Thanks and Regards,
Rajat
Posted: Thu Nov 10, 2016 6:15 am (GMT 5.5)
Hello everyone,
For the sake of my example, I mocked up the data. In actual, my key is 5,PD .
And Arun please refer to my comments below for your questions :-
Quote: |
Like Enrico has pointed out already, 1M should not be a matter of concern. |
Now , here is the issue. The record length of my input file 2 is 5000 FB it contains heaps of information so I was thinking even a 1M record count could result in a massive actual file size and also yeah this count varies and is not static. It may not be a matter of concern now but going forward this could potentially cause issues.
Quote: |
Meanwhile it might help if you post some missing information. - Are the input data sets already in sorted order of the keys shown? At least the input-2 does not seem to be from the sample data. But are they sorted in your 'real' data sets? - Do you need the output data sets to be sorted OR to preserve the input order for some reason? - What about the LRECL/RECFM of these data sets? |
- Input file 1 is 80/FB sorted key 1,5,PD, input file 2 is 5000/FB its just sorted on column 11,5,PD (first key position) the other key positions starts from Column 80 to 131 (5 * 10 = 50 bytes)
- Sequence of the output datasets doesn't matter to me
- Output 1 is 5000/FB same format data as stored in input file 2 and Output 2 is 80/FB as data format as stored in input file 1.
Yes, in input file 2 , the key can be present in 11 spots (11,5,PD [first occurance] and rest spanning between 80 to 131 as 5,PD[10 occurances] ) on a single record. The same key can be located in multiple instances in the single record or can be present in other single record entries as well.
I am not sure if this can be done but I was thinking if there is a way to define these key values like an array in DFSORT so that I can first put all the keys together right at the end of the input file 2 and then define an array to do the search on my key from input file 1 if found I write the record of input file 2 to output file 1 , this way I can get all the matching records, but still need to figure out how do I get the unmatching records from input file 1 to output file 2
Thanks and Regards,
Rajat