Awk to create directory then subdirectory with zip in itBash loop to make directory, if numerical id found in fileHow to do a recursive find/replace of a string with awk or sed?What is the difference between sed and awk?How to use “:” as awk field separator?AWK: Access captured group from line patternUsing awk to print all columns from the nth to the lastAWK multiple delimiterassociative arrays in awk challenging memory limitsin Bash, search file for a string and append file with string if it is not existingSplit All Files In a Folder Based On Row StructureBash loop to make directory, if numerical id found in file

What is the oldest known work of fiction?

Bash method for viewing beginning and end of file

How can I use the arrow sign in my bash prompt?

Can a monster with multiattack use this ability if they are missing a limb?

Ways to speed up user implemented RK4

How to be diplomatic in refusing to write code that breaches the privacy of our users

Print name if parameter passed to function

Hide Select Output from T-SQL

How do I define a right arrow with bar in LaTeX?

apt-get update is failing in debian

Trouble understanding overseas colleagues

Is exact Kanji stroke length important?

Mapping a list into a phase plot

Is expanding the research of a group into machine learning as a PhD student risky?

Can somebody explain Brexit in a few child-proof sentences?

What defines a dissertation?

What is the intuitive meaning of having a linear relationship between the logs of two variables?

Student evaluations of teaching assistants

What's a natural way to say that someone works somewhere (for a job)?

Using parameter substitution on a Bash array

Valid Badminton Score?

Can I Retrieve Email Addresses from BCC?

Understanding "audieritis" in Psalm 94

Should my PhD thesis be submitted under my legal name?



Awk to create directory then subdirectory with zip in it


Bash loop to make directory, if numerical id found in fileHow to do a recursive find/replace of a string with awk or sed?What is the difference between sed and awk?How to use “:” as awk field separator?AWK: Access captured group from line patternUsing awk to print all columns from the nth to the lastAWK multiple delimiterassociative arrays in awk challenging memory limitsin Bash, search file for a string and append file with string if it is not existingSplit All Files In a Folder Based On Row StructureBash loop to make directory, if numerical id found in file













0















The awk below will create sub-directories in a directory (which is always the last line of file1, each block separated by an empty line), if the number in line 2 (always the first 6 digits in the format xx-xxxx) of file2 is found in $2 of file1. This is the current awk output.



If there is a match and a sub-directory is created in a directory then the corresponding line1 https in file2 will always be a link to a zip file for download. I can not seem to create that link in the sub-folder, download and extract the .zip. the download code executes and downloads the zip but has to manually added to the terminal. i apoogize for the long post, wanted to include all details to solve this



file1



xxx_006 19-0000_xxx-yyy-aaa
xxx_007 19-0001_zzz-bbb-ccc
FolderName_001_001

yyyy_0287 19-0v02-xxx
yyyy_0289 19-0v31-xxxx
yyyy_0293 19-0v05-xxxx
FolderName_002_002


file2



https://xx.yy.zz/path/to/file.zip
19-0v05-xxx_000_001
cc112233
https://xx.yy.zz/path/to/download/file.zip
19-0v31-xxx-001-000
bb4456784
https://xx.yy.zz/path/to/file.zip
19-0v02-xxx_000_001
aaa331232


awk edit



cmd_fmt='mkdir -p "%s/%s"
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

# retrieve the first 7-char of each line in file2 as the key to test
against the above hash
k = substr($0, 1, 7)

# if find k, then print
k in a print a[k] "t" $0 "t" l
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link;
do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
create the format text used in sprintf() to run the desired shell commands
cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && filename="%s"; unzip
"$filename##*/"; '
done


desired awk output



FolderName_002_002 --- directory
19-0v02-xxx_000_001 --- sub folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
19-0v05-xxx_000_001 --- sub-folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
19-0v31-xxx-001-000 --- sub-folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder









share|improve this question




























    0















    The awk below will create sub-directories in a directory (which is always the last line of file1, each block separated by an empty line), if the number in line 2 (always the first 6 digits in the format xx-xxxx) of file2 is found in $2 of file1. This is the current awk output.



    If there is a match and a sub-directory is created in a directory then the corresponding line1 https in file2 will always be a link to a zip file for download. I can not seem to create that link in the sub-folder, download and extract the .zip. the download code executes and downloads the zip but has to manually added to the terminal. i apoogize for the long post, wanted to include all details to solve this



    file1



    xxx_006 19-0000_xxx-yyy-aaa
    xxx_007 19-0001_zzz-bbb-ccc
    FolderName_001_001

    yyyy_0287 19-0v02-xxx
    yyyy_0289 19-0v31-xxxx
    yyyy_0293 19-0v05-xxxx
    FolderName_002_002


    file2



    https://xx.yy.zz/path/to/file.zip
    19-0v05-xxx_000_001
    cc112233
    https://xx.yy.zz/path/to/download/file.zip
    19-0v31-xxx-001-000
    bb4456784
    https://xx.yy.zz/path/to/file.zip
    19-0v02-xxx_000_001
    aaa331232


    awk edit



    cmd_fmt='mkdir -p "%s/%s"
    # run the awk command
    awk -v cmd_fmt="$cmd_fmt" '
    # create an associative array (key/value pairs) based on the file1
    NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

    # retrieve the first 7-char of each line in file2 as the key to test
    against the above hash
    k = substr($0, 1, 7)

    # if find k, then print
    k in a print a[k] "t" $0 "t" l
    # save prev line to 'l' which is supposed to be the URL
    l = $0
    ' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link;
    do
    echo "download [$link] to '$base_dir/$sub_dir'"
    # bash command lines to make sub-folders and download files
    create the format text used in sprintf() to run the desired shell commands
    cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && filename="%s"; unzip
    "$filename##*/"; '
    done


    desired awk output



    FolderName_002_002 --- directory
    19-0v02-xxx_000_001 --- sub folder
    https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
    19-0v05-xxx_000_001 --- sub-folder
    https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
    19-0v31-xxx-001-000 --- sub-folder
    https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder









    share|improve this question


























      0












      0








      0








      The awk below will create sub-directories in a directory (which is always the last line of file1, each block separated by an empty line), if the number in line 2 (always the first 6 digits in the format xx-xxxx) of file2 is found in $2 of file1. This is the current awk output.



      If there is a match and a sub-directory is created in a directory then the corresponding line1 https in file2 will always be a link to a zip file for download. I can not seem to create that link in the sub-folder, download and extract the .zip. the download code executes and downloads the zip but has to manually added to the terminal. i apoogize for the long post, wanted to include all details to solve this



      file1



      xxx_006 19-0000_xxx-yyy-aaa
      xxx_007 19-0001_zzz-bbb-ccc
      FolderName_001_001

      yyyy_0287 19-0v02-xxx
      yyyy_0289 19-0v31-xxxx
      yyyy_0293 19-0v05-xxxx
      FolderName_002_002


      file2



      https://xx.yy.zz/path/to/file.zip
      19-0v05-xxx_000_001
      cc112233
      https://xx.yy.zz/path/to/download/file.zip
      19-0v31-xxx-001-000
      bb4456784
      https://xx.yy.zz/path/to/file.zip
      19-0v02-xxx_000_001
      aaa331232


      awk edit



      cmd_fmt='mkdir -p "%s/%s"
      # run the awk command
      awk -v cmd_fmt="$cmd_fmt" '
      # create an associative array (key/value pairs) based on the file1
      NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

      # retrieve the first 7-char of each line in file2 as the key to test
      against the above hash
      k = substr($0, 1, 7)

      # if find k, then print
      k in a print a[k] "t" $0 "t" l
      # save prev line to 'l' which is supposed to be the URL
      l = $0
      ' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link;
      do
      echo "download [$link] to '$base_dir/$sub_dir'"
      # bash command lines to make sub-folders and download files
      create the format text used in sprintf() to run the desired shell commands
      cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && filename="%s"; unzip
      "$filename##*/"; '
      done


      desired awk output



      FolderName_002_002 --- directory
      19-0v02-xxx_000_001 --- sub folder
      https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
      19-0v05-xxx_000_001 --- sub-folder
      https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
      19-0v31-xxx-001-000 --- sub-folder
      https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder









      share|improve this question
















      The awk below will create sub-directories in a directory (which is always the last line of file1, each block separated by an empty line), if the number in line 2 (always the first 6 digits in the format xx-xxxx) of file2 is found in $2 of file1. This is the current awk output.



      If there is a match and a sub-directory is created in a directory then the corresponding line1 https in file2 will always be a link to a zip file for download. I can not seem to create that link in the sub-folder, download and extract the .zip. the download code executes and downloads the zip but has to manually added to the terminal. i apoogize for the long post, wanted to include all details to solve this



      file1



      xxx_006 19-0000_xxx-yyy-aaa
      xxx_007 19-0001_zzz-bbb-ccc
      FolderName_001_001

      yyyy_0287 19-0v02-xxx
      yyyy_0289 19-0v31-xxxx
      yyyy_0293 19-0v05-xxxx
      FolderName_002_002


      file2



      https://xx.yy.zz/path/to/file.zip
      19-0v05-xxx_000_001
      cc112233
      https://xx.yy.zz/path/to/download/file.zip
      19-0v31-xxx-001-000
      bb4456784
      https://xx.yy.zz/path/to/file.zip
      19-0v02-xxx_000_001
      aaa331232


      awk edit



      cmd_fmt='mkdir -p "%s/%s"
      # run the awk command
      awk -v cmd_fmt="$cmd_fmt" '
      # create an associative array (key/value pairs) based on the file1
      NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

      # retrieve the first 7-char of each line in file2 as the key to test
      against the above hash
      k = substr($0, 1, 7)

      # if find k, then print
      k in a print a[k] "t" $0 "t" l
      # save prev line to 'l' which is supposed to be the URL
      l = $0
      ' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link;
      do
      echo "download [$link] to '$base_dir/$sub_dir'"
      # bash command lines to make sub-folders and download files
      create the format text used in sprintf() to run the desired shell commands
      cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && filename="%s"; unzip
      "$filename##*/"; '
      done


      desired awk output



      FolderName_002_002 --- directory
      19-0v02-xxx_000_001 --- sub folder
      https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
      19-0v05-xxx_000_001 --- sub-folder
      https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
      19-0v31-xxx-001-000 --- sub-folder
      https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder






      awk






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 12 at 20:07







      cm0728

















      asked Mar 7 at 11:37









      cm0728cm0728

      1,5271920




      1,5271920






















          1 Answer
          1






          active

          oldest

          votes


















          1














          I believe your question is related to this one: Bash loop to make directory, if numerical id found in file



          You can run all commands in one awk system() funcion, just organize them properly, for example:



          # create the format text used in sprintf() to run the desired shell commands
          cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

          # run the awk command
          awk -v cmd_fmt="$cmd_fmt" '
          # create an associative array (key/value pairs) based on the file1
          NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

          # retrieve the first 7-char of each line in file2 as the key to test against the above hash
          k = substr($0, 1, 7)

          # if find k, then run the system command
          k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd)

          # save prev line to 'l' which is supposed to be the URL
          l = $0
          ' RS= file1 RS='n' file2


          change print to system to execute the command.



          Note: the above unzip and rm commands might not work if file names contains URL encoded chars.



          Update based on your awk edit:



          you can also just print the required info from awk line and then process them in bash, no need to do everything in awk(also remove the line to define cmd_fmt in your awk edit section):



          awk '
          # create an associative array (key/value pairs) based on the file1
          NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

          # retrieve the first 7-char of each line in file2 as the key to test against the above hash
          k = substr($0, 1, 7)

          # if find k, then print
          k in a print a[k] "t" $0 "t" l

          # save prev line to 'l' which is supposed to be the URL
          l = $0

          ' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
          echo "download [$link] to '$base_dir/$sub_dir'"
          # bash command lines to make sub-folders and download files
          mkdir -p "$base_dir/$sub_dir"
          cd "$base_dir/$sub_dir"

          if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
          echo " + processing $link"
          # remove query_string from the link, since it might contains '/'
          filename="$link%?*"
          # remove path from filename and run `unzip`
          unzip "$filename##*/"
          else
          echo " + error downloading: $link"
          fi

          # return to the base directory if it's a relative path
          # if all are absolute paths, then just comment out the following line
          cd ../..
          done


          Note: I did not test the curl line and dont know what the filenames could be for different links. filename="$link##*/" is to remove all chars before the last '/', which will leave filename and potential query_strings. "$filename%?*" is to remove the trailing query strings from filename. Actually filename downloaded by your curl command might be different which you will have to check and adjust from your end.






          share|improve this answer

























          • Thank you very much :).

            – cm0728
            Mar 8 at 22:00











          • The zip archive $filename is extracted with as same name, so I need to rename $filename to tmp and unzip tmp, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

            – cm0728
            Mar 12 at 16:05






          • 1





            @cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk system() function might not invoke the bash and thus the bash parameter expansion will not work.

            – jxc
            Mar 12 at 16:26






          • 1





            I can see the output on the terminal but the directories are not created. I added the cd "%s/%s" && curl under the #bash download. Thank you :).

            – cm0728
            Mar 12 at 17:02












          • addind the curl didn't work. Is there something else? Thank you :).

            – cm0728
            Mar 12 at 19:06










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55042884%2fawk-to-create-directory-then-subdirectory-with-zip-in-it%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          I believe your question is related to this one: Bash loop to make directory, if numerical id found in file



          You can run all commands in one awk system() funcion, just organize them properly, for example:



          # create the format text used in sprintf() to run the desired shell commands
          cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

          # run the awk command
          awk -v cmd_fmt="$cmd_fmt" '
          # create an associative array (key/value pairs) based on the file1
          NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

          # retrieve the first 7-char of each line in file2 as the key to test against the above hash
          k = substr($0, 1, 7)

          # if find k, then run the system command
          k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd)

          # save prev line to 'l' which is supposed to be the URL
          l = $0
          ' RS= file1 RS='n' file2


          change print to system to execute the command.



          Note: the above unzip and rm commands might not work if file names contains URL encoded chars.



          Update based on your awk edit:



          you can also just print the required info from awk line and then process them in bash, no need to do everything in awk(also remove the line to define cmd_fmt in your awk edit section):



          awk '
          # create an associative array (key/value pairs) based on the file1
          NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

          # retrieve the first 7-char of each line in file2 as the key to test against the above hash
          k = substr($0, 1, 7)

          # if find k, then print
          k in a print a[k] "t" $0 "t" l

          # save prev line to 'l' which is supposed to be the URL
          l = $0

          ' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
          echo "download [$link] to '$base_dir/$sub_dir'"
          # bash command lines to make sub-folders and download files
          mkdir -p "$base_dir/$sub_dir"
          cd "$base_dir/$sub_dir"

          if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
          echo " + processing $link"
          # remove query_string from the link, since it might contains '/'
          filename="$link%?*"
          # remove path from filename and run `unzip`
          unzip "$filename##*/"
          else
          echo " + error downloading: $link"
          fi

          # return to the base directory if it's a relative path
          # if all are absolute paths, then just comment out the following line
          cd ../..
          done


          Note: I did not test the curl line and dont know what the filenames could be for different links. filename="$link##*/" is to remove all chars before the last '/', which will leave filename and potential query_strings. "$filename%?*" is to remove the trailing query strings from filename. Actually filename downloaded by your curl command might be different which you will have to check and adjust from your end.






          share|improve this answer

























          • Thank you very much :).

            – cm0728
            Mar 8 at 22:00











          • The zip archive $filename is extracted with as same name, so I need to rename $filename to tmp and unzip tmp, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

            – cm0728
            Mar 12 at 16:05






          • 1





            @cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk system() function might not invoke the bash and thus the bash parameter expansion will not work.

            – jxc
            Mar 12 at 16:26






          • 1





            I can see the output on the terminal but the directories are not created. I added the cd "%s/%s" && curl under the #bash download. Thank you :).

            – cm0728
            Mar 12 at 17:02












          • addind the curl didn't work. Is there something else? Thank you :).

            – cm0728
            Mar 12 at 19:06















          1














          I believe your question is related to this one: Bash loop to make directory, if numerical id found in file



          You can run all commands in one awk system() funcion, just organize them properly, for example:



          # create the format text used in sprintf() to run the desired shell commands
          cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

          # run the awk command
          awk -v cmd_fmt="$cmd_fmt" '
          # create an associative array (key/value pairs) based on the file1
          NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

          # retrieve the first 7-char of each line in file2 as the key to test against the above hash
          k = substr($0, 1, 7)

          # if find k, then run the system command
          k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd)

          # save prev line to 'l' which is supposed to be the URL
          l = $0
          ' RS= file1 RS='n' file2


          change print to system to execute the command.



          Note: the above unzip and rm commands might not work if file names contains URL encoded chars.



          Update based on your awk edit:



          you can also just print the required info from awk line and then process them in bash, no need to do everything in awk(also remove the line to define cmd_fmt in your awk edit section):



          awk '
          # create an associative array (key/value pairs) based on the file1
          NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

          # retrieve the first 7-char of each line in file2 as the key to test against the above hash
          k = substr($0, 1, 7)

          # if find k, then print
          k in a print a[k] "t" $0 "t" l

          # save prev line to 'l' which is supposed to be the URL
          l = $0

          ' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
          echo "download [$link] to '$base_dir/$sub_dir'"
          # bash command lines to make sub-folders and download files
          mkdir -p "$base_dir/$sub_dir"
          cd "$base_dir/$sub_dir"

          if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
          echo " + processing $link"
          # remove query_string from the link, since it might contains '/'
          filename="$link%?*"
          # remove path from filename and run `unzip`
          unzip "$filename##*/"
          else
          echo " + error downloading: $link"
          fi

          # return to the base directory if it's a relative path
          # if all are absolute paths, then just comment out the following line
          cd ../..
          done


          Note: I did not test the curl line and dont know what the filenames could be for different links. filename="$link##*/" is to remove all chars before the last '/', which will leave filename and potential query_strings. "$filename%?*" is to remove the trailing query strings from filename. Actually filename downloaded by your curl command might be different which you will have to check and adjust from your end.






          share|improve this answer

























          • Thank you very much :).

            – cm0728
            Mar 8 at 22:00











          • The zip archive $filename is extracted with as same name, so I need to rename $filename to tmp and unzip tmp, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

            – cm0728
            Mar 12 at 16:05






          • 1





            @cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk system() function might not invoke the bash and thus the bash parameter expansion will not work.

            – jxc
            Mar 12 at 16:26






          • 1





            I can see the output on the terminal but the directories are not created. I added the cd "%s/%s" && curl under the #bash download. Thank you :).

            – cm0728
            Mar 12 at 17:02












          • addind the curl didn't work. Is there something else? Thank you :).

            – cm0728
            Mar 12 at 19:06













          1












          1








          1







          I believe your question is related to this one: Bash loop to make directory, if numerical id found in file



          You can run all commands in one awk system() funcion, just organize them properly, for example:



          # create the format text used in sprintf() to run the desired shell commands
          cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

          # run the awk command
          awk -v cmd_fmt="$cmd_fmt" '
          # create an associative array (key/value pairs) based on the file1
          NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

          # retrieve the first 7-char of each line in file2 as the key to test against the above hash
          k = substr($0, 1, 7)

          # if find k, then run the system command
          k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd)

          # save prev line to 'l' which is supposed to be the URL
          l = $0
          ' RS= file1 RS='n' file2


          change print to system to execute the command.



          Note: the above unzip and rm commands might not work if file names contains URL encoded chars.



          Update based on your awk edit:



          you can also just print the required info from awk line and then process them in bash, no need to do everything in awk(also remove the line to define cmd_fmt in your awk edit section):



          awk '
          # create an associative array (key/value pairs) based on the file1
          NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

          # retrieve the first 7-char of each line in file2 as the key to test against the above hash
          k = substr($0, 1, 7)

          # if find k, then print
          k in a print a[k] "t" $0 "t" l

          # save prev line to 'l' which is supposed to be the URL
          l = $0

          ' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
          echo "download [$link] to '$base_dir/$sub_dir'"
          # bash command lines to make sub-folders and download files
          mkdir -p "$base_dir/$sub_dir"
          cd "$base_dir/$sub_dir"

          if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
          echo " + processing $link"
          # remove query_string from the link, since it might contains '/'
          filename="$link%?*"
          # remove path from filename and run `unzip`
          unzip "$filename##*/"
          else
          echo " + error downloading: $link"
          fi

          # return to the base directory if it's a relative path
          # if all are absolute paths, then just comment out the following line
          cd ../..
          done


          Note: I did not test the curl line and dont know what the filenames could be for different links. filename="$link##*/" is to remove all chars before the last '/', which will leave filename and potential query_strings. "$filename%?*" is to remove the trailing query strings from filename. Actually filename downloaded by your curl command might be different which you will have to check and adjust from your end.






          share|improve this answer















          I believe your question is related to this one: Bash loop to make directory, if numerical id found in file



          You can run all commands in one awk system() funcion, just organize them properly, for example:



          # create the format text used in sprintf() to run the desired shell commands
          cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

          # run the awk command
          awk -v cmd_fmt="$cmd_fmt" '
          # create an associative array (key/value pairs) based on the file1
          NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

          # retrieve the first 7-char of each line in file2 as the key to test against the above hash
          k = substr($0, 1, 7)

          # if find k, then run the system command
          k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd)

          # save prev line to 'l' which is supposed to be the URL
          l = $0
          ' RS= file1 RS='n' file2


          change print to system to execute the command.



          Note: the above unzip and rm commands might not work if file names contains URL encoded chars.



          Update based on your awk edit:



          you can also just print the required info from awk line and then process them in bash, no need to do everything in awk(also remove the line to define cmd_fmt in your awk edit section):



          awk '
          # create an associative array (key/value pairs) based on the file1
          NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next

          # retrieve the first 7-char of each line in file2 as the key to test against the above hash
          k = substr($0, 1, 7)

          # if find k, then print
          k in a print a[k] "t" $0 "t" l

          # save prev line to 'l' which is supposed to be the URL
          l = $0

          ' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
          echo "download [$link] to '$base_dir/$sub_dir'"
          # bash command lines to make sub-folders and download files
          mkdir -p "$base_dir/$sub_dir"
          cd "$base_dir/$sub_dir"

          if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
          echo " + processing $link"
          # remove query_string from the link, since it might contains '/'
          filename="$link%?*"
          # remove path from filename and run `unzip`
          unzip "$filename##*/"
          else
          echo " + error downloading: $link"
          fi

          # return to the base directory if it's a relative path
          # if all are absolute paths, then just comment out the following line
          cd ../..
          done


          Note: I did not test the curl line and dont know what the filenames could be for different links. filename="$link##*/" is to remove all chars before the last '/', which will leave filename and potential query_strings. "$filename%?*" is to remove the trailing query strings from filename. Actually filename downloaded by your curl command might be different which you will have to check and adjust from your end.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Mar 12 at 21:10

























          answered Mar 7 at 14:08









          jxcjxc

          1,013139




          1,013139












          • Thank you very much :).

            – cm0728
            Mar 8 at 22:00











          • The zip archive $filename is extracted with as same name, so I need to rename $filename to tmp and unzip tmp, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

            – cm0728
            Mar 12 at 16:05






          • 1





            @cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk system() function might not invoke the bash and thus the bash parameter expansion will not work.

            – jxc
            Mar 12 at 16:26






          • 1





            I can see the output on the terminal but the directories are not created. I added the cd "%s/%s" && curl under the #bash download. Thank you :).

            – cm0728
            Mar 12 at 17:02












          • addind the curl didn't work. Is there something else? Thank you :).

            – cm0728
            Mar 12 at 19:06

















          • Thank you very much :).

            – cm0728
            Mar 8 at 22:00











          • The zip archive $filename is extracted with as same name, so I need to rename $filename to tmp and unzip tmp, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

            – cm0728
            Mar 12 at 16:05






          • 1





            @cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk system() function might not invoke the bash and thus the bash parameter expansion will not work.

            – jxc
            Mar 12 at 16:26






          • 1





            I can see the output on the terminal but the directories are not created. I added the cd "%s/%s" && curl under the #bash download. Thank you :).

            – cm0728
            Mar 12 at 17:02












          • addind the curl didn't work. Is there something else? Thank you :).

            – cm0728
            Mar 12 at 19:06
















          Thank you very much :).

          – cm0728
          Mar 8 at 22:00





          Thank you very much :).

          – cm0728
          Mar 8 at 22:00













          The zip archive $filename is extracted with as same name, so I need to rename $filename to tmp and unzip tmp, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

          – cm0728
          Mar 12 at 16:05





          The zip archive $filename is extracted with as same name, so I need to rename $filename to tmp and unzip tmp, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

          – cm0728
          Mar 12 at 16:05




          1




          1





          @cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk system() function might not invoke the bash and thus the bash parameter expansion will not work.

          – jxc
          Mar 12 at 16:26





          @cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk system() function might not invoke the bash and thus the bash parameter expansion will not work.

          – jxc
          Mar 12 at 16:26




          1




          1





          I can see the output on the terminal but the directories are not created. I added the cd "%s/%s" && curl under the #bash download. Thank you :).

          – cm0728
          Mar 12 at 17:02






          I can see the output on the terminal but the directories are not created. I added the cd "%s/%s" && curl under the #bash download. Thank you :).

          – cm0728
          Mar 12 at 17:02














          addind the curl didn't work. Is there something else? Thank you :).

          – cm0728
          Mar 12 at 19:06





          addind the curl didn't work. Is there something else? Thank you :).

          – cm0728
          Mar 12 at 19:06



















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55042884%2fawk-to-create-directory-then-subdirectory-with-zip-in-it%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          1928 у кіно

          Захаров Федір Захарович

          Ель Греко