Awk to create directory then subdirectory with zip in itBash loop to make directory, if numerical id found in fileHow to do a recursive find/replace of a string with awk or sed?What is the difference between sed and awk?How to use “:” as awk field separator?AWK: Access captured group from line patternUsing awk to print all columns from the nth to the lastAWK multiple delimiterassociative arrays in awk challenging memory limitsin Bash, search file for a string and append file with string if it is not existingSplit All Files In a Folder Based On Row StructureBash loop to make directory, if numerical id found in file

What is the oldest known work of fiction?

Bash method for viewing beginning and end of file

How can I use the arrow sign in my bash prompt?

Can a monster with multiattack use this ability if they are missing a limb?

Ways to speed up user implemented RK4

How to be diplomatic in refusing to write code that breaches the privacy of our users

Print name if parameter passed to function

Hide Select Output from T-SQL

How do I define a right arrow with bar in LaTeX?

apt-get update is failing in debian

Trouble understanding overseas colleagues

Is exact Kanji stroke length important?

Mapping a list into a phase plot

Is expanding the research of a group into machine learning as a PhD student risky?

Can somebody explain Brexit in a few child-proof sentences?

What defines a dissertation?

What is the intuitive meaning of having a linear relationship between the logs of two variables?

Student evaluations of teaching assistants

What's a natural way to say that someone works somewhere (for a job)?

Using parameter substitution on a Bash array

Valid Badminton Score?

Can I Retrieve Email Addresses from BCC?

Understanding "audieritis" in Psalm 94

Should my PhD thesis be submitted under my legal name?

Awk to create directory then subdirectory with zip in it

Bash loop to make directory, if numerical id found in fileHow to do a recursive find/replace of a string with awk or sed?What is the difference between sed and awk?How to use “:” as awk field separator?AWK: Access captured group from line patternUsing awk to print all columns from the nth to the lastAWK multiple delimiterassociative arrays in awk challenging memory limitsin Bash, search file for a string and append file with string if it is not existingSplit All Files In a Folder Based On Row StructureBash loop to make directory, if numerical id found in file

The awk below will create sub-directories in a directory (which is always the last line of file1, each block separated by an empty line), if the number in line 2 (always the first 6 digits in the format xx-xxxx) of file2 is found in $2 of file1. This is the current awk output.

If there is a match and a sub-directory is created in a directory then the corresponding line1 https in file2 will always be a link to a zip file for download. I can not seem to create that link in the sub-folder, download and extract the .zip. the download code executes and downloads the zip but has to manually added to the terminal. i apoogize for the long post, wanted to include all details to solve this

file1

xxx_006 19-0000_xxx-yyy-aaa
xxx_007 19-0001_zzz-bbb-ccc
FolderName_001_001

yyyy_0287 19-0v02-xxx
yyyy_0289 19-0v31-xxxx
yyyy_0293 19-0v05-xxxx
FolderName_002_002

file2

https://xx.yy.zz/path/to/file.zip
19-0v05-xxx_000_001
 cc112233
https://xx.yy.zz/path/to/download/file.zip
19-0v31-xxx-001-000
bb4456784
https://xx.yy.zz/path/to/file.zip
19-0v02-xxx_000_001
aaa331232

awk edit

cmd_fmt='mkdir -p "%s/%s"
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

# retrieve the first 7-char of each line in file2 as the key to test 
 against the above hash
 k = substr($0, 1, 7) 

# if find k, then print
k in a print a[k] "t" $0 "t" l 
# save prev line to 'l' which is supposed to be the URL
 l = $0 
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; 
do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
 create the format text used in sprintf() to run the desired shell commands
cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && filename="%s"; unzip 
"$filename##*/"; '
done

desired awk output

FolderName_002_002 --- directory
 19-0v02-xxx_000_001 --- sub folder
 https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
 19-0v05-xxx_000_001 --- sub-folder
 https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
 19-0v31-xxx-001-000 --- sub-folder
 https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder

edited Mar 12 at 20:07

asked Mar 7 at 11:37

cm0728

1,5271920

add a comment |

file1

xxx_006 19-0000_xxx-yyy-aaa
xxx_007 19-0001_zzz-bbb-ccc
FolderName_001_001

yyyy_0287 19-0v02-xxx
yyyy_0289 19-0v31-xxxx
yyyy_0293 19-0v05-xxxx
FolderName_002_002

file2

https://xx.yy.zz/path/to/file.zip
19-0v05-xxx_000_001
 cc112233
https://xx.yy.zz/path/to/download/file.zip
19-0v31-xxx-001-000
bb4456784
https://xx.yy.zz/path/to/file.zip
19-0v02-xxx_000_001
aaa331232

awk edit

cmd_fmt='mkdir -p "%s/%s"
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

# retrieve the first 7-char of each line in file2 as the key to test 
 against the above hash
 k = substr($0, 1, 7) 

# if find k, then print
k in a print a[k] "t" $0 "t" l 
# save prev line to 'l' which is supposed to be the URL
 l = $0 
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; 
do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
 create the format text used in sprintf() to run the desired shell commands
cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && filename="%s"; unzip 
"$filename##*/"; '
done

desired awk output

FolderName_002_002 --- directory
 19-0v02-xxx_000_001 --- sub folder
 https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
 19-0v05-xxx_000_001 --- sub-folder
 https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
 19-0v31-xxx-001-000 --- sub-folder
 https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder

edited Mar 12 at 20:07

asked Mar 7 at 11:37

cm0728

1,5271920

add a comment |

file1

xxx_006 19-0000_xxx-yyy-aaa
xxx_007 19-0001_zzz-bbb-ccc
FolderName_001_001

yyyy_0287 19-0v02-xxx
yyyy_0289 19-0v31-xxxx
yyyy_0293 19-0v05-xxxx
FolderName_002_002

file2

https://xx.yy.zz/path/to/file.zip
19-0v05-xxx_000_001
 cc112233
https://xx.yy.zz/path/to/download/file.zip
19-0v31-xxx-001-000
bb4456784
https://xx.yy.zz/path/to/file.zip
19-0v02-xxx_000_001
aaa331232

awk edit

cmd_fmt='mkdir -p "%s/%s"
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

# retrieve the first 7-char of each line in file2 as the key to test 
 against the above hash
 k = substr($0, 1, 7) 

# if find k, then print
k in a print a[k] "t" $0 "t" l 
# save prev line to 'l' which is supposed to be the URL
 l = $0 
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; 
do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
 create the format text used in sprintf() to run the desired shell commands
cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && filename="%s"; unzip 
"$filename##*/"; '
done

desired awk output

FolderName_002_002 --- directory
 19-0v02-xxx_000_001 --- sub folder
 https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
 19-0v05-xxx_000_001 --- sub-folder
 https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
 19-0v31-xxx-001-000 --- sub-folder
 https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder

edited Mar 12 at 20:07

asked Mar 7 at 11:37

cm0728

1,5271920

file1

xxx_006 19-0000_xxx-yyy-aaa
xxx_007 19-0001_zzz-bbb-ccc
FolderName_001_001

yyyy_0287 19-0v02-xxx
yyyy_0289 19-0v31-xxxx
yyyy_0293 19-0v05-xxxx
FolderName_002_002

file2

https://xx.yy.zz/path/to/file.zip
19-0v05-xxx_000_001
 cc112233
https://xx.yy.zz/path/to/download/file.zip
19-0v31-xxx-001-000
bb4456784
https://xx.yy.zz/path/to/file.zip
19-0v02-xxx_000_001
aaa331232

awk edit

cmd_fmt='mkdir -p "%s/%s"
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

# retrieve the first 7-char of each line in file2 as the key to test 
 against the above hash
 k = substr($0, 1, 7) 

# if find k, then print
k in a print a[k] "t" $0 "t" l 
# save prev line to 'l' which is supposed to be the URL
 l = $0 
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; 
do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
 create the format text used in sprintf() to run the desired shell commands
cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && filename="%s"; unzip 
"$filename##*/"; '
done

desired awk output

FolderName_002_002 --- directory
 19-0v02-xxx_000_001 --- sub folder
 https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
 19-0v05-xxx_000_001 --- sub-folder
 https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
 19-0v31-xxx-001-000 --- sub-folder
 https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder

awk

edited Mar 12 at 20:07

asked Mar 7 at 11:37

cm0728

1,5271920

edited Mar 12 at 20:07

asked Mar 7 at 11:37

cm0728

1,5271920

edited Mar 12 at 20:07

asked Mar 7 at 11:37

cm0728

1,5271920

asked Mar 7 at 11:37

cm0728

1,5271920

asked Mar 7 at 11:37

cm0728

1,5271920

add a comment |

1 Answer
1

active

oldest

votes

I believe your question is related to this one: Bash loop to make directory, if numerical id found in file

You can run all commands in one awk system() funcion, just organize them properly, for example:

# create the format text used in sprintf() to run the desired shell commands
cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
 # create an associative array (key/value pairs) based on the file1
 NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

 # retrieve the first 7-char of each line in file2 as the key to test against the above hash
 k = substr($0, 1, 7) 

 # if find k, then run the system command 
 k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd) 

 # save prev line to 'l' which is supposed to be the URL
 l = $0 
' RS= file1 RS='n' file2

change print to system to execute the command.

Note: the above unzip and rm commands might not work if file names contains URL encoded chars.

Update based on your awk edit:

you can also just print the required info from awk line and then process them in bash, no need to do everything in awk(also remove the line to define cmd_fmt in your awk edit section):

awk '
 # create an associative array (key/value pairs) based on the file1
 NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

 # retrieve the first 7-char of each line in file2 as the key to test against the above hash
 k = substr($0, 1, 7) 

 # if find k, then print
 k in a print a[k] "t" $0 "t" l 

 # save prev line to 'l' which is supposed to be the URL
 l = $0 

' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
 echo "download [$link] to '$base_dir/$sub_dir'"
 # bash command lines to make sub-folders and download files
 mkdir -p "$base_dir/$sub_dir" 
 cd "$base_dir/$sub_dir"

 if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
 echo " + processing $link"
 # remove query_string from the link, since it might contains '/'
 filename="$link%?*"
 # remove path from filename and run `unzip`
 unzip "$filename##*/" 
 else
 echo " + error downloading: $link"
 fi

 # return to the base directory if it's a relative path
 # if all are absolute paths, then just comment out the following line
 cd ../..
done

Note: I did not test the curl line and dont know what the filenames could be for different links. filename="$link##*/" is to remove all chars before the last '/', which will leave filename and potential query_strings. "$filename%?*" is to remove the trailing query strings from filename. Actually filename downloaded by your curl command might be different which you will have to check and adjust from your end.

edited Mar 12 at 21:10

answered Mar 7 at 14:08

jxc

1,013139

Thank you very much :).

– cm0728
Mar 8 at 22:00

The zip archive $filename is extracted with as same name, so I need to rename $filename to tmp and unzip tmp, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

– cm0728
Mar 12 at 16:05

1

@cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk system() function might not invoke the bash and thus the bash parameter expansion will not work.

– jxc
Mar 12 at 16:26

1

I can see the output on the terminal but the directories are not created. I added the cd "%s/%s" && curl under the #bash download. Thank you :).

– cm0728
Mar 12 at 17:02

addind the curl didn't work. Is there something else? Thank you :).

– cm0728
Mar 12 at 19:06

|
show 3 more comments

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55042884%2fawk-to-create-directory-then-subdirectory-with-zip-in-it%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

I believe your question is related to this one: Bash loop to make directory, if numerical id found in file

You can run all commands in one awk system() funcion, just organize them properly, for example:

# create the format text used in sprintf() to run the desired shell commands
cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
 # create an associative array (key/value pairs) based on the file1
 NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

 # retrieve the first 7-char of each line in file2 as the key to test against the above hash
 k = substr($0, 1, 7) 

 # if find k, then run the system command 
 k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd) 

 # save prev line to 'l' which is supposed to be the URL
 l = $0 
' RS= file1 RS='n' file2

change print to system to execute the command.

Note: the above unzip and rm commands might not work if file names contains URL encoded chars.

Update based on your awk edit:

you can also just print the required info from awk line and then process them in bash, no need to do everything in awk(also remove the line to define cmd_fmt in your awk edit section):

awk '
 # create an associative array (key/value pairs) based on the file1
 NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

 # retrieve the first 7-char of each line in file2 as the key to test against the above hash
 k = substr($0, 1, 7) 

 # if find k, then print
 k in a print a[k] "t" $0 "t" l 

 # save prev line to 'l' which is supposed to be the URL
 l = $0 

' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
 echo "download [$link] to '$base_dir/$sub_dir'"
 # bash command lines to make sub-folders and download files
 mkdir -p "$base_dir/$sub_dir" 
 cd "$base_dir/$sub_dir"

 if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
 echo " + processing $link"
 # remove query_string from the link, since it might contains '/'
 filename="$link%?*"
 # remove path from filename and run `unzip`
 unzip "$filename##*/" 
 else
 echo " + error downloading: $link"
 fi

 # return to the base directory if it's a relative path
 # if all are absolute paths, then just comment out the following line
 cd ../..
done

edited Mar 12 at 21:10

answered Mar 7 at 14:08

jxc

1,013139

Thank you very much :).

– cm0728
Mar 8 at 22:00

The zip archive $filename is extracted with as same name, so I need to rename $filename to tmp and unzip tmp, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

– cm0728
Mar 12 at 16:05

1

@cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk system() function might not invoke the bash and thus the bash parameter expansion will not work.

– jxc
Mar 12 at 16:26

1

I can see the output on the terminal but the directories are not created. I added the cd "%s/%s" && curl under the #bash download. Thank you :).

– cm0728
Mar 12 at 17:02

addind the curl didn't work. Is there something else? Thank you :).

– cm0728
Mar 12 at 19:06

|
show 3 more comments

I believe your question is related to this one: Bash loop to make directory, if numerical id found in file

You can run all commands in one awk system() funcion, just organize them properly, for example:

# create the format text used in sprintf() to run the desired shell commands
cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
 # create an associative array (key/value pairs) based on the file1
 NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

 # retrieve the first 7-char of each line in file2 as the key to test against the above hash
 k = substr($0, 1, 7) 

 # if find k, then run the system command 
 k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd) 

 # save prev line to 'l' which is supposed to be the URL
 l = $0 
' RS= file1 RS='n' file2

change print to system to execute the command.

Note: the above unzip and rm commands might not work if file names contains URL encoded chars.

Update based on your awk edit:

you can also just print the required info from awk line and then process them in bash, no need to do everything in awk(also remove the line to define cmd_fmt in your awk edit section):

awk '
 # create an associative array (key/value pairs) based on the file1
 NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

 # retrieve the first 7-char of each line in file2 as the key to test against the above hash
 k = substr($0, 1, 7) 

 # if find k, then print
 k in a print a[k] "t" $0 "t" l 

 # save prev line to 'l' which is supposed to be the URL
 l = $0 

' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
 echo "download [$link] to '$base_dir/$sub_dir'"
 # bash command lines to make sub-folders and download files
 mkdir -p "$base_dir/$sub_dir" 
 cd "$base_dir/$sub_dir"

 if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
 echo " + processing $link"
 # remove query_string from the link, since it might contains '/'
 filename="$link%?*"
 # remove path from filename and run `unzip`
 unzip "$filename##*/" 
 else
 echo " + error downloading: $link"
 fi

 # return to the base directory if it's a relative path
 # if all are absolute paths, then just comment out the following line
 cd ../..
done

edited Mar 12 at 21:10

answered Mar 7 at 14:08

jxc

1,013139

Thank you very much :).

– cm0728
Mar 8 at 22:00

The zip archive $filename is extracted with as same name, so I need to rename $filename to tmp and unzip tmp, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

– cm0728
Mar 12 at 16:05

1

@cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk system() function might not invoke the bash and thus the bash parameter expansion will not work.

– jxc
Mar 12 at 16:26

1

I can see the output on the terminal but the directories are not created. I added the cd "%s/%s" && curl under the #bash download. Thank you :).

– cm0728
Mar 12 at 17:02

addind the curl didn't work. Is there something else? Thank you :).

– cm0728
Mar 12 at 19:06

|
show 3 more comments

I believe your question is related to this one: Bash loop to make directory, if numerical id found in file

You can run all commands in one awk system() funcion, just organize them properly, for example:

# create the format text used in sprintf() to run the desired shell commands
cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
 # create an associative array (key/value pairs) based on the file1
 NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

 # retrieve the first 7-char of each line in file2 as the key to test against the above hash
 k = substr($0, 1, 7) 

 # if find k, then run the system command 
 k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd) 

 # save prev line to 'l' which is supposed to be the URL
 l = $0 
' RS= file1 RS='n' file2

change print to system to execute the command.

Note: the above unzip and rm commands might not work if file names contains URL encoded chars.

Update based on your awk edit:

you can also just print the required info from awk line and then process them in bash, no need to do everything in awk(also remove the line to define cmd_fmt in your awk edit section):

awk '
 # create an associative array (key/value pairs) based on the file1
 NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

 # retrieve the first 7-char of each line in file2 as the key to test against the above hash
 k = substr($0, 1, 7) 

 # if find k, then print
 k in a print a[k] "t" $0 "t" l 

 # save prev line to 'l' which is supposed to be the URL
 l = $0 

' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
 echo "download [$link] to '$base_dir/$sub_dir'"
 # bash command lines to make sub-folders and download files
 mkdir -p "$base_dir/$sub_dir" 
 cd "$base_dir/$sub_dir"

 if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
 echo " + processing $link"
 # remove query_string from the link, since it might contains '/'
 filename="$link%?*"
 # remove path from filename and run `unzip`
 unzip "$filename##*/" 
 else
 echo " + error downloading: $link"
 fi

 # return to the base directory if it's a relative path
 # if all are absolute paths, then just comment out the following line
 cd ../..
done

edited Mar 12 at 21:10

answered Mar 7 at 14:08

jxc

1,013139

I believe your question is related to this one: Bash loop to make directory, if numerical id found in file

You can run all commands in one awk system() funcion, just organize them properly, for example:

# create the format text used in sprintf() to run the desired shell commands
cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
 # create an associative array (key/value pairs) based on the file1
 NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

 # retrieve the first 7-char of each line in file2 as the key to test against the above hash
 k = substr($0, 1, 7) 

 # if find k, then run the system command 
 k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd) 

 # save prev line to 'l' which is supposed to be the URL
 l = $0 
' RS= file1 RS='n' file2

change print to system to execute the command.

Note: the above unzip and rm commands might not work if file names contains URL encoded chars.

Update based on your awk edit:

you can also just print the required info from awk line and then process them in bash, no need to do everything in awk(also remove the line to define cmd_fmt in your awk edit section):

awk '
 # create an associative array (key/value pairs) based on the file1
 NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next 

 # retrieve the first 7-char of each line in file2 as the key to test against the above hash
 k = substr($0, 1, 7) 

 # if find k, then print
 k in a print a[k] "t" $0 "t" l 

 # save prev line to 'l' which is supposed to be the URL
 l = $0 

' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
 echo "download [$link] to '$base_dir/$sub_dir'"
 # bash command lines to make sub-folders and download files
 mkdir -p "$base_dir/$sub_dir" 
 cd "$base_dir/$sub_dir"

 if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
 echo " + processing $link"
 # remove query_string from the link, since it might contains '/'
 filename="$link%?*"
 # remove path from filename and run `unzip`
 unzip "$filename##*/" 
 else
 echo " + error downloading: $link"
 fi

 # return to the base directory if it's a relative path
 # if all are absolute paths, then just comment out the following line
 cd ../..
done

edited Mar 12 at 21:10

answered Mar 7 at 14:08

jxc

1,013139

edited Mar 12 at 21:10

answered Mar 7 at 14:08

jxc

1,013139

answered Mar 7 at 14:08

jxc

1,013139

answered Mar 7 at 14:08

jxc

1,013139

Thank you very much :).

– cm0728
Mar 8 at 22:00

The zip archive $filename is extracted with as same name, so I need to rename $filename to tmp and unzip tmp, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

– cm0728
Mar 12 at 16:05

1

@cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk system() function might not invoke the bash and thus the bash parameter expansion will not work.

– jxc
Mar 12 at 16:26

1

I can see the output on the terminal but the directories are not created. I added the cd "%s/%s" && curl under the #bash download. Thank you :).

– cm0728
Mar 12 at 17:02

addind the curl didn't work. Is there something else? Thank you :).

– cm0728
Mar 12 at 19:06

|
show 3 more comments

Thank you very much :).

– cm0728
Mar 8 at 22:00

The zip archive $filename is extracted with as same name, so I need to rename $filename to tmp and unzip tmp, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

– cm0728
Mar 12 at 16:05

1

@cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk system() function might not invoke the bash and thus the bash parameter expansion will not work.

– jxc
Mar 12 at 16:26

1

I can see the output on the terminal but the directories are not created. I added the cd "%s/%s" && curl under the #bash download. Thank you :).

– cm0728
Mar 12 at 17:02

addind the curl didn't work. Is there something else? Thank you :).

– cm0728
Mar 12 at 19:06

Thank you very much :).

– cm0728
Mar 8 at 22:00

The zip archive $filename is extracted with as same name, so I need to rename $filename to tmp and unzip tmp, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '

– cm0728
Mar 12 at 16:05

@cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk system() function might not invoke the bash and thus the bash parameter expansion will not work.

– jxc
Mar 12 at 16:26

I can see the output on the terminal but the directories are not created. I added the cd "%s/%s" && curl under the #bash download. Thank you :).

– cm0728
Mar 12 at 17:02

addind the curl didn't work. Is there something else? Thank you :).

– cm0728
Mar 12 at 19:06

|
show 3 more comments

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ufdjrw

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Алба-Юлія

Захаров Федір Захарович

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Алба-Юлія

Захаров Федір Захарович

1 Answer
1

1 Answer
1

1 Answer
1