Awk to create directory then subdirectory with zip in itBash loop to make directory, if numerical id found in fileHow to do a recursive find/replace of a string with awk or sed?What is the difference between sed and awk?How to use “:” as awk field separator?AWK: Access captured group from line patternUsing awk to print all columns from the nth to the lastAWK multiple delimiterassociative arrays in awk challenging memory limitsin Bash, search file for a string and append file with string if it is not existingSplit All Files In a Folder Based On Row StructureBash loop to make directory, if numerical id found in file
What is the oldest known work of fiction?
Bash method for viewing beginning and end of file
How can I use the arrow sign in my bash prompt?
Can a monster with multiattack use this ability if they are missing a limb?
Ways to speed up user implemented RK4
How to be diplomatic in refusing to write code that breaches the privacy of our users
Print name if parameter passed to function
Hide Select Output from T-SQL
How do I define a right arrow with bar in LaTeX?
apt-get update is failing in debian
Trouble understanding overseas colleagues
Is exact Kanji stroke length important?
Mapping a list into a phase plot
Is expanding the research of a group into machine learning as a PhD student risky?
Can somebody explain Brexit in a few child-proof sentences?
What defines a dissertation?
What is the intuitive meaning of having a linear relationship between the logs of two variables?
Student evaluations of teaching assistants
What's a natural way to say that someone works somewhere (for a job)?
Using parameter substitution on a Bash array
Valid Badminton Score?
Can I Retrieve Email Addresses from BCC?
Understanding "audieritis" in Psalm 94
Should my PhD thesis be submitted under my legal name?
Awk to create directory then subdirectory with zip in it
Bash loop to make directory, if numerical id found in fileHow to do a recursive find/replace of a string with awk or sed?What is the difference between sed and awk?How to use “:” as awk field separator?AWK: Access captured group from line patternUsing awk to print all columns from the nth to the lastAWK multiple delimiterassociative arrays in awk challenging memory limitsin Bash, search file for a string and append file with string if it is not existingSplit All Files In a Folder Based On Row StructureBash loop to make directory, if numerical id found in file
The awk
below will create sub-directories in a directory (which is always the last line of file1, each block separated by an empty line), if the number in line 2 (always the first 6 digits in the format xx-xxxx) of file2 is found in $2
of file1. This is the current awk output.
If there is a match and a sub-directory is created in a directory then the corresponding line1 https in file2 will always be a link to a zip file for download. I can not seem to create that link in the sub-folder, download and extract the .zip. the download code executes and downloads the zip but has to manually added to the terminal. i apoogize for the long post, wanted to include all details to solve this
file1
xxx_006 19-0000_xxx-yyy-aaa
xxx_007 19-0001_zzz-bbb-ccc
FolderName_001_001
yyyy_0287 19-0v02-xxx
yyyy_0289 19-0v31-xxxx
yyyy_0293 19-0v05-xxxx
FolderName_002_002
file2
https://xx.yy.zz/path/to/file.zip
19-0v05-xxx_000_001
cc112233
https://xx.yy.zz/path/to/download/file.zip
19-0v31-xxx-001-000
bb4456784
https://xx.yy.zz/path/to/file.zip
19-0v02-xxx_000_001
aaa331232
awk edit
cmd_fmt='mkdir -p "%s/%s"
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test
against the above hash
k = substr($0, 1, 7)
# if find k, then print
k in a print a[k] "t" $0 "t" l
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link;
do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
create the format text used in sprintf() to run the desired shell commands
cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && filename="%s"; unzip
"$filename##*/"; '
done
desired awk output
FolderName_002_002 --- directory
19-0v02-xxx_000_001 --- sub folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
19-0v05-xxx_000_001 --- sub-folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
19-0v31-xxx-001-000 --- sub-folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
awk
add a comment |
The awk
below will create sub-directories in a directory (which is always the last line of file1, each block separated by an empty line), if the number in line 2 (always the first 6 digits in the format xx-xxxx) of file2 is found in $2
of file1. This is the current awk output.
If there is a match and a sub-directory is created in a directory then the corresponding line1 https in file2 will always be a link to a zip file for download. I can not seem to create that link in the sub-folder, download and extract the .zip. the download code executes and downloads the zip but has to manually added to the terminal. i apoogize for the long post, wanted to include all details to solve this
file1
xxx_006 19-0000_xxx-yyy-aaa
xxx_007 19-0001_zzz-bbb-ccc
FolderName_001_001
yyyy_0287 19-0v02-xxx
yyyy_0289 19-0v31-xxxx
yyyy_0293 19-0v05-xxxx
FolderName_002_002
file2
https://xx.yy.zz/path/to/file.zip
19-0v05-xxx_000_001
cc112233
https://xx.yy.zz/path/to/download/file.zip
19-0v31-xxx-001-000
bb4456784
https://xx.yy.zz/path/to/file.zip
19-0v02-xxx_000_001
aaa331232
awk edit
cmd_fmt='mkdir -p "%s/%s"
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test
against the above hash
k = substr($0, 1, 7)
# if find k, then print
k in a print a[k] "t" $0 "t" l
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link;
do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
create the format text used in sprintf() to run the desired shell commands
cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && filename="%s"; unzip
"$filename##*/"; '
done
desired awk output
FolderName_002_002 --- directory
19-0v02-xxx_000_001 --- sub folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
19-0v05-xxx_000_001 --- sub-folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
19-0v31-xxx-001-000 --- sub-folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
awk
add a comment |
The awk
below will create sub-directories in a directory (which is always the last line of file1, each block separated by an empty line), if the number in line 2 (always the first 6 digits in the format xx-xxxx) of file2 is found in $2
of file1. This is the current awk output.
If there is a match and a sub-directory is created in a directory then the corresponding line1 https in file2 will always be a link to a zip file for download. I can not seem to create that link in the sub-folder, download and extract the .zip. the download code executes and downloads the zip but has to manually added to the terminal. i apoogize for the long post, wanted to include all details to solve this
file1
xxx_006 19-0000_xxx-yyy-aaa
xxx_007 19-0001_zzz-bbb-ccc
FolderName_001_001
yyyy_0287 19-0v02-xxx
yyyy_0289 19-0v31-xxxx
yyyy_0293 19-0v05-xxxx
FolderName_002_002
file2
https://xx.yy.zz/path/to/file.zip
19-0v05-xxx_000_001
cc112233
https://xx.yy.zz/path/to/download/file.zip
19-0v31-xxx-001-000
bb4456784
https://xx.yy.zz/path/to/file.zip
19-0v02-xxx_000_001
aaa331232
awk edit
cmd_fmt='mkdir -p "%s/%s"
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test
against the above hash
k = substr($0, 1, 7)
# if find k, then print
k in a print a[k] "t" $0 "t" l
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link;
do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
create the format text used in sprintf() to run the desired shell commands
cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && filename="%s"; unzip
"$filename##*/"; '
done
desired awk output
FolderName_002_002 --- directory
19-0v02-xxx_000_001 --- sub folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
19-0v05-xxx_000_001 --- sub-folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
19-0v31-xxx-001-000 --- sub-folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
awk
The awk
below will create sub-directories in a directory (which is always the last line of file1, each block separated by an empty line), if the number in line 2 (always the first 6 digits in the format xx-xxxx) of file2 is found in $2
of file1. This is the current awk output.
If there is a match and a sub-directory is created in a directory then the corresponding line1 https in file2 will always be a link to a zip file for download. I can not seem to create that link in the sub-folder, download and extract the .zip. the download code executes and downloads the zip but has to manually added to the terminal. i apoogize for the long post, wanted to include all details to solve this
file1
xxx_006 19-0000_xxx-yyy-aaa
xxx_007 19-0001_zzz-bbb-ccc
FolderName_001_001
yyyy_0287 19-0v02-xxx
yyyy_0289 19-0v31-xxxx
yyyy_0293 19-0v05-xxxx
FolderName_002_002
file2
https://xx.yy.zz/path/to/file.zip
19-0v05-xxx_000_001
cc112233
https://xx.yy.zz/path/to/download/file.zip
19-0v31-xxx-001-000
bb4456784
https://xx.yy.zz/path/to/file.zip
19-0v02-xxx_000_001
aaa331232
awk edit
cmd_fmt='mkdir -p "%s/%s"
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test
against the above hash
k = substr($0, 1, 7)
# if find k, then print
k in a print a[k] "t" $0 "t" l
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link;
do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
create the format text used in sprintf() to run the desired shell commands
cd "%s/%s" && curl -O -v -k -X GET %s -H "Content-Type:application/x- www-form-urlencoded" -H "Authorization:xxxx" && filename="%s"; unzip
"$filename##*/"; '
done
desired awk output
FolderName_002_002 --- directory
19-0v02-xxx_000_001 --- sub folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
19-0v05-xxx_000_001 --- sub-folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
19-0v31-xxx-001-000 --- sub-folder
https://xx.yy.zz/path/to/file.zip --- zip and extracted downloaded to sub-folder
awk
awk
edited Mar 12 at 20:07
cm0728
asked Mar 7 at 11:37
cm0728cm0728
1,5271920
1,5271920
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
I believe your question is related to this one: Bash loop to make directory, if numerical id found in file
You can run all commands in one awk system()
funcion, just organize them properly, for example:
# create the format text used in sprintf() to run the desired shell commands
cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test against the above hash
k = substr($0, 1, 7)
# if find k, then run the system command
k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd)
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2
change print
to system
to execute the command.
Note: the above unzip
and rm
commands might not work if file names contains URL encoded chars.
Update based on your awk edit
:
you can also just print the required info from awk
line and then process them in bash, no need to do everything in awk
(also remove the line to define cmd_fmt
in your awk edit
section):
awk '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test against the above hash
k = substr($0, 1, 7)
# if find k, then print
k in a print a[k] "t" $0 "t" l
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
mkdir -p "$base_dir/$sub_dir"
cd "$base_dir/$sub_dir"
if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
echo " + processing $link"
# remove query_string from the link, since it might contains '/'
filename="$link%?*"
# remove path from filename and run `unzip`
unzip "$filename##*/"
else
echo " + error downloading: $link"
fi
# return to the base directory if it's a relative path
# if all are absolute paths, then just comment out the following line
cd ../..
done
Note: I did not test the curl
line and dont know what the filenames could be for different links. filename="$link##*/"
is to remove all chars before the last '/', which will leave filename and potential query_strings. "$filename%?*"
is to remove the trailing query strings from filename
. Actually filename downloaded by your curl
command might be different which you will have to check and adjust from your end.
Thank you very much :).
– cm0728
Mar 8 at 22:00
The zip archive$filename
is extracted with as same name, so I need torename
$filename
totmp
and unziptmp
, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '
– cm0728
Mar 12 at 16:05
1
@cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awksystem()
function might not invoke the bash and thus the bashparameter expansion
will not work.
– jxc
Mar 12 at 16:26
1
I can see the output on the terminal but the directories are not created. I added thecd "%s/%s" && curl
under the#bash download
. Thank you :).
– cm0728
Mar 12 at 17:02
addind thecurl
didn't work. Is there something else? Thank you :).
– cm0728
Mar 12 at 19:06
|
show 3 more comments
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55042884%2fawk-to-create-directory-then-subdirectory-with-zip-in-it%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I believe your question is related to this one: Bash loop to make directory, if numerical id found in file
You can run all commands in one awk system()
funcion, just organize them properly, for example:
# create the format text used in sprintf() to run the desired shell commands
cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test against the above hash
k = substr($0, 1, 7)
# if find k, then run the system command
k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd)
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2
change print
to system
to execute the command.
Note: the above unzip
and rm
commands might not work if file names contains URL encoded chars.
Update based on your awk edit
:
you can also just print the required info from awk
line and then process them in bash, no need to do everything in awk
(also remove the line to define cmd_fmt
in your awk edit
section):
awk '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test against the above hash
k = substr($0, 1, 7)
# if find k, then print
k in a print a[k] "t" $0 "t" l
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
mkdir -p "$base_dir/$sub_dir"
cd "$base_dir/$sub_dir"
if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
echo " + processing $link"
# remove query_string from the link, since it might contains '/'
filename="$link%?*"
# remove path from filename and run `unzip`
unzip "$filename##*/"
else
echo " + error downloading: $link"
fi
# return to the base directory if it's a relative path
# if all are absolute paths, then just comment out the following line
cd ../..
done
Note: I did not test the curl
line and dont know what the filenames could be for different links. filename="$link##*/"
is to remove all chars before the last '/', which will leave filename and potential query_strings. "$filename%?*"
is to remove the trailing query strings from filename
. Actually filename downloaded by your curl
command might be different which you will have to check and adjust from your end.
Thank you very much :).
– cm0728
Mar 8 at 22:00
The zip archive$filename
is extracted with as same name, so I need torename
$filename
totmp
and unziptmp
, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '
– cm0728
Mar 12 at 16:05
1
@cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awksystem()
function might not invoke the bash and thus the bashparameter expansion
will not work.
– jxc
Mar 12 at 16:26
1
I can see the output on the terminal but the directories are not created. I added thecd "%s/%s" && curl
under the#bash download
. Thank you :).
– cm0728
Mar 12 at 17:02
addind thecurl
didn't work. Is there something else? Thank you :).
– cm0728
Mar 12 at 19:06
|
show 3 more comments
I believe your question is related to this one: Bash loop to make directory, if numerical id found in file
You can run all commands in one awk system()
funcion, just organize them properly, for example:
# create the format text used in sprintf() to run the desired shell commands
cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test against the above hash
k = substr($0, 1, 7)
# if find k, then run the system command
k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd)
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2
change print
to system
to execute the command.
Note: the above unzip
and rm
commands might not work if file names contains URL encoded chars.
Update based on your awk edit
:
you can also just print the required info from awk
line and then process them in bash, no need to do everything in awk
(also remove the line to define cmd_fmt
in your awk edit
section):
awk '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test against the above hash
k = substr($0, 1, 7)
# if find k, then print
k in a print a[k] "t" $0 "t" l
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
mkdir -p "$base_dir/$sub_dir"
cd "$base_dir/$sub_dir"
if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
echo " + processing $link"
# remove query_string from the link, since it might contains '/'
filename="$link%?*"
# remove path from filename and run `unzip`
unzip "$filename##*/"
else
echo " + error downloading: $link"
fi
# return to the base directory if it's a relative path
# if all are absolute paths, then just comment out the following line
cd ../..
done
Note: I did not test the curl
line and dont know what the filenames could be for different links. filename="$link##*/"
is to remove all chars before the last '/', which will leave filename and potential query_strings. "$filename%?*"
is to remove the trailing query strings from filename
. Actually filename downloaded by your curl
command might be different which you will have to check and adjust from your end.
Thank you very much :).
– cm0728
Mar 8 at 22:00
The zip archive$filename
is extracted with as same name, so I need torename
$filename
totmp
and unziptmp
, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '
– cm0728
Mar 12 at 16:05
1
@cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awksystem()
function might not invoke the bash and thus the bashparameter expansion
will not work.
– jxc
Mar 12 at 16:26
1
I can see the output on the terminal but the directories are not created. I added thecd "%s/%s" && curl
under the#bash download
. Thank you :).
– cm0728
Mar 12 at 17:02
addind thecurl
didn't work. Is there something else? Thank you :).
– cm0728
Mar 12 at 19:06
|
show 3 more comments
I believe your question is related to this one: Bash loop to make directory, if numerical id found in file
You can run all commands in one awk system()
funcion, just organize them properly, for example:
# create the format text used in sprintf() to run the desired shell commands
cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test against the above hash
k = substr($0, 1, 7)
# if find k, then run the system command
k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd)
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2
change print
to system
to execute the command.
Note: the above unzip
and rm
commands might not work if file names contains URL encoded chars.
Update based on your awk edit
:
you can also just print the required info from awk
line and then process them in bash, no need to do everything in awk
(also remove the line to define cmd_fmt
in your awk edit
section):
awk '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test against the above hash
k = substr($0, 1, 7)
# if find k, then print
k in a print a[k] "t" $0 "t" l
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
mkdir -p "$base_dir/$sub_dir"
cd "$base_dir/$sub_dir"
if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
echo " + processing $link"
# remove query_string from the link, since it might contains '/'
filename="$link%?*"
# remove path from filename and run `unzip`
unzip "$filename##*/"
else
echo " + error downloading: $link"
fi
# return to the base directory if it's a relative path
# if all are absolute paths, then just comment out the following line
cd ../..
done
Note: I did not test the curl
line and dont know what the filenames could be for different links. filename="$link##*/"
is to remove all chars before the last '/', which will leave filename and potential query_strings. "$filename%?*"
is to remove the trailing query strings from filename
. Actually filename downloaded by your curl
command might be different which you will have to check and adjust from your end.
I believe your question is related to this one: Bash loop to make directory, if numerical id found in file
You can run all commands in one awk system()
funcion, just organize them properly, for example:
# create the format text used in sprintf() to run the desired shell commands
cmd_fmt='mkdir -p "%s/%s" && cd "%s/%s" && curl -O -v -k -X GET %s -H "Content- Type:application/x-www-form-urlencoded" -H "Authorization:xxx" && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '
# run the awk command
awk -v cmd_fmt="$cmd_fmt" '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test against the above hash
k = substr($0, 1, 7)
# if find k, then run the system command
k in a cmd = sprintf(cmd_fmt, a[k], $0, a[k], $0, l, l); print(cmd)
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2
change print
to system
to execute the command.
Note: the above unzip
and rm
commands might not work if file names contains URL encoded chars.
Update based on your awk edit
:
you can also just print the required info from awk
line and then process them in bash, no need to do everything in awk
(also remove the line to define cmd_fmt
in your awk edit
section):
awk '
# create an associative array (key/value pairs) based on the file1
NR==FNR for(i=2; i<NF; i+=2) a[substr($i,1,7)] = $NF; next
# retrieve the first 7-char of each line in file2 as the key to test against the above hash
k = substr($0, 1, 7)
# if find k, then print
k in a print a[k] "t" $0 "t" l
# save prev line to 'l' which is supposed to be the URL
l = $0
' RS= file1 RS='n' file2 | while IFS=$'t' read -r base_dir sub_dir link; do
echo "download [$link] to '$base_dir/$sub_dir'"
# bash command lines to make sub-folders and download files
mkdir -p "$base_dir/$sub_dir"
cd "$base_dir/$sub_dir"
if curl -O -v -k -X GET "$link" -H "Content-Type:application/x-www-form-urlencoded" -H "Authorization:xxxx" >/dev/null 2>&1; then
echo " + processing $link"
# remove query_string from the link, since it might contains '/'
filename="$link%?*"
# remove path from filename and run `unzip`
unzip "$filename##*/"
else
echo " + error downloading: $link"
fi
# return to the base directory if it's a relative path
# if all are absolute paths, then just comment out the following line
cd ../..
done
Note: I did not test the curl
line and dont know what the filenames could be for different links. filename="$link##*/"
is to remove all chars before the last '/', which will leave filename and potential query_strings. "$filename%?*"
is to remove the trailing query strings from filename
. Actually filename downloaded by your curl
command might be different which you will have to check and adjust from your end.
edited Mar 12 at 21:10
answered Mar 7 at 14:08
jxcjxc
1,013139
1,013139
Thank you very much :).
– cm0728
Mar 8 at 22:00
The zip archive$filename
is extracted with as same name, so I need torename
$filename
totmp
and unziptmp
, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '
– cm0728
Mar 12 at 16:05
1
@cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awksystem()
function might not invoke the bash and thus the bashparameter expansion
will not work.
– jxc
Mar 12 at 16:26
1
I can see the output on the terminal but the directories are not created. I added thecd "%s/%s" && curl
under the#bash download
. Thank you :).
– cm0728
Mar 12 at 17:02
addind thecurl
didn't work. Is there something else? Thank you :).
– cm0728
Mar 12 at 19:06
|
show 3 more comments
Thank you very much :).
– cm0728
Mar 8 at 22:00
The zip archive$filename
is extracted with as same name, so I need torename
$filename
totmp
and unziptmp
, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '
– cm0728
Mar 12 at 16:05
1
@cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awksystem()
function might not invoke the bash and thus the bashparameter expansion
will not work.
– jxc
Mar 12 at 16:26
1
I can see the output on the terminal but the directories are not created. I added thecd "%s/%s" && curl
under the#bash download
. Thank you :).
– cm0728
Mar 12 at 17:02
addind thecurl
didn't work. Is there something else? Thank you :).
– cm0728
Mar 12 at 19:06
Thank you very much :).
– cm0728
Mar 8 at 22:00
Thank you very much :).
– cm0728
Mar 8 at 22:00
The zip archive
$filename
is extracted with as same name, so I need to rename
$filename
to tmp
and unzip tmp
, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '– cm0728
Mar 12 at 16:05
The zip archive
$filename
is extracted with as same name, so I need to rename
$filename
to tmp
and unzip tmp
, but I can not seem to without making the command error. Thank you :). && filename="%s"; unzip "$filename##*/" && rm -f "$filename##*/"; '– cm0728
Mar 12 at 16:05
1
1
@cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk
system()
function might not invoke the bash and thus the bash parameter expansion
will not work.– jxc
Mar 12 at 16:26
@cm0728, you are very welcome:). Not sure about your actual data files, but it seems to be much easier to just post-processing the actual files under bash. the awk
system()
function might not invoke the bash and thus the bash parameter expansion
will not work.– jxc
Mar 12 at 16:26
1
1
I can see the output on the terminal but the directories are not created. I added the
cd "%s/%s" && curl
under the #bash download
. Thank you :).– cm0728
Mar 12 at 17:02
I can see the output on the terminal but the directories are not created. I added the
cd "%s/%s" && curl
under the #bash download
. Thank you :).– cm0728
Mar 12 at 17:02
addind the
curl
didn't work. Is there something else? Thank you :).– cm0728
Mar 12 at 19:06
addind the
curl
didn't work. Is there something else? Thank you :).– cm0728
Mar 12 at 19:06
|
show 3 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55042884%2fawk-to-create-directory-then-subdirectory-with-zip-in-it%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown