create it… terminal, push it… S3
In this post, we use the linux shell to generate a bunch of files, add some text to each file, push all the files to S3, then read the files using AWS s3 CLI
The AWS docs mention that 1000 objects are retrieved at a time by the AWS CLI.
By default, the AWS CLI uses a page size of 1000 and retrieves all available items. For example, if you run
aws s3api list-objectson an Amazon S3 bucket that contains 3,500 objects, the AWS CLI automatically makes four calls to Amazon S3, handling the service-specific pagination logic for you in the background and returning all 3,500 objects in the final output.
Pagination only has to do with the API calls that the AWS CLI makes. It doesn’t affect the user output whatsoever, as is the case when a developer routinely thinks about pagination when dealing with a RESTful WEB API. I personally find this slightly counter intuitive.
There are several types of shell programming languages.
sh (Bourne Shell), bash (Bourne Again Shell), ksh (Korn Shell), zsh (Z-Shell)
A shell programming language is a lower level language that communicates with the kernel of the computer. It’s named “shell” because it is covering the kernel up as is typically the function of a shell. Now what is the kernel you ask?
The kernel is the core of the operating system that controls all the tasks of the system while the shell is the interface that allows the users to communicate with the kernel. -Jagadish Hiremath
The newer macs are using zsh, so I will be using that as well. The good news from what I read is that zsh is very similar to bash, which makes it easy to switch around.
First off, let’s create some txt files with zsh. To create a file with zsh, use the touch command.
It will create it in the current folder you are in. But how about creating multiple files. We will use brace expansion to generate any amount of files. Let’s create 2000 since AWS returns 1000 records for each API call by default. And let’s not forget to create a folder for these txt files.
cd aws-begin-txt-files (move into the folder/directory)
We should now see a terminal window filled with txt files.
Now we add some text to each file. Let’s add the phrase “hello aws-beginners!” to each file. We will use a classic for loop :
for i in *.txt
echo “hello aws-beginners!” >> $i
if you open one of the files using nano, you will see the text in every file.
Now we have our files, let us upload the files to s3.
We aren’t going to manually upload using the mouse and the aws console as much as you may be tempted to. Rather, we will leverage the power of the command line. The AWS CLI. Before using it we have to install it.
I’m using a mac, so here is the command straight from the AWS docs :
$ curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"
$ sudo installer -pkg AWSCLIV2.pkg -target /
After this, you will need to configure your AWS CLI to use your credentials. If using AWS, you will need your Secret Access ID and Secret Access Key handy. The ID is always available in the IAM (Identity Access Management) users section. But the Access Key is only available one time, when the IAM user was created.
You can always create a new access key at any time by clicking the button “Create access key”. Once you have your credentials, we configure our AWS CLI
Now to the meat of our task. Let’s create an S3 bucket, and upload the files to the bucket.
the — recursive flag is required when copying over multiple files. An error results without this flag. Notice before copying, we typed in the ls (list) command to view our s3 buckets.
Now we head over to the AWS console to verify the files are indeed present.
Now we will look at a couple of commands having to do with the AWS pagination concept we briefly touched upon.
this will display to the screen however many items you choose.
The page-size is the number of times AWS performs an API call, and again, does NOT have to do with how many items are shown to us as the user. In other words, if you set the — page-size to 50, All 2000 items will be retrieved, but instead of performing 2 API calls of 1000 items, 80 API calls of 25 items each will be performed. This could be useful for performance issues that may arise with larger volume API calls. You also will notice that with a smaller page-size, the AWS-CLI takes longer to retrieve the records. Why? Well because there are more API calls that are being made over the internet.