How to Read Amazon S3 Files Directly in Java/Spring Boot Without Downloading

Amazon Simple Storage Service (S3) offers a reliable and scalable solution for object storage in the cloud.

In today's cloud-native world, developers frequently leverage cloud storage services like Amazon S3 to store and manage files efficiently. However, traditional approaches involving downloading files locally can introduce performance bottlenecks.

In this tutorial, we'll explore how to streamline your workflow and boost application speed by directly reading files from S3, bypassing the need for local file transfers.

The complete code for this project is available on my GitHub repository (getButton) #text=(GitHub) #icon=(share) #color=(#000000)

How to Read Amazon S3 Files Directly in Java/Spring Boot Without Downloading thumbnail image

~~toc~~

Dependencies

To get started with this tutorial, you'll need to add the following dependencies to your Spring Boot project's. This dependency provides the necessary components for interacting with AWS services, including S3.

Gradle

ext {
    springCloudAwsVersion = '3.0.0'
}

dependencies {
    // spring cloud aws BOM(Bill of materials)
    implementation platform("io.awspring.cloud:spring-cloud-aws-dependencies:${springCloudAwsVersion}")
    implementation 'io.awspring.cloud:spring-cloud-aws-starter-s3'
    // AWS launched a high level file transfer utility, called Transfer Manager and a CRT based S3 client.
    // The starter automatically configures and registers a software.amazon.awssdk.transfer.s3.S3TransferManager bean if the following dependency is added to the project:
    implementation 'software.amazon.awssdk:s3-transfer-manager'
    //Transfer Manager works the best with CRT S3 Client. To auto-configure CRT based S3AsyncClient add following dependency to your project:
    implementation 'software.amazon.awssdk.crt:aws-crt'
}

Maven

<dependencies>
    <dependency>
        <groupId>io.awspring.cloud</groupId>
        <artifactId>spring-cloud-aws-starter-s3</artifactId>
    </dependency>
	
    <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>s3-transfer-manager</artifactId>
    </dependency>
	
    <dependency>
        <groupId>software.amazon.awssdk.crt</groupId>
        <artifactId>aws-crt</artifactId>
    </dependency>
</dependencies>

<dependencyManagement>
    <dependencies>
        <dependency>
            <groupId>io.awspring.cloud</groupId>
            <artifactId>spring-cloud-aws-dependencies</artifactId>
            <version>{project-version}</version>
            <type>pom</type>
            <scope>import</scope>
       </dependency>
    </dependencies>
</dependencyManagement>

Configuration and Credentials

Spring Cloud AWS offers auto-configuration for S3Client, S3TransferManager, and S3Template, making setup a breeze. Here's what you need to add to your application.properties file:

 # s3 Configuration
 spring.cloud.aws.credentials.access-key=
 spring.cloud.aws.credentials.secret-key=
 # Configures endpoint used by S3Client, I'm woorking in the Asia Pacific (Mumbai) hence, I've configure ap-south1 region and endpoint
 spring.cloud.aws.s3.endpoint=https://s3.ap-south-1.amazonaws.com
 spring.cloud.aws.region.static=ap-south-1

IAM Permissions for S3 Read Access

To grant your IAM user or role permission to read files from S3, you must attach this policy to it.

 {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "readAccess",
            "Effect": "Allow",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::*/*"
        }
    ]
 }

Reading S3 Objects Directly into Java Streams

S3Client is a Java interface for interacting with Amazon S3. We'll use S3Client to directly read the S3 object into an input stream. Then, we'll use a BufferedReader to process the file line by line.

 
 import software.amazon.awssdk.core.ResponseInputStream;
 import software.amazon.awssdk.services.s3.S3Client;
 import software.amazon.awssdk.services.s3.model.GetObjectRequest;
 import software.amazon.awssdk.services.s3.model.GetObjectResponse;

 import java.io.BufferedReader;
 import java.io.InputStream;
 import java.io.InputStreamReader;
 import java.util.Arrays;
 import java.util.Objects;

 public void readFileUsingS3Client(String bucketName, String key) {
    try {
        GetObjectRequest getObjectRequest = GetObjectRequest.builder()
                .bucket(bucketName)
                .key(key)
                .build();

        ResponseInputStream<GetObjectResponse> responseResponseInputStream = s3Client.getObject(getObjectRequest);
        BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(responseResponseInputStream));

        String line = "";
        while (Objects.nonNull(line = bufferedReader.readLine())) {
            // Since the file format is CSV, we'll split each line by commas (,) to access individual columns.
            String[] split = line.split(",");
            System.out.println("line: " + Arrays.toString(split));
        }
    } catch (Exception exception) {
        exception.printStackTrace();
    }
 }

The getObject() method of S3Client allows us to read an S3 object directly into a Java input stream. It takes a GetObjectRequest object as an argument, specifying the bucket name and object key.

Conclussion

Reading a file directly from Amazon S3 can not only boost application processing time but also optimize resource utilization, reducing latency and optimizing memory usage by eliminating the need to download files locally

To enable public access to your S3 objects without exposing your AWS credentials, explore (getButton) #text=(how to generate presigned URLs) #icon=(link) #color=(#35a576) in our comprehensive guide.

The complete code for this project is available on my GitHub repository (getButton) #text=(GitHub) #icon=(share) #color=(#000000)

Keep learning and keep growing.

How to Read Amazon S3 Files Directly in Java/Spring Boot Without Downloading

Dependencies

Gradle

Maven

Configuration and Credentials

IAM Permissions for S3 Read Access

Reading S3 Objects Directly into Java Streams

Conclussion

Post a Comment

Social Plugin

Popular Posts

How to Enable Time-to-Live (TTL) for Automatic Data Deletion in Amazon DynamoDB Table

Spring Cloud AWS S3 File Download Tutorial: Step-by-Step Guide