Publishing Conda Packages on Amazon S3

We worked together with AWS Deadline Cloud to improve S3 support in our tools! All rattler-based tools now seamlessly authenticate using default credentials on your system, making it simple to upload, download and index packages on S3 buckets.

This means rattler-build upload, rattler-index, and pixi all natively talk to S3 buckets now - making it by far the easiest way to run small scale private channels.

Let’s take a look at the integration:

# Build a package with rattler-build and publish it to `my-bucket`
rattler-build publish --recipe ./bzip2/recipe.yaml --to s3://my-bucket

# Use the package with pixi
pixi global install -c s3://my-bucket bzip2

# And lastly, this works too (pixi publish coming later)
pixi build && pixi upload s3 -c s3://my-bucket ./my-package-1.2.3-h12345_0.conda

# Manually index (if not using `publish) with rattler-index
rattler-index s3 s3://my-bucket

The new `rattler-build publish` functionality

Often, you want to build a package and share it. That was a multi-step process: build, upload and index (for S3 buckets or the filesystem). When building a minor update to a package, you also had to edit the build number in the recipe.

The new rattler-build publish command makes this very smooth. It will first build new recipes, then upload them to the specified channel (s3://my-bucket, artifactory://conda.company.com/my-channel, …) and index where necessary.

If you upload to prefix.dev and use trusted publishing, it’s also simple to generate and upload a sigstore attestation with --create-attestation on Github Actions.

For quick iteration and rebuilding, you can also conveniently bump the build number using the publish … --build-number=+1.

Using Conda packages with AWS Deadline Cloud

AWS Deadline Cloud is a fully managed service that enables you to have a scalable, fully managed visual compute farm up and running in minutes. It provides packages for software such as Blender, Cinema 4D, Houdini, Maya, V-Ray, Nuke, VRED, and more conveniently accessible from a default Conda queue environment to use in your cloud render farm – and they are early adopters of rattler-build for next-gen conda packaging.

When you need to make your own packages for your farm, the new rattler-build publish functionality makes it simpler than ever to create and update a private conda channel using S3. Once your farm is configured and channel initialized, you can build a package for your farm in a single step. For example, you could build the NeRF Studio sample recipe on github on a CUDA Linux host by running:

rattler-build publish ./nerfstudio/recipe/recipe.yaml \
    --to s3://amzn-s3-demo-bucket/Conda/Default \
    -c conda-forge \
    --build-number=+1

Technical Details

Natively Authenticating with AWS

When you log in to AWS using the CLI or SDKs, credentials are stored in a few different ways on your machine. Thankfully, the Rust crates we use already supported loading credentials from all the standard locations that AWS tools use.

IAM Roles and Instance Metadata

A highly secure method when running on AWS infrastructure: Amazon Elastic Compute Cloud (Amazon EC2) instances, Amazon Elastic Container Service (Amazon ECS) tasks, and AWS Lambda functions can assume AWS Identity and Access Management (IAM) roles automatically. The credentials are fetched from the instance metadata service (IMDS) and rotated automatically - no need to manage long-term credentials at all!

Our implementation uses the aws-config and aws-sdk-s3 crates, which handle this credential chain automatically. They check environment variables first, then the credentials file, and finally fall back to instance metadata. This means your code works seamlessly whether you're developing locally or running in production on AWS.

OIDC Authentication with S3 from Github Actions

Another cool feature that we unlocked by using native AWS authentication methods is the ability to use OIDC from Github Actions to assume an AWS role without having to set an API Key. The OIDC token from Github Actions will create a short-lived token that can only be used for the duration of the workflow, making it more secure than API keys (you cannot lose an API key that does not exist).

To configure this, you can use the AWS aws-actions/configure-aws-credentials Action on Github:

- name: Configure AWS (OIDC)
  uses: aws-actions/configure-aws-credentials@v4.3.1
    with:
      aws-region: ${{ env.AWS_REGION }}
      role-to-assume: arn:aws:iam::239378270001:role/conda-rattler-e2e-test
- runs: rattler-build publish --to s3://my-channel  ...

You can follow the guide by Github on how to configure this on the AWS side, or take a look at our example.

Environment Variables

The classic method is to set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY as environment variables. This is often used in CI/CD pipelines. You can also specify AWS_SESSION_TOKEN for temporary credentials and AWS_REGION to set your default region.

AWS recently launched the aws login command that lets you use your AWS console credentials to login and provide short-term credentials for the AWS CLI and AWS SDK. All of rattler-build, rattler-index, and pixi have been updated to support this method.

You can configure multiple logins to different profiles, then select a profile using the AWS_PROFILE environment variable.

Technical Highlight: Preventing Race Conditions in S3

When uploading packages from multiple CI pipelines, you could run into race conditions - especially when indexing and writing the repodata.json files. Two or more pipelines could attempt to write to the same file concurrently. Luckily, the S3 protocol has a mechanism to prevent this: using If-Match headers! We can validate that the file we are going to overwrite is the same we just saw (meaning no other process has overwritten it before us).

Conclusion

You might wonder: isn’t prefix also offering a package registry? - and yes, we obviously are. But we are determined to make Conda ubiquitous and that’s only possible if there is absolutely no vendor lock-in. Of course we still thrive to offer a superior experience on our platform: channels with CDN, extremely fast indexing, user management, download count insights, a frontend to browse packages, trusted publishing, sigstore attestations, ...

If you try out the new S3 functionalities, let us know how it goes. Also huge thanks to Mark Wiebe from AWS Deadline Cloud for the guidance in designing a great workflow around AWS services.